Q and A with the experts: What is watermarking?

Waterloo professor Florian Kerschbaum discusses the effectiveness of watermarking AI-generated content

Media Relations

AI companies such as OpenAI, Alphabet, the parent company of Google, and Meta Platforms have made voluntary commitments to implement measures such as watermarking AI-generated content to help make the technology safer. Watermarking is a process where a message is embedded into content, e.g., a text or an image, which can later be retrieved even if the content has been modified. Dr. Florian Kerschbaum, professor at the David R. Cheriton School of Computer Science and member of the Cybersecurity and Privacy Institute at the University of Waterloo, discusses the effectiveness of watermarking AI-generated content.

Does watermarking AI-generated content truly enhance safety and security?

For a watermark to be useful as a security and safety feature of generated AI, it must be accessible and, at the same time, reliable. 'Accessible' means that it must be possible to test whether the content is AI-generated. When deploying watermarks on the scale suggested by the White House, big tech companies must decide who gets to do these tests and how. 'Reliable' means that malicious actors should not be able to remove the watermark trivially. Still, at the same time, they may be allowed to test for a watermark's presence. This is a significant technical challenge.

Is it possible for malicious actors to remove watermarking?

Watermarking AI content works if the content generator cooperates by embedding the watermark. The pledge by the tech companies at the White House meeting provides some hope that they will do so for most of their AI-generated content. Second, although the content is seemingly entangled, a skilled attacker can still remove the watermark. Scientists have investigated this question for decades, and the answers could be more satisfactory. We primarily rely on a trial-and-error approach.

Can AI be used to generate watermarks?

AI itself may help design better watermarks. AI's greatest weakness is that humans do not understand how it works. But that it is better than human performance in many tasks, such as image recognition, may help design more robust watermarks. Using AI, one can embed watermarks that only AI can detect. But again, it is still unclear to scientists if this is reliable enough for deployment on the scale necessary for Amazon, Anthropic, Meta, Google, Inflection, and OpenAI.

This series is produced for the media, and its purpose is to share the expertise of UWaterloo researchers. To reach this researcher, please contact media relations.