Google DeepMind SynthID AI Watermarking Technology Open-Sourced to Businesses and Developers

AI-generated content in a variety of modalities, including text, photos, audio, and videos, can be watermarked with SynthID.

On Wednesday, Google DeepMind made a new watermarking technology for AI-generated text publicly available. SynthID is an artificial intelligence (AI) watermarking tool that works with a variety of modalities, including text, images, videos, and audio. However, it currently only offers the text watermarking tool to developers and companies. The company hopes the tool will be used more broadly to help identify AI-generated content. The Mountain View-based tech giant’s updated Responsible Generative AI Toolkit makes the tool accessible to both individuals and businesses.

Google DeepMind’s AI Text Watermarking Technology Is Open-Source

In a post on X (formerly known as Twitter), the official handle of Google DeepMind announced making SynthID’s text watermarking capability freely available to developers and businesses. It is also available for download from Google’s Hugging Face listing in addition to the Responsible GenAI Toolkit.

There is already a lot of AI-generated content available online. According to a study released earlier this year by the Amazon Web Services AI Lab, up to 57.1 percent of all online sentences translated into two or more languages may have been produced by AI tools.

While AI chatbots filling up the Internet with gibberish AI-generated text might appear to be a case of harmless spamming, there is a darker side to it. In the hands of bad actors, AI tools can be used to mass-generate misinformation or misleading content. Such acts could influence actual events like elections and be used to spread propaganda against public figures since a large amount of social discourse takes place online.

Out of all modalities, gauging AI-generated text has proven to be the most difficult task so far. This is mostly due to the fact that it is impossible to watermark the words, and even if it were, malicious actors could always use a second output cycle to reword the content.

But Google DeepMind’s SynthID watermarks AI-generated text in a unique way. The tool predicts words that might follow a given word in a sentence using machine learning. Take the statement “John was feeling extremely tired after working the entire day,” for example. There is a limit to how many words can come after the word “extremely.”

SynthID can anticipate the word that will come after “extremely” and substitute it with another synonym from its database based on an analysis of the content generation styles of different AI models. Such words will be embedded throughout the entire piece of content by the watermarking tool. The tool then looks for the quantity of these words to assess the authenticity of the content when it checks for AI-generated content.

It is interesting to note that SynthID incorporates a watermark directly into the pixel structure of images and videos, rendering it undetectable yet still detectable by the tool. For audio, the watermark is added to the visual data after the audio waves have first been transformed into a spectrograph. Currently, no one outside of Google has access to these features.