A watermark on AI-generated texts to detect them

Reactions

A tool capable of adding a watermark to AI-generated text to detect it has been developed

A study published in the journal Nature describes a tool capable of inserting watermarks into text generated by large linguistic models - artificial intelligence (AI) systems - thereby improving their ability to identify and track artificially created content. The tool uses a sampling algorithm to subtly bias the model's choice of words, inserting a signature that can be recognised by the detection software.

SMC Spain

23/10/2024 - 17:00 CEST

Versión en castellano

Expert reactions

Pablo Haya - marca de agua LLM EN

Pablo Haya Coll

Researcher at the Computer Linguistics Laboratory of the Autonomous University of Madrid (UAM) and director of Business & Language Analytics (BLA) of the Institute of Knowledge Engineering (IIC)

Autonomous University of Madrid

Institute of Knowledge Engineering

Science Media Centre Spain

The article presents a technically robust solution for the identification of AI-generated text through watermarking. A watermark adds invisible information to digital content (such as images, videos, audio or text) to identify its origin. In this case, watermarking consists of altering the algorithm for generating words so that they follow a traceable statistical pattern without changing the meaning.

For example, if the large linguistic model (LLM) had produced the following sentence: The dossier shows that the market has seen significant growth over the last quarter.

The watermarking algorithm generates an equivalent sentence, but choosing a series of words that, without modifying the meaning, follow a statistical relationship known to the algorithm: The report indicates that the market has had a notable advance during the last quarter.

In this example, that these four words ‘report’, ‘indicates’, ‘notable’, 'advance' and not others, appear in the same sentence, is unlikely in the case of an LLM, and highly likely if the watermarking algorithm has been used.

While it is easy to insert watermarks in images, videos or audios, in the case of text this presents a challenge, as any alteration in the words can significantly affect the meaning and quality of the content. Currently, systems to detect whether a document has been generated by AI have low success rates, so technologies that facilitate the identification of authorship are much needed. Moreover, these techniques are aligned with the transparency obligations of the AI Regulation which requires providers, at certain risk levels, to ensure that AI-generated content is identifiable.

However, the widespread adoption of these technologies remains a challenge, mainly because this type of watermarking is vulnerable to subsequent manipulations, such as modifications to the text or the use of paraphrasing techniques, which reduces the effectiveness of the mark to be detected.

The author has not responded to our request to declare conflicts of interest

Language EN

Publications