Synopsis

  • Task: Insert a watermark into a given text. Then, after we have attacked the text, detect the inserted watermark.
  • Registration: [CLEF labs]
  • Important dates:
    • May 07, 2026: software submission
    • May 28, 2026: participant notebook submission [template] [submission  – select "Stylometry and Digital Text Forensics (PAN)" ]
  • Data: TBA
  • Evaluation Measures: Balanced Accuracy, BLEU, BERTScore

Task Overview

In the Text Watermarking task, participants are given a text and must insert a watermark into it. After submitting the watermark system, including a watermark detection algorithm, through TIRA, the watermark system is run on the test dataset. The watermarked texts are then subjected to various attacks. The objective is to detect the watermark after the text has been attacked, thereby demonstrating its robustness against an attacker.

The task is structured as follows:

  1. Develop a text watermarking system, that inserts a watermark into a text. You can use the provided training dataset for this purpose.
  2. Submit your watermarking system, including the watermark detection, through the Tira platform.
  3. We will run your submitted watermarking system on our test dataset.
  4. We will then carry out attacks of varying severity on the watermarked texts from the test dataset.
  5. We will run your watermark detection system on the attacked texts to evaluate its performance in detecting watermarks.

Important: The watermarked text must remain semantically close to the original. Watermarked texts that deviate significantly in meaning from the original will be penalized during evaluation.

Submission

Participants will submit their systems as Docker images through the Tira platform. It is not expected that submitted systems are actually trained on Tira, but they must be standalone and runnable on the platform without requiring contact to the outside world (evaluation runs will be sandboxed).

The submitted software must be executable inside the container via a command line call. The script must take two arguments: an input file (an absolute path to the input JSONL file) and an output directory (an absolute path to where the results will be written):

Within Tira, the input file will be called dataset.jsonl, so with the pre-defined Tira placeholders, your software should be invoked like this:

$ mySoftware $inputDataset/dataset.jsonl $outputDir

Within $outputDir, a single (!) file with the file extension *.jsonl must be created with the following format:

{"id": "bea8cccd-0c99-4977-9c1b-8423a9e1ed96", "label": 1.0}
{"id": "a963d7a0-d7e9-47c0-be84-a40ccc2005c7", "label": 0.0}
...

Evaluation

Systems will be evaluated with the following metrics:

  • Overall score, here called Text Watermarking Fidelity (TWF). TWF = max(BLEU, BERTScore) · Balanced Accuracy
  • Balanced Accuracy: The average of true positive rate and true negative rate.
  • BLEU: Syntactic similarity of the watermarked text to the original text.
  • BERTScore: Semantic similarity between the watermarked text and the original text.
  • In addition, the confusion matrix is reported for calculating true/false positive/negative rates.

The evaluator for the task will output the above measures as JSON like so:

{
    "twf": 0.878,
    "balanced_accuracy": 0.974,
    "bleu": 0.901,
    "bert-score": 0.85,
    "confusion": [
        [
            1211,
            66
        ],
        [
            27,
            2285
        ]
    ]
}

Baselines

TBA

Leaderboard

TBD

  • John Kirchenbauer, Jonas Geiping, Yuxin Wen, Jonathan Katz, Ian Miers, and Tom Goldstein. A watermark for large language models. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 of Pro- ceedings of Machine Learning Research, pages 1706117084. PMLR, 2023. URL https://proceedings.mlr.press/v202/kirchenbauer23a.html
  • Xuandong Zhao, Prabhanjan Vijendra Ananth, Lei Li, and Yu-Xiang Wang. Provable robust watermarking for ai-generated text. In The Twelfth Interna- tional Conference on Learning Representations, ICLR 2024, Vienna, Aus- tria, May 7-11, 2024. OpenReview.net, 2024. URL https://openreview. net/forum?id=SsmT8aO45L.
  • Yijian Lu, Aiwei Liu, Dianzhi Yu, Jingjing Li, and Irwin King. An entropy- based text watermarking detection method. In Lun-Wei Ku, Andre Martins, and Vivek Srikumar, editors, Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, August 11-16, 2024, pages 1172411735. Association for Computational Linguistics, 2024. doi: 10.18653/V1/2024. ACL-LONG.630. URL https://doi.org/10.18653/v1/2024.acl-long. 630.
  • Taehyun Lee, Seokhee Hong, Jaewoo Ahn, Ilgee Hong, Hwaran Lee, Sangdoo Yun, Jamin Shin, and Gunhee Kim. Who wrote this code? watermarking for code generation. In Lun-Wei Ku, Andre Martins, and Vivek Sriku- mar, editors, Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, August 11-16, 2024, pages 48904911. Association for Compu- tational Linguistics, 2024. doi: 10.18653/V1/2024.ACL-LONG.268. URL https://doi.org/10.18653/v1/2024.acl-long.268.
  • Aiwei Liu, Leyi Pan, Yijian Lu, Jingjing Li, Xuming Hu, Lijie Wen, Irwin King, and Philip S. Yu. A survey of text watermarking in the era of large language models. CoRR, abs/2312.07913, 2023. doi: 10.48550/ARXIV.2312.07913. URL https://doi.org/10.48550/arXiv.2312.07913.

Task Committee