Generative Plagiarism Detection 2026

Synopsis

  • Task: Given a document and a collection of documents, your task is to identify all sources in the collection that the document plagiarizes.
  • Details will be announced shortly.
  • Important dates:
    • February 28, 2026: dataset release
    • May 07, 2026: software submission [Tira]
    • May 28, 2026: participant notebook submission [template] [submission  – select "Stylometry and Digital Text Forensics (PAN)" ]

Task Overview

This task follows the classic retrieval task design. The given document (query) is fully auto-generated via an undisclosed LLM that was instructed to write a new scientific document based of multiple (at least 2) source documents. The goal of this task is to identify all sources for the given query document within a given collection.

Data

Please register first at Tira. The dataset contains copyrighted material and may be used only for research purposes. No redistribution allowed.

Further information will be published soon.

Results

tba.
  1. Plagiarism Detection, PAN @ CLEF'25
  2. Plagiarism Detection, PAN @ CLEF'14
  3. Plagiarism Detection, PAN @ CLEF'13
  4. Plagiarism Detection, PAN @ CLEF'12
  5. Plagiarism Detection, PAN @ CLEF'11
  6. Plagiarism Detection, PAN @ CLEF'10
  7. Plagiarism Detection, PAN @ SEPLN'09

Task Committee