Intrinsic Plagiarism Detection 2009
Synopsis
- Task: Given a set of suspicious documents the task is to identify all plagiarized text passages, e.g., by detecting writing style breaches. The comparison of a suspicious document with other documents is not allowed in this task.
- Input: [data]
- Evaluator: [code]
Award
We are happy to announce the following overall winner of the 1st International Competition on Plagiarism Detection who will be awarded 500,- Euro sponsored by Yahoo! Research:
- Task winner of the intrinsic analysis task is Efstathios Stamatatos from the University of the Aegean.
Congratulations!
Input
To develop your approach, we provide you with a training corpus which comprises a set of suspicious documents, each of which may contain plagiarized passages.
Output
For each suspicious document suspicious-documentXYZ.txt found in the evaluation
corpora, your plagiarism detector shall output an XML file suspicious-documentXYZ.xml which contains
meta information about all plagiarism cases detected within:
<document reference="suspicious-documentXYZ.txt">
<feature name="detected-plagiarism"
this_offset="5"
this_length="1000"
/>
...
</document>
The XML documents must be valid with respect to the XML schema found here.
Evaluation
Performance will be measured using macro-averaged precision and recall, granularity, and the plagdet score, which is a combination of the first three measures. For your convenience, we provide a reference implementation of the measures written in Python.
Results
| Intrinsic Plagiarism Detection Performance | |
|---|---|
| Plagdet | Participant |
| 0.2462 | E. Stamatatos University of the Aegean, Greece |
| 0.1955 | B. Hagbi and M. Koppel Bar Ilan University, Israel |
| 0.1766 | M. Zechner, M. Muhr, R. Kern, and M. Granitzer Know-Center Graz, Austria |
| 0.1219 | L. M. Seaward and S. Matwin University of Ottawa, Canada |
A more detailed analysis of the detection performances with respect to precision, recall, and granularity can be found in the overview paper accompanying this task.




