Key ingredient to evaluation are data. For PAN's shared tasks on digital text forensics, a number of datasets have been compiled and used to evaluate dozens of approaches. Using these datasets in your research ensures comparability. You are welcome to submit datasets of your own making to PAN's shared tasks.
Datasets that comprise documents with text reused from the ClueWeb.
Datasets that comprise pairs of documents that may share reused text.
Datasets that comprise samples of Wikipedia edits and annotations whether or not they are vandalism.
Got a new dataset for one of PAN's task? We welcome the submission of new datasets to all tasks. As long as the dataset is formatted the same way as the other datasets for the same task, all software submitted by previous participants on that task can be run against it.