Synopsis

  • Task: Given a set of edits on Wikipedia articles, separate the ill-intentioned edits from the well-intentioned edits.
  • Input: [data (de, es)] [data (en)]

Introduction

Vandalism has always been one of Wikipedia's biggest problems, yet there are only few automatic countermeasures. Instead, volunteers spend their time in reverting vandalism edits---time, which is not spend on improving other parts of the Wikipedia. The goal of this evaluation campaign is to research and develop new, reliable ways to detect vandalism edits, which can be used to aid the Wikipedians.

Vandalism is defined as "any addition, removal, or change of content made in a deliberate attempt to compromise the integrity of Wikipedia". Put another way, a vandalism edit is an edit made with bad intentions.

Solutions to vandalism detection will resemble those of, e.g., spam detection. Hence, the application of machine learning to this problem is straightforward which makes the engineering of features for an edit model that discriminates vandalism edits from regular edits one of the primary topics. You can use all features imaginable for your edit model, with one exception: you may not look into an edit's future. I.e., to classify an edit, you may not analyze succeeding edits on the same article to see what became of it. Such a feature would be unusable in practice.

Input

To develop your approach, we provide you with a training corpus which comprises a set of edits on Wikipedia articles. All of these edits have been manually annotated whether they constitue vandalism or not.

Output

For all edits found in the evaluation corpora, your vandalism detector shall output a file classification.txt as follows:

26864258 27932250 V 0.92
28689695 87188208 R 0.50
85047080 85047157 V 0.67
80637222 91249168 R 0.43
...
  • The first column is the edit's old revision ID.
  • The second column is the edit's new revision ID.
  • The third column denotes whether the edit is vandalism (V) or regular (R), as determined by your classifier.
  • The fourth column denotes your classifier's confidence. Providing these confidence values is optional.

Evaluation

Performance is measured using the receiver operating characteristics (ROC) [Wikipedia | paper]. More specifically, we measure the area under the ROC curve (AUC), while the algorithm which maximizes the AUC performs best.

Results

Wikipedia Vandalism Detection Performance
ROC-AUC Participant
0.92236 S.M. Mola Velasco
Private, Spain
0.90351 B.T. Adler*, L. de Alfaro° and I. Pye^
*Fujitsu Labs of America, Inc., USA
°Google, Inc., and University of California Santa Cruz, USA
^CloudFlare, Inc., USA
0.89856 S. Javanmardi et al.
University of California Irvine, USA
0.89377 D. Chichkov
SC Software Inc., USA
0.87990 L. Seaward
University of Ottawa, Canada
0.87669 I. Hegedũs*, R. Ormándi*, R. Farkas*, and M. Jelasity*,°
*University of Szeged, Hungary
°Hungarian Academy of Sciences, Hungary
0.85875 M. Harpalani, T. Phumprao, M. Bassi, M. Hart, and R. Johnson
Stony Brook University, USA
0.84340 J. White and R. Maessen
University of California Irvine, USA
0.65404 A. Iftene
University of Iasi, Romania

A more detailed analysis of the detection performances can be found in the overview paper accompanying this task.

  • Definition of vandalism
  • Vandalism fighter's portal
  • Most vandalized pages
  • Wikipedia API
  • Si-Chi Chin, W. Nick Street, Padmini Srinivasan, and David Eichmann. Detecting Wikipedia vandalism with active learning and statistical language models. Fourth Workshop on Information Credibility on the Web (WICOW 2010), Raleigh, NC, April 2010.
  • R. Stuart Geiger and David Ribes. The Work of Sustaining Order in Wikipedia: The Banning of a Vandal. In CSCW'10: Proceedings of the ACM Conference on Computer Supported Cooperative Work, pages 107-126, Savannah, Georgia, USA, 2010. ACM.
  • Kelly Y. Itakura and Charles L. A. Clarke. Using Dynamic Markov Compression to Detect Vandalism in the Wikipedia. In SIGIR'09: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 822-823, New York, NY, USA, 2009. ACM.
  • Martin Potthast. Crowdsourcing a Wikipedia Vandalism Corpus. In 33rd Annual International ACM SIGIR Conference (to appear), Geneva, July 2010. ACM.
  • Martin Potthast, Benno Stein, and Robert Gerling. Automatic Vandalism Detection in Wikipedia. In ECIR'08: Proceedings of the 30th European Conference on IR Research, Glasgow, volume 4956 LNCS of Lecture Notes in Computer Science, pages 663-668, Berlin Heidelberg New York, 2008. Springer.
  • Reid Priedhorsky, Jilin Chen, Shyong (Tony) K. Lam, Katherine Panciera, Loren Terveen, and John Riedl. Creating, Destroying, and Restoring Value in Wikipedia. In Group'07: Proceedings of the International Conference on Supporting Group Work, Sanibel Island, Florida, USA, 2007.
  • Koen Smets, Bart Goethals, and Brigitte Verdonk. Automatic Vandalism Detection in Wikipedia: Towards a Machine Learning Approach. In WikiAI'08: Proceedings of the Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy, pages 43-48. AAAI Press, 2008.
  • Andrew G. West, Sampath Kannan, and Insup Lee. Detecting Wikipedia Vandalism via Spatio-Temporal Analysis of Revision Metadata. Technical Report MS-CIS-10-05, University of Pennsylvania, 2010.

Task Committee