Bots and Gender Profiling 2019
Synopsis
- Task: Given a Twitter feed, determine whether its author is a bot or a human. In case of human, identify her/his gender.
- Input: [train][test]
- Submission: [submit]
Task
Social media bots pose as humans to influence users with commercial, political or ideological purposes. For example, bots could artificially inflate the popularity of a product by promoting it and/or writing positive ratings, as well as undermine the reputation of competitive products through negative valuations. The threat is even greater when the purpose is political or ideological (see Brexit referendum or US Presidential elections). Fearing the effect of this influence, the German political parties have rejected the use of bots in their electoral campaign for the general elections. Furthermore, bots are commonly related to fake news spreading. Therefore, to approach the identification of bots from an author profiling perspective is of high importance from the point of view of marketing, forensics and security.
After having addressed several aspects of author profiling in social media from 2013 to 2018 (age and gender, also together with personality, gender and language variety, and gender from a multimodality perspective), this year we aim at investigating whether the author of a Twitter feed is a bot or a human. Furthermore, in case of human, to profile the gender of the author.
As in previous years, we propose the task from a multilingual perspective:
- English
- Spanish
Unlike previous years and with the aim at maintaining a realistic scenario, we have not performed any cleaning action on the tweets: they remain as users tweeted them. This means that RTs have not been removed and tweets potentially in more than one language can appear.
Award
We are happy to announce that the best performing team at the 7th International Competition on Author Profiling will be awarded 300 Euro sponsored by The Logic Value.
Juan Pizarro, Universitat Politècnica de València, Spain.
Congratulations!
Data
Input
The uncompressed dataset consists in a folder per language (en, es). Each folder contains:- A XML file per author (Twitter user) with 100 tweets. The name of the XML file correspond to the unique author id.
- A truth.txt file with the list of authors and the ground truth.
<author lang="en"> <documents> <document>Tweet 1 textual contents</document> <document>Tweet 2 textual contents</document> ... </documents> </author>The format of the truth.txt file is as follows. The first column corresponds to the author id. The second and third columns contain the truth for the human/bot and bot/male/female tasks.
b2d5748083d6fdffec6c2d68d4d4442d:::bot:::bot 2bed15d46872169dc7deaf8d2b43a56:::bot:::bot 8234ac5cca1aed3f9029277b2cb851b:::human:::female 5ccd228e21485568016b4ee82deb0d28:::human:::female 60d068f9cafb656431e62a6542de2dc0:::human:::male c6e5e9c92fb338dc0e029d9ea22a4358:::human:::male ...
Output
Your software must take as input the absolute path to an unpacked dataset, and has to output for each document of the dataset a corresponding XML file that looks like this:
<author id="author-id" lang="en|es" type="bot|human" gender="bot|male|female" />
The naming of the output files is up to you. However, we recommend to use the author-id as filename and "xml" as extension.
IMPORTANT! Languages should not be mixed. A folder should be created for each language and place inside only the files with the prediction for this language.
IMPORTANT! In order to avoid overfitting when experimenting with the training set, we recommend you to use the provided split train/dev (files truth-train.txt and truth-dev.txt).
Evaluation
The performance of your author profiling solution will be ranked by accuracy. For each language, we will calculate individual accuracies. Firstly, we will calculate the accuracy of identifying bots vs. human. Then, in case of humans, we will calculate the accuracy of identifying males vs. females. Finally, we will average the accuracy values per language to obtain the final ranking.Submission
This task follows PAN's software submission strategy described here.
Results
The following tables list the performances achieved by the participating teams in the different subtasks:
BOTS vs. HUMAN | GENDER | |||||
---|---|---|---|---|---|---|
POS | Team | EN | ES | EN | ES | AVG |
1 | Pizarro | 0.9360 | 0.9333 | 0.8356 | 0.8172 | 0.8805 |
2 | Srinivasarao & Manu | 0.9371 | 0.9061 | 0.8398 | 0.7967 | 0.8699 |
3 | Bacciu et al. | 0.9432 | 0.9078 | 0.8417 | 0.7761 | 0.8672 |
4 | Jimenez-Villar et al. | 0.9114 | 0.9211 | 0.8212 | 0.8100 | 0.8659 |
5 | Fernquist | 0.9496 | 0.9061 | 0.8273 | 0.7667 | 0.8624 |
6 | Mahmood | 0.9121 | 0.9167 | 0.8163 | 0.7950 | 0.8600 |
7 | Ispas & Popescu | 0.9345 | 0.8950 | 0.8265 | 0.7822 | 0.8596 |
8 | Vogel & Jiang | 0.9201 | 0.9056 | 0.8167 | 0.7756 | 0.8545 |
9 | Johansson & Isbister | 0.9595 | 0.8817 | 0.8379 | 0.7278 | 0.8517 |
10 | Goubin et al. | 0.9034 | 0.8678 | 0.8333 | 0.7917 | 0.8491 |
11 | Polignano & de Pinto | 0.9182 | 0.9156 | 0.7973 | 0.7417 | 0.8432 |
12 | Valencia et al. | 0.9061 | 0.8606 | 0.8432 | 0.7539 | 0.8410 |
13 | Kosmajac & Keselj | 0.9216 | 0.8956 | 0.7928 | 0.7494 | 0.8399 |
14 | Fagni & Tesconi | 0.9148 | 0.9144 | 0.7670 | 0.7589 | 0.8388 |
char nGrams | 0.9360 | 0.8972 | 0.7920 | 0.7289 | 0.8385 | |
15 | Glocker | 0.9091 | 0.8767 | 0.8114 | 0.7467 | 0.8360 |
word nGrams | 0.9356 | 0.8833 | 0.7989 | 0.7244 | 0.8356 | |
16 | Martinc et al. | 0.8939 | 0.8744 | 0.7989 | 0.7572 | 0.8311 |
17 | Sanchis & Velez | 0.9129 | 0.8756 | 0.8061 | 0.7233 | 0.8295 |
18 | Halvani & Marquardt | 0.9159 | 0.8239 | 0.8273 | 0.7378 | 0.8262 |
19 | Ashraf et al. | 0.9227 | 0.8839 | 0.7583 | 0.7261 | 0.8228 |
20 | Gishamer | 0.9352 | 0.7922 | 0.8402 | 0.7122 | 0.8200 |
21 | Petrik & Chuda | 0.9008 | 0.8689 | 0.7758 | 0.7250 | 0.8176 |
22 | Oliveira et al. | 0.9057 | 0.8767 | 0.7686 | 0.7150 | 0.8165 |
W2V | 0.9030 | 0.8444 | 0.7879 | 0.7156 | 0.8127 | |
23 | De La Peña & Prieto | 0.9045 | 0.8578 | 0.7898 | 0.6967 | 0.8122 |
24 | López Santillán et al. | 0.8867 | 0.8544 | 0.7773 | 0.7100 | 0.8071 |
LDSE | 0.9054 | 0.8372 | 0.7800 | 0.6900 | 0.8032 | |
25 | Bolonyai et al. | 0.9136 | 0.8389 | 0.7572 | 0.6956 | 0.8013 |
26 | Moryossef | 0.8909 | 0.8378 | 0.7871 | 0.6894 | 0.8013 |
27 | Zhechev | 0.8652 | 0.8706 | 0.7360 | 0.7178 | 0.7974 |
28 | Giachanou & Ghanem | 0.9057 | 0.8556 | 0.7731 | 0.6478 | 0.7956 |
29 | Espinosa et al. | 0.8413 | 0.7683 | 0.8413 | 0.7178 | 0.7922 |
30 | Rahgouy et al. | 0.8621 | 0.8378 | 0.7636 | 0.7022 | 0.7914 |
31 | Onose et al. | 0.8943 | 0.8483 | 0.7485 | 0.6711 | 0.7906 |
32 | Przybyla | 0.9155 | 0.8844 | 0.6898 | 0.6533 | 0.7858 |
33 | Puertas et al. | 0.8807 | 0.8061 | 0.7610 | 0.6944 | 0.7856 |
34 | Van Halteren | 0.8962 | 0.8283 | 0.7420 | 0.6728 | 0.7848 |
35 | Gamallo & Almatarneh | 0.8148 | 0.8767 | 0.7220 | 0.7056 | 0.7798 |
36 | Bryan & Philipp | 0.8689 | 0.7883 | 0.6455 | 0.6056 | 0.7271 |
37 | Dias & Paraboni | 0.8409 | 0.8211 | 0.5807 | 0.6467 | 0.7224 |
38 | Oliva & Masanet | 0.9114 | 0.9111 | 0.4462 | 0.4589 | 0.6819 |
39 | Hacohen-Kerner et al. | 0.4163 | 0.4744 | 0.7489 | 0.7378 | 0.5944 |
40 | Kloppenburg | 0.5830 | 0.5389 | 0.4678 | 0.4483 | 0.5095 |
MAJORITY | 0.5000 | 0.5000 | 0.5000 | 0.5000 | 0.5000 | |
RANDOM | 0.4905 | 0.4861 | 0.3716 | 0.3700 | 0.4296 | |
41 | Bounaama & Amine | 0.5008 | 0.5050 | 0.2511 | 0.2567 | 0.3784 |
42 | Joo & Hwang | 0.9333 | - | 0.8360 | - | 0.4423 |
43 | Staykovski | 0.9186 | - | 0.8174 | - | 0.4340 |
44 | Cimino & Dell'Orletta | 0.9083 | - | 0.7898 | - | 0.4245 |
45 | Ikae et al. | 0.9125 | - | 0.7371 | - | 0.4124 |
46 | Jeanneau | 0.8924 | - | 0.7451 | - | 0.4094 |
47 | Zhang | 0.8977 | - | 0.7197 | - | 0.4044 |
48 | Fahim et al. | 0.8629 | - | 0.6837 | - | 0.3867 |
49 | Saborit | - | 0.8100 | - | 0.6567 | 0.3667 |
50 | Saeed & Shirazi | 0.7951 | - | 0.5655 | - | 0.3402 |
51 | Radarapu | 0.7242 | - | 0.4951 | - | 0.3048 |
52 | Bennani-Smires | 0.9159 | - | - | - | 0.2290 |
53 | Gupta | 0.5007 | - | 0.4044 | - | 0.2263 |
54 | Qurdina | 0.9034 | - | - | - | 0.2259 |
55 | Aroyehun | 0.5000 | - | - | - | 0.1250 |
Baselines
- MAJORITY: The predicted class coincides with the majority class.
- RANDOM: A random prediction for each instance.
- CHAR N-GRAMS:
- BOTS-EN: 500 characters 5-grams + Random Forest
- BOTS-ES: 2,000 characters 5-grams + Random Forest
- GENDER-EN: 2,000 characters 4-grams + Random Forest
- GENDER-ES: 1,000 characters 5-grams + Random Forest
- WORD N-GRAMS:
- BOTS-EN: 200 words 1-grams + Random Forest
- BOTS-ES: 100 words 1-grams + Random Forest
- GENDER-EN: 200 words 1-grams + Random Forest
- GENDER-ES: 200 words 1-grams + Random Forest
- WORD EMBEDDINGS: Text represented by averaging the word embeddings.
- BOTS-EN: glove.twitter.27B.200d + Random Forest
- BOTS-ES: fasttext-wikipedia + J48
- GENDER-EN: glove.twitter.27B.100d + SVM
- GENDER-ES: fasttext-sbwc + SVM
- LDSE: Low Dimensionality Statistical Embedding described in: Rangel, F., Rosso, P., Franco, M. A
Low Dimensionality Representation for Language Variety Identification. In: Proceedings of the
17th International Conference on Intelligent Text Processing and Computational Linguistics
(CICLing’16), Springer-Verlag, LNCS(9624), pp. 156-169, 2018
- BOTS-EN: LDSE.v2 (MinFreq=10, MinSize=1) + Naive Bayes
- BOTS-ES: LDSE.v1 (MinFreq=10, MinSize=1) + Naive Bayes
- GENDER-EN: LDSE.v1 (MinFreq=10, MinSize=3) + BayesNet
- GENDER-ES: LDSE.v1 (MinFreq=2, MinSize=1) + Naive Bayes
Related Work
- Andre Guess, Jonathan Nagler, and Joshua Tucker. Less than you think: Prevalence and predictors of fake news dissemination on Facebook. Science Advances vol. 5 (2019)
- Kai Shu, Suhang Wang, and Huan Liu. Understanding user profiles on social media for fake news detection. IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 430--435 (2018)
- Massimo Stella, Emilio Ferrara, and Manlio De Domenico. Bots sustain and inflate striking opposition in online social systems. arXiv preprint arXiv:1802.07292 (2018)
- Massimo Stella, Emilio Ferrara, and Manlio De Domenico. Bots increase exposure to negative and inflammatory content in online social systems. Proceedings of the National Academy of Sciences, vol. 115 (49), pp. 12435-12440 (2018)
- Gouzhu Dong and Huan Liu. Feature Engineering for Machine Learning and Data Analytics. CRC Press (2018). Chapter 12. Feature Engineering for Social Bot Detection. Onur Varol, Clayton A. Davis, Filippo Menczer, Alessandro Flammini.
- Emilio Ferrara, Onur Varol, Filippo Menczer, Alessandro Flammini. Detection of Promoted Social Media Campaigns. The 10th International AAAI Conference on Web and Social Media - ICWSM, pp. 563-566 (2016)
- Zakaria el Hjouji, D. Scott Hunter, Nicolas Guenon des Mesnards, Tauhid Zaman. The Impact of Bots on Opinions in Social Networks. arXiv preprint arXiv:1810.12398 (2018)
- John P. Dickerson, Vadim Kagan, V.S. Subrahmanian. Using Sentiment to Detect Bots on Twitter: Are Humans more Opionated than Bots? Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 620-627. IEEE Press. (2014)
- Kai-Cheng Yang, Onur Varol, Clayton A. Davis, Emilio Ferrara, Alessandro Flammini, Filippo Menczer. Arming the Public with AI to Counter Social Bots. arXiv preprint arXiv:1901.00912. 2019 Jan 3.
- Chiyu Cai, Linking Li, Daniel Zeng. Behaviour Enhanced Deep Bot Detection in Social Media. Intelligence and Security Informatics (ISI). 2017 IEEE International Conference, pp. 128-130 (2017)
- Andrew Hall, Loren Terveen, Aaron Halfaker. Bot Detection in Wikidata Using Behavioral and Other Informal Cues. Proceedings of the ACM on Human-Computer Interaction. 2018 Nov 1;2(CSCW):64.
- Mariona Taulé, M. Antonia Martí, Francisco Rangel, Paolo Rosso, Cistina Bosco, and Viviana Patti. Overview of the task on stance and gender detection in tweets on Catalan independence at IberEval 2017. In: 2nd Workshop on Evaluation of Human Language Technologies for Iberian Languages, IberEval 2017. CEUR Workshop Proceedings. CEUR-WS.org, vol. 1881.
- Francisco Rangel, Paolo Rosso, Martin Potthast, Benno Stein. Overview of the 6th author profiling task at pan 2018: multimodal gender identification in Twitter. In: CLEF 2018 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings. CEUR-WS.org, vol. 2125.
- Francisco Rangel, Paolo Rosso, Martin Potthast, Benno Stein. Overview of the 5th Author Profiling Task at PAN 2017: Gender and Language Variety Identification in Twitter. In: Cappellato L., Ferro N., Goeuriot L, Mandl T. (Eds.) CLEF 2017 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings. CEUR-WS.org, vol. 1866.
- Francisco Rangel, Paolo Rosso, Ben Verhoeven, Walter Daelemans, Martin Pottast, Benno Stein. Overview of the 4th Author Profiling Task at PAN 2016: Cross-Genre Evaluations. In: Balog K., Capellato L., Ferro N., Macdonald C. (Eds.) CLEF 2016 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings. CEUR-WS.org, vol. 1609, pp. 750-784
- Francisco Rangel, Fabio Celli, Paolo Rosso, Martin Pottast, Benno Stein, Walter Daelemans. Overview of the 3rd Author Profiling Task at PAN 2015.In: Linda Cappelato and Nicola Ferro and Gareth Jones and Eric San Juan (Eds.): CLEF 2015 Labs and Workshops, Notebook Papers, 8-11 September, Toulouse, France. CEUR Workshop Proceedings. ISSN 1613-0073, http://ceur-ws.org/Vol-1391/,2015.
- Francisco Rangel, Paolo Rosso, Irina Chugur, Martin Potthast, Martin Trenkmann, Benno Stein, Ben Verhoeven, Walter Daelemans. Overview of the 2nd Author Profiling Task at PAN 2014. In: Cappellato L., Ferro N., Halvey M., Kraaij W. (Eds.) CLEF 2014 Labs and Workshops, Notebook Papers. CEUR-WS.org, vol. 1180, pp. 898-827.
- Francisco Rangel, Paolo Rosso, Moshe Koppel, Efstatios Stamatatos, Giacomo Inches. Overview of the Author Profiling Task at PAN 2013. In: Forner P., Navigli R., Tufis D. (Eds.)Notebook Papers of CLEF 2013 LABs and Workshops. CEUR-WS.org, vol. 1179