Profiling Irony and Stereotype Spreaders on Twitter (IROSTEREO) 2022
Synopsis
- Task: Given a Twitter feed in English, determine whether its author spreads Irony and Stereotypes.
- Input: Timelines of authors sharing Irony and Stereotypes towards, for instance, women or the LGTB community [data].
600 users with 200 English tweets each.
Classes: (1) Irony with Stereotypes, (2) Irony without Stereotypes, (3) Stereotypes but no Irony, (4) Neither - Evaluation: Accuracy.
- Submission: Even when Deployment on TIRA platform are prefered to guarantee the reproducibility of the results, participants can upload their runs in another modality this year. Participants can bypass the VMs and download the test set from ( Data ) and only upload the predictions in the correct output format specified by the shared task organizers (like in the good, old, non-reproducible days). Participants must upload their results in a single zip archive (.zip).
- Baselines: Character/word n-grams+ SVM/Logistic Regression, LDSE, ...
- Results (47 Submissions)
Best approach:Soft-voting BERTweet Ensemble [Wentao Yu et al]
Task
With irony, language is employed in a figurative and subtle way to mean the opposite to what is literally stated. In case of sarcasm, a more aggressive type of irony, the intent is to mock or scorn a victim without excluding the possibility to hurt. Stereotypes are often used, especially in discussions about controversial issues such as immigration or sexism and misogyny. At PAN’22, we will focus on profiling ironic authors in Twitter. Special emphasis will be given to those authors that employ irony to spread stereotypes, for instance, towards women or the LGTB community. The goal will be to classify authors as ironic or not depending on their number of tweets with ironic content. Among those authors we will consider a subset that employs irony to convey stereotypes in order to investigate if state-of-the-art models are able to distinguish also these cases. Therefore, given authors of Twitter together with their tweets, the goal will be to profile those authors that can be considered as ironic.
For those who likes challenges, there is also the opportunity to participate in the IROSTEREO subtask that addresses Stereotype Stance Detection. In fact, stereotypes have been employed by ironic authors to hurt the target (e.g. immigrants) or to somehow defend it. The goal of this subtask will be to detect the stance of how stereotypes are used by ironic authors, if in favour or against the target. Therefore, given the subset of ironic authors that employed stereotypes in some of their tweets, the goal will be to detect their overall stance.
Award
We are happy to announce that the best performing team at the 10th International Competition on Author Profiling will be awarded 300,- Euro sponsored by Symanto
This year, the winner of the task is:
- Wentao Yu, Benedikt Boenninghoff, and Dorothea Kolossa, Institute of Communication Acoustics, Ruhr University Bochum, Germany
Data
Input
The uncompressed dataset consists in a folder which contains:- A XML file per author (Twitter user) with 200 tweets. The name of the XML file correspond to the unique author id.
- A truth.txt file with the list of authors and the ground truth.
<author lang="en"> <documents> <document>Tweet 1 textual contents</document> <document>Tweet 2 textual contents</document> ... </documents> </author>The format of the truth.txt file is as follows. The first column corresponds to the author id. The second column contains the truth label.
2d0d4d7064787300c111033e1d2270cc:::I b9eccce7b46cc0b951f6983cc06ebb8:::NI f41251b3d64d13ae244dc49d8886cf07:::I 47c980972060055d7f5495a5ba3428dc:::NI d8ed8de45b73bbcf426cdc9209e4bfbc:::I 2746a9bf36400367b63c925886bc0683:::NI ...
Regarding the subtask on stance detection, the format will be the same except for the classes, whose labels are: INFAVOR and AGAINST.
Output
Your software must take as input the absolute path to an unpacked dataset, and has to output for each document of the dataset a corresponding XML file that looks like this:
<author id="author-id" lang="en" type="NI|I" />
The naming of the output files is up to you. However, we recommend to use the author-id as filename and "xml" as extension.
Regarding the subtask on stance detection, the format will be the same except for the classes, whose labels are: INFAVOR and AGAINST.
Evaluation
The performance of your system will be ranked by accuracy.
For those who participate in the subtask on stance detection of ironic users towards stereotypes, the evaluation metric will be Macro-F. We will also analyse the precision, recall and f-measure per class to look into the performance of the systems regarding each possibility (in favour vs. against).
Results
POS | Team | Accuracy |
---|---|---|
1 | wentaoyu | 0.9944 |
2 | harshv | 0.9778 |
3 | edapal | 0.9722 |
3 | ikae | 0.9722 |
5 | JoseAGD | 0.9667 |
5 | Enrub | 0.9667 |
7 | fsolgui | 0.9611 |
7 | claugomez | 0.9611 |
9 | AngelAso | 0.9556 |
9 | alvaro | 0.9556 |
9 | xhuang | 0.9556 |
9 | toshevska | 0.9556 |
9 | tfnribeiro_g | 0.9556 |
14 | josejaviercalvo | 0.9500 |
14 | taunk | 0.9500 |
14 | your | 0.9500 |
14 | PereMarco | 0.9500 |
14 | Garcia_Sanches | 0.9500 |
19 | pigeon | 0.9444 |
19 | xmpeiro | 0.9444 |
19 | marcosiino | 0.9444 |
19 | dingtli | 0.9444 |
19 | moncho | 0.9444 |
19 | yifanxu | 0.9444 |
19 | yzhang | 0.9444 |
19 | longma | 0.9444 |
LDSE | 0.9389 | |
27 | missino | 0.9389 |
27 | badjack | 0.9389 |
27 | sgomw | 0.9389 |
27 | wangbin | 0.9389 |
27 | caohaojie | 0.9389 |
32 | lwblinwenbin | 0.9333 |
32 | xuyifan | 0.9333 |
32 | dirazuherfa | 0.9333 |
32 | Los Pablos | 0.9333 |
32 | Metalumnos | 0.9333 |
37 | narcis | 0.9278 |
37 | stm | 0.9278 |
37 | huangxt233 | 0.9278 |
40 | lzy | 0.9222 |
40 | avazbar | 0.9222 |
40 | fragilro | 0.9222 |
40 | whoami | 0.9222 |
40 | Garcia\_Grau | 0.9222 |
45 | hjang | 0.9167 |
45 | nigarsas | 0.9167 |
45 | fernanda | 0.9167 |
45 | Hyewon | 0.9167 |
49 | zyang | 0.9056 |
50 | giglou | 0.9000 |
50 | sulee | 0.9000 |
52 | ehsan.tavan | 0.8889 |
53 | rlad | 0.8778 |
54 | balouchzahi | 0.8722 |
RF + char 2-ngrams | 0.8610 | |
55 | manexagirrezabalgmail | 0.8500 |
LR + word 1-ngrams | 0.8490 | |
56 | tamayo | 0.8111 |
57 | yuandong | 0.7500 |
LSTM+Bert-encoding | 0.6940 | |
58 | G-Lab | 0.6778 |
58 | AmitDasRup | 0.6778 |
60 | Alpine_EP | 0.6722 |
61 | Kminos | 0.6667 |
62 | castro | 0.6389 |
63 | castroa | 0.5833 |
64 | sokhandan | 0.5333 |
64 | leila | 0.5333 |
Results in the Stance Detection Subtask
POS | Team | Run | F1-Macro |
---|---|---|---|
- | LDSE | 0.7600 | |
1 | dirazuherfa | 3 | 0.6248 |
2 | dirazuherfa | 4 | 0.5807 |
- | RF + char 3-ngram | 0.5673 | |
3 | toshevska | 2 | 0.5545 |
4 | dirazuherfa | 1 | 0.5433 |
5 | JoseAGD | 1 | 0.5312 |
6 | tamayo | 1 | 0.4886 |
7 | dirazuherfa | 2 | 0.4876 |
8 | tamayo | 2 | 0.4685 |
- | SVM+word 2-ngram | 0.4685 | |
9 | AmitDasRup | 1 | 0.4563 |
10 | toshevska | 4 | 0.4444 |
10 | taunk | 1 | 0.4444 |
12 | toshevska | 3 | 0.4393 |
13 | AmitDasRup | 2 | 0.4357 |
14 | toshevska | 1 | 0.4340 |
15 | fernanda | 1 | 0.3119 |
Related Work
- [1] Valerio Basile, Cristina Bosco, Elisabetta Fersini, Dora Nozza, Viviana Patti, Francisco Rangel, Paolo Rosso, Manuela Sanguinetti (2019). SemEval-2019 task 5: Multilingual detection of hate speech against immigrants and women in Twitter. Proc. SemEval 2019
- [2] Sánchez-Junquera J., Chulvi B., Rosso P., Ponzetto S. How Do You Speak about Immigrants? Taxonomy and StereoImmigrants Dataset for Identifying Stereotypes about Immigrants. In: Applied Science, 11(8), 3610, 2021 https://doi.org/10.3390/app11083610
- [3] Sánchez-Junquera J., Rosso P., Montes-y-Gómez M., Chulvi B. Masking and BERT-based Models for Stereotype Identification. In: Procesamiento del Lenguaje Natural (SEPLN), num. 67, pp. 83-94, 2021
- [4] Zhang S., Zhang X., Chan J., Rosso P. Irony Detection via Sentiment-based Transfer Learning. In: Information Processing & Management, vol. 56, issue 5, pp. 1633-1644, 2019
- [5] Sulis E., Hernández I., Rosso P., Patti V., Ruffo G. Figurative Messages and Affect in Twitter: Differences Between #irony, #sarcasm and #not. In: Knowledge-Based Systems, vol. 108, pp. 132–143, 2016
- [6] Hernández I., Patti V., Rosso P. Irony Detection in Twitter: The Role of Affective Content. In: ACM Transactions on Internet Technology, 16(3):1-24, 2016
- [7] Ghosh A., Li G., Veale T., Rosso P., Shutova E., Barnden J., Reyes A. Semeval-2015 task 11: Sentiment Analysis of Figurative Language in Twitter. In: Proc. 9th Int. Workshop on Semantic Evaluation (SemEval 2015), Co-located with NAACL, Denver, Colorado, 4-5 June. Association for Computational Linguistics, pp. 470–478, 2015
- [8] Reyes A., Rosso P. On the Difficulty of Automatically Detecting Irony: Beyond a Simple Case of Negation. In: Knowledge and Information Systems, 40(3):595-614, 2014
- [9] Reyes A., Rosso P., Veale T. A Multidimensional Approach for Detecting Irony in Twitter. In: Language Resources and Evaluation, 47(1):239-268, 2013
- [10] Reyes A., Rosso P., Buscaldi D. From Humor Recognition to Irony Detection: The Figurative Language of Social Media. In: Data & Knowledge Engineering, vol. 74, pp.1-12, 2012
- [11] Francisco Rangel, Gretel Liz De La Peña Sarracén, Berta Chulvi, Elisabetta Fersini, Paolo Rosso. Profiling Hate Speech Spreaders on Twitter Task at PAN 2021. In: Faggioli, G., Ferro, N., Joly, A., Maistro, M., Piroi, F. (eds.) CLEF 2021 Labs and Workshops, Notebook Papers, CEUR-WS.org, vol. 2936, pp. 1772-1789
- [12] Francisco Rangel, Anastasia Giachanou, Bilal Ghanem, Paolo Rosso. Overview of the 8th Author Profiling Task at PAN 2020: Profiling Fake News Spreaders on Twitter. In: L. Cappellato, C. Eickhoff, N. Ferro, and A. Névéol (eds.) CLEF 2020 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings.CEUR-WS.org, vol. 2696
- [13] Francisco Rangel and Paolo Rosso. Overview of the 7th Author Profiling Task at PAN 2019: Bots and Gender Profiling in Twitter. In: L. Cappellato, N. Ferro, D. E. Losada and H. Müller (eds.) CLEF 2019 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings.CEUR-WS.org, vol. 2380
- [14] Francisco Rangel, Paolo Rosso, Martin Potthast, Benno Stein. Overview of the 6th author profiling task at pan 2018: multimodal gender identification in Twitter. In: CLEF 2018 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings. CEUR-WS.org, vol. 2125.
- [15] Francisco Rangel, Paolo Rosso, Martin Potthast, Benno Stein. Overview of the 5th Author Profiling Task at PAN 2017: Gender and Language Variety Identification in Twitter. In: Cappellato L., Ferro N., Goeuriot L, Mandl T. (Eds.) CLEF 2017 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings. CEUR-WS.org, vol. 1866.
- [16] Francisco Rangel, Paolo Rosso, Ben Verhoeven, Walter Daelemans, Martin Pottast, Benno Stein. Overview of the 4th Author Profiling Task at PAN 2016: Cross-Genre Evaluations. In: Balog K., Capellato L., Ferro N., Macdonald C. (Eds.) CLEF 2016 Labs and Workshops, Notebook Papers. CEUR Workshop Proceedings. CEUR-WS.org, vol. 1609, pp. 750-784
- [17] Francisco Rangel, Fabio Celli, Paolo Rosso, Martin Pottast, Benno Stein, Walter Daelemans. Overview of the 3rd Author Profiling Task at PAN 2015.In: Linda Cappelato and Nicola Ferro and Gareth Jones and Eric San Juan (Eds.): CLEF 2015 Labs and Workshops, Notebook Papers, 8-11 September, Toulouse, France. CEUR Workshop Proceedings. ISSN 1613-0073, http://ceur-ws.org/Vol-1391/,2015.
- [18] Francisco Rangel, Paolo Rosso, Irina Chugur, Martin Potthast, Martin Trenkmann, Benno Stein, Ben Verhoeven, Walter Daelemans. Overview of the 2nd Author Profiling Task at PAN 2014. In: Cappellato L., Ferro N., Halvey M., Kraaij W. (Eds.) CLEF 2014 Labs and Workshops, Notebook Papers. CEUR-WS.org, vol. 1180, pp. 898-827.
- [19] Francisco Rangel, Paolo Rosso, Moshe Koppel, Efstatios Stamatatos, Giacomo Inches. Overview of the Author Profiling Task at PAN 2013. In: Forner P., Navigli R., Tufis D. (Eds.)Notebook Papers of CLEF 2013 LABs and Workshops. CEUR-WS.org, vol. 1179
- [20] Francisco Rangel and Paolo Rosso On the Implications of the General Data Protection Regulation on the Organisation of Evaluation Tasks. In: Language and Law / Linguagem e Direito, Vol. 5(2), pp. 80-102
- [21] Francisco Rangel, Marc Franco-Salvador, Paolo Rosso A Low Dimensionality Representation for Language Variety Identification. In: Postproc. 17th Int. Conf. on Comput. Linguistics and Intelligent Text Processing, CICLing-2016, Springer-Verlag, Revised Selected Papers, Part II, LNCS(9624), pp. 156-169 (arXiv:1705.10754)