Shared Tasks

Important Dates

  • April 07, 2021 (extended): Early bird software submission phase (optional)
  • April to Mid-May: Software submission phase
  • May 20, 2021: Software submission deadline
  • May 28, 2021: Participant paper submission [template] [guidelines] (use this template) [submission]
  • June 11, 2021: Peer review notification
  • June 30, 2021: Camera-ready participant papers submission
  • TBD: Early bird conference registration
  • September 21-24, 2021: Conference

The timezone of all deadlines is Anywhere on Earth.

Keynotes

Arkaitz Zubiaga

While models built and tested on a specific dataset and for a specific task often achieve very good performance, they then fail to generalise when they are applied to new, unseen data. In this talk I will discuss the importance and challenges of achieving generalisable performance in social media research with a particular focus on fact verification and hate speech detection. I will present some of our recent work in this direction, as well as discuss open challenges to further the capacity of generalisation especially in hate speech detection.

Arkaitz Zubiaga is a lecturer at Queen Mary University of London, where he leads the Social Data Science lab. His research interests revolve around computational social science and natural language processing, with a focus on linking online data with events in the real world, among others for tackling problematic issues on the Web and social media that can have a damaging effect on individuals or society at large, such as hate speech, misinformation, inequality, biases and other forms of online harm.

Read more… Read less…
Maarten Sap
Detecting and Rewriting Socially Biased Language
University of Washington

Language has the power to reinforce stereotypes and project social biases onto others, either through overt hate or subtle biases. Accounting for this toxicity and social bias in language is crucial for natural language processing (NLP) systems to be safely and ethically deployed in the world. In this talk, I will first analyze a failure case of automatic hate speech detection, in which we find that models tend to flag speech by African Americans as toxic more often than by others. We trace the origins of the biases back to the annotated datasets, and show that we can reduce these biases, by making a tweet's dialect more explicit during the annotation process. Then, as an alternative to binary hate speech detection, I will present Social Bias Frames, a new structured formalism for distilling biased implications of language. Using a new corpus of 150k structured annotations, we show that models can learn to reason about high-level offensiveness of statements, but struggle to explain why a statement might be harmful. Finally, I will introduce PowerTransformer, a new unsupervised model for controllable debiasing of text through the lens of connotation frames of power and agency. With this model, we show that subtle gender biases in how characters are portrayed in stories and movies can be mitigated through automatic rewriting. I will conclude with future directions for better reasoning about toxicity and social biases in language.

Maarten Sap is a postdoc/young investigator at the Allen Institute for AI (AI2) on project MOSAIC, and will join CMU's LTI department as an assistant professor in Fall 2022. His research focuses on making NLP systems socially intelligent, and understanding social inequality and bias in language. He has presented his work in top-tier NLP and AI conferences, receiving a best short paper nomination at ACL 2019 and a best paper award at the WeCNLP 2020 summit. Additionally, he and his team won the inaugural 2017 Amazon Alexa Prize, a social chatbot competition. He received his PhD from the University of Washington's Paul G. Allen School of Computer Science & Engineering where he was advised by Yejin Choi and Noah Smith. In the past, he has interned at the Allen Institute for AI working on social commonsense reasoning, and at Microsoft Research working on deep learning models for understanding human cognition.

Read more… Read less…

Industry Talk

Francisco Rangel
Author Profiling at Symanto: Build Emotional Connection with your Customers at Scale
Symanto Research

Building emotional connection is key for human beings, but sometimes in the business world we forget about it, we forget that behind every potential deal there is a person, and we just speak about leads, customers or churn... At Symanto, we combine Artificial Intelligence with Psychology to help our customers build emotional connections at scale to improve the way they resonate and communicate. In this Industry talk, I will briefly introduce Symanto, how we apply, among others, author profiling techniques, how our core technology allows us to stand out from our competition, why emotions and psychology matter, how we offer this to the world, as well as some of the latest research lines that we are working on.

Francisco is Head of Product at Symanto Research and Member of the Advisory Board of Spain AI, the largest network on Artificial Intelligence in Spanish language. Former CTO at Autoritas and Postdoctoral researcher at the Pattern Recognition and Human Language Technology at the Universitat Politècnica de València, he specialises his career applying technology, information retrieval and artificial intelligence to the digital transformation of organisations. Francisco obtained his PhD on Author Profiling being advised by Paolo Rosso, winning the MAVIR Award 2007 for the best research master thesis on information retrieval and natural language processing and SEPLN Award 2017 for the best doctoral thesis in artificial intelligence, Dell 2009 Award Finalist in Technological Excellence (Grupo Fivasa / SIC+), and Mobip 2010 Award to the Most Innovative Idea (Corex SIC+). Since 2013, he co-organises international evaluation tasks in different areas of information retrieval and artificial intelligence (SemEval, CLEF, FIRE, SEPLN). Francisco is also co-author of books such as the oppositions syllabus of the Superior Technical Body of Information Technology of the Generalitat Valenciana administration of the College of Computer Engineers of the Valencian Community (COIICV), professor in various Masters of Artificial Intelligence, Information Retrieval and Big Data, author of more than 50 scientific publications and director of more than 15 final master's projects and a doctoral thesis. He has had repercussion in media such as the TVE Informe Semanal (WeeklyReport) program.

Read more… Read less…

Invited Presentation

Harry Scells
Completing the Multi-Authorship Jigsaw Puzzle
University of Queensland, Australia

Multi-authorship tasks have been run at PAN now for over half a decade. In this time, several multi-authorship tasks have been identified, a number of datasets have been created, and numerous methods have been developed to address the tasks. The community here has achieved considerable success, given these challenging tasks. To use an analogy, the multi-authorship jigsaw puzzle is now starting to take place, with many of the most important foundational pieces in place. In this talk, I will speculate on the next possible pieces of the multi-authorship jigsaw to connect the larger pieces of the puzzle, focusing on the construction of datasets and evaluation of methods.

Harry Scells is a postdoc at the University of Queensland, Australia, collaborating with the Webis group at the Bauhaus-Universität Weimar on the PHOENIX project, funded by the German Federal Ministry of Education and Research (BMBF). The PHOENIX project seeks to investigate novel techniques for multi-authorship analysis in scientific authorship and scientific writing. Harry's research interests lie in Information Retrieval and NLP, and has published his works in a number of high quality conferences such as SIGIR, WWW, CIKM, and ECIR.

Read more… Read less…

Program

PAN's program is part of the CLEF 2021 conference program.

Please note that all session times below are given in Bucharest time, i.e. GMT+3

.
September 22
10:00-11:30 CLEF Session: Lab overviews (BioASQ, ARQMath-2, SimpleText, PAN)
11:30-13:00 Keynote & Lab Session, Chair: Paolo Rosso
11:30-12:30 Keynote: Generalisation in Social Media Research: from Fact Verification to Hate Speech Detection
Arkaitz Zubiaga
12:30-13:00 Overview of the Profiling Hate Speech Spreaders on Twitter Task at PAN 2021
15:30-17:00 Lab Session: Profiling Hate Speech Spreaders on Twitter, Chair: Francisco Rangel
15:30-15:40 Best system award of Profiling Hate Speech Spreaders on Twitter
15:40-17:00 Participant presentations
Detection of hate speech spreaders using convolutional neural networks
Marco Siino, Elisa Di Nuovo, Ilenia Tinnirello, Marco La Cascia
Deep Modeling of Latent Representations for Twitter Profiles on Hate Speech Spreaders Identification
Roberto Labadie, Daniel Castro-Castro, Reynier Ortega Bueno
HaMor at the Profiling Hate Speech Spreaders on Twitter
Mirko Lai, Marco Antonio Stranisci, Cristina Bosco, Rossana Damiano, Viviana Patti
Multi-level stacked ensemble with sparse and dense features for hate speech detection on Twitter
Darko Tosev, Sonja Gievska
17:30-19:00 Keynotes, Chair: Martin Potthast
17:30-18:00 Industry Talk: Author Profiling at Symanto - Build Emotional Connection with your Customers at Scale
Francisco M. Rangel Pardo
18:00-19:00 Keynote: Detecting and Rewriting Socially Biased Language
Maarten Sap
September 23
15:30-17:00 Keynote & Lab Session: Style Change Detection, Chair: Eva Zangerle
15:30-16:00 Keynote: Completing the Multi-Authorship Jigsaw Puzzle
Harry Scells
16:00-16:30 Overview of the Style Change Detection Task at PAN 2021
16:30-17:00 Participant presentations
Style Change Detection Based On Writing Style Similarity
Zhijie Zhang, Zhongyuan Han, Leilei Kong, Xiaogang Miao, Zeyang Peng, Jieming Zeng, Haojie Cao, Jinxi Zhang, Ziwei Xiao and Xuemei Peng
Writing Style Change Detection on Multi-Author Documents
Rhia Singh, Janith Weerasinghe and Rachel Greenstadt
Style Change Detection on Real-World Data using an LSTM-powered Attribution Algorithm
Robert Deibel and Denise Loefflad
17:30-19:00 Lab Session: Style Change cont'd & Authorship Verification, Chair: Ilia Markov
17:30-17:40 Style change detection using Siamese neural networks
Sukanya Nath CANCELLED
17:40-18:00 Overview of the Authorship Verification Task at PAN 2021
18:00-19:00 Participant presentations
O2D2: Out-Of-Distribution Detector to Capture Undecidable Trials in Authorship Verification
Benedikt Bönninghoff, Robert Nickel and Dorothea Kolossa
Graph-based Siamese Network for Authorship Verification
Daniel Embarcadero-Ruiz, Helena Gómez-Adorno, Ivan Reyes-Hernández, Alexis García and Alberto Embarcadero-Ruiz
Feature Vector Difference based Authorship Verification for Open World Settings
Janith Weerasinghe, Rhia Singh and Rachel Greenstadt
Authorship Verification with neural networks via stylometric feature concatenation
Antonio Menta and Ana Garcia-Serrano

Organizing Committee