PAN at CLEF 2021

Shared Tasks
Important Dates
Keynotes
Program
Organizing Committee

Shared Tasks

Important Dates

April 07, 2021 (extended): Early bird software submission phase (optional)
April to Mid-May: Software submission phase
May 20, 2021: Software submission deadline
May 28, 2021: Participant paper submission [template] [guidelines] (use this template) [submission]
June 11, 2021: Peer review notification
June 30, 2021: Camera-ready participant papers submission
TBD: Early bird conference registration
September 21-24, 2021: Conference

The timezone of all deadlines is Anywhere on Earth.

Keynotes

Generalisation in Social Media Research: from Fact Verification to Hate Speech Detection

Arkaitz Zubiaga

Queen Mary University of London

While models built and tested on a specific dataset and for a specific task often achieve very good performance, they then fail to generalise when they are applied to new, unseen data. In this talk I will discuss the importance and challenges of achieving generalisable performance in social media research with a particular focus on fact verification and hate speech detection. I will present some of our recent work in this direction, as well as discuss open challenges to further the capacity of generalisation especially in hate speech detection.

Detecting and Rewriting Socially Biased Language

Maarten Sap

University of Washington

Language has the power to reinforce stereotypes and project social biases onto others, either through overt hate or subtle biases. Accounting for this toxicity and social bias in language is crucial for natural language processing (NLP) systems to be safely and ethically deployed in the world. In this talk, I will first analyze a failure case of automatic hate speech detection, in which we find that models tend to flag speech by African Americans as toxic more often than by others. We trace the origins of the biases back to the annotated datasets, and show that we can reduce these biases, by making a tweet's dialect more explicit during the annotation process. Then, as an alternative to binary hate speech detection, I will present Social Bias Frames, a new structured formalism for distilling biased implications of language. Using a new corpus of 150k structured annotations, we show that models can learn to reason about high-level offensiveness of statements, but struggle to explain why a statement might be harmful. Finally, I will introduce PowerTransformer, a new unsupervised model for controllable debiasing of text through the lens of connotation frames of power and agency. With this model, we show that subtle gender biases in how characters are portrayed in stories and movies can be mitigated through automatic rewriting. I will conclude with future directions for better reasoning about toxicity and social biases in language.

Industry Talk

Author Profiling at Symanto: Build Emotional Connection with your Customers at Scale

Francisco Rangel

Symanto Research

Building emotional connection is key for human beings, but sometimes in the business world we forget about it, we forget that behind every potential deal there is a person, and we just speak about leads, customers or churn... At Symanto, we combine Artificial Intelligence with Psychology to help our customers build emotional connections at scale to improve the way they resonate and communicate. In this Industry talk, I will briefly introduce Symanto, how we apply, among others, author profiling techniques, how our core technology allows us to stand out from our competition, why emotions and psychology matter, how we offer this to the world, as well as some of the latest research lines that we are working on.

Invited Presentation

Completing the Multi-Authorship Jigsaw Puzzle

Harry Scells

University of Queensland, Australia

Multi-authorship tasks have been run at PAN now for over half a decade. In this time, several multi-authorship tasks have been identified, a number of datasets have been created, and numerous methods have been developed to address the tasks. The community here has achieved considerable success, given these challenging tasks. To use an analogy, the multi-authorship jigsaw puzzle is now starting to take place, with many of the most important foundational pieces in place. In this talk, I will speculate on the next possible pieces of the multi-authorship jigsaw to connect the larger pieces of the puzzle, focusing on the construction of datasets and evaluation of methods.

Program

PAN's program is part of the CLEF 2021 conference program.

Please note that all session times below are given in Bucharest time, i.e. GMT+3

.
September 22
10:00-11:30	CLEF Session: Lab overviews (BioASQ, ARQMath-2, SimpleText, PAN)
11:30-13:00	Keynote & Lab Session, Chair: Paolo Rosso
11:30-12:30	Keynote: Generalisation in Social Media Research: from Fact Verification to Hate Speech Detection Arkaitz Zubiaga
12:30-13:00	Overview of the Profiling Hate Speech Spreaders on Twitter Task at PAN 2021

15:30-17:00	Lab Session: Profiling Hate Speech Spreaders on Twitter, Chair: Francisco Rangel
15:30-15:40	Best system award of Profiling Hate Speech Spreaders on Twitter
15:40-17:00	Participant presentations
	Detection of hate speech spreaders using convolutional neural networks Marco Siino, Elisa Di Nuovo, Ilenia Tinnirello, Marco La Cascia
	Deep Modeling of Latent Representations for Twitter Profiles on Hate Speech Spreaders Identification Roberto Labadie, Daniel Castro-Castro, Reynier Ortega Bueno
	HaMor at the Profiling Hate Speech Spreaders on Twitter Mirko Lai, Marco Antonio Stranisci, Cristina Bosco, Rossana Damiano, Viviana Patti
	Multi-level stacked ensemble with sparse and dense features for hate speech detection on Twitter Darko Tosev, Sonja Gievska

17:30-19:00	Keynotes, Chair: Martin Potthast
17:30-18:00	Industry Talk: Author Profiling at Symanto - Build Emotional Connection with your Customers at Scale Francisco M. Rangel Pardo
18:00-19:00	Keynote: Detecting and Rewriting Socially Biased Language Maarten Sap
September 23
15:30-17:00	Keynote & Lab Session: Style Change Detection, Chair: Eva Zangerle
15:30-16:00	Keynote: Completing the Multi-Authorship Jigsaw Puzzle Harry Scells
16:00-16:30	Overview of the Style Change Detection Task at PAN 2021
16:30-17:00	Participant presentations
	Style Change Detection Based On Writing Style Similarity Zhijie Zhang, Zhongyuan Han, Leilei Kong, Xiaogang Miao, Zeyang Peng, Jieming Zeng, Haojie Cao, Jinxi Zhang, Ziwei Xiao and Xuemei Peng
	Writing Style Change Detection on Multi-Author Documents Rhia Singh, Janith Weerasinghe and Rachel Greenstadt
	Style Change Detection on Real-World Data using an LSTM-powered Attribution Algorithm Robert Deibel and Denise Loefflad

17:30-19:00	Lab Session: Style Change cont'd & Authorship Verification, Chair: Ilia Markov
17:30-17:40	~~Style change detection using Siamese neural networks~~ ~~Sukanya Nath~~ CANCELLED
17:40-18:00	Overview of the Authorship Verification Task at PAN 2021
18:00-19:00	Participant presentations
	O2D2: Out-Of-Distribution Detector to Capture Undecidable Trials in Authorship Verification Benedikt Bönninghoff, Robert Nickel and Dorothea Kolossa
	Graph-based Siamese Network for Authorship Verification Daniel Embarcadero-Ruiz, Helena Gómez-Adorno, Ivan Reyes-Hernández, Alexis García and Alberto Embarcadero-Ruiz
	Feature Vector Difference based Authorship Verification for Open World Settings Janith Weerasinghe, Rhia Singh and Rachel Greenstadt
	Authorship Verification with neural networks via stylometric feature concatenation Antonio Menta and Ana Garcia-Serrano

Organizing Committee

Martin Potthast

University of Kassel, hessian.AI, and ScaDS.AI

Paolo Rosso

Efstathios Stamatatos

University of the Aegean

Benno Stein

Bauhaus-Universität Weimar