Shared Tasks

Important Dates

  • March 15, 2019: Early bird software submission
  • April 15, 2019: TIRA evaluation phase opens
  • May 11, 2019: TIRA evaluation phase deadline
  • May 31, 2019 (extended): Paper submission: [template] [guidelines] [submission]
  • June 07, 2019: Peer review notification
  • June 28, 2019: Camery-ready participant papers submission
  • tba: Early bird conference registration
  • September 09-12, 2019: Conference

The timezone of all deadlines is Anywhere on Earth.


Giancarlo Ruffo
Hoax vs fact checking: understanding and predicting the diffusion of low quality information on communication networks
University of Turin, Italy

The Internet and online social networks have amplified information diffusion processes, but at the same time, they provide fertile ground for the spread of misinformation, rumors, and hoaxes. The goal of this work is to introduce a simple modeling framework to study these phenomena: following the epidemic approach and motivated by results in literature, we look at misinformation as an instance of the more general concept of information diffusion, and we propose an adaption of the classic SIS (Susceptible-Infected-Susceptible) model to the case of misinformation by adding two essential socio-cognitive features: forgetting and competition with fact-checking efforts. First, we focus on how the availability of debunking information may contain the misinformation diffusion. Our approach allows to quantitatively gauge the minimal reaction necessary to eradicate a hoax. Second, we simulate the spreading dynamics on networks with two communities of gullible and skeptic users, with different propensities to believe hoaxes and a segregation parameter that represents the sparsity of links between the two communities. Simulations show that segregation plays an important role in the diffusion of misinformation, but can have different effects varying other parameters. Finally, we validate our model on Twitter data (both fake news and debunking), obtaining good results. Our encouraging findings suggest that fact-checking can be still considered useful in fighting misinformation, but also that the structure of the underlying social network is very important in the spreading process evolution, then further investigation in this direction is absolutely necessary in order to develop new tools and solutions to limit the diffusion of fake news.

Giancarlo Ruffo, Ph.D, is Associate Professor of Computer Science at the University of Turin, Italy from 2006, Adjunct Professor at Schools of Informatics and Computing from 2011, Indiana University, ISI fellow (awarded by ISI Foundation) from 2015, and coordinator of the master's degree program in "Networks and Computational Systems" (Reti e Sistemi Informatici) at University of Turin. His current research interests fall in the multidisciplinary research area of Computational Social Science and Network Science, with focus on data visualization and data-driven approaches to model the diffusion of misinformation, opinion polarization in social media. He also investigated research problems on web and data mining, recommendation systems, social media, distributed applications, peer-to-peer systems, security, and micro-payment schemes. He is the principal investigator of ARCS group, and he has led several research projects. He has published about 50 peer-reviewed papers in international journals and conferences. Aside his academic work, he has been involved in many other professional activities as free-lance consultant in the last 20 years. In 2013 he co-founded NetAtlas s.r.l., a tech company specialized in data modeling, analysis and management, data visualization, and ICT solutions.

Read more… Read less…
Preslav Nakov
Exposing Paid Opinion Manipulation Trolls
Qatar Computing Research Institute (QCRI), HBKU

The practice of using opinion manipulation trolls has been reality since the rise of Internet and community forums. It has been shown that user opinions about products, companies and politics can be influenced by posts by other users in online forums and social networks. This makes it easy for companies and political parties to gain popularity by paying for "reputation management" to people or companies that write in discussion forums and social networks fake opinions from fake profiles.

A natural question is whether such trolls can be found and exposed automatically. This is hard as there is no enough data to train a classifier; yet, it is possible to obtain some test data, as such trolls are sometimes caught and widely exposed. Yet, one still needs training data. We solve the problem by assuming that a user who is called a troll by several different people is likely to be one, and one who has never been called a troll is unlikely to be such. We compare the profiles of (i) paid trolls vs. (ii) "mentioned" trolls vs. (iii) non-trolls, and we further show that a classifier trained to distinguish (ii) from (iii) does quite well also at telling apart (i) from (iii).

Preslav Nakov, Ph.D, is a Principal Scientist at the Qatar Computing Research Institute (QCRI), HBKU. His research interests include computational linguistics, "fake news" detection, fact-checking, machine translation, question answering, sentiment analysis, lexical semantics, Web as a corpus, and biomedical text processing. At QCRI, he leads the Tanbih project, developed in collaboration with MIT, which aims to limit the effect of "fake news", propaganda and media bias by making users aware of what they are reading. Dr. Nakov is the Secretary of ACL SIGLEX and ACL SIGSLAV, and a member of the EACL advisory board. He also serves on the editorial boards of the Journals of Transactions of the Association for Computational Linguistics, Computer Speech and Language, Natural Language Engineering, AI Communications, and Frontiers in AI. Dr. Nakov received his PhD from the University of California at Berkeley (supported by a Fulbright grant). He is the recipient of the Bulgarian President's John Atanasoff award, named after the inventor of the first automatic electronic digital computer.

Read more… Read less…


PAN's program is part of the CLEF conference program.

September 9
Best of Labs at Auditorium
13:45-15:00 An Ensemble Approach to Cross-Domain Authorship Attribution
José Custódio and Ivandre Paraboni
Labs Presentations at Auditorium
15:45-16:00 Overview of PAN 2019: Bots and Gender Profiling, Celebrity Profiling, Cross-domain Authorship Attribution and Style Change Detection
Walter Daelemans, Mike Kestemont, Enrique Manjavacas, Martin Potthast, Francisco Manuel Rangel Pardo, Paolo Rosso, Günther Specht, Efstathios Stamatatos, Benno Stein, Michael Tschuggnall, Matti Wiegmann and Eva Zangerle
September 10
13:30-15:00 Session 1 at A31, Chair: Martin Potthast
12:00 - 13:30 Poster Session during Lunch
13:30-13:40 PAN 2019 Welcome
Martin Potthast
13:40-14:40 Keynote: Exposing Paid Opinion Manipulation Trolls
Preslav Nakov
14:40-15:00 Overview of the Shared Task on Bots and Gender Profiling in Twitter
Francisco Rangel and Paolo Rosso
15:00-15:30 Break
15:30-16:30 Session 2 at A31, Chair: Francisco Rangel
15:30-15:35 Award in Bots and Gender Profiling by The Logic Value
15:35-15:50 Using N-grams to detect Bots on Twitter
Juan Pizarro
15:50-16:10 Supervised Classification of Twitter Accounts Based on Textual Content of Tweets
Fredrik Johansson
16:10-16:30 Overview of the Celebrity Profiling Task
Matti Wiegmann
16:30-17:30 Session 3 at A31, Chair: Paolo Rosso
16:30-17:30 Keynote: Hoax vs fact checking: understanding and predicting the diffusion of low quality information on communication networks
Giancarlo Ruffo
18:30-22:00 Conference Dinner
September 11
12:00 - 13:30 Poster Session during Lunch
Multi-channel Open-set Cross-domain Authorship Attribution
José Custódio and Ivandre Paraboni
Bot and Gender detection of Twitter accounts using Distortion and LSA
Andrea Bacciu, Massimo La Morgia, Alessandro Mei, Eugenio Nerio Nemmi, Valerio Neri, and Julinda Stefa
FOI Cross-Domain Authorship Attribution for Criminal Investigations
Fredrik Johansson and Tim Isbister
UniNE at PAN-CLEF 2019: Bots and Gender Task
Catherine Ikae, Sukanya Nath, Jacques Savoy
Combined CNN+RNN Bot and Gender Profiling
Rafael Felipe Sandroni Dias and Ivandré Paraboni
Detecting bot accounts on Twitter by measuring message predictability
Piotr Przybyła
Bots and gender profiling using masking techniques
Victor Jimenez-Villar, Javier Sánchez-Junquera, Manuel Montes-y-Gómez, Luis Villaseñor-Pineda, and Simone Paolo Ponzetto
An evolutionary approach to build user representations for profiling of bots and humans in Twitter
Roberto López-Santillán, Luis Carlos González-Gurrola, Manuel Montes-y-Gómez, Graciela Ramírez-Alonso, and Olanda Prieto-Ordaz
Naive-Bayesian Classification for Bot Detection in Twitter
Pablo Gamallo and Sattam Almatarneh
Unsupervised pretraining for text classification using siamese transfer learning
Maximilian Bryan and J. Nathanael Philipp
Author profiling using semantic and syntactic features
György Kovács, Vanda Balogh, Kumar Shridhar, Purvanshi Mehta, and Pedro Alonso
Profiling Twitter users using autogenerated features invariant to data distribution
Tiziano Fagni and Maurizio Tesconi
Bots and Gender Profiling using a Multi-layer Architecture
Régis Goubin, Dorian Lefeuvre, Alaa Alhamzeh, Jelena Mitrovic, Elod Egyed-Zsigmond, and Leopold Ghemmogne Fossi
Bots and Gender Profiling on Twitter using Sociolinguistic Features
Edwin Puertas, Luis Gabriel Moreno-Sandoval, Flor Miriam Plaza-del-Arco, Jorge Andres Alvarado-Valencia, Alexandra Pomares-Quimbaya, and L.Alfonso Ureña-López
Bots and gender profiling with convolutional hierarchical recurrent neural network
Juraj Petrik, Daniela Chuda
Celebrity Profiling on Twitter using Sociolinguistic Features
Luis Gabriel Moreno-Sandoval, Edwin Puertas, Flor Miriam Plaza-del-Arco, Alexandra Pomares-Quimbaya, Jorge Andres Alvarado-Valencia, and L. Alfonso Ureña-López
A Hierarchical Neural Network Approach for Bots and Gender Profiling
Andrea Cimino and Felice dell’Orletta
Bots and Gender Profiling using Character Bigrams
Daniel Yacob Espinosa, Helena Gómez-Adorno, and Grigori Sidorov
15:30-16:30 Session 4 at A31, Chair: Efstathios Stamatatos
15:30-15:45 Overview of the Style Change Detection Task
Eva Zangerle
15:45-16:00 Style Change Detection by Threshold Based and Window Merge Clustering Methods
Sukanya Nath
16:00-16:15 Twitter User Profiling: Bot and Gender Identification
Dijana Kosmajac and Vlado Keselj
16:15-16:30 Twitter feeds profiling with TF-IDF
Juraj Petrik and Daniela Chuda
16:30-16:50 Overview of the Cross-domain Authorship Attribution Task
Mike Kestemont
16:50-17:10 Cross-Domain Authorship Attribution Combining Instance Based and Profile-Based Features
Andrea Bacciu, Massimo La Morgia, Alessandro Mei, Eugenio Nerio Nemmi, Valerio Neri, Julinda Stefa
17:10-17:30 Community discussion
18:30-20:30 Civic Reception
September 12
Best of CLEF for Industry at Auditorium
14:00-14:30 Shared Tasks for Industry: Experiment Platforms and Author Profiling
Martin Potthast and Francisco Rangel


Organizing Committee