In this talk I will review some recent results regarding early detection of signs of depression and anorexia. Since 2017, we have been organizing eRisk, a CLEF lab that promotes the development of effective and efficient solutions for early risk prediction on the Internet. eRisk explores the evaluation methodology, effectiveness metrics and practical applications (particularly those related to health and safety) of early risk detection on the Internet. Early detection technologies can be employed in different areas, particularly those related to health and safety. For instance, early alerts could be sent when a predator starts interacting with a child for sexual purposes, or when a potential offender starts publishing antisocial threats on a blog, forum or social network. Our main goal is to pioneer a new interdisciplinary research area that would be potentially applicable to a wide variety of situations and to many different personal profiles. Examples include potential paedophiles, stalkers, individuals that could fall into the hands of criminal organisations, people with suicidal inclinations, or people susceptible to depression. In this talk, I will discuss the lessons learned over these two years and some future lines of work.
PAN at CLEF 2018
Shared Tasks
- Author Profiling
- Author Masking
- Cross-domain Authorship Attribution
- Obfuscation Evaluation
- Style Change Detection
Important Dates
- March 30, 2018: Early bird software submission
- April 15, 2018: TIRA evaluation phase opens
- May 11, 2018: TIRA evaluation phase deadline
- May 31, 2018 (extended): Paper submission: [template] [guidelines] [submission]
- June 15, 2018: Peer review notification
- June 29, 2018: Camera-ready participant papers submission
- June 30, 2018: Early bird conference registration
- September 10-14, 2018: Conference
The timezone of all deadlines is Anywhere on Earth.
Keynotes
The drastic change in the Web was witnessed throughout the past decade, which saw an exponential growth in social networking services. Traditionally, social network users are encouraged to complete their profiles by explicitly providing their personal attributes such as age, gender, interests, etc. Such information is essential for Marketing, Facility Arrangement, or Candidate Assessment, but, unfortunately, often not publicly available. This gives rise to user profiling, which aims at automatic inference of individual user attributes based on their social network interactions. Considering that human beings frequently contribute multi-modal data in multiple online social networks at the same time, it is essential to implement inter-source complimentary multi-view learning techniques to perform automatic user profiling efficiently. In this talk, we will overview recent research attempts on learning across multiple social networks and data modalities for automatic user profiling. We will also give several practical examples of how Multi-View User Profiling helps SoMin.ai in boosting the efficiency of enterprises' marketing efforts.
There is much concern about algorithms that underlie information services and the view of the social world they present to users. Image search engines are known to perpetuate gender stereotypes, particularly surrounding professions (e.g., returning primarily images of men on a search for "engineer," although few, if any, men on a search for "nurse"). In the first part of the talk, I discuss the problem of detecting social biases in image search results. We developed a novel method for automatically examining the content and strength of gender stereotypes in image results, which is inspired by the trait adjective checklist method. In experiments with Microsoft Bing, we found that photos of women are more often retrieved for searches on warm character traits (e.g., "emotional"), whereas agentic traits (e.g., "rational") typically result in more images of men. In the second part of the talk, I address questions surrounding the origin of social biases in search algorithms. I will argue that the quality of image metadata is a source of bias, as algorithms are typically trained on "gold standard," human-produced metadata. Specifically, in an experiment testing a commonly used crowdsourcing task for metadata generation, I will provide evidence that people's descriptions of men and women depicted in similar contexts differ in systematic ways that are predictable by theory. In conclusion, I shall argue that while the reproduction of social stereotypes in search algorithms is likely inevitable, there are ways to effectively raise users' awareness of biases in results.
Program
PAN's program is part of the CLEF conference program.
September 10 | |
|
Labs Overviews |
Overview of PAN 2018: Author Identification, Author Profiling, and Author
Obfuscation Efstathios Stamatatos, Francisco Rangel, Michael Tschuggnall, Benno Stein, Mike Kestemont, Paolo Rosso, Martin Potthast |
|
|
Best of Labs 2017 |
Hierarchical Clustering Analysis: The best-performing approach at PAN 2017 author clustering
task Helena Gómez-Adorno, Carolina Martín-Del-Campo-Rodríguez, Grigori Sidorov, Yuridiana Alemán, Darnes Vilariño Ayala and David Pinto |
|
Session 1, Chair: Paolo Rosso | |
14:30-15:20 | Keynote: Profiling Depression and Anorexia in Social Media David Losada |
15:20-15:40 | Overview of the 6th Author Profiling Task at PAN 2018: Multimodal Gender Identification in
Twitter Francisco Rangel, Paolo Rosso, Manuel Montes-y-Gómez, Martin Potthast, Benno Stein |
15:40-16:00 | Text and Image Synergy with Feature Cross Technique for Gender Identification Takumi Takahashi, Takuji Tahara, Koki Nagatani, Yasuhide Miura, Tomoki Taniguchi, Tomoko Ohkuma |
16:00-16:30 | Break |
Session 2, Chair: Francisco Rangel | |
16:30-16:50 | Gender Identification in Twitter using N-grams and LSA Saman Daneshvar, Diana Inkpen |
16:50-17:10 | Character-based Convolutional Neural Network and ResNet18 for Twitter Author
Profiling Nils Schaetti |
17:10-17:50 | Keynote: Demystifying Psychometric Marketing: Multi-View Learning as a New Social Media User
Profiling Standard Aleksandr Farseev |
17:50-18:00 | Discussion |
19:00-23:00 | City Tour |
September 11 | |
13:30-14:30 | Poster Session |
Author Profiling using Word Embeddings with Subword Information Rafael Felipe Sandroni Dias, Ivandré Paraboni |
|
Character-based Convolutional Neural Network for Style Change Detection Nils Schaetti |
|
Bidirectional Echo State Network-based Reservoir Computing for Cross-domain Authorship
Attribution Nils Schaetti |
|
Gender Prediction From Tweets With Convolutional Neural Networks Erhan Sezerer, Ozan Polatbilek, Özge Sevgili, Selma Tekir |
|
CIC-GIL Approach to Cross-domain Authorship Attribution Carolina Martín-Del-Campo-Rodríguez, Helena Gómez-Adorno, Grigori Sidorov, Ildar Batyrshin |
|
Complexity Measures and POS n-grams for Author Identification in Several Languages Rocío López-Anguita, Arturo Montejo-Ráez, Manuel C. Díaz-Galiano |
|
Authorship Profiling Without Using Topical Information Jussi Karlgren, Lewis Esposito, Chantal Gratton, Pentti Kanerva |
|
Authorship Attribution with Neural Networks and Multiple Features Łukasz Gągała |
|
Stacked Gender Prediction from Tweet Texts and Images Giovanni Ciccone, Arthur Sultan, Léa Laporte, Elöd Egyed-Zsigmond, Alaa Alhamzeh, Michael Granitzer |
|
Gender Identification through Multi-modal Tweet Analysis using MicroTC and Bag of Visual
Words Eric S. Tellez, Sabino Miranda-Jiménez, Daniela Moctezuma, Mario Graff, Vladimir Salgado, José Ortiz-Bejar |
|
Multi-Language Neural Network Model with Advance Preprocessor for Gender Classification over
Social Media Kashyap Raiyani, Teresa Gonçalves, Paulo Quaresma, Vítor Beires Nogueira |
|
Multilingual Author Profiling using LSTMs Roy Khristopher Bayot, Teresa Gonçalves |
|
Session 3, Chair: Efstathios Stamatatos | |
14:30-14:50 | Overview of the Author Obfuscation Task at PAN 2018: A New Approach to Measuring Safety Martin Potthast, Felix Schremmer, Matthias Hagen, Benno Stein |
14:50-15:00 | UniNE at CLEF 2018: Author Masking Mirco Kocher, Jacques Savoy |
15:00-15:15 | Overview of the Author Identification Task at PAN-2018: Cross-domain Authorship
Attribution Mike Kestemont, Efstathios Stamatatos, Walter Daelemans, Benno Stein, Martin Potthast |
15:15-15:30 | EACH-USP Ensemble Cross-domain Authorship Attribution José Eleandro Custódio, Ivandré Paraboni |
15:30-15:45 | Dynamic Parameter Search for Cross-Domain Authorship Attribution Benjamin Murauer, Michael Tschuggnall, Günther Specht |
15:45-16:00 | Cross-Domain Authorship Attribution Based on Compression Oren Halvani, Lukas Graner |
16:00-16:30 | Break |
Session 4, Chair: Mike Kestemont | |
16:30-17:15 | Keynote: Competent Men and Warm Women: On the Detection and Origin of Gender Stereotyped
Image
Search Results Jahna Otterbacher |
17:15-17:30 | Overview of the Author Identification Task at PAN-2018: Style Change Detection Michael Tschuggnall, Günther Specht, Benno Stein, Martin Potthast |
17:30-17:45 | An Ensemble-Rich Multi-Aspect Approach Towards Robust Style Change Detection Dimitrina Zlatkova, Daniel Kopev, Kristiyan Mitov, Atanas Atanasov, Momchil Hardalov, Ivan Koychev, Preslav Nakov |
17:45-18:00 | Detecting a Change of Style using Text Statistics Kamil Safin, Aleksandr Ogaltsov |
19:00-24:00 | Social Event: Théâtre des Halles |
September 12 | |
|
Best of Labs 2017 |
Simply the Best: Minimalist System Trumps Complex Models in Author Profiling Angelo Basile, Gareth Dwyer, Maria Medvedeva, Josine Rawee, Hessel Haagsma and Malvina Nissim |
|
17:00-18:00 | 2019 Labs Kickoff and Closing Discussion Forum |
19:00-24:00 | Social Event: Théâtre des Halles & Music Festival |