Categories
Meetings

Data Augmentation through Back-Translation for Stereotypes and Irony Detection

Tom Bourgeade will present his research on “Data Augmentation through Back-Translation for Stereotypes and Irony Detection”.

Abstract

In NLP, the detection of nuanced phenomena such as stereotypes or irony presents unique challenges, namely linked to the scarcity of labeled datasets. One strategy to mitigate this is to employ Data Augmentation methods, which each have their pros and cons with regard to these phenomena. This presentation will focus on Back-Translation, which proposes exploiting modern Machine Translation models to introduce variety in training instances, in a process similar to paraphrasing, by translating a text into a pivot language, then back into the original language. We compare this approach on multilingual datasets for stereotypes and irony detection, against simpler strategies such as oversampling, as well as Cross-Translation, in which instances from other language subsets are translated and injected into the target training language subset.

When: 19/04/2024 11.30

Where: Sala Conferenze (3rd floor)

Categories
Meetings

Multimodal Strategies for Robot-to-Human Communication

Massimo Donini will present his work “Multimodal Strategies for Robot-to-Human Communication”.

Abstract

Multimodality offers new possibilities in the field of robot-to-human communication. In the proposed approach, the coordinated  and integrated use of multimedia elements with the robot’s speech  plays a very important role in the overall effectiveness of the communicative act. During the research, different multimodal communication strategies have been formalised and implemented.

When: 09/02/2024 11:30

Where: Sala Riunioni (1st floor)

Categories
Meetings

IDA – a multimodal comparable corpus for exploring extremist dynamics in online interaction

Selenia Anastasi (she/her) is a Phd candidate in Digital Humanities at the University of Genoa and a Fellow at the Language Technology Group, Hamburg University. She will be delivering a presentation on an already published work: IDA – a multimodal comparable corpus for exploring extremist dynamics in online interaction.

Link to the publication: Proceedings of the 10th International Conference on CMC and Social Media Corpora for the Humanities 2023 (CMC-2023)

Abstract

Extremist online communities are rapidly growing locally, posing potential threats to European and non-European countries. To gain insight into the dynamics of interaction within these web-based extremist groups, we present IDA, the Incel Data Archive. IDA is a multilingual and multimodal corpus compiled from Incel forums in both Italian and English languages. With its collection of forums, blogs, and websites, the Incelosphere serves as an ideal case study for examining interaction dynamics within extremist online communities from a cross-cultural perspective. Therefore, this work makes a twofold contribution: firstly, it provides an original cross-cultural perspective on the Incel phenomenon, and secondly, it extensively discusses the challenges and opportunities encountered when constructing a multimodal and multilingual corpus from discussion forums. The results of the thematic exploration of the corpus demonstrate not only variations in the discussion topic favoured by each community but also differences in the targets of their hateful content. 

When: 24/11/2023 11:30

Where: Sala conferenze (3th floor)