Categories
Meetings

The Ontology of Migrant Writers

Marco Antonio Stranisci presents a new Computational Ontology of Migrant Writers.

Title: The Ontology of Migrant Writers

Narratives have become a pervasive, and multifaceted presence in social media. Within these communicative contexts, journalists and other influential people use them to frame specific and often conflicting points of view on the world. Correspondingly, users are an active part of this creative process because they interact and redefine narratives through their sentiment on specific topics.

However, social media are often affected by stereotypical narratives that increase the level of aggressiveness and verbal violence online, often at the expense of people vulnerable to discrimination. Many of these narratives are mainstream and strongly related to the spreading of Hate Speech (HS). Unfortunately, similar stereotypes are also present in positive narratives, which in several cases depict people vulnerable to HS exclusively as victims. Instead, stories directly created by minorities have poor visibility in the public debate even if the social web hosts a lot of them.

In order to reduce this underrepresentation, a computational ontology of migrant writers has been developed. This resource is aimed at representing people who created literary works and are or have been migrant during their life. It will be used to collect, organize, and make publicly available knowledge about migrant writers, and their narratives. The ontology design focused on two research questions:

  • how to model  the concept of migrant;
  • how to represent biographical events in their temporal succession.

In the presentation, he will first introduce the backbone ontology of migrant writers, highlighting the most challenging aspects he faced during its development. Then, he will show a series of data collection strategies he implemented to gather contents from Wikidata, DBpedia, and Wikipedia.

When: On 12th February, 2021 at 11.30 am

Where: https://unito.webex.com/webappng/sites/unito/meeting/info/910eaf7ad0534d1ba92c5dde0a66a9a7_20210212T103000Z

Categories
Meetings

Zero-Shot Cross-Lingual Hate Speech Detection

Endang Wahyu Pamungkas presents new experiments and challenges in Hate Speech Detection in a multi-lingual context.

Title: Zero-Shot Cross-Lingual Hate Speech Detection

Hate speech is an increasingly important societal issue in the era of digital communication. Hateful expressions often make use of figurative language and, although they represent, in some sense, the dark side of language, they are also often prime examples of creative use of language. While hate speech is a global phenomenon, current studies on automatic hate speech detection are typically framed in a monolingual setting.

In this talk, he will present an ongoing work on hate speech detection in low-resource languages by transferring knowledge from a resource-rich language, English, in a zero-shot learning fashion. He will present experiments with traditional and recent neural architectures, and propose two joint-learning models, using different multilingual language representations to transfer knowledge between pairs of languages. The results of the experiment highlight a number of challenges and issues in this particular task.

One of the main challenges is related to the issue of current benchmarks for hate speech detection, in particular how bias related to the topical focus in the datasets influences the classification performance. The insufficient ability of current multilingual language models to transfer knowledge between languages in the specific hate speech detection task also remains an open problem. However, the experimental evaluation and the qualitative analysis show how the explicit integration of linguistic knowledge from a structured abusive language lexicon helps to alleviate this issue.

When: On 29th January, 2021 at 11.30 am

Categories
Meetings

VALICO-UD

Elisa Di Nuovo presents a new resource for NLP “VALICO-UD”.

Title: VALICO-UD, an Italian Learner Treebank in Universal Dependencies for NLP tasks

In this talk, a novel parallel treebank made of texts written by learners of Italian and their grammatically corrected versions will be presented. The treebank is annotated according to Universal Dependencies formalism and is composed of a silver standard (automatically parsed) and a core gold standard which was manually corrected and error annotated. In addition, the evaluation of three different UDPipe models will be presented, measuring also the impact of gold tokenisation and PoS tagging. To conclude, its applications and annotation choices will be discussed.

Paper: Towards an Italian Learner Treebank in Universal Dependencies

When: On 15th January, 2021 at 11.30 am