Zero-Shot Cross-Lingual Hate Speech Detection

Endang Wahyu Pamungkas presents new experiments and challenges in Hate Speech Detection in a multi-lingual context.

Title: Zero-Shot Cross-Lingual Hate Speech Detection

Hate speech is an increasingly important societal issue in the era of digital communication. Hateful expressions often make use of figurative language and, although they represent, in some sense, the dark side of language, they are also often prime examples of creative use of language. While hate speech is a global phenomenon, current studies on automatic hate speech detection are typically framed in a monolingual setting.

In this talk, he will present an ongoing work on hate speech detection in low-resource languages by transferring knowledge from a resource-rich language, English, in a zero-shot learning fashion. He will present experiments with traditional and recent neural architectures, and propose two joint-learning models, using different multilingual language representations to transfer knowledge between pairs of languages. The results of the experiment highlight a number of challenges and issues in this particular task.

One of the main challenges is related to the issue of current benchmarks for hate speech detection, in particular how bias related to the topical focus in the datasets influences the classification performance. The insufficient ability of current multilingual language models to transfer knowledge between languages in the specific hate speech detection task also remains an open problem. However, the experimental evaluation and the qualitative analysis show how the explicit integration of linguistic knowledge from a structured abusive language lexicon helps to alleviate this issue.

When: On 29th January, 2021 at 11.30 am