Author: Simona Frenda

Interpretability of deep neural networks for computer vision

Post author By Simona Frenda
Post date 11/29/2021

Francesco Di Ciccio will discuss the importance to make interpretable the models, showing a first experiment with the Layer-wise Relevance Propagation (LRP) technique.

Title: Interpretability of deep neural networks for computer vision

Machine learning methods are widely used in both commercial applications and academia in order to make inferences in a wide range of areas. Extracting information from data in order to make accurate predictions is the predominant goal in many such applications, although it could come at the cost of the explainability of the applied model. A focus in the design of a model should be placed on implementing tools that assist the understanding of: output results with respect to the parameters of the model and to their consistency to the domain knowledge; choice of the hyperparameters and their interpretation with respect to the domain of the application. In some domains more than others the importance of choosing the correct features is especially important, such as in medicine or economics, where decisions are made on the assumption that the predicted results are obtained from a proper representation of the problem at hand. In such fields, there is a strong reliance on the interpretability of the model, given the higher interest in knowing ‘Why?’ and further identifying those parameters that caused the model to make such predictions. The interest in the interpretability of models is also shared by other fields, such as computer vision, which for simplicity will be used as a reference to investigate the functioning of the Layer-wise Relevance Propagation (LRP) technique. In this application, the focus is on a simple classification task with the MNIST dataset.

When: December 03 at 11.30

Where: in presentia (032_A_P03_3140) or online

Talks

Seminar: Gavin Abercrombie

Post author By Simona Frenda
Post date 11/29/2021

Adventures in Annotation for NLP

Most natural language processing tasks rely on human-labelled data in order to train supervised learning systems.
However, tasks can be subjective, and the “ground truth” labels may be
difficult or even impossible to ascertain. In this seminar, I will describe work on collecting, creating, analysing, and deploying labelled datasets for tasks including sarcasm detection, sentiment analysis, topic identification, and abuse detection in domains as diverse as social media, parliamentary
debates, and conversational AI.

Gavin Abercrombie is a postdoctoral research associate at Heriot-Watt University (Edinburgh, Scotland), where he is working on the ESPRC-funded project “Designing Conversational Assistants to Reduce Gender Bias”. He holds a PhD from the University of Manchester and an MSc from the University of Copenhagen, and is currently a visiting researcher at Bocconi University, Milan.

When: 18/11/2021 at 11.00

Where: in presentia (032_A_P03_3140)

Meetings

Mining Annotator Perspectives from Hate Speech Corpora

Post author By Simona Frenda
Post date 11/29/2021

Valerio Basile will introduce a new automatic method to identify annotators’ perspectives in controversial issues such as Hate Speech.

Title: Mining Annotator Perspectives from Hate Speech Corpora

Disagreement in annotation, traditionally treated mostly as noise, is now more and more often considered as a source of valuable information instead. He investigated a particular form of disagreement, occurring when the focus of an annotated dataset is a subjective and controversial phenomenon, therefore inducing a certain degree of polarization among the annotators’ judgments. He argued that the polarization is indicative of the conflicting perspectives held by different annotator groups, and propose a quantitative method to model this phenomenon. Moreover, he introduced a method to automatically identify shared perspectives stemming from a common background.
He tested this method on several corpora in English and Italian, manually annotated according to their hate speech content, validating prior knowledge about the groups of annotators, when available, and discovering characteristic traits among annotators with unknown background.
He found numerous precisely defined perspectives, described in terms of increased sensitivity towards textual content expressing attitudes such as xenophobia, Islamophobia, and homophobia.

When: February 11 at 11.30

Where: online

Meetings

Sign Language Recognition using Machine Learning

Post author By Simona Frenda
Post date 11/17/2021

Muhammad Saad Amin will talk about the problems related to Sign Language Recognition, from the creation of the dataset to the design of the algorithm.

Title: Sign Language Recognition using Machine Learning

Human gesture classification and recognition is always a challenging task. Capturing human gestures and transforming these gestures into labeled digital data is mainly required to train supervised Machine Learning algorithms. Hence, SLR systems with improved accuracy and increased efficiency are the need of the hour.

In this seminar, he will discuss how to capture gestures (specifically ASL) using sensor-based prototypes? How to convert these sign gestures into digital data (dataset generation)? And how a dataset can be used for SL recognition using Supervised Machine Learning algorithms?

When: November 19 at 11.30

Where: in presentia (032_A_P03_3140) or online

Meetings

Detection of Hate Speech Spreaders

Post author By Simona Frenda
Post date 10/08/2021

With Convolutional Neural Networks

Elisa Di Nuovo and Marco Siino will present an interesting approach used to detect hate speech spreaders in the context of the shared task Profiling Hate Speech Spreaders (HSSs) proposed at PAN 2021.

Title: Detection of hate speech spreaders using convolutional neural networks

The speakers will describe a deep learning model based on a Convolutional Neural Network (CNN) to profile hate speech spreaders online. The model was developed for the Profiling Hate Speech Spreaders (HSSs) task proposed by PAN 2021 organizers and hosted at the 2021 CLEF Conference. The approach, used to classify an author as HSS or not (nHSS), takes advantage of a CNN based on a single convolutional layer. In this binary classification task, on the tests performed using a 5-fold cross validation, the proposed model reaches a maximum accuracy of 0.80 on the multilingual (i.e., English and Spanish) training set, and a minimum loss value of 0.51 on the same set. As announced by the task organizers, the model won the 2021 PAN competition on profiling HSSs, reaching an overall accuracy of 0.79 on the full test set. This overall accuracy is obtained averaging the accuracy achieved by the model on both languages. In particular, with regard to the Spanish test set the model achieves an accuracy of 0.85, while on the English test set the same model achieved an accuracy of 0.73.

When: On 22nd October at 11.30 am

Where: https://unito.webex.com/webappng/sites/unito/meeting/info/910eaf7ad0534d1ba92c5dde0a66a9a7_20211022T093000Z

Paper: http://ceur-ws.org/Vol-2936/paper-189.pdf

Meetings

HaMor To Profile Hate Speech Spreaders

Post author By Simona Frenda
Post date 09/27/2021

Mirko Lai and Marco A. Stranisci will present an innovative approach that takes into account the morality and communicative behaviour of the users to profile hate speech spreaders online.

Title: HaMor at the Profiling Hate Speech Spreaders on Twitter

In this talk, they will describe the Hate and Morality (HaMor) submission for the Profiling of Hate Speech Spreaders on Twitter, the shared task at PAN 2021.
HaMor ranked as the 19th position – over 66 participating teams – with an averaged accuracy value of 73% reached over the two languages.
This approach obtained the 43th higher accuracy for English (62%) and the 2nd higher accuracy for Spanish (84%).
In particular, it involves four types of features that help the system to infer users attitudes just from their messages: hate speech detection, users morality, named entities, and communicative behaviour.
The results of their experiments are promising and will lead to future investigations of these features in a finer grained perspective.

When: On 5th November at 11.30 am

Where: https://unito.webex.com/webappng/sites/unito/meeting/info/910eaf7ad0534d1ba92c5dde0a66a9a7_20211022T093000Z

Paper: http://ceur-ws.org/Vol-2936/paper-178.pdf

Meetings

WordUp! at VaxxStance 2021

Post author By Simona Frenda
Post date 09/27/2021

Combining Contextual Information with Textual and Dependency-Based Syntactic Features for Stance Detection

Mirko Lai and Alessandra T. Cignarella will present an innovative approach to detect stance online, proposed in the frame of the VaxxStance shared task.

Title: WordUp! at VaxxStance 2021: Combining Contextual Information with Textual and Dependency-Based Syntactic Features for Stance Detection

In this talk, they will describe the participation of the WordUp! team in the VaxxStance shared task at IberLEF 2021. The goal of the competition is to determine the author’s stance from tweets written both in Spanish and Basque on the topic of the Antivaxxers movement. Their approach, in the four different tracks proposed, combines the Logistic Regression classifier with diverse groups of features: stylistic, tweet-based, user-based, lexicon-based, dependency-based, and network-based. The outcomes of their experiments are in line with state-of-the-art results on other languages, proving the efficacy of combining methods derived from NLP and Network Science for detecting stance in Spanish and Basque.

When: On 8th October at 11.30 am

Where: h ttps://unito.webex.com/webappng/sites/unito/meeting/info/910eaf7ad0534d1ba92c5dde0a66a9a7_20210702T093000Z?from_login=true

Paper: http://ceur-ws.org/Vol-2943/vaxx_paper3.pdf

Talks

“Shouldn’t I use a polar question?”

Post author By Simona Frenda
Post date 07/13/2021

Proper Question Forms Disentangling Inconsistencies in Dialogue Systems

This talk reports on the description of a specific class of clarification requests, adopted for the negotiation of grounded information in argumentation-based dialogue systems. Two studies are carried out to prove the adequateness of a specific form of polar question when a presupposition is contradicted by a new evidence. Whereas the first one proves the appropriateness of the negative form, the second one also demonstrates how the use of such a form can affect the principle of robustness, in terms of observability and recoverability, important in human–machine interaction applications. The two studies show that dialogue systems with such capabilities can lead to improved usability and naturalness in conversation. For this reason, I present here a system capable of detecting conflicts and of using argumentation strategies to signal them consistently with previous observations.

Maria Di Maro is a Post-Doc at the University of Naples ‘Federico II’ working on the interaction design for the project BRILLO (Bartending Robot for Interactive Long-Lasting Operations). She got her Ph.D. in linguistics at the University of Naples ‘Federico II’ in 2021 with the dissertation “Shouldn’t I use a polar question? Proper Question Forms Disentangling Inconsistencies in Common Ground”. Her research interests range from corpus collection and pragmatic annotations to computational pragmatics and the modeling of grounding processes in spoken dialogue systems. She is also passionate for Artificial Intelligence in general and graph databases as the vessel for twisted pragmatic reasonings.

When: 13/07/2021 at 11.00

Meetings

Whose Opinions Matter?

Post author By Simona Frenda
Post date 06/30/2021

Perspective-aware Models to Identify Opinions of Hate Speech Victims in Abusive Language Detection.

Sohail Akhtar will present an in-depth study of the novel approaches to detect hate speech focusing on the development of approaches to leverage fine-grained knowledge derived from the annotations of individual annotators.

Title: Whose Opinions Matter? Perspective-aware Models to Identify Opinions of Hate Speech Victims in Abusive Language Detection.

Hate Speech (HS) is a form of abusive language and its detection on social media platforms is a rather difficult but important task. The sudden rise in hate speech related incidents on social media is considered a major issue. The technologies being developed for HS detection mainly employ supervised machine learning approaches in Natural Language Processing (NLP). Training such models require manually annotated data by humans, either by crowd-sourcing paid workers or by domain experts, for training and bench-marking purposes.

Because abusive language is subjective in nature, there might be highly polarizing topics or events involved in the annotation of abusive contents such as HS. Therefore, novel approaches are required to model conflicting perspectives and opinions coming from people with different personal and demographic backgrounds which raise issues concerning the quality of the annotation itself and might also impact the gold standard data to train NLP models. The annotators might also show different sensitivity levels against particular forms of hate, which results in low inter-annotators agreements. The online platforms used for the HS annotation does not provide any background information about the annotators and the views and personal opinions of the victims of online hate are often ignored in HS detection tasks.

In this talk, he will present an in-depth study of the novel approaches to detect various forms of abusive language against minorities. The work is focused on developing approaches to leverage fine-grained knowledge derived from the annotations of individual annotators, before a gold standard is created in which the subjectivity of the annotators is averaged out.

The research work aimed at developing approaches to model the polarized opinions coming from different communities under the hypothesis that similar characteristics (ethnicity, social background, culture etc.) can influence the perspectives of the annotators on a certain phenomenon and based on such information, they can be grouped together.

The institution is that by relying on such information, it is possible to divide the annotators into separate groups. Based on this grouping, separate gold standards are crated for individual to train state-of-the-art deep learning models for abusive language detection. Additionally, an ensemble approach is implemented to combine the perspective-aware classifiers from different groups into an inclusive model.

The research proposed a novel resource, a multi-perspective English language dataset annotated according to different sub-categories relevant for characterizing online abuse: HS, aggressiveness, offensiveness and stereotype. Unlike previous work, where the annotations were based on crowd-sourcing, here, the study involved the victims of targeted communities in the annotation process, who volunteered to annotate the dataset, providing a natural selection of the annotator groups based on their personal characteristics. These annotators are from different cultural and social background and demographics. These annotated datasets and one of the groups involve the members of targeted communities.

By training state-of-the-art deep learning models on this novel resource, the results showed that how the proposed approach improves the prediction performance of a state-of-the-art supervised classifier.

Moreover, there is an in-depth qualitative analysis of the novel dataset by analyzing the individual instances of the tweets to identify and understand the topics and events causing polarization among the annotators. The analysis proved that the keywords (unigram features) are indeed strongly linked with and influenced by the culture, religion and demographic background of annotators.

When: On 2nd July at 11.30 am

Where: https://unito.webex.com/webappng/sites/unito/meeting/info/910eaf7ad0534d1ba92c5dde0a66a9a7_20210702T093000Z?from_login=true

Meetings

Weights & Biases

Post author By Simona Frenda
Post date 05/21/2021

Mattia Cerrato presents a tutorial about the use of Weights & Biases platform useful to keep track of results, hyperparameters and random seeds in ML experiments.

Title: Experiment tracking with Weights & Biases

Performing experiments is perhaps the most time consuming activity in ML research, especially at the junior level. Often too little effort is spent in understanding how to optimize this process. The Weights & Biases (W&B) platform provides a simple Python interface which may be used to keep track of results, hyperparameters and random seeds. It has intuitive visualization utilities which may be used to write experimental reports starting from raw performance metric data. Furthermore, it provides an easy way to perform hyperparameter search (random, grid and even Bayesian search strategies are available) and even some light training orchestration capabilities. In this talk, we will see how to extend our experimental scripts so that W&B can help us keep our sanity during the experimental phase of a project.

When: On 4th June at 11.30

Where: https://unito.webex.com/webappng/sites/unito/meeting/info/910eaf7ad0534d1ba92c5dde0a66a9a7_20210604T093000Z