Category: Meetings

Periodic meetings of CCC group

iTelos – A methodology for building reusable purpose-specific Knowledge Graphs

Post author By Matteo Delsanto
Post date 03/21/2022

Simone Bocca (University of Trento), will present iTelos – A methodology for building reusable purpose-specific Knowledge Graphs.

Knowledge Graphs (KGs) have become more and more popular in recent years, due to their efficiency in handling, representing and integrating information. Within different areas of interest KGs are exploited, for several objectives, by applications, services, as well as data analysis and visualization. Such popularity increased the need of building KGs for many different purposes stated by users, sometimes, without a clear understanding about the several issues to be addressed while building a KG. We propose iTelos, a KG building methodology designed to support the user in resolving those issues. In other words, iTelos aims to reduce the effort in building KGs as suitable as possible for the purpose expressed by the final users. To this end, the methodology is based on two key ideas; (i) to stratify the resources involved into different semantic interoperability levels, in order to deal with multiple types of data heterogeneity; (ii) to enhance as much as possible the reuse of already existing data and knowledge resources during the KG building process, thus reducing the effort required for the construction, and producing in turn highly reusable resources. iTelos is currently taught in the Knowledge and Data Integration (KDI) master course in University of Trento (Italy) and Jilin University (China), as well as adopted in EU projects by KnowDive group (University of Trento, Department of Information engineering and Computer Science).

Models and vocabularies for ancient Near Eastern prosopographies

Post author By Simona Frenda
Post date 02/22/2022

Rossana Damiano, Stefano De Martino (Dipartimento di Studi Storici) and Elena Devecchi (Dipartimento di Studi Storici) will present the findings of an interesting investigation performed in the PRIN project “Writing uses: Transmission of Knowledge, Administrative Practices and Political Control in Anatolian and Syro-Anatolian Polities in the II and I millennium BC”.

Title: Models and vocabularies for ancient Near Eastern prosopographies

Prosopographies, intended as the large scale study of the people’s life events as they emerge from written sources, have been largely used in the last decade to study the social structure of ancient societies. The analysis of professional, kinship, administrative and political relations of the past, informed on real data, can confirm the models put forth by historians and archaeologists through traditional research paradigms, and in some cases suggest new ones. In this sense, the availability of agreed-upon, formally expressed vocabularies for describing these data and the relations to sources is a key factor to the development of methods for the analysis of social networks from the past in support of the work of historians.

The PRIN project “Writing uses: Transmission of Knowledge, Administrative Practices and Political Control in Anatolian and Syro-Anatolian Polities in the II and I millennium BC.” (2020-2022) has investigated the adaptation of a factoid-based model of prosopographic data to the case study of Hittite and Kassite civilizations. Aimed at the collaborative creation of prosopographic datasets for large-scale study of Hittite and Kassite social networks, the project has collected a corpus of person records and relations in a Linked Data format. In this seminar, she will describe the design of the vocabularies for the construction of the datasets and the research methods being developed from these data.

Typicality, Probabilities and Cognitive Heuristics: A Dynamic Knowledge Generation Framework for Knowledge Invention with applications in Cognitive Modelling, Computational Creativity, Explainable AI and Serendipity-based Recommender Systems

Post author By Valerio Basile
Post date 01/20/2022

Antonio Lieto

Inventing novel knowledge to solve problems is a crucial, creative, mechanism employed by humans, to extend their range of action. In this talk, I will show how commonsense reasoning plays a crucial role in this respect. In particular, I will present a cognitively inspired reasoning framework for knowledge invention and creative problem solving exploiting TCL: a probabilistic non-monotonic extension of a Description Logic (DL) of typicality able to combine prototypical (commonsense) descriptions of concepts in a human-like fashion. The proposed approach has been tested in a variety of fields and applications. I will present the obtained results, the lessons learned, and the road ahead of this research path.

See this page https://www.antoniolieto.net/tcl_logic.html for a list of the main papers and applications

Meetings

Interpretability of deep neural networks for computer vision

Post author By Simona Frenda
Post date 11/29/2021

Francesco Di Ciccio will discuss the importance to make interpretable the models, showing a first experiment with the Layer-wise Relevance Propagation (LRP) technique.

Title: Interpretability of deep neural networks for computer vision

Machine learning methods are widely used in both commercial applications and academia in order to make inferences in a wide range of areas. Extracting information from data in order to make accurate predictions is the predominant goal in many such applications, although it could come at the cost of the explainability of the applied model. A focus in the design of a model should be placed on implementing tools that assist the understanding of: output results with respect to the parameters of the model and to their consistency to the domain knowledge; choice of the hyperparameters and their interpretation with respect to the domain of the application. In some domains more than others the importance of choosing the correct features is especially important, such as in medicine or economics, where decisions are made on the assumption that the predicted results are obtained from a proper representation of the problem at hand. In such fields, there is a strong reliance on the interpretability of the model, given the higher interest in knowing ‘Why?’ and further identifying those parameters that caused the model to make such predictions. The interest in the interpretability of models is also shared by other fields, such as computer vision, which for simplicity will be used as a reference to investigate the functioning of the Layer-wise Relevance Propagation (LRP) technique. In this application, the focus is on a simple classification task with the MNIST dataset.

When: December 03 at 11.30

Where: in presentia (032_A_P03_3140) or online

Meetings

Mining Annotator Perspectives from Hate Speech Corpora

Post author By Simona Frenda
Post date 11/29/2021

Valerio Basile will introduce a new automatic method to identify annotators’ perspectives in controversial issues such as Hate Speech.

Title: Mining Annotator Perspectives from Hate Speech Corpora

Disagreement in annotation, traditionally treated mostly as noise, is now more and more often considered as a source of valuable information instead. He investigated a particular form of disagreement, occurring when the focus of an annotated dataset is a subjective and controversial phenomenon, therefore inducing a certain degree of polarization among the annotators’ judgments. He argued that the polarization is indicative of the conflicting perspectives held by different annotator groups, and propose a quantitative method to model this phenomenon. Moreover, he introduced a method to automatically identify shared perspectives stemming from a common background.
He tested this method on several corpora in English and Italian, manually annotated according to their hate speech content, validating prior knowledge about the groups of annotators, when available, and discovering characteristic traits among annotators with unknown background.
He found numerous precisely defined perspectives, described in terms of increased sensitivity towards textual content expressing attitudes such as xenophobia, Islamophobia, and homophobia.

When: February 11 at 11.30

Where: online

Meetings

Sign Language Recognition using Machine Learning

Post author By Simona Frenda
Post date 11/17/2021

Muhammad Saad Amin will talk about the problems related to Sign Language Recognition, from the creation of the dataset to the design of the algorithm.

Title: Sign Language Recognition using Machine Learning

Human gesture classification and recognition is always a challenging task. Capturing human gestures and transforming these gestures into labeled digital data is mainly required to train supervised Machine Learning algorithms. Hence, SLR systems with improved accuracy and increased efficiency are the need of the hour.

In this seminar, he will discuss how to capture gestures (specifically ASL) using sensor-based prototypes? How to convert these sign gestures into digital data (dataset generation)? And how a dataset can be used for SL recognition using Supervised Machine Learning algorithms?

When: November 19 at 11.30

Where: in presentia (032_A_P03_3140) or online

Meetings

Detection of Hate Speech Spreaders

Post author By Simona Frenda
Post date 10/08/2021

With Convolutional Neural Networks

Elisa Di Nuovo and Marco Siino will present an interesting approach used to detect hate speech spreaders in the context of the shared task Profiling Hate Speech Spreaders (HSSs) proposed at PAN 2021.

Title: Detection of hate speech spreaders using convolutional neural networks

The speakers will describe a deep learning model based on a Convolutional Neural Network (CNN) to profile hate speech spreaders online. The model was developed for the Profiling Hate Speech Spreaders (HSSs) task proposed by PAN 2021 organizers and hosted at the 2021 CLEF Conference. The approach, used to classify an author as HSS or not (nHSS), takes advantage of a CNN based on a single convolutional layer. In this binary classification task, on the tests performed using a 5-fold cross validation, the proposed model reaches a maximum accuracy of 0.80 on the multilingual (i.e., English and Spanish) training set, and a minimum loss value of 0.51 on the same set. As announced by the task organizers, the model won the 2021 PAN competition on profiling HSSs, reaching an overall accuracy of 0.79 on the full test set. This overall accuracy is obtained averaging the accuracy achieved by the model on both languages. In particular, with regard to the Spanish test set the model achieves an accuracy of 0.85, while on the English test set the same model achieved an accuracy of 0.73.

When: On 22nd October at 11.30 am

Where: https://unito.webex.com/webappng/sites/unito/meeting/info/910eaf7ad0534d1ba92c5dde0a66a9a7_20211022T093000Z

Paper: http://ceur-ws.org/Vol-2936/paper-189.pdf

Meetings

HaMor To Profile Hate Speech Spreaders

Post author By Simona Frenda
Post date 09/27/2021

Mirko Lai and Marco A. Stranisci will present an innovative approach that takes into account the morality and communicative behaviour of the users to profile hate speech spreaders online.

Title: HaMor at the Profiling Hate Speech Spreaders on Twitter

In this talk, they will describe the Hate and Morality (HaMor) submission for the Profiling of Hate Speech Spreaders on Twitter, the shared task at PAN 2021.
HaMor ranked as the 19th position – over 66 participating teams – with an averaged accuracy value of 73% reached over the two languages.
This approach obtained the 43th higher accuracy for English (62%) and the 2nd higher accuracy for Spanish (84%).
In particular, it involves four types of features that help the system to infer users attitudes just from their messages: hate speech detection, users morality, named entities, and communicative behaviour.
The results of their experiments are promising and will lead to future investigations of these features in a finer grained perspective.

When: On 5th November at 11.30 am

Where: https://unito.webex.com/webappng/sites/unito/meeting/info/910eaf7ad0534d1ba92c5dde0a66a9a7_20211022T093000Z

Paper: http://ceur-ws.org/Vol-2936/paper-178.pdf

Meetings

WordUp! at VaxxStance 2021

Post author By Simona Frenda
Post date 09/27/2021

Combining Contextual Information with Textual and Dependency-Based Syntactic Features for Stance Detection

Mirko Lai and Alessandra T. Cignarella will present an innovative approach to detect stance online, proposed in the frame of the VaxxStance shared task.

Title: WordUp! at VaxxStance 2021: Combining Contextual Information with Textual and Dependency-Based Syntactic Features for Stance Detection

In this talk, they will describe the participation of the WordUp! team in the VaxxStance shared task at IberLEF 2021. The goal of the competition is to determine the author’s stance from tweets written both in Spanish and Basque on the topic of the Antivaxxers movement. Their approach, in the four different tracks proposed, combines the Logistic Regression classifier with diverse groups of features: stylistic, tweet-based, user-based, lexicon-based, dependency-based, and network-based. The outcomes of their experiments are in line with state-of-the-art results on other languages, proving the efficacy of combining methods derived from NLP and Network Science for detecting stance in Spanish and Basque.

When: On 8th October at 11.30 am

Where: h ttps://unito.webex.com/webappng/sites/unito/meeting/info/910eaf7ad0534d1ba92c5dde0a66a9a7_20210702T093000Z?from_login=true

Paper: http://ceur-ws.org/Vol-2943/vaxx_paper3.pdf

Meetings

Whose Opinions Matter?

Post author By Simona Frenda
Post date 06/30/2021

Perspective-aware Models to Identify Opinions of Hate Speech Victims in Abusive Language Detection.

Sohail Akhtar will present an in-depth study of the novel approaches to detect hate speech focusing on the development of approaches to leverage fine-grained knowledge derived from the annotations of individual annotators.

Title: Whose Opinions Matter? Perspective-aware Models to Identify Opinions of Hate Speech Victims in Abusive Language Detection.

Hate Speech (HS) is a form of abusive language and its detection on social media platforms is a rather difficult but important task. The sudden rise in hate speech related incidents on social media is considered a major issue. The technologies being developed for HS detection mainly employ supervised machine learning approaches in Natural Language Processing (NLP). Training such models require manually annotated data by humans, either by crowd-sourcing paid workers or by domain experts, for training and bench-marking purposes.

Because abusive language is subjective in nature, there might be highly polarizing topics or events involved in the annotation of abusive contents such as HS. Therefore, novel approaches are required to model conflicting perspectives and opinions coming from people with different personal and demographic backgrounds which raise issues concerning the quality of the annotation itself and might also impact the gold standard data to train NLP models. The annotators might also show different sensitivity levels against particular forms of hate, which results in low inter-annotators agreements. The online platforms used for the HS annotation does not provide any background information about the annotators and the views and personal opinions of the victims of online hate are often ignored in HS detection tasks.

In this talk, he will present an in-depth study of the novel approaches to detect various forms of abusive language against minorities. The work is focused on developing approaches to leverage fine-grained knowledge derived from the annotations of individual annotators, before a gold standard is created in which the subjectivity of the annotators is averaged out.

The research work aimed at developing approaches to model the polarized opinions coming from different communities under the hypothesis that similar characteristics (ethnicity, social background, culture etc.) can influence the perspectives of the annotators on a certain phenomenon and based on such information, they can be grouped together.

The institution is that by relying on such information, it is possible to divide the annotators into separate groups. Based on this grouping, separate gold standards are crated for individual to train state-of-the-art deep learning models for abusive language detection. Additionally, an ensemble approach is implemented to combine the perspective-aware classifiers from different groups into an inclusive model.

The research proposed a novel resource, a multi-perspective English language dataset annotated according to different sub-categories relevant for characterizing online abuse: HS, aggressiveness, offensiveness and stereotype. Unlike previous work, where the annotations were based on crowd-sourcing, here, the study involved the victims of targeted communities in the annotation process, who volunteered to annotate the dataset, providing a natural selection of the annotator groups based on their personal characteristics. These annotators are from different cultural and social background and demographics. These annotated datasets and one of the groups involve the members of targeted communities.

By training state-of-the-art deep learning models on this novel resource, the results showed that how the proposed approach improves the prediction performance of a state-of-the-art supervised classifier.

Moreover, there is an in-depth qualitative analysis of the novel dataset by analyzing the individual instances of the tweets to identify and understand the topics and events causing polarization among the annotators. The analysis proved that the keywords (unigram features) are indeed strongly linked with and influenced by the culture, religion and demographic background of annotators.

When: On 2nd July at 11.30 am

Where: https://unito.webex.com/webappng/sites/unito/meeting/info/910eaf7ad0534d1ba92c5dde0a66a9a7_20210702T093000Z?from_login=true