Categories
Meetings

Bootstrapping UMRs from UD for Scalable Multilingual Annotation

CCC Seminar by Federica Gamba – PhD student in Computational Linguistics, Charles University, Czech Republic

Abstract

Uniform Meaning Representation (UMR) offers a cross-linguistically applicable framework for capturing sentence- and document-level semantics, but producing UMR annotations from scratch is a time-intensive process. In this talk, I will present an approach for bootstrapping UMR graphs by leveraging Universal Dependencies (UD), a richly annotated multilingual syntactic resource covering a wide range of language families. I will describe how structural correspondences between UD and UMR can be exploited to automatically derive partial UMR graphs from UD trees, providing annotators with an initial representation to refine rather than create from scratch. While UD is not inherently semantic, it encodes syntactic information that maps well onto UMR structures, allowing us to extract meaningful correspondences that simplify annotation. This method not only reduces annotation effort but also facilitates scalable UMR creation across typologically diverse languages, aligning with UMR’s cross-linguistic design goals.

When:  12/11/25 , h 9.00

Where:  Sala Conferenze – 3rd floor