Vol-XXX urn:nbn:de:0074-XXX-C Copyright © 2025 for the individual papers by the papers’ authors. Copyright © 2025 for the volume as a collection by its editors. This volume and its papers are published under the Creative Commons License Attribution 4.0 International (CC BY 4.0). |
CLiC-it 2025
Italian Conference on Computational Linguistics
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)
Cagliari, Italy, September 24-26, 2025.
Edited by
Cristina Bosco *
Elisabetta Jezek **
Marco Polignano ***
Manuela Sanguinetti ****
* University of Torino, Italy
** University of Pavia, Italy
*** University of Bari “Aldo Moro”, Italy
**** University of Cagliari, Italy
Table of Contents
-
Preface
Summary: There were 123 papers submitted for peer-review to this conference. Out of these, 109 papers were accepted for this volume, 109 as regular papers and 0 as short papers.
Main Conference, long papers
-
Contemporary Voices in Ancient Tongue: Integrating Papal Encyclicals into the LiLa KB
Aurora Alagni, Federica Iurescia and Eleonora Litta -
Ellipsis in Enhanced Dependencies: A Case Study on Latin
Lisa Sophie Albertelli, Lorenzo Augello, Giulia Calvi, Annachiara Clementelli, Federica Iurescia and Claudia Corbetta -
Low- vs High-level Lemmatization for Historical Languages. a Case Study on Italian
Chiara Alzetta and Simonetta Montemagni -
PeRAG: Multi-Modal Perspective-Oriented Verbalization with RAG for Inclusive Decision Making
Muhammad Saad Amin, Horacio Jesús Jarquín-Vásquez, Simona Lo Giudice, Franco Sansonetti, Valerio Basile and Viviana Patti -
Beyond Raw Text: Knowledge-Augmented Italian Relation Extraction with Large Language Models
Gianmaria Balducci, Elisabetta Fersini and Messina Enza -
When Figures Speak with Irony: Investigating the Role of Rhetorical Figures in Irony Generation with LLMs
Pier Felice Balestrucci, Michael Oliverio, Soda Marem Lo, Luca Anselma, Valerio Basile, Cristina Bosco, Alessandro Mazzei and Viviana Patti -
BLiMP-IT: Harnessing Automatic Minimal Pair Generation for Italian Language Model Evaluation
Matilde Barbini, Maria Letizia Piccini Bianchessi, Veronica Bressan, Achille Fusco, Sarah Rossi, Sofia Neri, Tommaso Sgrizzi and Cristiano Chesi -
Do LLMs Authentically Represent Affective Experiences of People with Disabilities on Social Media?
Marco Bombieri, Simone Paolo Ponzetto and Marco Rospocher -
LLMs Struggle on Explicit Causality in Italian
Alessandro Bondielli, Martina Miliani, Luca Paglione, Serena Auriemma, Lucia C. Passaro and Alessandro Lenci -
Sparse Autoencoders Find Partially Interpretable Features in Italian Small Language Models
Alessandro Bondielli, Lucia C. Passaro and Alessandro Lenci -
A Novel Real-World Dataset of Italian Clinical Notes for NLP-based Decision Support in Low Back Pain Treatment
Agnese Bonfigli, Ruben Piperno, Luca Bacco, Felice Dell’Orletta, Dominique Brunato, Filippo Crispino, Giuseppe Francesco Papalia, Fabrizio Russo, Gianluca Vadalà, Rocco Papalia, Mario Merone and Leandro Pecchia -
MakeItSample: A Python Library for Generating Typological Language Samples Based on the Diversity Value Metric
Luca Brigada Villa -
The OuLiBench Benchmark: Formal Constraints as a Lens into LLM Linguistic Competence
Silvio Calderaro, Alessio Miaschi and Felice Dell’Orletta -
BES4RAG: A Framework for Embedding Model Selection in Retrieval-Augmented Generation
Lorenzo Canale, Stefano Scotta, Alberto Messina and Laura Farinetti -
BAMBI Goes to School: Evaluating Italian BabyLMs with Invalsi-ITA
Luca Capone, Alice Suozzi, Gianluca Lebani and Alessandro Lenci -
Arbuli Sunnu: A Sicilian-Italian Parallel Treebank
Caterina Maria Cappello, Sabrina D’Ali, Mario Guglielmetti, Elisa Di Nuovo and Cristina Bosco -
A BERT-Based Approach for Part-of-Speech Tagging in the Low-Resource Context of Sardinian
Salvatore Mario Carta, Filippo Concas, Gianni Fenu, Alessandro Giuliani, Marco Manolo Manca, Mirko Marras, Piergiorgio Mura and Simone Pisano -
A Hypothesis-Driven Framework for Detecting Lexical Semantic Change
Pierluigi Cassotti and Nina Tahmasebi -
Balancing Translation Quality and Environmental Impact: Comparing Large and Small Language Models
Antonio Castaldo, Petra Giommarelli and Johanna Monti -
On the Impact of Hate Speech Synthetic Data on Model Fairness
Camilla Casula and Sara Tonelli -
Classifying Gas Pipe Damage Descriptions in Low-Diversity Corpora
Luca Catalano, Federico D’Asaro, Michele Pantaleo, Prima Acharjee, Minal Jamshed, Nicola Giulietti, Eugenio Fossat and Giuseppe Rizzo -
Knowledge-Grounded Detection of Factual Hallucinations in Large Language Models
Cristian Ceccarelli, Alessandro Raganato and Marco Viviani -
Benchmarking Historical Phase Recognition from Text and Events
Fabio Celli and Marco Rovera -
Ontology-Guided Domain Entity Recognition in Environmental Texts: Evaluating Syntax-Driven and LLM Approaches Using BabelNet and GEMET
Elisa Chierchiello, Patricia Chiril, Cristina Bosco and Adriana S. Pagano -
Crossword Space: Latent Manifold Learning for Italian Crosswords and beyond
Cristiano Ciaccio, Gabriele Sarti, Alessio Miaschi and Felice Dell’Orletta -
Veras Audire Et Reddere Voces: A Corpus of Prosodically-Correct Latin Poetic Audio from Large-Language-Model TTS
Michele Ciletti -
A Tough Hoe to Row: Instruction Fine-Tuning LLaMA 3.2 for Multilingual Idiom Processing
Debora Ciminari and Alberto Barrón-Cedeño -
Semantic Priming in GPT: Investigating LLMs through a Cognitive Psychology Lens
Filippo Colombi and Carlo Strapparava -
Verso La Valutazione Automatizzata Dell’italiano L2: ETET Tra LLM E Tecnologie Vocali
Claudia Roberta Combei, Anna Vignoli and Francesco Zappulla -
What Is Better for Syntactic Parsing? A Comparison between Supervised and Unsupervised Models on Dante and Cavalcanti
Claudia Corbetta, Anna Erminia Colombi, Giovanni Moretti and Marco Passarotti -
Segmenting Italian Sentences for Easy Reading
Marta Cozzini and Horacio Saggion -
Unveiling Stereotypes: Combining Knowledge Graphs and LLMs for Implied Stereotype Generation
Marco Cuccarini, Lia Draetta, Beatrice Fiumanò, Stefano Bistarelli, Rossana Damiano and Valentina Presutti -
Detecting Semantic Intertextuality in Ancient Greek Literature: A Computational Approach.
Caterina D’Angelo, Andrea Taddei and Alessandro Lenci -
WorthIt: Check-worthiness Estimation of Italian Social Media Posts
Agnese Daffara, Alan Ramponi and Sara Tonelli -
Diffusion-Aided RAG: Elevating Dense-Retrieval Chatbots via Graph-Based Diffusion Reranking
Sai Teja Dampanaboina, Sai Nishchal Gamini, Karishma Kunwar, Marco Polignano, Marco Levantesi, Giovanni Semeraro and Ernesto William De Luca -
When Less Is More? Diagnosing ASR Predictions in Sardinian via Layer-Wise Decoding
Domenico De Cristofaro, Alessandro Vietti, Aleese Block and Marianne Pouplier -
Meta-evaluation of Automatic Machine Translation Metrics between Italian and a Minor Language Variety of German
Paolo Di Natale, Elena Chiocchetti and Egon Waldemar Stemle -
Linking Emotions: Affective and Lexical Resources for Italian in Linked Open Data
Eliana Di Palma, Valerio Basile, Agnese Vardanega, Giuliano Gabrieli and Marco Vassallo -
Dissonant Ballerinas and Crafty Carrots: A Comparative Multi-modal Analysis of Italian Brain Rot
Anca Dinu, Andra-Maria Florescu, Marius Micluța-Câmpeanu, Ștefana Arina Tăbușcă, Claudiu Creangă and Andreiana Mihail -
The Role of Eye-Tracking Data in Encoder-Based Models: An In-depth Linguistic Analysis
Lucia Domenichelli, Luca Dini, Dominique Brunato and Felice Dell’Orletta -
Seeing Cause and Time: A Visually Grounded Evaluation of Multimodal Models
Salvatore Ergoli, Alessandro Bondielli and Alessandro Lenci -
Mapping Meaning in Latin with Large Language Models: A Multi-Task Evaluation of Preverbed Motion Verbs and Spatial Relation Detection in LLMs
Andrea Farina, Andrea Ballatore and Barbara McGillivray -
MAMITA: Benchmarking Misogyny in Italian Memes
Elisabetta Fersini, Francesca Gasparini, Giulia Rizzi and Aurora Saibene -
LEARN: On the Feasibility of Learner Error AutoRegressive Neural Annotation
Paolo Gajo, Daniele Polizzi, Adriano Ferraresi and Alberto Barrón-Cedeño -
The Meaning of Beatus: Disambiguating Latin with Contemporary AI Models
Eleonora Ghizzota, Pierpaolo Basile, Lucia Siciliani and Giovanni Semeraro -
LLMike: Exploring Large Language Models’ Abilities in Wheel of Fortune Riddles
Ejdis Gjinika, Nicola Arici, Andrea Loreggia, Luca Putelli, Ivan Serina and Alfonso Emilio Gerevini -
Bidirectional Emotional Influence in HumanâLLM Interaction: Empirical Analysis and Methodological Framework
Manuel Gozzi and Francesca Fallucchi -
A Multilingual Investigation of Anthropocentrism in GPT-4O
Francesca Grasso and Stefano Locci -
Surprisal and Crossword Clues Difficulty: Evaluating Linguistic Processing between LLMs and Humans
Tommaso Iaquinta, Kamyar Zeinalipour, Achille Fusco, Asya Zanollo and Cristiano Chesi -
AI-Driven Resume Analysis and Enhancement Using Semantic Modeling and Large Language Feedback Loops
Achal Jagadeesh, Chinmayi Ravi Shankar, Sahithya Narayanaswamy Patel, Marco Polignano, Marco Levantesi, Giovanni Semeraro and Ernesto William De Luca -
DIETA: A Decoder-only Transformer-based Model for Italian-English Machine TrAnslation
Pranav Kasela, Marco Braga, Alessandro Ghiotto, Andrea Pilzer, Marco Viviani and Alessandro Raganato -
Positional Bias in Binary Question Answering: How Uncertainty Shapes Model Preferences
Tiziano Labruna, Simone Gallo and Giovanni Da San Martino -
Is It Still a Village? Tracing Grammaticalization with Word Embeddings
Joseph E. Larson and Patrícia Amaral -
MedBench-IT: A Comprehensive Benchmark for Evaluating Large Language Models on Italian Medical Entrance Examinations
Ruggero Marino Lazzaroni, Alessandro Angioi, Michelangelo Puliga, Davide Sanna and Roberto Marras -
MuLTa-Telegram: A Fine-Grained Italian and Polish Dataset for Hate Speech and Target Detection
Elisa Leonardelli, Camilla Casula, Sebastiano Vecellio Salto, Joanna Ewa Bąk, Elisa Muratore, Anna Kołos, Thomas Louf and Sara Tonelli -
Linking CompL-it to the LiITA Knowledge Base
Eleonora Litta, Marco Passarotti, Giovanni Moretti, Paolo Brasolin, Valerio Basile, Cristina Bosco, Andrea Di Fabio, Eliana Di Palma, Emiliano Giovannetti, Simone Marchi, Andrea Bellandi and Flavia Sciolette -
Subjectivity in Stereotypes against Migrants in Italian: An Experimental Annotation Procedure
Soda Marem Lo, Marco Antonio Stranisci, Alessandra Teresa Cignarella, Simona Frenda, Valerio Basile, Cristina Bosco, Elisabetta Jezek and Viviana Patti -
Doing Things with Words: Rethinking Theory of Mind Simulation in Large Language Models
Agnese Lombardi and Alessandro Lenci -
BeaverTails-IT: Towards a Safety Benchmark for Evaluating Italian Large Language Models
Giuseppe Magazzù, Alberto Sormani, Giulia Rizzi, Francesca Pulerà, Daniel Scalena, Stefano Cariddi, Edoardo Michielon, Marco Pasqualini, Claudio Stamile and Elisabetta Fersini -
A Leaderboard for Benchmarking LLMs on Italian
Bernardo Magnini, Marco Madeddu, Michele Resta, Roberto Zanoli, Martin Cimmino, Paolo Albano and Viviana Patti -
Towards the Semi-Automated Population of the Ancient Greek WordNet
Beatrice Marchesi, Annachiara Clementelli, Andrea Maurizio Mammarella, Silvia Zampetta, Erica Biagetti, Luca Brigada Villa, Virginia Mastellari, Riccardo Ginevra, Claudia Roberta Combei and Chiara Zanchi -
Evaluating Large Language Models on Wikipedia Graph Navigation: Insights from the WikiGame
Daniele Margiotta, Danilo Croce and Roberto Basili -
Linguistic Markers of Population Replacement Conspiracy Theories in YouTube Immigration Discourse
Erik Bran Marino, Davide Bassi and Renata Vieira -
Exploring the Adaptability of Large Speech Models to Non-Verbal Vocalization Task
Juan José Márquez Villacís, Federico D’Asaro, Giuseppe Rizzo and Andrea Bottino -
Strategic Conversations: LLMs Argumentation and User Perception in Movie Recommendation Dialogues
Valeria Mauro, Martina Di Bratto, Valentina Russo, Azzurra Mancini and Marco Grazioso -
Uni-Mate: A Retrieval-Augmented Generation System to Provide High School Students with Accurate Academic Guidance
Samuele Mazzei, Lorenzo Zambotto, Gabriele Tealdo, Alberto Macagno and Alessio Palmero Aprosio -
Language Models and the Magic of Metaphor: A Comparative Evaluation with Human Judgments
Simone Mazzoli, Alice Suozzi and Gianluca Lebani -
Easy to Complete, Hard to Choose: Investigating LLM Performance on the ProverbIT Benchmark
Enrico Mensa, Lorenzo Zane, Calogero Jerik Scozzaro, Matteo Delsanto, Tommaso Milani and Daniele P. Radicioni -
Mamma Mia! Where’s My Name? De-Identifying Italian Clinical Notes with Large Language Models
Michele Miranda, Sébastien Bratières, Stefano Patarnello and Livia Lilli -
Sustainable Italian LLM Evaluation: Community Perspectives and Methodological Guidelines
Luca Moroni, Gianmarco Pappacoda, Edoardo Barba, Simone Conia, Andrea Galassi, Bernardo Magnini, Roberto Navigli, Paolo Torroni and Roberto Zanoli -
What We Learned from Continually Training Minerva: A Case Study on Italian
Luca Moroni, Tommaso Bonomo, Luca Gioffré, Lu Xu, Domenico Fedele, Leonardo Colosi, Andrei Stefan Bejgu, Alessandro Scirè and Roberto Navigli -
Automatic GRIâSDG Annotation and LLM-Based Filtering for Sustainability Reports
Seyed Alireza Mousavian Anaraki, Danilo Croce and Roberto Basili -
Benchmarking Large Language Models for Target-Based Financial Sentiment Analysis
Iftikhar Muhammad, Marco Rospocher, Timotej Knez and Slavko Žitnik -
Is Multimodality Still Required for Multimodal Machine Translation? A Case Study on English and Italian
Elio Musacchio, Lucia Siciliani, Pierpaolo Basile and Giovanni Semeraro -
Extending Italian Large Language Models for Vision-language Tasks
Elio Musacchio, Lucia Siciliani, Pierpaolo Basile, Asia Beatrice Uboldi, Giovanni Germani and Giovanni Semeraro -
Probing Feminist Representations: A Study of Bias in LLMs and Word Embeddings
Arianna Muti, Elisa Bassignana, Emanuele Moscato and Debora Nozza -
Multilingual vs. Monolingual Transformer Models in Encoding Linguistic Structure and Lexical Abstraction
Vivi Nastase, Giuseppe Samo, Chunyang Jiang and Paola Merlo -
Direct and Indirect Interpretations of Speech Acts: Evidence from Human Judgments and Large Language Models
Massimiliano Orsini and Dominique Brunato -
Narrative Conflicts: A Tri-Modal Computational Analysis of Antagonism in Shakespeare’s Julius Caesar
Aylin Özkan, İrem Ertaş, Mehmet Can Yavuz and Lucia Cascone -
FAMA: The First Large-Scale Open-Science Speech Foundation Model for English and Italian
Sara Papi, Marco Gaido, Luisa Bentivogli, Alessio Brutti, Mauro Cettolo, Roberto Gretter, Marco Matassoni, Mohamed Nabih Ali Mohamed Nawar and Matteo Negri -
Generating and Evaluating Multi-Level Text Simplification: A Case Study on Italian
Michele Papucci, Giulia Venturi and Felice Dell’Orletta -
Analyzing Femicide Reactions in YouTube Comments: A Comparative Study of Giulia Cecchettin and Carol Maltesi
Sveva Silvia Pasini, Marco Madeddu, Chiara Ferrando, Chiara Zanchi and Viviana Patti -
Gender-Neutral Rewriting in Italian: Models, Approaches, and Trade-offs
Andrea Piergentili, Beatrice Savoldi, Matteo Negri and Luisa Bentivogli -
Doctor, Is That You? Evaluating Large Language Models on Italy’s Medical School Entrance Exams
Ruben Piperno, Agnese Bonfigli, Felice Dell’Orletta, Leandro Pecchia, Mario Merone and Luca Bacco -
A Modular LLM-based Dialog System for Accessible Exploration of Finite State Automata
Stefano Vittorio Porta, Pier Felice Balestrucci, Michael Oliverio, Luca Anselma and Alessandro Mazzei -
Leveraging LLMs to Build a Semi-synthetic Dataset for Legal Information Retrieval: A Case Study on the Italian Civil Code and GPT4-O
Mattia Proietti, Lucia C. Passaro and Alessandro Lenci -
Improving Reasoning beyond English through Structured Abstraction and Self-Refinement
Federico Ranaldi and Leonardo Ranaldi -
Evaluating Models, Prompting Strategies, and Task Formats: A Case Study on the MACID Challenge
Matteo Rinaldi, Rossella Varvara, Lorenzo Gregori and Andrea Amelio Ravelli -
Gender Violence in Numbers: Prompting Italian LLMs to Characterize Crimes against Women
Giulia Rizzi, Daniel Scalena and Elisabetta Fersini -
Uncovering Unsafety Traits in Italian Language Models
Giulia Rizzi, Giuseppe Magazzù, Alberto Sormani, Francesca Pulerà, Daniel Scalena and Elisabetta Fersini -
Can LLMs Help Recollect and Elaborate on Our Personal Experiences?
Gabriel Roccabruna, Olha Khomyn, Michele Yin and Giuseppe Riccardi -
IMB: An Italian Medical Benchmark for Question Answering
Antonio Romano, Giuseppe Riccio, Mariano Barone, Marco Postiglione and Vincenzo Moscato -
Acquisition in Babies and Machines: Comparing the Learning Trajectories of LMs in Terms of Syntactic Structures (ATTracTSS Test Set)
Sarah Rossi, Guido Formichi, Sofia Neri, Tommaso Sgrizzi, Asya Zanollo, Veronica Bressan and Cristiano Chesi -
Context-Aware Search Space Adaptation of Hyperparameters and Architectures for AutoML in Text Classification
Parisa Safikhani and David Broneske -
Evaluating Linguistic Speaker Profiles on Response Selection in Multi-Party Dialogue
Maryam Sajedinia, Seyed Mahed Mousavi and Valerio Basile -
MLLMs Construction Company: Investigating Multimodal LLMs’ Communicative Skills in a Collaborative Building Task
Marika Sarzotti, Giovanni Duca, Chris Madge, Raffaella Bernardi and Massimo Poesio -
Toward Optimised Datasets to Fine-tune ASR Systems Leveraging Less but More Informative Speech.
Loredana Schettino, Vincenzo Norman Vitale and Alessandro Vietti -
Using End-to-End Automatic Speech Recognizers’ Internals to Model Disfluencies in Italian Patients with Early-stage Parkinson’s Disease.
Loredana Schettino, Vincenzo Norman Vitale and Marta Maffia -
Structural Sensitivity Does Not Entail Grammaticality: Assessing LLMs against the Universal Functional Hierarchy
Tommaso Sgrizzi, Asya Zanollo and Cristiano Chesi -
Evaluation of Italian and English Small Language Models for Domain-based QA in Low-Resource Scenario
Irene Siragusa and Roberto Pirrone -
Annotating Manzoni: Challenges in the Annotation of Lemmas, POS and Features in “I Promessi Sposi”
Rachele Sprugnoli and Arianna Redaelli -
Ciallabacialla! Modeling and Linking a Regional Lexical Resource to Include Sicilian in the Semantic Web
Rachele Sprugnoli, Giovanni Moretti, Domenico Giuseppe Muscianisi and Eleonora Litta -
Curated Data Does Not Mean Representative Data When Training Large Language Models: An Experiment Using Representative Data for Italian
Fabio Tamburini -
BABILong-ITA: A New Benchmark for Testing Large Language Models Effective Context Length and a Context Extension Method
Fabio Tamburini -
MAIA: A Benchmark for Multimodal AI Assessment
Davide Testa, Giovanni Bonetta, Raffaella Bernardi, Alessandro Bondielli, Alessandro Lenci, Alessio Miaschi, Lucia C. Passaro and Bernardo Magnini -
Building It-tok: An Italian TikTok Corpus
Luisa Troncone -
Protecting the Privacy in Velvet with Model Editing
Giancarlo A. Xompero, Elena Sofia Ruzzetti, Cristina Giannone, Andrea Favalli, Raniero Romagnoli and Fabio Massimo Zanzotto -
“I Understand, but…”: Towards a Comprehensive Account of the Explainee’s Voice in Explanatory Dialogues
Andrea Zaninello, Petar Bodlovic, Marcin Lewinski and Bernardo Magnini -
PharmaER.IT: An Italian Dataset for Entity Recognition in the Pharmaceutical Domain
Andrea Zugarini and Leonardo Rigutini
2025-09-23: submitted by Andrea Zaninello metadata incl. bibliographic data published under Creative Commons CC0
yyyy-mm-dd: published on CEUR Workshop Proceedings (CEUR-WS.org, ISSN 1613-0073) |valid HTML5|