Automating Gender-Inclusive Language Modification in Italian University Administrative Documents

被引:0
作者
Cerabolini, Aurora [1 ]
Pasi, Gabriella [1 ]
Viviani, Marco [1 ]
机构
[1] Univ Milano Bicocca, Dipartimento Informat Sistemist & Comunicaz DISCo, Informat & Knowledge Representat Retrieval & Reas, Viale Sarca 336, I-20126 Milan, Italy
来源
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT I, NLDB 2024 | 2024年 / 14762卷
关键词
Gender Bias; Inclusive Language; Natural Language Processing (NLP); Large Language Models (LLMs); ChatGPT;
D O I
10.1007/978-3-031-70239-6_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we address the issue of automating the identification of non-inclusive language in administrative documents of Italian universities as well as providing gender-inclusive corrections. To achieve this objective, data from various Italian universities were gathered, leading to the creation of a dictionary containing potentially non-inclusive terms, and of a dataset containing gender non-inclusive sentences and their corresponding inclusive versions. Subsequently, three distinct approaches have been defined and evaluated: a rule-based and two neural approaches. In the development of the rule-based approach, Italian Part-of-Speech tagging, dependency parsing, and morphologization techniques were employed to detect masculine trigger words within sentences, ascertain whether they functioned as generic masculine terms, and offer gender-inclusive alternatives. In contrast, for the implementation of the two neural approaches, both the mT5 model and ChatGPT were utilized, and their respective outputs were compared against the rewritten sentences they generated. The experimental evaluations conducted suggest the effectiveness of the proposed solutions.
引用
收藏
页码:333 / 347
页数:15
相关论文
共 30 条
  • [1] Gender representation in EFL materials: an analysis of English textbooks of Iranian high schools
    Bahman, Masoumeh
    Rahimi, Ali
    [J]. WORLD CONFERENCE ON LEARNING, TEACHING AND ADMINISTRATION PAPERS, 2010, 9
  • [2] Language for Sex and Gender Inclusiveness in Writing
    Bamberger, Ethan T.
    Farrow, Aiden
    [J]. JOURNAL OF HUMAN LACTATION, 2021, 37 (02) : 251 - 259
  • [3] Bolukbasi T, 2016, Arxiv, DOI arXiv:1606.06121
  • [4] Principal component analysis
    Bro, Rasmus
    Smilde, Age K.
    [J]. ANALYTICAL METHODS, 2014, 6 (09) : 2812 - 2831
  • [5] Carl M., 2004, COLING 2004, P820
  • [6] Cer Daniel, 2017, P 11 INT WORKSH SEM, P1, DOI [DOI 10.18653/V1/S17-2001, 10.18653/v1/S17-2001]
  • [7] De Benedetti Andrea, 2022, Cosi non schwa. Limiti ed eccessi del linguaggio inclusivo
  • [8] Supporting Gender-Neutral Writing in German
    Diesner-Mayer, Theodor
    Seidel, Niels
    [J]. MUC 2022: PROCEEDINGS OF MENSCH UND COMPUTER 2022, 2022, : 509 - 512
  • [9] Downes W., 1998, Language and Society, V10
  • [10] Gheno V., 2020, Lo schwa tra fantasia e norma. come superare il maschile sovraesteso Nella lingua Italiana