Multilingual Controllable Transformer-Based Lexical Simplification

被引:0
|
作者
Sheang, Kim Cheng [1 ]
Saggion, Horacio [1 ]
机构
[1] Univ Pompeu Fabra, LaSTUS Grp, TALN Lab, DTIC, Barcelona, Spain
来源
PROCESAMIENTO DEL LENGUAJE NATURAL | 2023年 / 71期
关键词
Multilingual Lexical Simplification; Controllable Lexical Simplification; Text Simplification; Multilinguality;
D O I
10.26342/2023-71-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text is by far the most ubiquitous source of knowledge and information and should be made easily accessible to as many people as possible; however, texts often contain complex words that hinder reading comprehension and accessibility. Therefore, suggesting simpler alternatives for complex words without compromising meaning would help convey the information to a broader audience. This paper proposes mTLS, a multilingual controllable Transformer-based Lexical Simplification (LS) system fined-tuned with the T5 model. The novelty of this work lies in the use of language-specific prefixes, control tokens, and candidates extracted from pretrained masked language models to learn simpler alternatives for complex words. The evaluation results on three well-known LS datasets - LexMTurk, BenchLS, and NNSEval - show that our model outperforms the previous state-of-the-art models like LSBert and ConLS. Moreover, further evaluation of our approach on the part of the recent TSAR-2022 multilingual LS shared-task dataset shows that our model performs competitively when compared with the participating systems for English LS and even outperforms the GPT-3 model on several metrics. Moreover, our model obtains performance gains also for Spanish and Portuguese.
引用
收藏
页码:109 / 123
页数:15
相关论文
共 28 条
  • [11] Lexical simplification approach using easy-to-read resources
    Alarcon, Rodrigo
    Moreno, Lourdes
    Segura-Bedmar, Isabel
    Martinez, Paloma
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2019, (63): : 95 - 102
  • [12] Reaching Quality and Efficiency with a Parameter-Efficient Controllable Sentence Simplification Approach
    Menta, Antonio
    Garcia-Serrano, Ana
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2024, 21 (03) : 899 - 921
  • [13] Extremely Low Resource Text simplification with Pre-trained Transformer Language Model
    Maruyama, Takumi
    Yamamoto, Kazuhide
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 53 - 58
  • [14] Classifier Based Text Simplification for Improved Machine Translation
    Tyagi, Shruti
    Chopra, Deepti
    Mathur, Iti
    Joshi, Nisheeth
    2015 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER ENGINEERING AND APPLICATIONS (ICACEA), 2015, : 46 - 50
  • [15] Collaborative multilingual knowledge management based on controlled natural language
    Kaljurand, Kaarel
    Kuhn, Tobias
    Canedo, Laura
    SEMANTIC WEB, 2015, 6 (03) : 241 - 258
  • [16] Enhancing portability with multilingual ontology-based knowledge management
    Segev, Aviv
    Gal, Avigdor
    DECISION SUPPORT SYSTEMS, 2008, 45 (03) : 567 - 584
  • [17] Relative Clause based Text Simplification for Improved English to Hindi Translation
    Saini, Sandeep
    Sehgal, Umang
    Sahula, Vineet
    2015 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2015, : 1479 - 1484
  • [18] Graph-based Model Using Text Simplification for Readability Assessment
    Xu, Rui
    Pan, Wenjing
    Chen, Canhua
    Chen, Xiaoyin
    Lin, Shilin
    Li, Xia
    2022 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2022), 2022, : 401 - 406
  • [19] General Architecture of a Controlled Natural Language Based Multilingual Semantic Wiki
    Kaljurand, Kaarel
    CONTROLLED NATURAL LANGUAGE, CNL 2012, 2012, 7427 : 110 - 120
  • [20] A multilingual FrameNet-based grammar and lexicon for controlled natural language
    Normunds Gruzitis
    Dana Dannélls
    Language Resources and Evaluation, 2017, 51 : 37 - 66