Improving DRS-to-Text Generation Through Delexicalization and Data Augmentation

被引:0
|
作者
Amin, Muhammad Saad [1 ]
Anselma, Luca [1 ]
Mazzei, Alessandro [1 ]
机构
[1] Univ Turin, Dept Comp Sci, Turin, Italy
来源
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT I, NLDB 2024 | 2024年 / 14762卷
关键词
Delexicalization; Data augmentation; Discourse representation structure; Formal meaning representation; Neural DRS-to-Text generation; Super senses;
D O I
10.1007/978-3-031-70239-6_9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text generation from Discourse Representation Structure (DRS), is a complex logic-to-text generation task where lexical information in the form of logical concepts is translated into its corresponding textual representation. Delexicalization is the process of removing lexical information from the data which helps the model be more robust in producing textual sequences by focusing on the semantic structure of the input rather than the exact lexical content. Implementation of delexicalization is even harder in the case of the DRS-to-Text generation task where the lexical entities are anchored using WordNet synsets and thematic roles are sourced from VerbNet. In this paper, we have introduced novel procedures to selectively delexicalize proper nouns and common nouns. For data transformations, we propose to use two types of lexical abstractions (1): WordNet supersense-based contextually categorized abstraction; and (2): abstraction based on the lexical category associated with named entities and nouns. We present many experiments for evaluating the hypotheses of delexicalization in the DRS-to-Text generation task by using state-of-the-art neural sequence-to-sequence models. Furthermore, we also explored data augmentation through delexicalization while evaluating test sets with different abstraction methodologies i.e., with and without supersenses. Our experimental results proved the effectiveness of model generalizability through delexicalization while comparing it with the results of fully lexicalized DRS-to-Text generation. Delexicalization resulted in an improved translation quality with a significant increase in evaluation scores.
引用
收藏
页码:121 / 136
页数:16
相关论文
共 50 条
  • [1] Exploring Data Augmentation in Neural DRS-to-Text Generation
    Amin, Muhammad Saad
    Anselma, Luca
    Mazzei, Alessandro
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 2164 - 2178
  • [2] Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation
    Liu, Ruibo
    Xu, Guangxuan
    Jia, Chenyan
    Ma, Weicheng
    Wang, Lili
    Vosoughi, Soroush
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9031 - 9041
  • [3] Using Data Augmentation for Improving Text Summarization
    Constantin, Daniel
    Mihaescu, Marian Cristian
    Heras, Stella
    Jordan, Jaume
    Palanca, Javier
    Julian, Vicente
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2024, PT II, 2025, 15347 : 132 - 144
  • [4] Improving Radiology Report Generation Quality and Diversity through Reinforcement Learning and Text Augmentation
    Parres, Daniel
    Albiol, Alberto
    Paredes, Roberto
    BIOENGINEERING-BASEL, 2024, 11 (04):
  • [5] A Text Data Augmentation Approach for Improving the Performance of CNN
    Abulaish, Muhammad
    Sah, Amit Kumar
    2019 11TH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS (COMSNETS), 2019, : 660 - 665
  • [6] Data Augmentation for Text Generation Without Any Augmented Data
    Bi, Wei
    Li, Huayang
    Huang, Jiacheng
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2223 - 2237
  • [7] Improving Automated Evaluation of Formative Assessments with Text Data Augmentation
    Cochran, Keith
    Cohn, Clayton
    Hutchins, Nicole
    Biswas, Gautam
    Hastings, Peter
    ARTIFICIAL INTELLIGENCE IN EDUCATION, PT I, 2022, 13355 : 390 - 401
  • [8] Improving Utterance Rewriter Based on MMI and Text Data Augmentation
    Yang, Lina
    Lin, Hai
    Li, Wei
    Meng, Zuqiang
    Wang, Patrick Shen-Pei
    Li, Xichun
    Luo, Huiwu
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (04)
  • [9] Neural Data-to-Text Generation with LM-based Text Augmentation
    Chang, Ernie
    Shen, Xiaoyu
    Zhu, Dawei
    Demberg, Vera
    Su, Hui
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 758 - 768
  • [10] Avoiding Overlap in Data Augmentation for AMR-to-Text Generation
    Du, Wenchao
    Flanigan, Jeffrey
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 1043 - 1048