Augmenting Low-Resource Cross-Lingual Summarization with Progression-Grounded Training and Prompting

被引:0
|
作者
Ma, Jiu Shun [1 ]
Huang, Yuxin [1 ]
Wang, Linqin [2 ]
Huang, Xiang [3 ]
Peng, Hao [3 ]
Yu, Zhengtao [1 ]
Yu, Philip [4 ]
机构
[1] Kunming Univ Sci & Technol, Fac Informat Engn & Automat, Kunming, Peoples R China
[2] Kunming Univ Sci & Technol, Kunming, Peoples R China
[3] Beihang Univ, Beijing, Peoples R China
[4] Univ Illinois, Chicago, IL USA
基金
中国国家自然科学基金;
关键词
CLS; pretrain plus finetune paradigm; low-resource languages; progressive training; reinforcement learning; discrete-prompts;
D O I
10.1145/3675167
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-lingual summarization (CLS), generating summaries in one language from source documents in another language, offers invaluable assistance in enabling global access to information for people worldwide. State-of-the-art neural summarization models typically train or fine-tune language models on large-scale corpora. However, this is difficult to achieve in realistic low-resource scenarios due to the lack of domain-specific annotated data. In this article, we present a novel cross-lingual summarization model that utilizes progressive training with mBART and employs reinforcement learning to optimize discrete prompts, which addresses low-resource cross-lingual summarization through a two-pronged approach. During training, we introduce a progressive approach based on mBART, which allows the pre-trained model to gradually acquire the ability to compress information, develop cross-lingual capabilities, and ultimately adapt to specific summarization tasks. During downstream summarization, we employ a discrete-prompts joint pre-trained model based on reinforcement learning optimization to achieve low-resource cross-lingual summarization. Experimental results on four cross-lingual summarization datasets demonstrate state-of-the-art performance and superiority compared to six baselines in low-resource scenarios.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Cross-Lingual Summarization Method Based on Joint Training and Self-Training in Low-Resource Scenarios
    Cheng, Shaohuan
    Tang, Yujia
    Liu, Qiao
    Chen, Wenyu
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2024, 53 (05): : 762 - 770
  • [2] Low-Resource Cross-Lingual Adaptive Training for Nigerian Pidgin
    Lin, Pin-Jie
    Saeed, Muhammed
    Chang, Ernie
    Scholman, Merel
    INTERSPEECH 2023, 2023, : 3954 - 3958
  • [3] Augmenting Low-Resource Text Classification with Graph-Grounded Pre-training and Prompting
    Wen, Zhihao
    Fang, Yuan
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 506 - 516
  • [4] Deep Persian sentiment analysis: Cross-lingual training for low-resource languages
    Ghasemi, Rouzbeh
    Ashrafi Asli, Seyed Arad
    Momtazi, Saeedeh
    JOURNAL OF INFORMATION SCIENCE, 2022, 48 (04) : 449 - 462
  • [5] Cross-lingual embedding for cross-lingual question retrieval in low-resource community question answering
    HajiAminShirazi, Shahrzad
    Momtazi, Saeedeh
    MACHINE TRANSLATION, 2020, 34 (04) : 287 - 303
  • [6] Cross-Lingual Morphological Tagging for Low-Resource Languages
    Buys, Jan
    Botha, Jan A.
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1954 - 1964
  • [7] A two-stage fine-tuning method for low-resource cross-lingual summarization
    Zhang, Kaixiong
    Zhang, Yongbing
    Yu, Zhengtao
    Huang, Yuxin
    Tan, Kaiwen
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2024, 21 (01) : 1125 - 1143
  • [8] Intent detection and slot filling for Persian: Cross-lingual training for low-resource languages
    Zadkamali, Reza
    Momtazi, Saeedeh
    Zeinali, Hossein
    NATURAL LANGUAGE PROCESSING, 2025, 31 (02): : 559 - 574
  • [9] ACROSS: An Alignment-based Framework for Low-Resource Many-to-One Cross-Lingual Summarization
    Li, Peiyao
    Zhang, Zhengkun
    Wang, Jun
    Li, Liang
    Jatowt, Adam
    Yang, Zhenglu
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 2458 - 2472
  • [10] CROSS-LINGUAL TRANSFER LEARNING FOR LOW-RESOURCE SPEECH TRANSLATION
    Khurana, Sameer
    Dawalatabad, Nauman
    Laurent, Antoine
    Vicente, Luis
    Gimeno, Pablo
    Mingote, Victoria
    Glass, James
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 670 - 674