IndoNLG: Benchmark and Resources for Evaluating Indonesian Natural Language Generation

被引:0
|
作者
Cahyawijaya, Samuel [1 ]
Winata, Genta Indra [1 ]
Wilie, Bryan [3 ]
Vincentio, Karissa [4 ]
Li, Xiaohong [2 ]
Kuncoro, Adhiguna [5 ]
Ruder, Sebastian [5 ]
Lim, Zhi Yuan [2 ]
Bahar, Syafri [2 ]
Khodra, Masayu Leylia [3 ]
Purwarianti, Ayu [3 ,6 ]
Fung, Pascale [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[2] Gojek, Jakarta, Indonesia
[3] Inst Teknol Bandung, Bandung, Indonesia
[4] Univ Multimedia Nusantara, Tangerang, Indonesia
[5] DeepMind, London, England
[6] Prosa Ai, Bandung, Indonesia
来源
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021) | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural language generation (NLG) benchmarks provide an important avenue to measure progress and develop better NLG systems. Unfortunately, the lack of publicly available NLG benchmarks for low-resource languages poses a challenging barrier for building NLG systems that work well for languages with limited amounts of data. Here we introduce IndoNLG, the first benchmark to measure natural language generation (NLG) progress in three low-resource-yet widely spokenlanguages of Indonesia: Indonesian, Javanese, and Sundanese. Altogether, these languages are spoken by more than 100 million native speakers, and hence constitute an important use case of NLG systems today. Concretely, IndoNLG covers six tasks: summarization, question answering, chit-chat, and three different pairs of machine translation (MT) tasks. We collate a clean pretraining corpus of Indonesian, Sundanese, and Javanese datasets, Indo4B-Plus, which is used to pretrain our models: IndoBART and IndoGPT. We show that IndoBART and IndoGPT achieve competitive performance on all tasks-despite using only one-fifth the parameters of a larger multilingual model, mBARTLARGE (Liu et al., 2020). This finding emphasizes the importance of pretraining on closely related, local languages to achieve more efficient learning and faster inference for very low-resource languages like Javanese and Sundanese.
引用
收藏
页码:8875 / 8898
页数:24
相关论文
共 50 条
  • [1] IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding
    Wilie, Bryan
    Vincentio, Karissa
    Winata, Genta Indra
    Cahyawijaya, Samuel
    Li, Xiaohong
    Lim, Zhi Yuan
    Soleman, Sidik
    Mahendra, Rahmad
    Fun, Pascale
    Bahar, Syafri
    Purwarianti, Ayu
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 843 - 857
  • [2] Evaluating the Evaluation of Diversity in Natural Language Generation
    Tevet, Guy
    Berant, Jonathan
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 326 - 346
  • [3] A Repository of Data and Evaluation Resources for Natural Language Generation
    Belz, Anja
    Gatt, Albert
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 4027 - 4032
  • [4] RoMe: A Robust Metric for Evaluating Natural Language Generation
    Rony, Md Rashad Al Hasan
    Kovriguina, Liubov
    Chaudhuri, Debanjan
    Usbeck, Ricardo
    Lehmann, Jens
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5645 - 5657
  • [5] BanglaNLG and BanglaT5: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla
    Bhattacharjee, Abhik
    Hasan, Tahmid
    Ahmad, Wasi Uddin
    Shahriyar, Rifat
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 726 - 735
  • [6] Natural Language Generation Using Automatically Constructed Lexical Resources
    Ito, Naho
    Hagiwara, Masafumi
    2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 980 - 987
  • [7] EVALUATING NATURAL RESOURCES
    SUCHOTIN, J
    SOWJETWISSENSCHAFT GESELLSCHAFTS WISSENSCHAFTLICHE BEITRAGE, 1968, (06): : 630 - 641
  • [8] ScienceBenchmark: A Complex Real-World Benchmark for Evaluating Natural Language to SQL Systems
    Zhang, Yi
    Deriu, Jan
    Katsogiannis-Meimarakis, George
    Kosten, Catherine
    Koutrika, Georgia
    Stockinger, Kurt
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 17 (04): : 685 - 698
  • [9] JavaBench: A Benchmark of Object-Oriented Code Generation for Evaluating Large Language Models
    Cao, Jialun
    Chen, Zhiyong
    Wu, Jiarong
    Cheung, Shing-Chi
    Xu, Chang
    Proceedings - 2024 39th ACM/IEEE International Conference on Automated Software Engineering, ASE 2024, : 870 - 882
  • [10] Evaluating Natural Language Generation via Unbalanced Optimal Transport
    Chen, Yimeng
    Lan, Yanyan
    Xiong, Ruibin
    Pang, Liang
    Ma, Zhiming
    Cheng, Xueqi
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3730 - 3736