Abstractive Summarization of Broadcast News Stories for Estonian

被引:1
|
作者
Harm, Henry [1 ]
Alumae, Tanel [1 ]
机构
[1] Tallinn Univ Technol, Inst Software Sci, Tallinn, Estonia
来源
BALTIC JOURNAL OF MODERN COMPUTING | 2022年 / 10卷 / 03期
关键词
Abstractive summarization; low-resource languages; pre-trained models; multilingual models; machine-translation;
D O I
10.22364/bjmc.2022.10.3.23
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present an approach for generating abstractive summaries for Estonian spoken news stories in a low-resource setting. Given a recording of a radio news story, the goal is to create a summary that captures the essential information in a short format. The approach consists of two steps: automatically generating the transcript and applying a state-of-the-art text summarization system to generate the result. We evaluated a number of models, with the best-performing model leveraging the large English BART model pre-trained on CNN/DailyMail dataset and fine-tuned on machine-translated in-domain data, and with the test data translated to English and back. The method achieved a ROUGE-1 score of 17.22, improving on the alternatives and achieving the best result in human evaluation. The applicability of the proposed solution might be limited in languages where machine translation systems are not mature. In such cases multilingual BART should be considered, which achieved a ROUGE-1 score of 17.00 overall and a score of 16.22 without machine translation based data augmentation.
引用
收藏
页码:511 / 524
页数:14
相关论文
共 50 条
  • [1] Abstractive Summarizers Become Emotional on News Summarization
    Ahuir, Vicent
    Gonzalez, Jose-Angel
    Hurtado, Lluis-F.
    Segarra, Encarna
    APPLIED SCIENCES-BASEL, 2024, 14 (02):
  • [2] Assessing Abstractive and Extractive Methods for Automatic News Summarization
    Oliveira, Hilario
    Lins, Rafael Dueire
    PROCEEDINGS OF THE 2024 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, DOCENG 2024, 2024,
  • [3] Abstractive Web News Summarization Using Knowledge Graphs
    Lakshika, M. V. P. T.
    Caldera, H. A.
    Welgama, W., V
    2020 20TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER-2020), 2020, : 300 - 301
  • [4] Legal public opinion news abstractive summarization by incorporating topic information
    Yuxin Huang
    Zhengtao Yu
    Junjun Guo
    Zhiqiang Yu
    Yantuan Xian
    International Journal of Machine Learning and Cybernetics, 2020, 11 : 2039 - 2050
  • [5] Legal public opinion news abstractive summarization by incorporating topic information
    Huang, Yuxin
    Yu, Zhengtao
    Guo, Junjun
    Yu, Zhiqiang
    Xian, Yantuan
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (09) : 2039 - 2050
  • [6] Exploiting comments information to improve legal public opinion news abstractive summarization
    Huang, Yuxin
    Yu, Zhengtao
    Xiang, Yan
    Yu, Zhiqiang
    Guo, Junjun
    FRONTIERS OF COMPUTER SCIENCE, 2022, 16 (06)
  • [7] Exploiting comments information to improve legal public opinion news abstractive summarization
    Yuxin Huang
    Zhengtao Yu
    Yan Xiang
    Zhiqiang Yu
    Junjun Guo
    Frontiers of Computer Science, 2022, 16
  • [8] The Method of Automatic Construction of Training Collections for the Task of Abstractive Summarization of News Articles
    D. I. Chernyshev
    B. V. Dobrov
    Pattern Recognition and Image Analysis, 2023, 33 : 255 - 267
  • [9] The Method of Automatic Construction of Training Collections for the Task of Abstractive Summarization of News Articles
    Chernyshev, D. I.
    Dobrov, B. V.
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2023, 33 (03) : 255 - 267
  • [10] Exploiting comments information to improve legal public opinion news abstractive summarization
    HUANG Yuxin
    YU Zhengtao
    XIANG Yan
    YU Zhiqiang
    GUO Junjun
    Frontiers of Computer Science, 2022, 16 (06)