Evaluating Factuality in Cross-lingual Summarization

被引：0

作者：

Gao, Mingqi ^{[1
,2
,3
]}

Wang, Wenqing ^{[4
]}

Wan, Xiaojun ^{[1
,2
,3
]}

Xu, Yuemei ^{[4
]}

机构：

[1] Peking Univ, Wangxuan Inst Comp Technol, Beijing, Peoples R China

[2] Peking Univ, Ctr Data Sci, Beijing, Peoples R China

[3] Peking Univ, MOE Key Lab Computat Linguist, Beijing, Peoples R China

[4] Beijing Foreign Studies Univ, Sch Informat Sci & Technol, Beijing, Peoples R China

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023) | 2023年

基金：

美国国家科学基金会; 国家重点研发计划;

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Cross-lingual summarization aims to help people efficiently grasp the core idea of the document written in a foreign language. Modern text summarization models generate highly fluent but often factually inconsistent outputs, which has received heightened attention in recent research. However, the factual consistency of cross-lingual summarization has not been investigated yet. In this paper, we propose a cross-lingual factuality dataset by collecting human annotations of reference summaries as well as generated summaries from models at both summary level and sentence level. Furthermore, we perform the fine-grained analysis and observe that over 50% of generated summaries and over 27% of reference summaries contain factual errors with characteristics different from mono-lingual summarization. Existing evaluation metrics for monolingual summarization require translation to evaluate the factuality of cross-lingual summarization and perform differently at different tasks and levels. Finally, we adapt the monolingual factuality metrics as an initial step towards the automatic evaluation of summarization factuality in cross-lingual settings. Our dataset and code are available at https: //github.com/kite99520/Fact_CLS.

引用

页码：12415 / 12431

页数：17

共 50 条

[21] A Comprehensive Survey and Prospect of Cross-Lingual Summarization Method Research
Wang, Jing-Dong
Chang, Duo
Meng, Fan-Qi
Qu, Guangqiang
Journal of Network Intelligence, 2024, 9 (01): : 384 - 412
[22] Multi-Task Learning for Cross-Lingual Abstractive Summarization
Takase, Sho
Okazaki, Naoaki
2022 Language Resources and Evaluation Conference, LREC 2022, 2022, : 3008 - 3016
[23] PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in India
Urlanal, Ashok
Chen, Pinzhen
Zhao, Zheng
Cohen, Shay B.
Shrivastava, Manish
Haddow, Barry
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 11606 - 11628
[24] Cross-Lingual Summarization of Speech-to-Speech Translation: A Baseline
Karande, Pranav
Sarkar, Balaram
Maurya, Chandresh Kumar
SPEECH AND COMPUTER, SPECOM 2024, PT I, 2025, 15299 : 119 - 133
[25] Unifying Cross-lingual Summarization and Machine Translation with Compression Rate
Bai, Yu
Huang, Heyan
Fan, Kai
Gao, Yang
Zhu, Yiming
Zhan, Jiaao
Chi, Zewen
Chen, Boxing
PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 1087 - 1097
[26] WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive Summarization
Ladhak, Faisal
Durmus, Esin
Cardie, Claire
McKeown, Kathleen
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4034 - 4048
[27] CATAMARAN: A Cross-lingual Long Text Abstractive Summarization Dataset
Chen, Zheng
Lin, Hongyu
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6932 - 6937
[28] XNLI: Evaluating Cross-lingual Sentence Representations
Conneau, Alexis
Rinott, Ruty
Lample, Guillaume
Schwenk, Holger
Stoyanov, Ves
Williams, Adina
Bowman, Samuel R.
2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 2475 - 2485
[29] MCLS: A Large-Scale Multimodal Cross-Lingual Summarization Dataset
Shi, Xiaorui
CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 273 - 288
[30] Attend, Translate and Summarize: An Efficient Method for Neural Cross-Lingual Summarization
Zhu, Junnan
Zhou, Yu
Zhang, Jiajun
Zong, Chengqing
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1309 - 1321

← 1 2 3 4 5 →