Domain Adaptation and Summary Distillation for Unsupervised Query Focused Summarization

被引:1
作者
Du, Jiancheng [1 ]
Gao, Yang [1 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptation models; Task analysis; Data models; Training; Benchmark testing; Predictive models; Question answering (information retrieval); Abstractive summarization; domain adaptation; query-focused summarization; summary distillation; unsupervised learning;
D O I
10.1109/TKDE.2023.3296441
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text summarizing is the task of reducing a document's length while maintaining its essential information. In the age of information explosion, how to obtain the content that users needed from a large volume of information becomes particularly significant. Under such circumstances, query-focused abstractive summarization (qfs) becomes more dominant since it is able to focus on user needs while delivering fluent, concise, succinct paraphrased summaries. However, unlike generic summarization, which has achieved remarkable progress driven by a substantial amount of parallel data, the qfs struggles due to a deficiency of parallel corpus. Therefore, in this paper, we leverage a typical large generic summarization dataset to facilitate the pressing demands on unsupervised qfs. The large-scale query-free benchmark is automatically transformed into a query-focused dataset (Query-CNNDM) while preserving its informative summaries. We propose a simple yet effective unsupervised method, called Domain Adaptation and Summary Distillation method (DASD). In the model, to achieve the domain adaptation for unsupervised qfs, we design a query-aware gap sentence generation (q-GSG) strategy to equip the model with the capability of learning target textual knowledge and obtaining a good initialization at the target domain. As instance-specific regularization, we train a teacher model with the Query-CNNDM to generate pseudo-labels for summary distillation. Experimental results indicate that our DASD model achieves state-of-the-art performance on two benchmark datasets, Debatepedia and Wikiref, in a zero-shot setting and shows good generalization to the abstractive few-shot qfs.
引用
收藏
页码:1044 / 1055
页数:12
相关论文
共 52 条
[1]  
Abdullah Deen Mohammad., 2020, P 13 INT C NAT LANG, P80
[2]  
Angeli G, 2015, PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, P344
[3]  
Ba LJ, 2014, ADV NEUR IN, V27
[4]  
Badrinath R, 2011, LECT NOTES COMPUT SC, V6611, P641, DOI 10.1007/978-3-642-20161-5_64
[5]  
Chakraborty S., 2020, P 28 INT C COMPUTATI, P669, DOI [DOI 10.18653/V1/2020.COLINGMAIN.59, 10.18653/v1/, 10.18653/v1/2020.colingmain.59]
[6]  
Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
[7]  
Dolan WilliamB., 2005, Proceedings of the Third International Workshop on Paraphrasing, P9
[8]  
Dou ZY, 2021, 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), P4830
[9]   LexRank: Graph-based lexical centrality as salience in text summarization [J].
Erkan, G ;
Radev, DR .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2004, 22 :457-479
[10]  
Fabbri AR, 2021, Arxiv, DOI arXiv:2010.12836