Query-based summarization of discussion threads

被引：6

作者：

Verberne, Suzan ^{[1
]}

Krahmer, Emiel ^{[2
]}

Wubben, Sander ^{[2
]}

van den Bosch, Antal ^{[3
,4
]}

机构：

[1] Leiden Univ, Leiden Inst Adv Comp Sci, Leiden, Netherlands

[2] Tilburg Univ, Tilburg Sch Humanities, Tilburg, Netherlands

[3] Radboud Univ Nijmegen, Ctr Language Studies, Nijmegen, Netherlands

[4] Meertens Inst, Amsterdam, Netherlands

来源：

NATURAL LANGUAGE ENGINEERING | 2020年 / 26卷 / 01期

关键词：

query-based summarization; discussion forums; reference summaries; word embeddings; evaluation; AGREEMENT; NETWORKS; DOCUMENT;

D O I：

10.1017/S1351324919000123

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we address query-based summarization of discussion threads. New users can profit from the information shared in the forum, Please check if the inserted city and country names in the affiliations are correct. if they can find back the previously posted information. However, discussion threads on a single topic can easily comprise dozens or hundreds of individual posts. Our aim is to summarize forum threads given real web search queries. We created a data set with search queries from a discussion forum's search engine log and the discussion threads that were clicked by the user who entered the query. For 120 thread-query combinations, a reference summary was made by five different human raters. We compared two methods for automatic summarization of the threads: a query-independent method based on post features, and Maximum Marginal Relevance (MMR), a method that takes the query into account. We also compared four different word embeddings representations as alternative for standard word vectors in extractive summarization. We find (1) that the agreement between human summarizers does not improve when a query is provided that: (2) the query-independent post features as well as a centroid-based baseline outperform MMR by a large margin; (3) combining the post features with query similarity gives a small improvement over the use of post features alone; and (4) for the word embeddings, a match in domain appears to be more important than corpus size and dimensionality. However, the differences between the models were not reflected by differences in quality of the summaries created with help of these models. We conclude that query-based summarization with web queries is challenging because the queries are short, and a click on a result is not a direct indicator for the relevance of the result.

引用

页码：3 / 29

页数：27

共 50 条

[21] A query-based medical information summarization system using ontology knowledge
Chen, Ping
Verma, Rakesh
19TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, PROCEEDINGS, 2006, : 37 - +
[22] Improvement of query-based text summarization using word sense disambiguation
Rahman, Nazreena
Borah, Bhogeswar
COMPLEX & INTELLIGENT SYSTEMS, 2020, 6 (01) : 75 - 85
[23] QMOS: Query-based multi-documents opinion-oriented summarization
Abdi, Asad
Shamsuddin, Siti Mariyam
Aliguliyev, Ramiz M.
INFORMATION PROCESSING & MANAGEMENT, 2018, 54 (02) : 318 - 338
[24] QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization
Zhong, Ming
Yin, Da
Yu, Tao
Zaidi, Ahmad
Mutuma, Mutethia
Jha, Rahul
Awadallah, Ahmed Hassan
Celikyilmaz, Asli
Liu, Yang
Qiu, Xipeng
Radev, Dragomir
2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5905 - 5921
[25] CoMSum and SIBERT: A Dataset and Neural Model for Query-Based Multi-document Summarization
Kulkarni, Sayali
Chammas, Sheide
Zhu, Wan
Sha, Fei
Ie, Eugene
DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 84 - 98
[26] A Simple, Concise, Query-based Approach to News Article Summarization Using Sentence Scoring
Thornton, Megan
Gao, Sophie
Ng, Yiu-Kai
2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 951 - 958
[27] Query-Based Automatic Multi-document Summarization Extraction Method for Web Pages
He, Qi
Hao, Hong-Wei
Yin, Xu-Cheng
PROCEEDINGS OF THE 2011 2ND INTERNATIONAL CONGRESS ON COMPUTER APPLICATIONS AND COMPUTATIONAL SCIENCE, VOL 1, 2012, 144 : 107 - 112
[28] Query-Based Extractive Text Summarization Using Sense-Oriented Semantic Relatedness Measure
Rahman, Nazreena
Borah, Bhogeswar
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024, 49 (03) : 3751 - 3792
[29] Query-based multi-documents summarization using linguistic knowledge and content word expansion
Abdi, Asad
Idris, Norisma
Alguliyev, Rasim M.
Aliguliyev, Ramiz M.
SOFT COMPUTING, 2017, 21 (07) : 1785 - 1801
[30] Query-Based Extractive Text Summarization Using Sense-Oriented Semantic Relatedness Measure
Nazreena Rahman
Bhogeswar Borah
Arabian Journal for Science and Engineering, 2024, 49 : 3751 - 3792

← 1 2 3 4 5 →