Query-based summarization of discussion threads

被引:6
|
作者
Verberne, Suzan [1 ]
Krahmer, Emiel [2 ]
Wubben, Sander [2 ]
van den Bosch, Antal [3 ,4 ]
机构
[1] Leiden Univ, Leiden Inst Adv Comp Sci, Leiden, Netherlands
[2] Tilburg Univ, Tilburg Sch Humanities, Tilburg, Netherlands
[3] Radboud Univ Nijmegen, Ctr Language Studies, Nijmegen, Netherlands
[4] Meertens Inst, Amsterdam, Netherlands
关键词
query-based summarization; discussion forums; reference summaries; word embeddings; evaluation; AGREEMENT; NETWORKS; DOCUMENT;
D O I
10.1017/S1351324919000123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we address query-based summarization of discussion threads. New users can profit from the information shared in the forum, Please check if the inserted city and country names in the affiliations are correct. if they can find back the previously posted information. However, discussion threads on a single topic can easily comprise dozens or hundreds of individual posts. Our aim is to summarize forum threads given real web search queries. We created a data set with search queries from a discussion forum's search engine log and the discussion threads that were clicked by the user who entered the query. For 120 thread-query combinations, a reference summary was made by five different human raters. We compared two methods for automatic summarization of the threads: a query-independent method based on post features, and Maximum Marginal Relevance (MMR), a method that takes the query into account. We also compared four different word embeddings representations as alternative for standard word vectors in extractive summarization. We find (1) that the agreement between human summarizers does not improve when a query is provided that: (2) the query-independent post features as well as a centroid-based baseline outperform MMR by a large margin; (3) combining the post features with query similarity gives a small improvement over the use of post features alone; and (4) for the word embeddings, a match in domain appears to be more important than corpus size and dimensionality. However, the differences between the models were not reflected by differences in quality of the summaries created with help of these models. We conclude that query-based summarization with web queries is challenging because the queries are short, and a click on a result is not a direct indicator for the relevance of the result.
引用
收藏
页码:3 / 29
页数:27
相关论文
共 50 条
  • [21] A query-based medical information summarization system using ontology knowledge
    Chen, Ping
    Verma, Rakesh
    19TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, PROCEEDINGS, 2006, : 37 - +
  • [22] Improvement of query-based text summarization using word sense disambiguation
    Rahman, Nazreena
    Borah, Bhogeswar
    COMPLEX & INTELLIGENT SYSTEMS, 2020, 6 (01) : 75 - 85
  • [23] QMOS: Query-based multi-documents opinion-oriented summarization
    Abdi, Asad
    Shamsuddin, Siti Mariyam
    Aliguliyev, Ramiz M.
    INFORMATION PROCESSING & MANAGEMENT, 2018, 54 (02) : 318 - 338
  • [24] QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization
    Zhong, Ming
    Yin, Da
    Yu, Tao
    Zaidi, Ahmad
    Mutuma, Mutethia
    Jha, Rahul
    Awadallah, Ahmed Hassan
    Celikyilmaz, Asli
    Liu, Yang
    Qiu, Xipeng
    Radev, Dragomir
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5905 - 5921
  • [25] CoMSum and SIBERT: A Dataset and Neural Model for Query-Based Multi-document Summarization
    Kulkarni, Sayali
    Chammas, Sheide
    Zhu, Wan
    Sha, Fei
    Ie, Eugene
    DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 84 - 98
  • [26] A Simple, Concise, Query-based Approach to News Article Summarization Using Sentence Scoring
    Thornton, Megan
    Gao, Sophie
    Ng, Yiu-Kai
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 951 - 958
  • [27] Query-Based Automatic Multi-document Summarization Extraction Method for Web Pages
    He, Qi
    Hao, Hong-Wei
    Yin, Xu-Cheng
    PROCEEDINGS OF THE 2011 2ND INTERNATIONAL CONGRESS ON COMPUTER APPLICATIONS AND COMPUTATIONAL SCIENCE, VOL 1, 2012, 144 : 107 - 112
  • [28] Query-Based Extractive Text Summarization Using Sense-Oriented Semantic Relatedness Measure
    Rahman, Nazreena
    Borah, Bhogeswar
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024, 49 (03) : 3751 - 3792
  • [29] Query-based multi-documents summarization using linguistic knowledge and content word expansion
    Abdi, Asad
    Idris, Norisma
    Alguliyev, Rasim M.
    Aliguliyev, Ramiz M.
    SOFT COMPUTING, 2017, 21 (07) : 1785 - 1801
  • [30] Query-Based Extractive Text Summarization Using Sense-Oriented Semantic Relatedness Measure
    Nazreena Rahman
    Bhogeswar Borah
    Arabian Journal for Science and Engineering, 2024, 49 : 3751 - 3792