Automatic summarization of open-domain multiparty dialogues in diverse genres

被引:57
|
作者
Zechner, K [1 ]
机构
[1] Educ Testing Serv, Princeton, NJ 08541 USA
关键词
D O I
10.1162/089120102762671945
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic summarization of open-domain spoken dialogues is a relatively new research area. This article introduces the task and the challenges involved and motivates and presents an approach for obtaining automatic-extract summaries for human transcripts of multiparty dialogues of four different genres, without any restriction on domain. We address the following issues, which are intrinsic to spoken-dialogue summarization and typically can be ignored when summarizing written text such as news wire data: (1) detection and removal of speech disfluencies; (2) detection and insertion of sentence boundaries; and (3) detection and linking of cross-speaker information units (question-answer pairs). A system evaluation is performed using a corpus of 23 dialogue excerpts with an average duration of about 10 minutes, comprising 80 topical segments and about 47,000 words total. The corpus was manually annotated for relevant text spans by six human annotators. The global evaluation shows that for the two more informal genres, our summarization system using dialogue-specific components significantly outperforms two baselines: (1) a maximum-marginal-relevance ranking algorithm using TF*IDF term weighting, and (2) a LEAD baseline that extracts the first n words from a text.
引用
收藏
页码:447 / 485
页数:39
相关论文
共 50 条
  • [1] Proxy Indicators for the Quality of Open-domain Dialogues
    Nedelchev, Rostislav
    Lehmann, Jens
    Usbeck, Ricardo
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 7834 - 7855
  • [2] Supervised ranking in open-domain text summarization
    Nomoto, T
    Matsumoto, Y
    40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, : 465 - 472
  • [3] Implicit Discourse Relation Identification for Open-domain Dialogues
    Ma, Mingyu Derek
    Bowden, Kevin K.
    Wu, Jiaqi
    Cui, Wen
    Walker, Marilyn
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 666 - 672
  • [4] RHO (ρ): Reducing Hallucination in Open-domain Dialogues with Knowledge Grounding
    Ji, Ziwei
    Liu, Zihan
    Lee, Nayeon
    Yu, Tiezheng
    Wilie, Bryan
    Zeng, Min
    Fung, Pascale
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 4504 - 4522
  • [5] The diversity-based approach to open-domain text summarization
    Nomoto, T
    Matsumoto, Y
    INFORMATION PROCESSING & MANAGEMENT, 2003, 39 (03) : 363 - 389
  • [6] Multistage Fusion with Forget Gate for Multimodal Summarization in Open-Domain Videos
    Liu, Nayu
    Sun, Xian
    Yul, Hongfeng
    Zhangi, Wenkai
    Xui, Guangluan
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1834 - 1845
  • [7] Fusing Task-Oriented and Open-Domain Dialogues in Conversational Agents
    Young, Tom
    Xing, Frank
    Pandelea, Vlad
    Ni, Jinjie
    Cambria, Erik
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11622 - 11629
  • [8] The Use of Semantic and Acoustic Features for Open-Domain TED Talk Summarization
    Koto, Fajri
    Sakti, Sakriani
    Neubig, Graham
    Toda, Tomoki
    Adriani, Mirna
    Nakamura, Satoshi
    2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [9] Towards Multilingual Automatic Open-Domain Dialogue Evaluation
    Mendonca, John
    Lavie, Alon
    Trancoso, Isabel
    24TH MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE, SIGDIAL 2023, 2023, : 130 - 141
  • [10] A Randomized Link Transformer for Diverse Open-Domain Dialogue Generation
    Lee, Jing Yang
    Lee, Kong Aik
    Gan, Woon Seng
    PROCEEDINGS OF THE 4TH WORKSHOP ON NLP FOR CONVERSATIONAL AI, 2022, : 1 - 11