Automatic summarization of open-domain multiparty dialogues in diverse genres

被引:57
|
作者
Zechner, K [1 ]
机构
[1] Educ Testing Serv, Princeton, NJ 08541 USA
关键词
D O I
10.1162/089120102762671945
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic summarization of open-domain spoken dialogues is a relatively new research area. This article introduces the task and the challenges involved and motivates and presents an approach for obtaining automatic-extract summaries for human transcripts of multiparty dialogues of four different genres, without any restriction on domain. We address the following issues, which are intrinsic to spoken-dialogue summarization and typically can be ignored when summarizing written text such as news wire data: (1) detection and removal of speech disfluencies; (2) detection and insertion of sentence boundaries; and (3) detection and linking of cross-speaker information units (question-answer pairs). A system evaluation is performed using a corpus of 23 dialogue excerpts with an average duration of about 10 minutes, comprising 80 topical segments and about 47,000 words total. The corpus was manually annotated for relevant text spans by six human annotators. The global evaluation shows that for the two more informal genres, our summarization system using dialogue-specific components significantly outperforms two baselines: (1) a maximum-marginal-relevance ranking algorithm using TF*IDF term weighting, and (2) a LEAD baseline that extracts the first n words from a text.
引用
收藏
页码:447 / 485
页数:39
相关论文
共 50 条
  • [41] Automatic music summarization in compressed domain
    Shao, X
    Xu, C
    Wang, Y
    Kankanhalli, MS
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PROCEEDINGS: AUDIO AND ELECTROACOUSTICS SIGNAL PROCESSING FOR COMMUNICATIONS, 2004, : 261 - 264
  • [42] Between reality and delusion: challenges of applying large language models to companion robots for open-domain dialogues with older adults
    Irfan, Bahar
    Kuoppamaeki, Sanna
    Hosseini, Aida
    Skantze, Gabriel
    AUTONOMOUS ROBOTS, 2025, 49 (01)
  • [43] Reading Wikipedia to Answer Open-Domain Questions
    Chen, Danqi
    Fisch, Adam
    Weston, Jason
    Bordes, Antoine
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1870 - 1879
  • [44] Ranking and Sampling in Open-Domain Question Answering
    Xu, Yanfu
    Lin, Zheng
    Liu, Yuanxin
    Liu, Rui
    Wang, Weiping
    Meng, Dan
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 2412 - 2421
  • [45] Passage filtering for open-domain Question Answering
    Noguera, Elisa
    Llopis, Fernando
    Ferrandez, Antonio
    ADVANCES IN NATURAL LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4139 : 534 - 540
  • [46] Open-Domain Trending Hashtag Recommendation for Videos
    Mehta, Swapneel
    Sarkhel, Somdeb
    Chen, Xiang
    Mitra, Saayan
    Swaminathan, Viswanathan
    Rossi, Ryan
    Aminian, Ali
    Guo, Han
    Garg, Kshitiz
    23RD IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2021), 2021, : 174 - 181
  • [47] Open-domain textual question answering techniques
    Harabagiu, Sanda M.
    Maiorano, Steven J.
    Paşca, Marius A.
    Natural Language Engineering, 2003, 9 (03) : 231 - 267
  • [48] Multi-Modal Open-Domain Dialogue
    Shuster, Kurt
    Smith, Eric Michael
    Ju, Da
    Weston, Jason
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 4863 - 4883
  • [49] Adversarial Evaluation for Open-Domain Dialogue Generation
    Bruni, Elia
    Fernandez, Raquel
    18TH ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2017), 2017, : 284 - 288
  • [50] Assessing Political Prudence of Open-domain Chatbots
    Bang, Yejin
    Lee, Nayeon
    Ishii, Etsuko
    Madotto, Andrea
    Fung, Pascale
    SIGDIAL 2021: 22ND ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2021), 2021, : 548 - 555