Disentangling Chat

被引:48
作者
Elsner, Micha [1 ]
Charniak, Eugene [1 ]
机构
[1] Brown Univ, BLLIP, Providence, RI 02912 USA
关键词
Computational linguistics;
D O I
10.1162/coli_a_00003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When multiple conversations occur simultaneously, a listener must decide which conversation each utterance is part of in order to interpret and respond to it appropriately. We refer to this task as disentanglement. We present a corpus of Internet Relay Chat dialogue in which the various conversations have been manually disentangled, and evaluate annotator reliability. We propose a graph-based clustering model for disentanglement, using lexical, timing, and discourse-based features. The model's predicted disentanglements are highly correlated with manual annotations. We conclude by discussing two extensions to the model, specificity tuning and conversation start detection, both of which are promising but do not currently yield practical improvements.
引用
收藏
页码:389 / 409
页数:21
相关论文
共 33 条
  • [1] Acar E, 2005, LECT NOTES COMPUT SC, V3495, P256
  • [2] Adams P.H., 2008, THESIS NAVAL POSTGRA
  • [3] ADAMS PH, 2008, INT C SEM COMP, V2, P581
  • [4] [Anonymous], 2009, P WORKSH INT LIN PRO
  • [5] [Anonymous], LDC95T21
  • [6] [Anonymous], 2005, P HUMAN LANGUAGE TEC
  • [7] AOKI PM, 2006, CSCW 06, P393
  • [8] AOKI PM, 2003, CHI P C HUM FACT COM, P425
  • [9] Correlation clustering
    Bansal, N
    Blum, A
    Chawla, S
    [J]. MACHINE LEARNING, 2004, 56 (1-3) : 89 - 113
  • [10] CAMTEPE SA, 2005, IADIS AC ALG, P89