Disentangling Chat

被引:48
作者
Elsner, Micha [1 ]
Charniak, Eugene [1 ]
机构
[1] Brown Univ, BLLIP, Providence, RI 02912 USA
关键词
Computational linguistics;
D O I
10.1162/coli_a_00003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When multiple conversations occur simultaneously, a listener must decide which conversation each utterance is part of in order to interpret and respond to it appropriately. We refer to this task as disentanglement. We present a corpus of Internet Relay Chat dialogue in which the various conversations have been manually disentangled, and evaluate annotator reliability. We propose a graph-based clustering model for disentanglement, using lexical, timing, and discourse-based features. The model's predicted disentanglements are highly correlated with manual annotations. We conclude by discussing two extensions to the model, specificity tuning and conversation start detection, both of which are promising but do not currently yield practical improvements.
引用
收藏
页码:389 / 409
页数:21
相关论文
共 33 条
  • [11] CHEN L, 2008, THESIS PURDUE U
  • [12] Chen L., 2006, 6 INT WORKSHOP TROP, P36
  • [13] Daume Hal., 2004, NOTES CG LM BFGS OPT
  • [14] Dou Shen, 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P35, DOI 10.1145/1148170.1148180
  • [15] Haghighi A, 2006, P MAIN C HUMAN LANGU, P320
  • [16] HAWES T, 2008, LAMPTR147HCIL200802
  • [17] *ILOG INC, 2003, CPLEX SOLV
  • [18] JOVANOVIC N, 2006, P EACL TRENT
  • [19] Jovanovic N., 2004, P 5 SIGDIAL WORKSHOP, P89
  • [20] Li X, 2004, HLT-NAACL 2004: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, P17