Automatic Question Answering using the Web: Beyond the factoid

被引:44
作者
Soricut, R
Brill, E
机构
[1] Univ So Calif, Inst Informat Sci, Marina Del Rey, CA 90292 USA
[2] Microsoft Res, Redmond, WA 98052 USA
来源
INFORMATION RETRIEVAL | 2006年 / 9卷 / 02期
关键词
Statistical Model; Data Structure; Information Theory; Search Engine; Language Model;
D O I
10.1007/s10791-006-7149-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we describe and evaluate a Question Answering (QA) system that goes beyond answering factoid questions. Our approach to QA assumes no restrictions on the type of questions that are handled, and no assumption that the answers to be provided are factoids. We present an unsupervised approach for collecting question and answer pairs from FAQ pages, which we use to collect a corpus of 1 million question/answer pairs from FAQ pages available on the Web. This corpus is used to train various statistical models employed by our QA system: a statistical chunker used to transform a natural language-posed question into a phrase-based query to be submitted for exact match to an off-the-shelf search engine; an answer/question translation model, used to assess the likelihood that a proposed answer is indeed an answer to the posed question; and an answer language model, used to assess the likelihood that a proposed answer is a well-formed answer. We evaluate our QA system in a modular fashion, by comparing the performance of baseline algorithms against our proposed algorithms for various modules in our QA system. The evaluation shows that our system achieves reasonable performance in terms of answer accuracy for a large variety of complex, non-factoid questions.
引用
收藏
页码:191 / 206
页数:16
相关论文
共 20 条
[1]  
AGICHTEIN E, 2002, ACM T INTERNET TECHN, V4, P129
[2]  
[Anonymous], **DROPPED REF**
[3]  
Berger A, 1999, SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P222, DOI 10.1145/312624.312681
[4]  
Berger A, 2000, P 23 ANN INT ACM SIG, P192
[5]  
Brill E., 2001, P TREC GAITNH MD US, VVolume 56, P90
[6]  
Brown P. F., 1993, Computational Linguistics, V19, P263
[7]  
Burke RD, 1997, AI MAG, V18, P57
[8]  
Dunning T., 1993, Computational Linguistics, V19, P61
[9]  
ECHIHABI A, 2003, P 41 ANN M ASS COMP
[10]  
Girju R., 2003, P 41 ANN M ASS COMPU, V12, P76