Automatic Term Mismatch Diagnosis for Selective Query Expansion

被引:0
|
作者
Zhao, Le [1 ]
Callan, Jamie [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
来源
SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 2012年
关键词
Query term diagnosis; term mismatch; term expansion; Boolean conjunctive normal form queries; simulated user interactions;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
People are seldom aware that their search queries frequently mismatch a majority of the relevant documents. This may not be a big problem for topics with a large and diverse set of relevant documents, but would largely increase the chance of search failure for less popular search needs. We aim to address the mismatch problem by developing accurate and simple queries that require minimal effort to construct. This is achieved by targeting retrieval interventions at the query terms that are likely to mismatch relevant documents. For a given topic, the proportion of relevant documents that do not contain a term measures the probability for the term to mismatch relevant documents, or the term mismatch probability. Recent research demonstrates that this probability can be estimated reliably prior to retrieval. Typically, it is used in probabilistic retrieval models to provide query dependent term weights. This paper develops a new use: Automatic diagnosis of term mismatch. A search engine can use the diagnosis to suggest manual query reformulation, guide interactive query expansion, guide automatic query expansion, or motivate other responses. The research described here uses the diagnosis to guide interactive query expansion, and create Boolean conjunctive normal form (CNF) structured queries that selectively expand 'problem' query terms while leaving the rest of the query untouched. Experiments with TREC Ad-hoc and Legal Track datasets demonstrate that with high quality manual expansion, this diagnostic approach can reduce user effort by 33%, and produce simple and effective structured queries that surpass their bag of word counterparts.
引用
收藏
页码:515 / 524
页数:10
相关论文
共 6 条
  • [1] Domain Lexicon-based Query Expansion for Patent Retrieval
    Wang, Feng
    Lin, Lanfen
    2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 1543 - 1547
  • [2] Term expansion on the categorization of summarized documents
    Hsiao, Wen-Feng
    Chang, Te-Min
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2013, 28 (04): : 259 - 268
  • [3] A Search Log Mining based Query Expansion Technique to Improve Effectiveness in Code Search
    Satter, Abdus
    Sakib, Kazi
    PROCEEDINGS OF THE 2016 19TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2016, : 586 - 591
  • [4] Can Word Embedding Help Term Mismatch Problem? - A Result Analysis on Clinical Retrieval Tasks
    Zhang, Danchen
    He, Daqing
    TRANSFORMING DIGITAL WORLDS, ICONFERENCE 2018, 2018, 10766 : 402 - 408
  • [5] Contextual service discovery using term expansion and binding coverage analysis
    Ma, Shang-Pin
    Lan, Ci-Wei
    Li, Chia-Hsueh
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2015, 48 : 73 - 81
  • [6] Retrieval of Web Service Components using UML Modeling and Term Expansion
    Lee, Wen-Tin
    Ma, Shang-Pin
    Tsai, Yao-Yu
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2017, 33 (01) : 17 - 36