Ensemble-based method of answers retrieval for domain specific questions from text-based documentation
被引:0
|
作者:
Safiulin, Iskander
论文数: 0引用数: 0
h-index: 0
机构:
ITMO Univ, 49 Kronverksky Pr, St Petersburg 197101, RussiaITMO Univ, 49 Kronverksky Pr, St Petersburg 197101, Russia
Safiulin, Iskander
[1
]
论文数: 引用数:
h-index:
机构:
Butakov, Nikolay
[1
]
Alexandrov, Dmitriy
论文数: 0引用数: 0
h-index: 0
机构:
ITMO Univ, 49 Kronverksky Pr, St Petersburg 197101, RussiaITMO Univ, 49 Kronverksky Pr, St Petersburg 197101, Russia
Alexandrov, Dmitriy
[1
]
Nasonov, Denis
论文数: 0引用数: 0
h-index: 0
机构:
ITMO Univ, 49 Kronverksky Pr, St Petersburg 197101, RussiaITMO Univ, 49 Kronverksky Pr, St Petersburg 197101, Russia
Nasonov, Denis
[1
]
机构:
[1] ITMO Univ, 49 Kronverksky Pr, St Petersburg 197101, Russia
来源:
8TH INTERNATIONAL YOUNG SCIENTISTS CONFERENCE ON COMPUTATIONAL SCIENCE, YSC2019
|
2019年
/
156卷
基金:
俄罗斯科学基金会;
关键词:
information retrieval;
optimization;
text-based search;
learning to rank;
D O I:
10.1016/j.procs.2019.08.191
中图分类号:
TP39 [计算机的应用];
学科分类号:
081203 ;
0835 ;
摘要:
Many companies want or prefer to use chatbot systems to provide smart assistants for accompanying human specialists especially newbies with automatic consulting. Implementation of a really useful smart assistant for a specific domain requires a knowledge base for this domain, that often exists only in the form of text documentation and manuals. Lacks of properly built datasets and often expensiveness in resources and time to build one from scratch to apply data-driven methods with high quality. It motivates to seek a solution that can work without such data or require only a small amount of it though having reduced quality. The reformulation of the task into an information retrieval problem where the assistant responds with a piece of documentation instead of generated sentences may make the task easier but doesn't solve the whole problem. It allows using of metrics-based methods with reduced search quality or data-driven methods which also needs a great amount of data. In this paper, we propose a new ensemble-based data-driven method that tries to learn a scoring function by combining independent functions from a predefined set. The method may substantially improve the quality of the search in comparison with pure metrics-based methods while requiring significantly less data for training than data-driven methods. (C) 2019 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC -ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility of the scientific committee of the 8th International Young Scientist Conference on Computational Science.