A Study on Machine Learning for Imbalanced Datasets with Answer Validation of Question Answering

被引:1
作者
Day, Min-Yuh [1 ]
Tsai, Cheng-Chia [1 ]
机构
[1] Tamkang Univ, Dept Informat Management, New Taipei, Taiwan
来源
PROCEEDINGS OF 2016 IEEE 17TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IEEE IRI) | 2016年
关键词
Answer Validation; Imbalanced Datasets; Machine Learning; Question Answering; QA-Lab; Support Vector Machine;
D O I
10.1109/IRI.2016.76
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Question Answering is a system that can process and answer a given question. In recent years, an enormous number of studies have been made on question answering; little is known about the effects of imbalanced datasets with answer validation of question answer system. The objective of this paper is to provide a better understanding of the effects of imbalanced datasets model for answer validation in a real world university entrance exam question answering system. In this paper, we proposed a question answer system and provided a comprehensive analysis of imbalanced datasets and balanced datasets model with Answer Validation of Question Answering system using NTCIR-12 QA-Lab2 Japanese university entrance exams English translation development and test dataset. As a result, our system achieved 90% accuracy with imbalanced datasets machine learning model for the NTCIR-12 QA-Lab2 development datasets.
引用
收藏
页码:513 / 519
页数:7
相关论文
共 24 条
[1]  
[Anonymous], 2000, TREC
[2]  
[Anonymous], 2004, ACM SIGKDD EXPLOR NE, DOI DOI 10.1145/1007730.1007733
[3]  
[Anonymous], 2004, SIGKDD Explorations, DOI [10.1145/1007730.1007738, DOI 10.1145/1007730.1007738]
[4]  
[Anonymous], 2014, INT J DATAB THEORY A
[5]  
Batista GE., 2004, ACM SIGKDD EXPL NEWS, V6, P20, DOI DOI 10.1145/1007730.1007735
[6]   ADJUSTED GEOMETRIC-MEAN: A NOVEL PERFORMANCE MEASURE FOR IMBALANCED BIOINFORMATICS DATASETS LEARNING [J].
Batuwita, Rukshan ;
Palade, Vasile .
JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2012, 10 (04)
[7]  
Dang H.T., 2007, TREC
[8]   On the influence of an adaptive inference system in fuzzy rule based classification systems for imbalanced data-sets [J].
Fernandez, Alberto ;
Jose del Jesus, Maria ;
Herrera, Francisco .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (06) :9805-9812
[9]  
Japkowicz N., 2002, Intelligent Data Analysis, V6, P429
[10]  
Magnini B, 2002, 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P425