Effective Classification Using a Small Training Set Based on Discretization and Statistical Analysis

被引:23
作者
Bruni, Renato [1 ]
Bianchi, Gianpiero [2 ]
机构
[1] Univ Roma La Sapienza, Dept Comp Control & Management Engn DIAG, I-00185 Rome, Italy
[2] Italian Natl Inst Stat Istat, Dept Integrat Qual Res & Prod Networks Dev DIQR, I-00173 Rome, Italy
关键词
Classification algorithms; data mining; machine learning; discrete mathematics; optimization; LOGICAL ANALYSIS;
D O I
10.1109/TKDE.2015.2416727
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work deals with the problem of producing a fast and accurate data classification, learning it from a possibly small set of records that are already classified. The proposed approach is based on the framework of the so-called Logical Analysis of Data ( LAD), but enriched with information obtained from statistical considerations on the data. A number of discrete optimization problems are solved in the different steps of the procedure, but their computational demand can be controlled. The accuracy of the proposed approach is compared to that of the standard LAD algorithm, of support vector machines and of label propagation algorithm on publicly available datasets of the UCI repository. Encouraging results are obtained and discussed.
引用
收藏
页码:2349 / 2361
页数:13
相关论文
共 42 条
[1]   LEARNING BOOLEAN CONCEPTS IN THE PRESENCE OF MANY IRRELEVANT FEATURES [J].
ALMUALLIM, H ;
DIETTERICH, TG .
ARTIFICIAL INTELLIGENCE, 1994, 69 (1-2) :279-305
[2]  
[Anonymous], 2001, ADAP COMP MACH LEARN
[3]  
[Anonymous], 2010, UCI Machine Learning Repository
[4]  
[Anonymous], 1988, Integer and combinatorial optimization
[5]  
Bengio Yoshua, 2006, Semi-Supervised Learning, P192
[6]   Finding essential attributes from binary data [J].
Boros, E ;
Horiyama, T ;
Ibaraki, T ;
Makino, K ;
Yagiura, M .
ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2003, 39 (03) :223-257
[7]   Logical analysis of numerical data [J].
Boros, E ;
Hammer, PL ;
Ibaraki, T ;
Kogan, A .
MATHEMATICAL PROGRAMMING, 1997, 79 (1-3) :163-190
[8]   An implementation of logical analysis of data [J].
Boros, E ;
Hammer, PL ;
Ibaraki, T ;
Kogan, A ;
Mayoraz, E ;
Muchnik, I .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2000, 12 (02) :292-306
[9]   Logical analysis of data: classification with justification [J].
Boros, Endre ;
Crama, Yves ;
Hammer, Peter L. ;
Ibaraki, Toshihide ;
Kogan, Alexander ;
Makino, Kazuhisa .
ANNALS OF OPERATIONS RESEARCH, 2011, 188 (01) :33-61
[10]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32