Effective Classification Using a Small Training Set Based on Discretization and Statistical Analysis

被引:23
作者
Bruni, Renato [1 ]
Bianchi, Gianpiero [2 ]
机构
[1] Univ Roma La Sapienza, Dept Comp Control & Management Engn DIAG, I-00185 Rome, Italy
[2] Italian Natl Inst Stat Istat, Dept Integrat Qual Res & Prod Networks Dev DIQR, I-00173 Rome, Italy
关键词
Classification algorithms; data mining; machine learning; discrete mathematics; optimization; LOGICAL ANALYSIS;
D O I
10.1109/TKDE.2015.2416727
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work deals with the problem of producing a fast and accurate data classification, learning it from a possibly small set of records that are already classified. The proposed approach is based on the framework of the so-called Logical Analysis of Data ( LAD), but enriched with information obtained from statistical considerations on the data. A number of discrete optimization problems are solved in the different steps of the procedure, but their computational demand can be controlled. The accuracy of the proposed approach is compared to that of the standard LAD algorithm, of support vector machines and of label propagation algorithm on publicly available datasets of the UCI repository. Encouraging results are obtained and discussed.
引用
收藏
页码:2349 / 2361
页数:13
相关论文
共 42 条
[21]  
Ghahramani, 2003, P 20 INT C MACH LEAR, P912, DOI DOI 10.1109/18.850663
[22]   Pareto-optimal patterns in logical analysis of data [J].
Hammer, PL ;
Kogan, A ;
Simeone, B ;
Szedmák, S .
DISCRETE APPLIED MATHEMATICS, 2004, 144 (1-2) :79-102
[23]  
Hastie T, 2002, ELEMENTS STAT LEARNI
[24]  
Hsu C.W., 2010, A Practical Guide to Support Vector Classification, DOI DOI 10.1177/02632760022050997
[25]   Evaluation criteria based on mutual information for classifications including rejected class [J].
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China ;
不详 ;
不详 .
Zidonghua Xuebao Acta Auto. Sin., 2008, 11 (1396-1403)
[26]  
IBM, 2009, IL CPLEX 12 1 REF MA
[27]  
Jankowski N., 2011, METALEARNING COMPUTA
[28]  
Klosgen W., 2002, Handbook of Data Mining and Knowledge Discovery
[29]  
MITCHELL T, 1989, ANNU REV COMPUT SCI, V4, P417
[30]   Flexible Manifold Embedding: A Framework for Semi-Supervised and Unsupervised Dimension Reduction [J].
Nie, Feiping ;
Xu, Dong ;
Tsang, Ivor Wai-Hung ;
Zhang, Changshui .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2010, 19 (07) :1921-1932