Logical Analysis of Data as a tool for the analysis of Probabilistic Discrete Choice Behavior

被引:12
作者
Bruni, Renato [1 ]
Bianchi, Gianpiero [2 ]
Dolente, Cosimo [3 ]
Leporelli, Claudio [1 ]
机构
[1] Sapienza Univ, Dept Comp Control & Management Engn, Rome, Italy
[2] Istat, Methodol & Stat Proc Design, Rome, Italy
[3] Fdn Ugo Bordoni, Rome, Italy
关键词
Classification algorithms; Rule learning; Socio-economic analyses; Data analytics; Digital divide; LEARNING ALGORITHM; DISCRETIZATION; DETERMINANTS; FRAMEWORK; SELECTION; PATTERNS;
D O I
10.1016/j.cor.2018.04.014
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Probabilistic Discrete Choice Models (PDCM) have been extensively used to interpret the behavior of heterogeneous decision makers that face discrete alternatives. The classification approach of Logical Analysis of Data (LAD) uses discrete optimization to generate patterns, which are logic formulas characterizing the different classes. Patterns can be seen as rules explaining the phenomenon under analysis. In this work we discuss how LAD can be used as the first phase of the specification of PDCM. Since in this task the number of patterns generated may be extremely large, and many of them may be nearly equivalent, additional processing is necessary to obtain practically meaningful information. Hence, we propose computationally viable techniques to obtain small sets of patterns that constitute meaningful representations of the phenomenon and allow to discover significant associations between subsets of explanatory variables and the output. We consider the complex socio-economic problem of the analysis of the utilization of the Internet in Italy, using real data gathered by the Italian National Institute of Statistics. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:191 / 201
页数:11
相关论文
共 40 条
[1]   Logical analysis of data - the vision of Peter L. Hammer [J].
Alexe, Gabriela ;
Alexe, Sorin ;
Bonates, Tiberius O. ;
Kogan, Alexander .
ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2007, 49 (1-4) :265-312
[2]  
[Anonymous], 2002, ELEMENTS STAT LEARNI
[3]  
[Anonymous], 2007, An introduction to categorical data analysis
[4]   Integer programming models for feature selection: New extensions and a randomized solution algorithm [J].
Bertolazzi, P. ;
Felici, G. ;
Festa, P. ;
Fiscon, G. ;
Weitschek, E. .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2016, 250 (02) :389-399
[5]   Logic based methods for SNPs tagging and reconstruction [J].
Bertolazzi, Paola ;
Felici, Giovanni ;
Festa, Paola .
COMPUTERS & OPERATIONS RESEARCH, 2010, 37 (08) :1419-1426
[6]   Maximum patterns in datasets [J].
Bonates, T. O. ;
Hammer, Peter L. ;
Kogan, A. .
DISCRETE APPLIED MATHEMATICS, 2008, 156 (06) :846-861
[7]   Logical analysis of numerical data [J].
Boros, E ;
Hammer, PL ;
Ibaraki, T ;
Kogan, A .
MATHEMATICAL PROGRAMMING, 1997, 79 (1-3) :163-190
[8]   An implementation of logical analysis of data [J].
Boros, E ;
Hammer, PL ;
Ibaraki, T ;
Kogan, A ;
Mayoraz, E ;
Muchnik, I .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2000, 12 (02) :292-306
[9]   Logical analysis of data: classification with justification [J].
Boros, Endre ;
Crama, Yves ;
Hammer, Peter L. ;
Ibaraki, Toshihide ;
Kogan, Alexander ;
Makino, Kazuhisa .
ANNALS OF OPERATIONS RESEARCH, 2011, 188 (01) :33-61
[10]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32