Regression trees for multivalued numerical response variables

被引:16
作者
D'Ambrosio, Antonio [1 ]
Aria, Massimo [1 ]
Iorio, Carmela [2 ]
Siciliano, Roberta [2 ]
机构
[1] Univ Naples Federico II, Dept Econ & Stat, Naples, Italy
[2] Univ Naples Federico II, Dept Ind Engn, Naples, Italy
关键词
Regression trees; Multivalued variables; Modal variables; Earth mover distance; Mallows distance; MATCHING NOISE;
D O I
10.1016/j.eswa.2016.10.021
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the framework of regression trees, this paper provides a recursive partitioning methodology to deal with a non-standard response variable. Specifically, either multivalued numerical or modal response of the type histogram will be considered. These data are known as symbolic data, which special cases are classical data, imprecise data, conjunctive data as well as fuzzy data. In spite of pre-processing data in order to deal with standard regression tree methodology, this paper provides, as main contribution, a definition of the impurity measure and of the splitting criterion allowing for building the regression tree for multivalued numerical response variable. We analyze and evaluate the performance of our proposal, using simulated data as well as a real-world case studies. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:21 / 28
页数:8
相关论文
共 41 条
[1]  
[Anonymous], J SYMBOLIC DATA ANAL
[2]  
[Anonymous], EXPERT SYSTEMS APPL
[3]  
[Anonymous], COMP VIS 1998 6 INT
[4]  
[Anonymous], 1997, Revue Statistique appliquee
[5]  
[Anonymous], 2005, ELECT J SYMBOLIC DAT
[6]  
[Anonymous], COMPUTATIONAL STAT C
[7]  
[Anonymous], DYNAMIC RECURS UNPUB
[8]  
[Anonymous], STAT MODELLING
[9]  
Bartels R.H., 1998, INTRO SPLINES USE CO, P9
[10]   From the statistics of data to the statistics of knowledge: Symbolic data analysis [J].
Billard, L ;
Diday, E .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2003, 98 (462) :470-487