Multi-target support vector regression via correlation regressor chains

被引:103
作者
Melki, Gabriella [1 ]
Cano, Alberto [1 ]
Kecman, Vojislav [1 ]
Ventura, Sebastian [2 ,3 ]
机构
[1] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
[2] Univ Cordoba, Dept Comp Sci & Numer Anal, Cordoba, Spain
[3] King Abdulaziz Univ, Dept Comp Sci, Jeddah, Saudi Arabia
关键词
Multi-target regression; Multi-output regression; Regressor chains; Support vector regressor; MULTI-LABEL; MODEL; INTELLIGENCE; METHODOLOGY; ALGORITHM; SAMPLES; TESTS;
D O I
10.1016/j.ins.2017.06.017
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-target regression is a challenging task that consists of creating predictive models for problems with multiple continuous target outputs. Despite the increasing attention on multi-label classification, there are fewer studies concerning multi-target (MT) regression. The current leading MT models are based on ensembles of regressor chains, where random, differently ordered chains of the target variables are created and used to build separate regression models, using the previous target predictions in the chain. The challenges of building MT models stem from trying to capture and exploit possible correlations among the target variables during training. This paper presents three multi-target support vector regression models. The first involves building independent, single-target Support Vector Regression (SVR) models for each output variable. The second builds an ensemble of ran:. dom chains using the first method as a base model. The third calculates the targets' correlations and forms a maximum correlation chain, which is used to build a single chained support vector regression model, improving the models' prediction performance while reducing the computational complexity. The experimental study evaluates and compares the performance of the three approaches with seven other state-of-the-art multi-target regressors on 24 multi-target datasets. The experimental results are then analyzed using non parametric statistical tests. The results show that the maximum correlation SVR approach improves the performance of using ensembles of random chains. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:53 / 69
页数:17
相关论文
共 51 条
[1]  
Aho T, 2012, J MACH LEARN RES, V13, P2367
[2]  
Appice A, 2007, LECT NOTES ARTIF INT, V4701, P502
[3]  
Bache K., 2013, UCI Machine Learning Repository
[4]   A Bayesian information theoretic model of learning to learn via multiple task sampling [J].
Baxter, J .
MACHINE LEARNING, 1997, 28 (01) :7-39
[5]   Exploiting task relatedness for multiple task learning [J].
Ben-David, S ;
Schuller, R .
LEARNING THEORY AND KERNEL MACHINES, 2003, 2777 :567-580
[6]   A survey on multi-output regression [J].
Borchani, Hanen ;
Varando, Gherardo ;
Bielza, Concha ;
Larranaga, Pedro .
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2015, 5 (05) :216-233
[7]  
Boyd S., 2004, Convex optimization, DOI [10.1017/cbo97805118044 41, 10.1017/CBO9780511804441]
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]   LAIM discretization for multi-label data [J].
Cano, Alberto ;
Maria Luna, Jose ;
Gibaja, Eva L. ;
Ventura, Sebastian .
INFORMATION SCIENCES, 2016, 330 :370-384
[10]   Multitask learning [J].
Caruana, R .
MACHINE LEARNING, 1997, 28 (01) :41-75