Predictive learning with structured (grouped) data

被引:23
作者
Liang, Lichen [1 ]
Cai, Feng [1 ]
Cherkassky, Vladimir [1 ]
机构
[1] Univ Minnesota, Dept Elect & Comp Engn, Minneapolis, MN 55455 USA
基金
美国国家科学基金会;
关键词
Heterogeneous data; Learning with structured data; Model selection; Multi-task learning; SVM; SVM-Plus;
D O I
10.1016/j.neunet.2009.06.030
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many applications of machine learning involve sparse and heterogeneous data. For example, estimation of diagnostic models using patients' data from clinical studies requires effective integration of genetic, clinical and demographic data. Typically all heterogeneous inputs are properly encoded and mapped onto a single feature vector, used for estimating a classifier. This approach, known as standard inductive learning, is used in most application studies. Recently, several new learning methodologies have emerged. For instance, when training data can be naturally separated into several groups (or structured), we can view model estimation for each group as a separate task, leading to a Multi-Task Learning framework. Similarly, a setting where the training data are Structured, but the objective is to estimate a single predictive model (for all groups), leads to the Learning with Structured Data and SVM+ methodology recently proposed by Vapnik [(2006). Empirical inference science afterword of 2006. Springer]. This paper describes a biomedical application of these new data modeling approaches for modeling heterogeneous data using several medical data sets. The characteristics of group variables are analyzed. Our comparisons demonstrate the advantages and limitations of these new approaches, relative to standard inductive SVM classifiers. (C) 2009 Elsevier Ltd. All rights reserved.
引用
收藏
页码:766 / 773
页数:8
相关论文
共 23 条
[1]  
[Anonymous], P 17 SIGKDD C KNOWL
[2]  
[Anonymous], J MACHINE LEARNING R
[3]  
[Anonymous], 2006, MULTITASK FEATURE SE
[4]  
[Anonymous], IJCNN
[5]  
[Anonymous], 2006, NIPS
[6]  
BAKKER B, 2004, J MACHINE LEARNING R
[7]  
Ben-David S., 2003, P COMP LEARN THEOR
[8]  
BENDAVID S, 2002, ACM KDD
[9]  
CAI F, 2009, IJCNN
[10]  
CampsValls G, 2007, KERNEL METHODS IN BIOENGINEERING, SIGNAL AND IMAGE PROCESSING, P1