Minimum Description Length Penalization for Group and Multi-Task Sparse Learning

被引:0
作者
Dhillon, Paramveer S. [1 ]
Foster, Dean P. [2 ]
Ungar, Lyle H. [1 ]
机构
[1] Univ Penn, Dept Comp & Informat Sci, Philadelphia, PA 19104 USA
[2] Univ Penn, Dept Stat, Wharton Sch, Philadelphia, PA 19104 USA
关键词
feature selection; minimum description length principle; multi-task learning; GROUP LASSO; VARIABLE SELECTION; MODEL SELECTION; CONSISTENCY; REGRESSION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a framework MIC (Multiple Inclusion Criterion) for learning sparse models based on the information theoretic Minimum Description Length (MDL) principle. MIC provides an elegant way of incorporating arbitrary sparsity patterns in the feature space by using two-part MDL coding schemes. We present MIC based models for the problems of grouped feature selection (MIC-GROUP) and multi-task feature selection (MIC-MULTI). MIC-GROUP assumes that the features are divided into groups and induces two level sparsity, selecting a subset of the feature groups, and also selecting features within each selected group. MIC-MULTI applies when there are multiple related tasks that share the same set of potentially predictive features. It also induces two level sparsity, selecting a subset of the features, and then selecting which of the tasks each feature should be added to. Lastly, we propose a model, TRANSFEAT, that can be used to transfer knowledge from a set of previously learned tasks to a new task that is expected to share similar features. All three methods are designed for selecting a small set of predictive features from a large pool of candidate features. We demonstrate the effectiveness of our approach with experimental results on data from genomics and from word sense disambiguation problems.(1)
引用
收藏
页码:525 / 564
页数:40
相关论文
共 65 条
[1]  
Akaike H., 1998, Selected papers of Hirotugu Akaike, P199, DOI DOI 10.1007/978-1-4612-1694-0_15
[2]  
ANDO R, 2006, CONLL X
[3]  
Ando RK, 2005, J MACH LEARN RES, V6, P1817
[4]  
[Anonymous], 2005, Advances in Minimum Description Length: Theory and Applications
[5]  
[Anonymous], 2009, Advances in Neural Information Processing Systems
[6]  
[Anonymous], 2001, Mathematical statistics
[7]  
[Anonymous], 2007, The Minimum Description Length Principle
[8]  
[Anonymous], 1991, ELEMENTS INFORM THEO, DOI [DOI 10.1002/0471200611, 10.1002/0471200611]
[9]  
[Anonymous], 2006, Journal of the Royal Statistical Society, Series B
[10]   Convex multi-task feature learning [J].
Argyriou, Andreas ;
Evgeniou, Theodoros ;
Pontil, Massimiliano .
MACHINE LEARNING, 2008, 73 (03) :243-272