A Supervised Learning Model for High-Dimensional and Large-Scale Data

被引:29
作者
Peng, Chong [1 ]
Cheng, Jie [2 ]
Cheng, Qiang [1 ]
机构
[1] Southern Illinois Univ, Dept Comp Sci, Carbondale, IL 62901 USA
[2] Univ Hawaii, Dept Comp Sci & Engn, Hilo, HI 96720 USA
基金
美国国家科学基金会;
关键词
Discriminative regression; supervised learning; classification; high dimension; large-scale data; NEWTON METHOD; CLASSIFICATION;
D O I
10.1145/2972957
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a new supervised learning model using a discriminative regression approach. This new model estimates a regression vector to represent the similarity between a test example and training examples while seamlessly integrating the class information in the similarity estimation. This distinguishes our model from usual regression models and locally linear embedding approaches, rendering our method suitable for supervised learning problems in high-dimensional settings. Our model is easily extensible to account for nonlinear relationship and applicable to general data, including both high-and low-dimensional data. The objective function of the model is convex, for which two optimization algorithms are provided. These two optimization approaches induce two scalable solvers that are of mathematically provable, linear time complexity. Experimental results verify the effectiveness of the proposed method on various kinds of data. For example, our method shows comparable performance on low-dimensional data and superior performance on high-dimensional data to several widely used classifiers; also, the linear solvers obtain promising performance on large-scale classification.
引用
收藏
页数:23
相关论文
共 60 条
[31]   ROBUST REGRESSION USING ITERATIVELY RE-WEIGHTED LEAST-SQUARES [J].
HOLLAND, PW ;
WELSCH, RE .
COMMUNICATIONS IN STATISTICS PART A-THEORY AND METHODS, 1977, 6 (09) :813-827
[32]  
Horn RA., 2013, MATRIX ANAL
[33]  
Hsieh C. -J., 2008, P 25 INT C MACH LEAR, P408
[34]   Robust Manifold Nonnegative Matrix Factorization [J].
Huang, Jin ;
Nie, Feiping ;
Huang, Heng ;
Ding, Chris .
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2014, 8 (03)
[35]  
Ji S., 2009, An accelerated gradient method for trace norm minimization, P457
[36]   Statistical challenges of high-dimensional data INTRODUCTION [J].
Johnstone, Iain M. ;
Titterington, D. Michael .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2009, 367 (1906) :4237-4253
[37]  
Keerthi SS, 2005, J MACH LEARN RES, V6, P341
[38]  
Leibe B, 2003, PROC CVPR IEEE, P409
[39]  
Leon B, 2008, ADV NEURAL INFORM PR, P161, DOI DOI 10.7751/mitpress/8996.003.0015
[40]  
LICHMAN M., 2013, UCI MACHINE LEARNING