New algorithms for trace-ratio problem with application to high-dimension and large-sample data dimensionality reduction

被引:9
作者
Shi, Wenya [1 ]
Wu, Gang [1 ]
机构
[1] China Univ Min & Technol, Sch Math, Xuzhou 221116, Jiangsu, Peoples R China
关键词
Dimensionality reduction; Trace-ratio problem; High-dimension and large-sample data; Large-scale discriminant analysis; Randomized singular value decomposition (RSVD); Inner-outer iterative algorithm; LINEAR DISCRIMINANT-ANALYSIS; FAST IMPLEMENTATION; OPTIMIZATION;
D O I
10.1007/s10994-020-05937-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning large-scale data sets with high dimensionality is a main concern in research areas including machine learning, visual recognition, information retrieval, to name a few. In many practical uses such as images, video, audio, and text processing, we have to face with high-dimension and large-sample data problems. The trace-ratio problem is a key problem for feature extraction and dimensionality reduction to circumvent the high dimensional space. However, it has been long believed that this problem has no closed-form solution, and one has to solve it by using some inner-outer iterative algorithms that are very time consuming. Therefore, efficient algorithms for high-dimension and large-sample trace-ratio problems are still lacking, especially for dense data problems. In this work, we present a closed-form solution for the trace-ratio problem, and propose two algorithms to solve it. Based on the formula and the randomized singular value decomposition, we first propose a randomized algorithm for solving high-dimension and large-sample dense trace-ratio problems. For high-dimension and large-sample sparse trace-ratio problems, we then propose an algorithm based on the closed-form solution and solving some consistent under-determined linear systems. Theoretical results are established to show the rationality and efficiency of the proposed methods. Numerical experiments are performed on some real-world data sets, which illustrate the superiority of the proposed algorithms over many state-of-the-art algorithms for high-dimension and large-sample dimensionality reduction problems.
引用
收藏
页码:3889 / 3916
页数:28
相关论文
共 47 条
  • [41] A theoretical contribution to the fast implementation of null linear discriminant analysis with random matrix multiplication
    Wu, Gang
    Feng, Ting-Ting
    [J]. NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS, 2015, 22 (06) : 1180 - 1188
  • [42] Fast Fisher discriminant analysis with randomized algorithms
    Ye, Haishan
    Li, Yujun
    Chen, Cheng
    Zhang, Zhihua
    [J]. PATTERN RECOGNITION, 2017, 72 : 82 - 92
  • [43] FAST ALGORITHMS FOR THE GENERALIZED FOLEY-SAMMON DISCRIMINANT ANALYSIS
    Zhang, Lei-Hong
    Liao, Li-Zhi
    Ng, Michael K.
    [J]. SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2010, 31 (04) : 1584 - 1605
  • [44] INCREMENTAL REGULARIZED LEAST SQUARES FOR DIMENSIONALITY REDUCTION OF LARGE-SCALE DATA
    Zhang, Xiaowei
    Cheng, Li
    Chu, Delin
    Liao, Li-Zhi
    Ng, Michael K.
    Tan, Roger C. E.
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2016, 38 (03) : B414 - B439
  • [45] Zhao M.X., 2012, P INT JOINT C NEUR N, P1
  • [46] Trace Ratio Linear Discriminant Analysis for Medical Diagnosis: A Case Study of Dementia
    Zhao, Mingbo
    Chan, Rosa H. M.
    Tang, Peng
    Chow, Tommy W. S.
    Wong, Savio W. H.
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2013, 20 (05) : 431 - 434
  • [47] A Rayleigh-Ritz style method for large-scale discriminant analysis
    Zhu, Lin
    Huang, De-Shuang
    [J]. PATTERN RECOGNITION, 2014, 47 (04) : 1698 - 1708