New algorithms for trace-ratio problem with application to high-dimension and large-sample data dimensionality reduction

被引:9
作者
Shi, Wenya [1 ]
Wu, Gang [1 ]
机构
[1] China Univ Min & Technol, Sch Math, Xuzhou 221116, Jiangsu, Peoples R China
关键词
Dimensionality reduction; Trace-ratio problem; High-dimension and large-sample data; Large-scale discriminant analysis; Randomized singular value decomposition (RSVD); Inner-outer iterative algorithm; LINEAR DISCRIMINANT-ANALYSIS; FAST IMPLEMENTATION; OPTIMIZATION;
D O I
10.1007/s10994-020-05937-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning large-scale data sets with high dimensionality is a main concern in research areas including machine learning, visual recognition, information retrieval, to name a few. In many practical uses such as images, video, audio, and text processing, we have to face with high-dimension and large-sample data problems. The trace-ratio problem is a key problem for feature extraction and dimensionality reduction to circumvent the high dimensional space. However, it has been long believed that this problem has no closed-form solution, and one has to solve it by using some inner-outer iterative algorithms that are very time consuming. Therefore, efficient algorithms for high-dimension and large-sample trace-ratio problems are still lacking, especially for dense data problems. In this work, we present a closed-form solution for the trace-ratio problem, and propose two algorithms to solve it. Based on the formula and the randomized singular value decomposition, we first propose a randomized algorithm for solving high-dimension and large-sample dense trace-ratio problems. For high-dimension and large-sample sparse trace-ratio problems, we then propose an algorithm based on the closed-form solution and solving some consistent under-determined linear systems. Theoretical results are established to show the rationality and efficiency of the proposed methods. Numerical experiments are performed on some real-world data sets, which illustrate the superiority of the proposed algorithms over many state-of-the-art algorithms for high-dimension and large-sample dimensionality reduction problems.
引用
收藏
页码:3889 / 3916
页数:28
相关论文
共 47 条
  • [1] Deep learning model with low-dimensional random projection for large-scale image search
    Alzu'bi, Ahmad
    Abuarqoub, Abdelrahman
    [J]. ENGINEERING SCIENCE AND TECHNOLOGY-AN INTERNATIONAL JOURNAL-JESTECH, 2020, 23 (04): : 911 - 920
  • [2] High-Dimensional Function Approximation With Neural Networks for Large Volumes of Data
    Andras, Peter
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (02) : 500 - 508
  • [3] [Anonymous], 2013, LOAN: Matrix computations
  • [4] Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection
    Belhumeur, PN
    Hespanha, JP
    Kriegman, DJ
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (07) : 711 - 720
  • [5] SRDA: An efficient algorithm for large-scale discriminant analysis
    Cai, Deng
    He, Xiaofei
    Han, Jiawei
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (01) : 1 - 12
  • [6] A new LDA-based face recognition system which can solve the small sample size problem
    Chen, LF
    Liao, HYM
    Ko, MT
    Lin, JC
    Yu, GJ
    [J]. PATTERN RECOGNITION, 2000, 33 (10) : 1713 - 1726
  • [7] Hybrid Dimensionality Reduction Forest With Pruning for High-Dimensional Data Classification
    Chen, Weihong
    Xu, Yuhong
    Yu, Zhiwen
    Cao, Wenming
    Chen, C. L. Philip
    Han, Guoqiang
    [J]. IEEE ACCESS, 2020, 8 : 40138 - 40150
  • [8] A new and fast implementation for null space based linear discriminant analysis
    Chu, Delin
    Thye, Goh Siong
    [J]. PATTERN RECOGNITION, 2010, 43 (04) : 1373 - 1379
  • [9] NEAREST NEIGHBOR PATTERN CLASSIFICATION
    COVER, TM
    HART, PE
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) : 21 - +
  • [10] Elden L., 2005, MATRIX METHODS DATA