New algorithms for trace-ratio problem with application to high-dimension and large-sample data dimensionality reduction

被引：9

作者：

Shi, Wenya ^{[1
]}

Wu, Gang ^{[1
]}

机构：

[1] China Univ Min & Technol, Sch Math, Xuzhou 221116, Jiangsu, Peoples R China

来源：

MACHINE LEARNING | 2024年 / 113卷 / 07期

关键词：

Dimensionality reduction; Trace-ratio problem; High-dimension and large-sample data; Large-scale discriminant analysis; Randomized singular value decomposition (RSVD); Inner-outer iterative algorithm; LINEAR DISCRIMINANT-ANALYSIS; FAST IMPLEMENTATION; OPTIMIZATION;

D O I：

10.1007/s10994-020-05937-w

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Learning large-scale data sets with high dimensionality is a main concern in research areas including machine learning, visual recognition, information retrieval, to name a few. In many practical uses such as images, video, audio, and text processing, we have to face with high-dimension and large-sample data problems. The trace-ratio problem is a key problem for feature extraction and dimensionality reduction to circumvent the high dimensional space. However, it has been long believed that this problem has no closed-form solution, and one has to solve it by using some inner-outer iterative algorithms that are very time consuming. Therefore, efficient algorithms for high-dimension and large-sample trace-ratio problems are still lacking, especially for dense data problems. In this work, we present a closed-form solution for the trace-ratio problem, and propose two algorithms to solve it. Based on the formula and the randomized singular value decomposition, we first propose a randomized algorithm for solving high-dimension and large-sample dense trace-ratio problems. For high-dimension and large-sample sparse trace-ratio problems, we then propose an algorithm based on the closed-form solution and solving some consistent under-determined linear systems. Theoretical results are established to show the rationality and efficiency of the proposed methods. Numerical experiments are performed on some real-world data sets, which illustrate the superiority of the proposed algorithms over many state-of-the-art algorithms for high-dimension and large-sample dimensionality reduction problems.

引用

页码：3889 / 3916

页数：28

共 47 条

[1] Deep learning model with low-dimensional random projection for large-scale image search
Alzu'bi, Ahmad
Abuarqoub, Abdelrahman
[J]. ENGINEERING SCIENCE AND TECHNOLOGY-AN INTERNATIONAL JOURNAL-JESTECH, 2020, 23 (04): : 911 - 920
[2] High-Dimensional Function Approximation With Neural Networks for Large Volumes of Data
Andras, Peter
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (02) : 500 - 508
[3] [Anonymous], 2013, LOAN: Matrix computations
[4] Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection
Belhumeur, PN
Hespanha, JP
Kriegman, DJ
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (07) : 711 - 720
[5] SRDA: An efficient algorithm for large-scale discriminant analysis
Cai, Deng
He, Xiaofei
Han, Jiawei
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (01) : 1 - 12
[6] A new LDA-based face recognition system which can solve the small sample size problem
Chen, LF
Liao, HYM
Ko, MT
Lin, JC
Yu, GJ
[J]. PATTERN RECOGNITION, 2000, 33 (10) : 1713 - 1726
[7] Hybrid Dimensionality Reduction Forest With Pruning for High-Dimensional Data Classification
Chen, Weihong
Xu, Yuhong
Yu, Zhiwen
Cao, Wenming
Chen, C. L. Philip
Han, Guoqiang
[J]. IEEE ACCESS, 2020, 8 : 40138 - 40150
[8] A new and fast implementation for null space based linear discriminant analysis
Chu, Delin
Thye, Goh Siong
[J]. PATTERN RECOGNITION, 2010, 43 (04) : 1373 - 1379
[9] NEAREST NEIGHBOR PATTERN CLASSIFICATION
COVER, TM
HART, PE
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) : 21 - +
[10] Elden L., 2005, MATRIX METHODS DATA

← 1 2 3 4 5 →