A novel autoencoder approach to feature extraction with linear separability for high-dimensional data

被引:0
作者
Zheng J. [1 ]
Qu H. [1 ,2 ]
Li Z. [1 ]
Li L. [1 ]
Tang X. [2 ]
Guo F. [2 ]
机构
[1] College of Computer Science and Technology, Chongqing University of Post and Telecommunications, Chongqing
[2] College of Automation, Chongqing University of Posts and Telecommunications, Chongqing
基金
中国国家自然科学基金;
关键词
Autoencoder; Distance metric; Feature extraction;
D O I
10.7717/PEERJ-CS.1061
中图分类号
学科分类号
摘要
Feature extraction often needs to rely on sufficient information of the input data, however, the distribution of the data upon a high-dimensional space is too sparse to provide sufficient information for feature extraction. Furthermore, high dimensionality of the data also creates trouble for the searching of those features scattered in subspaces. As such, it is a tricky task for feature extraction from the data upon a high-dimensional space. To address this issue, this article proposes a novel autoencoder method using Mahalanobis distance metric of rescaling transformation. The key idea of the method is that by implementing Mahalanobis distance metric of rescaling transformation, the difference between the reconstructed distribution and the original distribution can be reduced, so as to improve the ability of feature extraction to the autoencoder. Results show that the proposed approach wins the state-of-the-art methods in terms of both the accuracy of feature extraction and the linear separabilities of the extracted features. We indicate that distance metric-based methods are more suitable for extracting those features with linear separabilities from high-dimensional data than feature selection-based methods. In a high-dimensional space, evaluating feature similarity is relatively easier than evaluating feature importance, so that distance metric methods by evaluating feature similarity gain advantages over feature selection methods by assessing feature importance for feature extraction, while evaluating feature importance is more computationally efficient than evaluating feature similarity. © 2022 Zheng et al.
引用
收藏
相关论文
共 50 条
[21]   A strategy for feature extraction of high dimensional noisy data [J].
Bhushan, B ;
Romagnoli, JA .
PROCEEDINGS OF THE 25TH IASTED INTERNATIONAL CONFERENCE ON MODELLING, IDENTIFICATION, AND CONTROL, 2006, :441-+
[22]   A Convolutional Autoencoder Approach for Feature Extraction in Virtual Metrology [J].
Maggipinto, Marco ;
Masiero, Chiara ;
Beghi, Alessandro ;
Susto, Gian Antonio .
28TH INTERNATIONAL CONFERENCE ON FLEXIBLE AUTOMATION AND INTELLIGENT MANUFACTURING (FAIM2018): GLOBAL INTEGRATION OF INTELLIGENT MANUFACTURING AND SMART INDUSTRY FOR GOOD OF HUMANITY, 2018, 17 :126-133
[23]   Machinery Prognostics and High-Dimensional Data Feature Extraction Based on a Transformer Self-Attention Transfer Network [J].
Sun, Shilong ;
Peng, Tengyi ;
Huang, Haodong .
SENSORS, 2023, 23 (22)
[24]   Feature Extraction of High-dimensional Data Based on J-HOSVD for Cyber-Physical-Social Systems [J].
Gao, Yuan ;
Yang, Laurence T. ;
Zhao, Yaliang ;
Yang, Jing .
ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2022, 13 (03)
[25]   Feature selection using autoencoders with Bayesian methods to high-dimensional data [J].
Shu, Lei ;
Huang, Kun ;
Jiang, Wenhao ;
Wu, Wenming ;
Liu, Hongling .
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (06) :7397-7406
[26]   A Novel Feature Selection Method for High-Dimensional Mixed Decision Tables [J].
Nguyen Ngoc Thuy ;
Wongthanavasu, Sartra .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (07) :3024-3037
[27]   High-Dimensional Time Series Feature Extraction for Low-Cost Machine Olfaction [J].
Shakya, Pratistha ;
Kennedy, Eamonn ;
Rose, Christopher ;
Rosenstein, Jacob K. .
IEEE SENSORS JOURNAL, 2021, 21 (03) :2495-2504
[28]   Circular convolution-based feature extraction algorithm for classification of high-dimensional datasets [J].
Tajanpure, Rupali ;
Muddana, Akkalakshmi .
JOURNAL OF INTELLIGENT SYSTEMS, 2021, 30 (01) :1026-1039
[29]   Autoencoder Assist: An Efficient Profiling Attack on High-Dimensional Datasets [J].
Lei, Qi ;
Yang, Zijia ;
Wang, Qin ;
Ding, Yaoling ;
Ma, Zhe ;
Wang, An .
INFORMATION AND COMMUNICATIONS SECURITY, ICICS 2022, 2022, 13407 :324-341
[30]   Convolutional Autoencoder based Feature Extraction in Radar Data Analysis [J].
Lee, Hansoo ;
Kim, Jonggeun ;
Kim, Baekcheon ;
Kim, Sungshin .
2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, :81-84