An Efficient Dimensionality Reduction Approach for Small-sample Size and High-dimensional Data Modeling

被引:7
作者
Qiu, Xintao [1 ]
Fu, Dongmei [1 ]
Fu, Zhenduo [2 ]
机构
[1] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing, Peoples R China
[2] Endress Hauser Shanghai Automat Equipment Co Ltd, Res & Dev Dept, Shanghai, Peoples R China
关键词
feature selection; feature extraction; dimensionality reduction; small-sample data; atmospheric corrosion prediction;
D O I
10.4304/jcp.9.3.576-580
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
As for massive multidimensional data are being generated in a wide range of emerging applications, this paper introduces two new methods of dimension reduction to conduct small-sample size and high-dimensional data processing and modeling. Through combining the support vector machine (SVM) and recursive feature elimination (RFE), SVM-RFE algorithm is proposed to select features, and further, adding the higher order singular value decomposition (HOSVD) to the feature extraction which involves successfully organizing the data into high order tensor pattern. The validation of simulation experiment data shows that the proposed novel feature selection and feature extraction methods can be effectively applied to the research work for analyzing and modeling the data of atmospheric corrosion. The feature selection method pledges that the remaining feature subset is optimal; feature extraction method reserves the original structure, discriminate information, and the integrity of data, etc. Finally, this paper proposes a complete data dimensionality reduction solution that can effectively solve the high-dimensional small sample data problem, and code programming for this solution has been implemented.
引用
收藏
页码:576 / 580
页数:5
相关论文
共 24 条
  • [1] Algorithm 862: MATLAB tensor classes for fast algorithm prototyping
    Bader, Brett W.
    Kolda, Tamara G.
    [J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2006, 32 (04): : 635 - 653
  • [2] Belkin M, 2006, J MACH LEARN RES, V7, P2399
  • [3] Modeling the environmental dependence of pit growth using neural network approaches
    Cavanaugh, M. K.
    Buchheit, R. G.
    Birbilis, N.
    [J]. CORROSION SCIENCE, 2010, 52 (09) : 3070 - 3077
  • [4] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [5] Dimensionality reduction in higher-order signal processing and rank-(R1, R2, ..., RN) reduction in multilinear algebra
    De Lathauwer, L
    Vandewalle, J
    [J]. LINEAR ALGEBRA AND ITS APPLICATIONS, 2004, 391 : 31 - 55
  • [6] Gene selection for cancer classification using support vector machines
    Guyon, I
    Weston, J
    Barnhill, S
    Vapnik, V
    [J]. MACHINE LEARNING, 2002, 46 (1-3) : 389 - 422
  • [7] Han Lu, 2013, Journal of Networks, V8, P253, DOI 10.4304/jnw.8.1.253-261
  • [8] Canonical correlation analysis: An overview with application to learning methods
    Hardoon, DR
    Szedmak, S
    Shawe-Taylor, J
    [J]. NEURAL COMPUTATION, 2004, 16 (12) : 2639 - 2664
  • [9] ON MEAN ACCURACY OF STATISTICAL PATTERN RECOGNIZERS
    HUGHES, GF
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1968, 14 (01) : 55 - +
  • [10] Neural network based popularity prediction for IPTV system
    Department of Automation, Joint Lab. Of Network Communication System and Control, University of Science and Technology of China, Hefei, Anhui, 230027, China
    不详
    [J]. J. Netw., 2012, 12 (2051-2056): : 2051 - 2056