A method of dimensionality reduction by selection of components in principal component analysis for text classification

被引:5
|
作者
Zhang, Yangwu [1 ,2 ]
Li, Guohe [1 ,3 ]
Zong, Heng [2 ]
机构
[1] China Univ Petr, Coll Geophys & Informat Engn, Beijing, Peoples R China
[2] China Univ Polit Sci & Law, Dept Sci & Technol Teaching, Beijing, Peoples R China
[3] China Univ Petr, Beijing Key Lab Data Min Petr Data, Beijing, Peoples R China
关键词
Principal components analysis; Dimensionality reduction; Text classification;
D O I
10.2298/FIL1805499Z
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Dimensionality reduction, including feature extraction and selection, is one of the key points for text classification. In this paper, we propose a mixed method of dimensionality reduction constructed by principal components analysis and the selection of components. Principal components analysis is a method of feature extraction. Not all of the components in principal component analysis contribute to classification, because PCA objective is not a form of discriminant analysis (see, e.g. Jolliffe, 2002). In this context, we present a function of components selection, which returns the useful components for classification by the indicators of the performances on the different subsets of the components. Compared to traditional methods of feature selection, SVM classifiers trained on selected components show improved classification performance and a reduction in computational overhead.
引用
收藏
页码:1499 / 1506
页数:8
相关论文
共 50 条
  • [31] On the application of principal component analysis to classification problems
    Zheng J.
    Rakovski C.
    Data Science Journal, 2021, 20 (01):
  • [32] Intelligent classification of point clouds for indoor components based on dimensionality reduction
    Yang, Huimin
    Wu, Hangbin
    2020 5TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND APPLICATIONS (ICCIA 2020), 2020, : 89 - 93
  • [33] Orthogonal component analysis: A fast dimensionality reduction algorithm
    Zhu, Tao
    Xu, Ye
    Shen, Furao
    Zhao, Jinxi
    NEUROCOMPUTING, 2016, 177 : 136 - 146
  • [34] An enhanced feature selection method for text classification
    Kang, Jinbeom
    Lee, Eunshil
    Hong, Kwanghee
    Park, Jeahyun
    Kim, Taehwan
    Park, Juyoung
    Choi, Joongmin
    Yang, Jaeyoung
    PROCEEDINGS OF THE SECOND IASTED INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2006, : 36 - 41
  • [35] A new feature selection method for text classification
    Uchyigit, Gulden
    Clark, Keith
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2007, 21 (02) : 423 - 438
  • [36] Review of Feature Selection, Dimensionality Reduction and Classification for Chronic Disease Diagnosis
    Alhassan, Afnan M.
    Zainon, Wan Mohd Nazmee Wan
    IEEE ACCESS, 2021, 9 : 87310 - 87317
  • [37] Efficient Method for Feature Selection in Text Classification
    Sun, Jian
    Zhang, Xiang
    Liao, Dan
    Chang, Victor
    2017 INTERNATIONAL CONFERENCE ON ENGINEERING AND TECHNOLOGY (ICET), 2017,
  • [38] A Dimensionality Reduction and Reconstruction Method for Data with Multiple Connected Components
    Yao, Yuqin
    Gao, Yang
    Long, Zhiguo
    Meng, Hua
    Sioutis, Michael
    2022 IEEE THE 5TH INTERNATIONAL CONFERENCE ON BIG DATA AND ARTIFICIAL INTELLIGENCE (BDAI 2022), 2022, : 87 - 92
  • [39] Implementation of the Principal Component Analysis onto High-Performance Computer Facilities for Hyperspectral Dimensionality Reduction: Results and Comparisons
    Martel, Ernestina
    Lazcano, Raquel
    Lopez, Jose
    Madronal, Daniel
    Salvador, Ruben
    Lopez, Sebastian
    Juarez, Eduardo
    Guerra, Raul
    Sanz, Cesar
    Sarmiento, Roberto
    REMOTE SENSING, 2018, 10 (06)
  • [40] A Supervised Learning Method Combine With Dimensionality Reduction in Vietnamese Text Summarization
    Ha Nguyen Thi Thu
    Quynh Nguyen Huu
    Tu Nguyen Thi Ngoc
    2013 COMPUTING, COMMUNICATIONS AND IT APPLICATIONS CONFERENCE (COMCOMAP), 2013, : 69 - 73