A method of dimensionality reduction by selection of components in principal component analysis for text classification

被引:5
|
作者
Zhang, Yangwu [1 ,2 ]
Li, Guohe [1 ,3 ]
Zong, Heng [2 ]
机构
[1] China Univ Petr, Coll Geophys & Informat Engn, Beijing, Peoples R China
[2] China Univ Polit Sci & Law, Dept Sci & Technol Teaching, Beijing, Peoples R China
[3] China Univ Petr, Beijing Key Lab Data Min Petr Data, Beijing, Peoples R China
关键词
Principal components analysis; Dimensionality reduction; Text classification;
D O I
10.2298/FIL1805499Z
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Dimensionality reduction, including feature extraction and selection, is one of the key points for text classification. In this paper, we propose a mixed method of dimensionality reduction constructed by principal components analysis and the selection of components. Principal components analysis is a method of feature extraction. Not all of the components in principal component analysis contribute to classification, because PCA objective is not a form of discriminant analysis (see, e.g. Jolliffe, 2002). In this context, we present a function of components selection, which returns the useful components for classification by the indicators of the performances on the different subsets of the components. Compared to traditional methods of feature selection, SVM classifiers trained on selected components show improved classification performance and a reduction in computational overhead.
引用
收藏
页码:1499 / 1506
页数:8
相关论文
共 50 条
  • [41] Feature Extraction based on Principal Component Analysis for Text Categorization
    Lhazmir, Safae
    El Moudden, Ismail
    Kobbane, Abdellatif
    2017 INTERNATIONAL CONFERENCE ON PERFORMANCE EVALUATION AND MODELING IN WIRED AND WIRELESS NETWORKS (PEMWN), 2017,
  • [42] Research on the Method of Nonlinear Dimensionality Reduction for the Text of Laws and Regulations of construction
    Su, Bian-ping
    Wang, Yi-ping
    Zhi, Hui
    2009 5TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-8, 2009, : 5371 - +
  • [43] A novel multivariate filter method for feature selection in text classification problems
    Labani, Mahdieh
    Moradi, Parham
    Ahmadizar, Fardin
    Jalili, Mahdi
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2018, 70 : 25 - 37
  • [44] A new feature selection method for handling redundant information in text classification
    You-wei Wang
    Li-zhou Feng
    Frontiers of Information Technology & Electronic Engineering, 2018, 19 : 221 - 234
  • [45] Multicriteria classification method for dimensionality reduction adapted to hyperspectral images
    Khoder, Mahdi
    Kashana, Serge
    Khoder, Jihan
    Younes, Rafic
    JOURNAL OF APPLIED REMOTE SENSING, 2017, 11
  • [46] An incremental dimensionality reduction method on discriminant information for pattern classification
    Hu, Xiaoqin
    Yang, Zhixia
    Jing, Ling
    PATTERN RECOGNITION LETTERS, 2009, 30 (15) : 1416 - 1423
  • [47] A new feature selection method for handling redundant information in text classification
    Wang, You-wei
    Feng, Li-zhou
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2018, 19 (02) : 221 - 234
  • [48] A MULTIVARIATE ADAPTIVE STOCHASTIC SEARCH METHOD FOR DIMENSIONALITY REDUCTION IN CLASSIFICATION
    Tian, Tian Siva
    James, Gareth M.
    Wilcox, Rand R.
    ANNALS OF APPLIED STATISTICS, 2010, 4 (01) : 340 - 365
  • [49] An optimal feature selection method for text classification through redundancy and synergy analysis
    Lazhar Farek
    Amira Benaidja
    Multimedia Tools and Applications, 2025, 84 (16) : 16397 - 16423
  • [50] Curvilinear Component Analysis for nonlinear dimensionality reduction of hyperspectral images
    Lennon, M
    Mercier, G
    Mouchot, MC
    Hubert-Moy, L
    IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING VII, 2002, 4541 : 157 - 168