Multi-label symbolic value partitioning through random walks

被引:5
作者
Wen, Liu-Ying [1 ]
Luo, Chao-Guang [1 ]
Wu, Wei-Zhi [2 ,3 ]
Min, Fan [1 ]
机构
[1] Southwest Petr Univ, Sch Comp Sci, Chengdu 610500, Peoples R China
[2] Zhejiang Ocean Univ, Sch Math Phys & Informat Sci, Zhoushan 316022, Peoples R China
[3] Zhejiang Ocean Univ, Key Lab Oceanog Big Data Min & Applicat Zhejiang, Zhoushan 316022, Peoples R China
基金
中国国家自然科学基金;
关键词
Clustering; Random walk; Symbolic value partition; Weighted graph; FEATURE-SELECTION; FEATURE-EXTRACTION; CLASSIFICATION; TRANSFORMATION;
D O I
10.1016/j.neucom.2020.01.046
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection and symbolic value partitioning are effective knowledge reduction techniques in the field of data mining. A large body of feature selection methods has been proposed for multi-label data. By contrast, symbolic value partitioning for such data has not been studied. In this paper, we propose the multi-label symbolic value partitioning through random walks algorithm with two stages. In the first stage, an undirected weighted graph is constructed for each attribute. Each node corresponds to an attribute value and the weight of each edge corresponds to the similarity between two nodes. Similarity is defined based on the attribute value distribution for each label. In the second stage, a random walk algorithm is used to cluster attribute values. The average weight serves as the separation operator to sharpen the inter-cluster edges. We tested the new algorithm and seven popular feature selection algorithms on 13 datasets. The experimental results demonstrate the effectiveness of the proposed algorithm in reducing the data size and improving classification accuracy. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:195 / 209
页数:15
相关论文
共 53 条
  • [1] [Anonymous], 2003, ADV NEURAL INFORM PR
  • [2] [Anonymous], [No title captured]
  • [3] Cabitza Federico, 2019, Organizing for the Digital World. IT for Individuals, Communities and Societies. Lecture Notes in Information Systems and Organisation (LNISO 28), P121, DOI 10.1007/978-3-319-90503-7_10
  • [4] Feature Selection for Multi-label Classification Using Neighborhood Preservation
    Cai, Zhiling
    Zhu, William
    [J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2018, 5 (01) : 320 - 330
  • [5] Document transformation for multi-label feature selection in text categorization
    Chen, Weizhu
    Yan, Jun
    Zhang, Benyu
    Chen, Zheng
    Yang, Qiang
    [J]. ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 451 - +
  • [6] Chua Tat-Seng, 2008, 2008 International Conference on Image and Video Retrieval, CIVR 2008, July 7, 2008 - July 9, P17
  • [7] Maximal-Discernibility-Pair-Based Approach to Attribute Reduction in Fuzzy Rough Sets
    Dai, Jianhua
    Hu, Hu
    Wu, Wei-Zhi
    Qian, Yuhua
    Huang, Debiao
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2018, 26 (04) : 2174 - 2187
  • [8] de Carvalho ACPLF, 2009, STUD COMPUT INTELL, V205, P177
  • [9] A multiway p-spectral clustering algorithm
    Ding, Shifei
    Cong, Lin
    Hu, Qiankun
    Jia, Hongjie
    Shi, Zhongzhi
    [J]. KNOWLEDGE-BASED SYSTEMS, 2019, 164 : 371 - 377
  • [10] Weighted linear loss multiple birth support vector machine based on information granulation for multi-class classification
    Ding, Shifei
    Zhang, Xiekai
    An, Yuexuan
    Xue, Yu
    [J]. PATTERN RECOGNITION, 2017, 67 : 32 - 46