Hypergraph-based importance assessment for binary classification data

被引:1
作者
Misiorek, Pawel [1 ]
Janowski, Szymon [1 ]
机构
[1] Poznan Univ Tech, Inst Comp Sci, Piotrowo 3, PL-60965 Poznan, Poland
关键词
Hypergraphs; Machine learning; Imbalanced data; Random undersampling; Feature selection; GRAPH EDIT DISTANCE; COMPUTATION; ALGORITHM; NETWORK;
D O I
10.1007/s10115-022-01786-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a novel hypergraph-based framework enabling an assessment of the importance of binary classification data elements. Specifically, we apply the hypergraph model to rate data samples' and categorical feature values' relevance to classification labels. The proposed Hypergraph-based Importance ratings are theoretically grounded on the hypergraph cut conductance minimization concept. As a result of using hypergraph representation, which is a lossless representation from the perspective of higher-order relationships in data, our approach allows for more precise exploitation of the information on feature and sample coincidences. The solution was tested using two scenarios: undersampling for imbalanced classification data and feature selection. The experimentation results have proven the good quality of the new approach when compared with other state-of-the-art and baseline methods for both scenarios measured using the average precision evaluation metric.
引用
收藏
页码:1657 / 1683
页数:27
相关论文
共 50 条
  • [31] Binary Classification Based on Potential Functions
    Boczko, Erik M.
    Di Lullo, Andrew
    Young, Todd R.
    2009 OHIO COLLABORATIVE CONFERENCE ON BIOINFORMATICS, PROCEEDINGS, 2009, : 129 - +
  • [32] A Cross-Entropy Based Feature Selection Method for Binary Valued Data Classification
    Wang, Zhipeng
    Zhu, Qiuming
    INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, ISDA 2021, 2022, 418 : 1406 - 1416
  • [33] Attribute reduction for partially labeled data based on hypergraph models
    Xie, Xiaojun
    Qin, Xiaolin
    Huang, Guangmei
    Zhao, Wei
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 1434 - 1439
  • [34] Study on suitability and importance of multilayer extreme learning machine for classification of text data
    Roul, Rajendra Kumar
    Asthana, Shubham Rohan
    Kumar, Gaurav
    SOFT COMPUTING, 2017, 21 (15) : 4239 - 4256
  • [35] Hypergraph Based Abstraction for File-Less Data Management
    Kryza, Bartosz
    Kitowski, Jacek
    PARALLEL PROCESSING AND APPLIED MATHEMATICS, PPAM 2015, PT I, 2016, 9573 : 322 - 331
  • [36] A new approach for imbalanced data classification based on data gravitation
    Peng, Lizhi
    Zhang, Hongli
    Yang, Bo
    Chen, Yuehui
    INFORMATION SCIENCES, 2014, 288 : 347 - 373
  • [37] Binary Classification Optimisation with AI-Generated Data
    Mazon, Manuel Jesus Cerezo
    Garcia, Ricardo Moya
    Garcia, Ekaitz Arriola
    del Castillo, Miguel Herencia Garcia
    Iglesias, Guillermo
    TESTING SOFTWARE AND SYSTEMS, ICTSS 2024, 2025, 15383 : 210 - 216
  • [38] Linear Binary Classification under Interval Uncertainty of Data
    Erokhin, V. I.
    Kadochnikov, A. P.
    Sotnikov, S. V.
    SCIENTIFIC AND TECHNICAL INFORMATION PROCESSING, 2024, 51 (06) : 539 - 544
  • [39] An Empirical Assessment of Performance of Data Balancing Techniques in Classification Task
    Jadhav, Anil
    Mostafa, Samih M.
    Elmannai, Hela
    Karim, Faten Khalid
    APPLIED SCIENCES-BASEL, 2022, 12 (08):
  • [40] Granular computing-based approach of rule learning for binary classification
    Liu, Han
    Cocea, Mihaela
    GRANULAR COMPUTING, 2019, 4 (02) : 275 - 283