Effectively Capturing Label Correlation for Tabular Multi-Label Classification

被引:0
作者
Siahroudi, Sajjad Kamali [1 ]
Ahmadi, Zahra [1 ]
Kudenko, Daniel [1 ]
机构
[1] Leibniz Univ Hannover, Res Ctr L3S, Hannover, Niedersachsen, Germany
来源
PROCEEDINGS OF THE 33RD ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2024 | 2024年
关键词
Multi-label; Graph Convolutional Network; Transformer; Tabular Data; Classification;
D O I
10.1145/3627673.3679772
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-label data is prevalent across various applications, where instances can be annotated with a set of classes. Although multi-label data can take various forms, such as images and text, tabular multi-label data stands out as the predominant data type in many real-world scenarios. Over the past decades, numerous methods have been proposed for tabular multi-label classification. Effectively addressing challenges like class imbalance, correlation among labels and features, and scalability is crucial for a high-performance multi-label classifier. However, many existing methods fall short of fully considering the correlation between labels and features. In cases where attempts are made, they often encounter high computational costs, rendering them impractical for large datasets. This paper introduces an innovative classification method for tabular multi-label data, utilizing a fusion of transformers and graph convolutional networks (GCN). The central concept of the proposed approach involves transforming tabular data into images, leveraging state-of-the-art methods in image processing, including image-based transformers and pre-trained models to capture correlation among labels effectively. Our approach jointly learns the representation of feature space and the correlation among labels within a unified network. To substantiate the performance of our proposed method, we conducted a rigorous series of experiments across diverse multi-label datasets(1). The results underscore the superior performance and scalability of our approach compared to other existing state-of-the-art methods. This work not only contributes a novel perspective to the field of tabular multi-label classification but also showcases advancements in both accuracy and scalability.
引用
收藏
页码:1060 / 1069
页数:10
相关论文
共 38 条
[1]  
[Anonymous], 2020, Augment. Hum. Res, DOI DOI 10.1007/S41133-019-0029-Y
[2]  
Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]
[3]  
Boutell M. R., 2004, PATTERN RECOGN, V37, P1757
[4]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[5]   A Comprehensive and Didactic Review on Multilabel Learning Software Tools [J].
Charte, Francisco .
IEEE ACCESS, 2020, 8 :50330-50354
[6]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[7]  
Demsar J, 2006, J MACH LEARN RES, V7, P1
[8]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[9]  
Dosovitskiy A., 2020, arXiv, V2010, P11929, DOI [10.48550/arXiv.2010.11929Focustolearnmore, DOI 10.48550/ARXIV.2010.11929FOCUSTOLEARNMORE]
[10]   Multilabel classification via calibrated label ranking [J].
Fuernkranz, Johannes ;
Huellermeier, Eyke ;
Mencia, Eneldo Loza ;
Brinker, Klaus .
MACHINE LEARNING, 2008, 73 (02) :133-153