Feature Selection for Handling Label Ambiguity Using Weighted Label-Fuzzy Relevancy and Redundancy

被引:7
作者
Deng, Zhixuan [1 ]
Li, Tianrui [1 ]
Deng, Dayong [2 ]
Liu, Keyu [1 ]
Luo, Zhipeng [1 ]
Zhang, Pengfei [3 ]
机构
[1] Southwest Jiaotong Univ, Sch Comp & Artificial Intelligence, Chengdu 611756, Peoples R China
[2] Zhejiang Normal Univ, Xingzhi Coll, Lanxi 321100, Peoples R China
[3] Chengdu Univ Tradit Chinese Med, Sch Intelligent Med, Chengdu, Peoples R China
基金
中国博士后科学基金; 国家重点研发计划; 中国国家自然科学基金;
关键词
Annotations; Feature extraction; Rough sets; Redundancy; Termination of employment; Fuzzy systems; Mutual information; Feature selection; fuzzy rough sets; label ambiguity; neighborhood rough sets; MULTILABEL FEATURE-SELECTION; MUTUAL INFORMATION; CLASSIFICATION; DEPENDENCY;
D O I
10.1109/TFUZZ.2024.3399617
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection is a crucial step for data preprocessing, and it is widely applied in machine learning. It can eliminate features that are redundant or irrelevant from data, thereby improving performance and reducing runtime. The uncertain nature of labels produces unique challenges for high-dimensional data with label ambiguity, which is still an open problem; the structural information of the data is not utilized fully. In this article, we sufficiently consider the structural information of the data, including relevancy between labels and features, redundancy among features, and positive regions, and set up a novel label ambiguity feature selection model via weighted label-fuzzy relevancy and redundancy. Specifically, we first transform the nonlabel distribution annotations to label distribution annotations by using a label enhancement model. Second, we use a fuzzy similarity relation to quantify how similar samples are in label space. Third, a general label-fuzzy rough set model is created, and then, a novel feature evaluation measure based on weighted label-fuzzy relevancy and redundancy is defined. In this model, general label-fuzzy rough sets are employed to process label ambiguity problems, and the label-fuzzy relevancy and redundancy are weighted with the feature significance with the positive region as the focus. Finally, a feature selection algorithm for label ambiguity that follows the idea of weighted label-fuzzy relevancy and redundancy is proposed. Extensive experiments are conducted on 12 label distribution annotation datasets and eight multilabel annotation datasets. The results indicate the advantages of our proposed algorithm over state-of-the-art algorithms.
引用
收藏
页码:4436 / 4447
页数:12
相关论文
共 49 条
[1]  
Ash R.B., 2012, Information Theory
[2]   Label Distribution Learning on Auxiliary Label Space Graphs for Facial Expression Recognition [J].
Chen, Shikai ;
Wang, Jianfeng ;
Chen, Yuedong ;
Shi, Zhongchao ;
Geng, Xin ;
Rui, Yong .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :13981-13990
[3]  
Demsar J, 2006, J MACH LEARN RES, V7, P1
[4]   Multi-Label Emotion Detection via Emotion-Specified Feature Extraction and Emotion Correlation Learning [J].
Deng, Jiawen ;
Ren, Fuji .
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (01) :475-486
[5]   Feature selection for label distribution learning using dual-similarity based neighborhood fuzzy entropy [J].
Deng, Zhixuan ;
Li, Tianrui ;
Deng, Dayong ;
Liu, Keyu ;
Zhang, Pengfei ;
Zhang, Shiming ;
Luo, Zhipeng .
INFORMATION SCIENCES, 2022, 615 :385-404
[6]  
Gao BB, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P712
[7]  
[耿新 Geng Xin], 2018, [中国科学. 信息科学, Scientia Sinica Informationis], V48, P521, DOI 10.1360/n112018-00029
[8]   Label Distribution Learning [J].
Geng, Xin .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (07) :1734-1748
[9]   An efficient Pareto-based feature selection algorithm for multi-label classification [J].
Hashemi, Amin ;
Dowlatshahi, Mohammad Bagher ;
Nezamabadi-pour, Hossein .
INFORMATION SCIENCES, 2021, 581 :428-447
[10]   Neighborhood rough set based heterogeneous feature subset selection [J].
Hu, Qinghua ;
Yu, Daren ;
Liu, Jinfu ;
Wu, Congxin .
INFORMATION SCIENCES, 2008, 178 (18) :3577-3594