High Accuracy Data Classification and Feature Selection for Incomplete Information Systems Using Extended Limited Tolerance Relation and Conditional Entropy Approach

被引:0
作者
Deris, Mustafa Mat [1 ,2 ]
Abawajy, Jemal H. [3 ]
Yanto, Iwan Tri Riyadi [4 ]
Adiwijaya, Adiwijaya [5 ]
Herawan, Tutut [2 ,6 ]
Rofiq, Ainur [2 ]
Efendi, Riswan [7 ]
Jaafar, Mohamad Jazli Shafizan [8 ]
机构
[1] Univ Muhammadiah Malaysia, Fac Business Management & IT, Malang 02100, Perlis, Malaysia
[2] Univ Brawijaya, Fac Econ & Business, Malang 65145, Indonesia
[3] Deakin Univ, Sch Informat Technol, Geelong, Vic 3220, Australia
[4] Univ Ahmad Dahlan, Fak Tekn Informat, Yogyakarta 55166, Indonesia
[5] Telkom Univ, Sch Comp, Bandung 40257, Indonesia
[6] Univ Malaya, Fac Comp Sci & Informat Technol, Kuala Lumpur 50603, Malaysia
[7] Univ Pendidikan Sultan Idris, Fac Sci & Math, Tanjong Malim 35900, Perak, Malaysia
[8] Univ Malaysia Terengganu, Fac Comp Sci, Kuala Terengganu 21030, Terengganu, Malaysia
关键词
Information systems; Accuracy; Feature extraction; Entropy; Support vector machines; Uncertainty; Rough sets; Principal component analysis; Organizations; Information technology; Extended tolerance relation; accuracy; data reduction; similarity precision; ROUGH SETS; TABLES;
D O I
10.1109/ACCESS.2025.3538278
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data classification and feature/attribute selection approaches play important role in enabling organizations to extract meaningful insights from vast and complex datasets. Besides, the accuracy and processing time are two parameters of interest to determine which approach is favourable or suitable for enormous data. Moreover, the presence of redundant, incomplete, noisy and inconsistent data made more concern to accuracy and computational resources. The issue of incomplete data is addressed in limited studies due to its complexities, particularly on data classification and accuracy as well as attribute selection. The limited tolerance relation between objects is the favourable approach used in this scenario. However, the accuracy and the data classification rate need to be improved. In this paper, a new approach called extended limited tolerance relation with the similarity precision among objects to improve the data classification with high accuracy will be presented and the feature/attribute selection is performed using conditional entropy. Comparative analysis and experiment result between the proposed approach with limited tolerance relation approach in terms of data classification and accuracy are presented. The proposed approach comparatively improved the accuracy with better data classification rate and feature selection while preserving the consistency of the information in incomplete information systems that is worthy of attention.
引用
收藏
页码:27657 / 27669
页数:13
相关论文
共 32 条
[1]  
[Anonymous], 2003, P WORKSH FDN NEW DIR
[2]   A Study on Soft Multi-Granulation Rough Sets and Their Applications [J].
Ayub, Saba ;
Mahmood, Waqas ;
Shabir, Muhammad ;
Koam, Ali N. A. ;
Gul, Rizwan .
IEEE ACCESS, 2022, 10 :115541-115554
[3]   Uncertainty measurement for interval-valued decision systems based on extended conditional entropy [J].
Dai, Jianhua ;
Wang, Wentao ;
Xu, Qing ;
Tian, Haowei .
KNOWLEDGE-BASED SYSTEMS, 2012, 27 :443-450
[4]   Extended Tolerance Relation to Define a New Rough Set Model in Incomplete Information Systems [J].
Do Van Nguyen ;
Yamada, Koichi ;
Unehara, Muneyuki .
ADVANCES IN FUZZY SYSTEMS, 2013, 2013
[5]   Medical diagnosis for the problem of Chikungunya disease using soft rough sets [J].
El-Bably, Mostafa K. ;
Abu-Gdairi, Radwan ;
El-Gayar, Mostafa A. .
AIMS MATHEMATICS, 2023, 8 (04) :9082-9105
[6]   A rough set approach for selecting clustering attribute [J].
Herawan, Tutut ;
Deris, Mustafa Mat ;
Abawajy, Jemal H. .
KNOWLEDGE-BASED SYSTEMS, 2010, 23 (03) :220-231
[7]   A rough set approach to multiple dataset analysis [J].
Kaneiwa, Ken .
APPLIED SOFT COMPUTING, 2011, 11 (02) :2538-2547
[8]   Rules in incomplete information systems [J].
Kryszkiewicz, M .
INFORMATION SCIENCES, 1999, 113 (3-4) :271-292
[9]   Rough set approach to incomplete information systems [J].
Kryszkiewicz, M .
INFORMATION SCIENCES, 1998, 112 (1-4) :39-49
[10]   Incorporating logistic regression to decision-theoretic rough sets for classifications [J].
Liu, Dun ;
Li, Tianrui ;
Liang, Decui .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2014, 55 (01) :197-210