A hybrid model: PNM for improving prediction capability of classifier

被引:0
作者
Mehrotra S. [1 ]
Muttum V.K. [2 ]
Krishna R.V. [2 ]
Kumar V. [3 ]
Varish N. [4 ]
机构
[1] College of Computing Science and Information Technology, Teerthanker Mahaveer University, Uttar Pradesh, Moradabad
[2] Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Andhra Pradesh, Guntur
[3] School of Computing Science and Engineering, GALGOTIAS University, Greater Noida
[4] Department of Computer Science and Engineering, GITAM (Deemed to be University), Rudraram, Telangana, Hyderabad
关键词
Class imbalance; Classification; K-Nearest Neighbor; Near-Miss; Principal component analysis; Support vector machine;
D O I
10.1007/s41870-023-01609-9
中图分类号
学科分类号
摘要
In recent years, the COVID-19 and its variant are more dangerous for people with some health complexity, such as breast cancer, diabetes, and heart disease. Diagnosis at the early stage of these diseases may save people from deadly diseases and infections. However, often these diseases are diagnosed very late. Classification algorithms are in use for the diagnosis of several diseases. Classification result significance is a major issue in many real-time domains which deal with class imbalance data. Class imbalance data is one of the primary challenges for the performance of classification models. Most sensitive and essential real-time domain data are class imbalances, resulting in poor classification performance. Due to the dominancy of the major class, the classification of the minor class may show high accuracy. Misclassification can not be accepted in medical cases, which can cost human life as well. The paper presents a hybrid model named PNM to address the class imbalance issue and thus improve the performance of the classifiers. This proposed sampling model integrates Principal Component Analysis and the Near-Miss sampling method. Experimental results show the efficiency of the PNM model for the class imbalance data. © 2023, The Author(s), under exclusive licence to Bharati Vidyapeeth's Institute of Computer Applications and Management.
引用
收藏
页码:483 / 491
页数:8
相关论文
共 23 条
[1]  
Ahmad H., Kasasbeh B., Aldabaybah B., Et al., Class balancing framework for credit card fraud detection based on clustering and similarity-based selection (SBS), Int J Inf Technol, 15, 1, pp. 325-333, (2023)
[2]  
Anagnostou P., Barbas P., Vrahatis A.G., Et al., Approximate kNN classification for biomedical data, 2020 IEEE International Conference on Big Data (Big Data), pp. 3602-3607, (2020)
[3]  
Bader-El-Den M., Teitei E., Perry T., Biased random forest for dealing with the class imbalance problem, IEEE Trans Neural Netw Learn Syst, 30, 7, pp. 2163-2172, (2019)
[4]  
Branco P., Torgo L., Ribeiro R.P., A survey of predictive modeling on imbalanced domains, ACM Comput Surv, 49, 2, pp. 1-50, (2016)
[5]  
Bressan R.S., Camargo G., Bugatti P.H., Et al., Exploring active learning based on representativeness and uncertainty for biomedical data classification, IEEE J Biomed Health Inform, 23, 6, pp. 2238-2244, (2019)
[6]  
Brzezinski D., Stefanowski J., Susmaga R., Et al., On the dynamics of classification measures for imbalanced and streaming data, IEEE Trans Neural Netw Learn Syst, 31, 8, pp. 2868-2878, (2020)
[7]  
Dong Q., Gong S., Zhu X., Imbalanced deep learning by minority class incremental rectification, IEEE Trans Pattern Anal Mach Intell, 41, 6, pp. 1367-1381, (2019)
[8]  
Flores C.A., Figueroa R.L., Pezoa J.E., Active learning for biomedical text classification based on automatically generated regular expressions, IEEE Access, 9, pp. 38767-38777, (2021)
[9]  
Haixiang G., Yijing L., Shang J., Et al., Learning from class-imbalanced data: review of methods and applications, Expert Syst Appl, 73, pp. 220-239, (2017)
[10]  
Jin B., Zhang Y.Q., Support vector machines with evolutionary feature weights optimization for biomedical data classification, NAFIPS 2005—2005 Annual Meeting of the North American Fuzzy Information Processing Society, pp. 177-180, (2005)