Feature and Subfeature Selection for Classification Using Correlation Coefficient and Fuzzy Model

被引:17
作者
Bhuyan, Hemanta Kumar [1 ]
Chakraborty, Chinmay [2 ]
Pani, Subhendu Kumar [3 ]
Ravi, Vinayakumar [4 ]
机构
[1] Technol & Res Deemed Univ, Vignans Fdn Sci, Dept Informat Technol, Vejendla 522213, Andhra Pradesh, India
[2] Birla Inst Technol Mesra, Elect & Commun Engn, Jharhand 835215, India
[3] Biju Patnaik Univ Technol, Orissa Engn Coll, Dept Comp Sci & Engn, Rourkela 769004, Odisha, India
[4] Prince Mohammad Bin Fahd Univ, Ctr Artificial Intelligence, Khobar 34754, Saudi Arabia
关键词
Feature extraction; Correlation; Redundancy; Databases; Data models; Data mining; Task analysis; Classification; correlation coefficient; data mining; feature selection; fuzzy model; UNSUPERVISED FEATURE-SELECTION; SUB-FEATURE SELECTION;
D O I
10.1109/TEM.2021.3065699
中图分类号
F [经济];
学科分类号
02 ;
摘要
This article presents an analysis of data extraction for classification using correlation coefficient and fuzzy model. Several traditional methods of data extraction are used for classification that could not provide sufficient information for further step of data analysis on class. It needs refinement of features data to distinguish a class that differs from a traditional class. Thus, it proposes the feature tiny data (subfeature data) to find distinguish class from a traditional class using two methods such as correlation coefficient and fuzzy model to select features as well as subfeature for distinguishing class. In the first approach, the correlation coefficient methods with gradient descent technique are used to select features from the dataset and in the second approach, the fuzzy model with supreme of minimum value is considered to get subfeature data. As per the proposed model, some features (i.e., three features from the acoustic dataset, two features from the QCM dataset, and eight features from the audit dataset, etc.) and subfeatures (as per threshold value like 20 for acoustic; 10 for QCM, and 20 for audit, etc.) are selected based on correlation coefficient as well as fuzzy methods, respectively. Further, the probability approach is used to find the association and availability of subfeature data from the dimensional reduced database. The experimental results show the proposed framework identifies and selects both feature and subfeature data with the effectiveness of the new class. The comparison results of several classifiers on several datasets are explained in the experimental section.
引用
收藏
页码:1655 / 1669
页数:15
相关论文
共 34 条
[1]   Singular value decomposition for genome-wide expression data processing and modeling [J].
Alter, O ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) :10101-10106
[2]   Fuzzy logic approach for infectious disease diagnosis: A methodical evaluation, literature and classification [J].
Arji, Goli ;
Ahmadi, Hossein ;
Nilashi, Mehrbakhsh ;
Rashid, Tarik A. ;
Ahmed, Omed Hassan ;
Aljojo, Nahla ;
Zainol, Azida .
BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2019, 39 (04) :937-955
[3]   Unsupervised Feature Selection with Controlled Redundancy (UFeSCoR) [J].
Banerjee, Monami ;
Pal, Nikhil R. .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (12) :3390-3403
[4]   Feature selection with SVD entropy: Some modification and extension [J].
Banerjee, Monami ;
Pal, Nikhil R. .
INFORMATION SCIENCES, 2014, 264 :118-134
[5]  
Bhuyan HK, 2018, PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICICCT), P477, DOI 10.1109/ICICCT.2018.8473206
[6]   Privacy preserving sub-feature selection in distributed data mining [J].
Bhuyan, Hemanta Kumar ;
Kamila, Narendra Kumar .
APPLIED SOFT COMPUTING, 2015, 36 :552-569
[7]   Privacy preserving sub-feature selection based on fuzzy probabilities [J].
Bhuyan, Hemanta Kumar ;
Kamila, Narendra Kumar .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2014, 17 (04) :1383-1399
[8]  
Cai D, 2010, P 16 ACM SIGKDD INT, P333
[9]   Telemedicine Supported Chronic Wound Tissue Prediction Using Classification Approaches [J].
Chakraborty, Chinmay ;
Gupta, Bharat ;
Ghosh, Soumya K. ;
Das, Dev K. ;
Chakraborty, Chandan .
JOURNAL OF MEDICAL SYSTEMS, 2016, 40 (03) :1-12
[10]   Semi-Supervised Feature Selection via Sparse Rescaled Linear Square Regression [J].
Chen, Xiaojun ;
Yuan, Guowen ;
Nie, Feiping ;
Ming, Zhong .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (01) :165-176