Filter and Embedded Feature Selection Methods to Meet Big Data Visualization Challenges

被引:4
|
作者
ElDahshan, Kamal A. [1 ]
AlHabshy, AbdAllah A. [1 ]
Mohammed, Luay Thamer [1 ]
机构
[1] Al Azhar Univ, Fac Sci, Math Dept, Cairo 11884, Egypt
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2023年 / 74卷 / 01期
关键词
Data Redaction; features selection; Select from model; Select; percentile; big data visualization; data visualization; PARTICLE SWARM OPTIMIZATION; ALGORITHM;
D O I
10.32604/cmc.2023.032287
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods. To reduce the volume of big data and minimize model training time (Tt) while maintaining data quality. We contributed to meeting the challenges of big data visualization using the embedded method based "Select from model (SFM)" method by using "Random forest Importance algorithm (RFI)" and comparing it with the filter method by using "Select percentile (SP)" method based chi square "Chi2" tool for selecting the most important features, which are then fed into a classification process using the logistic regression (LR) algorithm and the k-nearest neighbor (KNN) algorithm. Thus, the classification accuracy (AC) performance of LR is also compared to the KNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied. Consequently, the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal. After making several comparisons, the study suggests (SFMLR) using SFM based on RFI algorithm for feature selection, with LR algorithm for data classify. The proposal proved its efficacy by comparing its results with recent literature.
引用
收藏
页码:817 / 839
页数:23
相关论文
共 50 条
  • [1] Feature Selection and Its Use in Big Data: Challenges, Methods, and Trends
    Rong, Miao
    Gong, Dunwei
    Gao, Xiaozhi
    IEEE ACCESS, 2019, 7 : 19709 - 19725
  • [2] Feature Selection: Filter Methods Performance Challenges
    Cherrington, Marianne
    Thabtah, Fadi
    Lu, Joan
    Xu, Qiang
    2019 INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCES (ICCIS), 2019, : 252 - 255
  • [3] Challenges of Feature Selection for Big Data Analytics
    Li, Jundong
    Liu, Huan
    IEEE INTELLIGENT SYSTEMS, 2017, 32 (02) : 9 - 15
  • [4] Challenges of Feature Selection for Big Data Analytics
    Li J.
    Liu H.
    1600, Institute of Electrical and Electronics Engineers Inc., United States (32): : 9 - 15
  • [5] Practical Challenges and Recommendations of Filter Methods for Feature Selection
    Rajab, Mohammed
    Wang, Dennis
    JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2020, 19 (01)
  • [6] Feature Selection for Big Visual Data: Overview and Challenges
    Bolon-Canedo, Veronica
    Remeseiro, Beatriz
    Cancela, Brais
    IMAGE ANALYSIS AND RECOGNITION (ICIAR 2018), 2018, 10882 : 136 - 143
  • [7] Feature Selection in Big Data using Filter Based Techniques
    Srinivas, Sumitra K.
    Kancharla, Gangadhara Rao
    2019 4TH MEC INTERNATIONAL CONFERENCE ON BIG DATA AND SMART CITY (ICBDSC), 2019, : 139 - 145
  • [8] Data Feature Selection Methods on Distributed Big Data Processing Platforms
    Catalkaya, Mehmet Burak
    Kalipsiz, Oya
    Aktas, Mehmet S.
    Turgut, Umut Orcun
    2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2018, : 133 - 138
  • [9] Big Data Meet Green Challenges: Greening Big Data
    Wu, Jinsong
    Guo, Song
    Li, Jie
    Zeng, Deze
    IEEE SYSTEMS JOURNAL, 2016, 10 (03): : 873 - 887
  • [10] Feature selection methods and genomic big data: a systematic review
    Tadist, Khawla
    Najah, Said
    Nikolov, Nikola S.
    Mrabti, Fatiha
    Zahi, Azeddine
    JOURNAL OF BIG DATA, 2019, 6 (01)