The importance of signal pre-processing for machine learning: The influence of Data scaling in a driver identity classification

被引:5
作者
Abdennour, Najmeddine [1 ]
Ouni, Tarek [1 ]
Ben Amor, Nader [2 ]
机构
[1] Univ Sfax, CES Lab, Natl Sch Elect & Telecommun Sfax ENETCOM, Sfax, Tunisia
[2] Univ Sfax, Natl Sch Engineers Sfax ENIS, CES Lab, Sfax, Tunisia
来源
2021 IEEE/ACS 18TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA) | 2021年
关键词
Machine Learning; Normalization; Data scaling; pre-processing; classification; driver identification;
D O I
10.1109/AICCSA53542.2021.9686756
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Machine Learning (ML) and Deep Learning (DL) algorithms have overtaken the attention of the scientific community for their important capabilities and their over the top results. However, the excessive focus on hyperparameters and the model's architectures made the pre-processing step often neglected. In spite of its importance, it represented a weak point for most of the machine learning applications as well as a blind spot in many research studies. In this paper, we will demonstrate through a CAN-Bus vehicle data-based driver identification case study, the importance of testing the use of different methods of data scaling and normalization while demonstrating their role in improving the performance of several Machine Learning algorithms.
引用
收藏
页数:6
相关论文
共 20 条
  • [1] Driver identification using only the CAN-Bus vehicle data through an RCN deep learning approach
    Abdennour, N.
    Ouni, T.
    Amor, N. Ben
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2021, 136
  • [2] Bernardi M. L., 2018, P 2018 INT JOINT C N, P1, DOI [DOI 10.1109/IJCNN.2018.8489426, 10.1109/IJCNN.2018.8489426]
  • [3] Recent advances and emerging challenges of feature selection in the context of big data
    Bolon-Canedo, V.
    Sanchez-Marono, N.
    Alonso-Betanzos, A.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2015, 86 : 33 - 45
  • [4] A time series forest for classification and feature extraction
    Deng, Houtao
    Runger, George
    Tuv, Eugene
    Vladimir, Martyanov
    [J]. INFORMATION SCIENCES, 2013, 239 : 142 - 153
  • [5] Improvement of the Accuracy of Prediction Using Unsupervised Discretization Method: Educational Data Set Case Study
    Dimic, Gabrijela
    Rancic, Dejan
    Milentijevic, Ivan
    Spalevic, Petar
    [J]. TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2018, 25 (02): : 407 - 414
  • [6] Who is behind the wheel? Driver identification and fingerprinting
    Ezzini S.
    Berrada I.
    Ghogho M.
    [J]. Journal of Big Data, 5 (1)
  • [7] Classification in the Presence of Label Noise: a Survey
    Frenay, Benoit
    Verleysen, Michel
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (05) : 845 - 869
  • [8] Hallac D, 2018, IEEE INT C INTELL TR, P3233, DOI 10.1109/ITSC.2018.8569550
  • [9] Hastie T., 2009, Math. Intell, V2nd ed., DOI [DOI 10.1007/B94608, DOI 10.1007/BF02985802]
  • [10] Ioffe S, 2015, PR MACH LEARN RES, V37, P448