Pre-Processing Methods of Data Mining

被引:0
|
作者
Saleem, Asma [1 ]
Asif, Khadim Hussain [1 ]
Ali, Ahmad [2 ]
Awan, Shahid Mahmood [3 ]
AlGhamdi, Mohammed A. [4 ]
机构
[1] Univ Engn & Technol, Dept Comp Sci & Engn, Lahore, Pakistan
[2] COMSATS Inst Informat Technol, Dept Biosci, Sahiwal, Pakistan
[3] Univ Engn & Technol, Al Khawarizmi Inst Comp Sci, Lahore, Pakistan
[4] Umm Al Qura Univ, Inst Innovat & Entrepreneurship, Mecca, Saudi Arabia
来源
2014 IEEE/ACM 7TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC) | 2014年
关键词
data pre-processing; data mining; outliers; missing values;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data generation, handling and its processing have emerged as the most reliable source of understanding and discovery of new facts, knowledge and products in the world of natural and material sciences. The emergence of the most efficient techniques in statistical or bioinformatics situations has therefore become a routine practice in research and industrial sectors. Under practical conditions, dealing with large datasets, it's likely to have inconsistencies and anomalies of all kinds to prevent to know real outcomes for practical problems. For accurate data mining computer based techniques of data pre-processing offer solutions that help the data under processing to conform normal structures which in turn considerably improve the performance of machine learning algorithms. In this process, accurate determination of outliers, extreme values and filling up gaps poses formidable challenges. Multiple methodologies have therefore been developed to detect these deviated or inconsistent values called outliers. Different data pre-processing techniques discussed in this paper could offer most suitable solutions for handling missing values and outliers in all kinds of large datasets such as electric load and weather datasets.
引用
收藏
页码:451 / 456
页数:6
相关论文
共 50 条
  • [21] Analysis of activity detection data pre-processing
    Alexan, Anca
    Alexan, Alexandru
    Stefan, Oniga
    Pap, Iuliu Alexandru
    2019 IEEE 25TH INTERNATIONAL SYMPOSIUM FOR DESIGN AND TECHNOLOGY IN ELECTRONIC PACKAGING (SIITME 2019), 2019, : 282 - 286
  • [22] A study on data pre-processing in reverse engineering
    Liu Deping
    Shangguan Jianlin
    Chen Jianjun
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MECHANICAL TRANSMISSIONS, VOLS 1 AND 2, 2006, : 1428 - 1432
  • [23] Improving Pipelining Tools for Pre-processing Data
    Novo-Loures, Maria
    Lage, Yeray
    Pavon, Reyes
    Laza, Rosalia
    Ruano-Ordas, David
    Ramon Mendez, Jose
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2022, 7 (04): : 214 - 224
  • [24] STUDY ON DATA PRE-PROCESSING IN WEB MINING BASED E-COMMERCE RECOMMENDATION SYSTEMS
    Ya, Luo
    2011 3RD INTERNATIONAL CONFERENCE ON COMPUTER TECHNOLOGY AND DEVELOPMENT (ICCTD 2011), VOL 3, 2012, : 667 - 670
  • [25] Data pre-processing for analyzing microbiome data - A mini review
    Zhou, Ruwen
    Ng, Siu Kin
    Sung, Joseph Jao Yiu
    Goh, Wilson Wen Bin
    Wong, Sunny Hei
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2023, 21 : 4804 - 4815
  • [26] Data pre-processing for cardiovascular disease classification: A systematic literature review
    Javid, Irfan
    Ghazali, Rozaida
    Zulqarnain, Muhammad
    Hassan, Norlida
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (01) : 1525 - 1545
  • [27] The application of data pre-processing technology in the geoscience big data
    Wang ChengBin
    Ma XiaoGang
    Chen JianGuo
    ACTA PETROLOGICA SINICA, 2018, 34 (02) : 303 - 313
  • [28] A Study of the Data Pre-Processing Module of the Dendritic Cell Evolutionary Algorithm
    Chelly, Zeineb
    Elouedi, Zied
    2014 INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT), 2014, : 634 - 639
  • [29] AIS Data Pre-Processing for Trajectory Clustering Data Preparation
    Hartawan, I. Putu Noven
    Widyantara, I. Made Oka
    Karyawati, A. A. I. N. E.
    Er, Ngurah Indra
    Artana, Ketut Buda
    Sastra, Nyoman Putra
    PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON AEROSPACE ELECTRONICS AND REMOTE SENSING TECHNOLOGY (ICARES 2021), 2021,
  • [30] A Conversation on Data Mining Strategies in LC-MS Untargeted Metabolomics: Pre-Processing and Pre-Treatment Steps
    Tugizimana, Fidele
    Steenkamp, Paul A.
    Piater, Lizelle A.
    Dubery, Ian A.
    METABOLITES, 2016, 6 (04):