HIGHLY ROBUST METHODS IN DATA MINING

被引:11
作者
Kalina, Jan [1 ]
机构
[1] Acad Sci Czech Republ, Inst Comp Sci, Vodarenskou Vezi 2, Prague 18207 8, Czech Republic
关键词
Data mining; robust statistics; High-dimensional data; Cluster analysis; Logistic regression; Neuralnetworks;
D O I
10.5937/sjm8-3226
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper is devoted to highly robust methods for information extraction from data, with a special attention paid to methods suitable for management applications. The sensitivity of available data mining methods to the presence of outlying measurements in the observed data is discussed as a major drawback of available data mining methods. The paper proposes several newhighly robust methods for data mining, which are based on the idea of implicit weighting of individual data values. Particularly it propose a novel robust method of hierarchical cluster analysis, which is a popular data mining method of unsupervised learning. Further, a robust method for estimating parameters in the logistic regression was proposed. This idea is extended to a robust multinomial logistic classification analysis. Finally, the sensitivity of neural networks to the presence of noise and outlying measurements in the data was discussed. The method for robust training of neural networks for the task of function approximation, which has the form of a robust estimator in nonlinear regression, was proposed.
引用
收藏
页码:9 / 24
页数:16
相关论文
共 50 条
  • [41] Pre-Processing Methods of Data Mining
    Saleem, Asma
    Asif, Khadim Hussain
    Ali, Ahmad
    Awan, Shahid Mahmood
    AlGhamdi, Mohammed A.
    [J]. 2014 IEEE/ACM 7TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC), 2014, : 451 - 456
  • [42] A review of data mining methods in financial markets
    Liu, Haihua
    Huang, Shan
    Wang, Peng
    Li, Zejun
    [J]. DATA SCIENCE IN FINANCE AND ECONOMICS, 2021, 1 (04): : 362 - 392
  • [43] Applying data mining methods to structural identification
    Michaela, Horalova Kalinova
    German, Michalconok
    Darja, Gabriska
    [J]. ADVANCED DESIGNS AND RESEARCHES FOR MANUFACTURING, PTS 1-3, 2013, 605-607 : 2279 - 2283
  • [44] The approach of data mining methods for medical database
    Su, JL
    Wu, GZ
    Chao, IP
    [J]. PROCEEDINGS OF THE 23RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-4: BUILDING NEW BRIDGES AT THE FRONTIERS OF ENGINEERING AND MEDICINE, 2001, 23 : 3824 - 3826
  • [45] A Survey on Data Mining Methods for Clustering Complex Spatiotemporal Data
    Maciag, Piotr S.
    [J]. BEYOND DATABASES, ARCHITECTURES AND STRUCTURES: TOWARDS EFFICIENT SOLUTIONS FOR DATA ANALYSIS AND KNOWLEDGE REPRESENTATION, 2017, 716 : 115 - 126
  • [46] Analysis Methods of Workflow Execution Data Based on Data Mining
    Feng Lei
    Chen Hexin
    [J]. WKDD: 2009 SECOND INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2009, : 116 - +
  • [47] Clustering fMRI data with a robust unsupervised learning algorithm for neuroscience data mining
    Aljobouri, Hadeel K.
    Jaber, Hussain A.
    Kocak, Orhan M.
    Algin, Oktay
    Cankaya, Ilyas
    [J]. JOURNAL OF NEUROSCIENCE METHODS, 2018, 299 : 45 - 54
  • [48] Application of hybrid data mining methods to increase profitability of heavy oil production
    Korovin, Iakov
    Khisamutdinov, Maxim
    Schaefer, Gerald
    Kalyaev, Anatoly
    [J]. 2016 5TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS AND VISION (ICIEV), 2016, : 1149 - 1152
  • [49] PAKDD Data Mining Competition 2009: New Ways of Using Known Methods
    Linhart, Chaim
    Harari, Guy
    Abramovich, Sharon
    Buchris, Altina
    [J]. NEW FRONTIERS IN APPLIED DATA MINING, 2010, 5669 : 99 - +
  • [50] Customer credit quality assessments using data mining methods for banking industries
    Huang, Shian-Chang
    Wu, Cheng-Feng
    [J]. AFRICAN JOURNAL OF BUSINESS MANAGEMENT, 2011, 5 (11): : 4438 - 4445