HIGHLY ROBUST METHODS IN DATA MINING

被引:11
作者
Kalina, Jan [1 ]
机构
[1] Acad Sci Czech Republ, Inst Comp Sci, Vodarenskou Vezi 2, Prague 18207 8, Czech Republic
关键词
Data mining; robust statistics; High-dimensional data; Cluster analysis; Logistic regression; Neuralnetworks;
D O I
10.5937/sjm8-3226
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper is devoted to highly robust methods for information extraction from data, with a special attention paid to methods suitable for management applications. The sensitivity of available data mining methods to the presence of outlying measurements in the observed data is discussed as a major drawback of available data mining methods. The paper proposes several newhighly robust methods for data mining, which are based on the idea of implicit weighting of individual data values. Particularly it propose a novel robust method of hierarchical cluster analysis, which is a popular data mining method of unsupervised learning. Further, a robust method for estimating parameters in the logistic regression was proposed. This idea is extended to a robust multinomial logistic classification analysis. Finally, the sensitivity of neural networks to the presence of noise and outlying measurements in the data was discussed. The method for robust training of neural networks for the task of function approximation, which has the form of a robust estimator in nonlinear regression, was proposed.
引用
收藏
页码:9 / 24
页数:16
相关论文
共 50 条
  • [31] THE APPLICATION OF DATA MINING METHODS IN MONITORING OF ECOSYSTEMS
    Bila, Jiri
    Jura, Jakub
    MENDEL 2008, 2008, : 263 - 268
  • [32] Utilization of Data Mining Methods in Manufacturing Industry
    Tyleckova, Eva
    Noskievicova, Darja
    2021 22ND INTERNATIONAL CARPATHIAN CONTROL CONFERENCE (ICCC), 2021, : 284 - 289
  • [33] Advanced microstructure classification by data mining methods
    Gola, Jessica
    Britz, Dominik
    Staudt, Thorsten
    Winter, Marc
    Schneider, Andreas Simon
    Ludovici, Marc
    Muecklich, Frank
    COMPUTATIONAL MATERIALS SCIENCE, 2018, 148 : 324 - 335
  • [34] Predicting Academic Performance by Data Mining Methods
    Vandamme, J. -P.
    Meskens, N.
    Superby, J. -F.
    EDUCATION ECONOMICS, 2007, 15 (04) : 405 - 419
  • [35] A family of optimization based data mining methods
    Shi, Yong
    Liu, Rong
    Yan, Nian
    Chen, Zhenxing
    PROGRESS IN WWW RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2008, 4976 : 26 - +
  • [36] Application of data mining methods in eluxyl process
    Ren, Jia
    Ma, Chaoyang
    Su, Hongye
    Chu, Jian
    WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 7697 - 7701
  • [37] Methods of Data Mining for Quality Assurance in Glassworks
    Pasko, Lukasz
    Litwin, Pawel
    COLLABORATIVE NETWORKS AND DIGITAL TRANSFORMATION, 2019, : 185 - 192
  • [38] APPLICATION OF METHODS OF DATA MINING IN THE EDUCATIONAL PROCESS
    Abbasov, Ali
    Aliyeva, Tarana
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON CONTROL AND OPTIMIZATION WITH INDUSTRIAL APPLICATIONS, VOL I, 2018, : 37 - 39
  • [39] DATA MINING METHODS FOR PREDICTION OF AIR POLLUTION
    Siwek, Krzysztof
    Osowski, Stanislaw
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2016, 26 (02) : 467 - 478
  • [40] The Methods and Key Issues of Data Mining on Landslide
    Cui Yun
    Kong Jiming
    Sun Feng
    Ni Zhenqiang
    2011 AASRI CONFERENCE ON APPLIED INFORMATION TECHNOLOGY (AASRI-AIT 2011), VOL 1, 2011, : 383 - 386