HIGHLY ROBUST METHODS IN DATA MINING

被引:11
作者
Kalina, Jan [1 ]
机构
[1] Acad Sci Czech Republ, Inst Comp Sci, Vodarenskou Vezi 2, Prague 18207 8, Czech Republic
关键词
Data mining; robust statistics; High-dimensional data; Cluster analysis; Logistic regression; Neuralnetworks;
D O I
10.5937/sjm8-3226
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper is devoted to highly robust methods for information extraction from data, with a special attention paid to methods suitable for management applications. The sensitivity of available data mining methods to the presence of outlying measurements in the observed data is discussed as a major drawback of available data mining methods. The paper proposes several newhighly robust methods for data mining, which are based on the idea of implicit weighting of individual data values. Particularly it propose a novel robust method of hierarchical cluster analysis, which is a popular data mining method of unsupervised learning. Further, a robust method for estimating parameters in the logistic regression was proposed. This idea is extended to a robust multinomial logistic classification analysis. Finally, the sensitivity of neural networks to the presence of noise and outlying measurements in the data was discussed. The method for robust training of neural networks for the task of function approximation, which has the form of a robust estimator in nonlinear regression, was proposed.
引用
收藏
页码:9 / 24
页数:16
相关论文
共 50 条
  • [21] Data mining methods for hydroclimatic forecasting
    Wei, Wenge
    Watkins, David W., Jr.
    ADVANCES IN WATER RESOURCES, 2011, 34 (11) : 1390 - 1400
  • [22] Diabetes Detection by Data Mining Methods
    V. Ambikavathi
    P. Arumugam
    P. Jose
    Wireless Personal Communications, 2023, 133 : 2087 - 2104
  • [23] Feature transformation methods in data mining
    Kusiak, A
    IEEE TRANSACTIONS ON ELECTRONICS PACKAGING MANUFACTURING, 2001, 24 (03): : 214 - 221
  • [24] A Robust Data-Mining Approach to Bankruptcy Prediction
    Divsalar, Mehdi
    Roodsaz, Habib
    Vahdatinia, Farshad
    Norouzzadeh, Ghassem
    Behrooz, Amir Hossein
    JOURNAL OF FORECASTING, 2012, 31 (06) : 504 - 523
  • [25] Modeling and optimization of a wastewater pumping system with data-mining methods
    Zhang, Zijun
    Kusiak, Andrew
    Zeng, Yaohui
    Wei, Xiupeng
    APPLIED ENERGY, 2016, 164 : 303 - 311
  • [26] SELF-ADAPTIVE CUSTOMIZING WITH DATA MINING METHODS A Concept for the Automatic Customizing of an ERP System with Data Mining Methods
    Schult, Rene
    Kassem, Gamal
    ICEIS 2008: PROCEEDINGS OF THE TENTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL ISAS-2: INFORMATION SYSTEMS ANALYSIS AND SPECIFICATION, VOL 2, 2008, : 70 - 75
  • [27] Statistical Methods with Applications in Data Mining: A Review of the Most Recent Works
    Pinto da Costa, Joaquim Fernando
    Cabral, Manuel
    MATHEMATICS, 2022, 10 (06)
  • [28] Highly Robust Statistical Methods in Medical Image Analysis
    Kalina, Jan
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2012, 32 (02) : 3 - 16
  • [29] A review of data mining methods in financial markets
    Liu, Haihua
    Huang, Shan
    Wang, Peng
    Li, Zejun
    DATA SCIENCE IN FINANCE AND ECONOMICS, 2021, 1 (04): : 362 - 392
  • [30] The approach of data mining methods for medical database
    Su, JL
    Wu, GZ
    Chao, IP
    PROCEEDINGS OF THE 23RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-4: BUILDING NEW BRIDGES AT THE FRONTIERS OF ENGINEERING AND MEDICINE, 2001, 23 : 3824 - 3826