Associative learning on imbalanced environments: An empirical study

被引:12
作者
Cleofas-Sanchez, L. [1 ]
Sanchez, J. S. [1 ]
Garcia, V. [2 ]
Valdovinos, R. M. [3 ]
机构
[1] Univ Jaume 1, Dept Comp Languages & Syst, Inst New Imaging Technol, Castellon de La Plana, Spain
[2] Univ Autonoma Ciudad Juarez, Div Multidisciplinaria Ciudad Univ, Ciudad Juarez, Chihuahua, Mexico
[3] Univ Autonoma Estado Mexico, Sch Engn, Toluca, Mexico
关键词
Associative memory; Class imbalance; Resampling; NEURAL-NETWORKS; CLASSIFICATION; ALGORITHMS; ENSEMBLES; FEATURES; MEMORY;
D O I
10.1016/j.eswa.2015.10.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Associative memories have emerged as a powerful computational neural network model for several pattern classification problems. Like most traditional classifiers, these models assume that the classes share similar prior probabilities. However, in many real-life applications the ratios of prior probabilities between classes are extremely skewed. Although the literature has provided numerous studies that examine the performance degradation of renowned classifiers on different imbalanced scenarios, so far this effect has not been supported by a thorough empirical study in the context of associative memories. In this paper, we fix our attention on the applicability of the associative neural networks to the classification of imbalanced data. The key questions here addressed are whether these models perform better, the same or worse than other popular classifiers, how the level of imbalance affects their performance, and whether distinct resampling strategies produce a different impact on the associative memories. In order to answer these questions and gain further insight into the feasibility and efficiency of the associative memories, a large-scale experimental evaluation with 31 databases, seven classification models and four resampling algorithms is carried out here, along with a non-parametric statistical test to discover any significant differences between each pair of classifiers. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:387 / 397
页数:11
相关论文
共 65 条
[1]   Applying support vector machines to imbalanced datasets [J].
Akbani, R ;
Kwek, S ;
Japkowicz, N .
MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 :39-50
[2]  
Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
[3]   Collaborative learning based on associative models: Application to pattern classification in medical datasets [J].
Aldape-Perez, Mario ;
Yanez-Marquez, Cornelio ;
Camacho-Nieto, Oscar ;
Lopez-Yanez, Itzama ;
Argueelles-Cruz, Amadeo-Jose .
COMPUTERS IN HUMAN BEHAVIOR, 2015, 51 :771-779
[4]   An associative memory approach to medical decision support systems [J].
Aldape-Perez, Mario ;
Yanez-Marquez, Cornelio ;
Camacho-Nieto, Oscar ;
Argueelles-Cruz, Amadeo J. .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2012, 106 (03) :287-307
[5]  
ANDERSON J A, 1972, Mathematical Biosciences, V14, P197, DOI 10.1016/0025-5564(72)90075-2
[6]  
Barandela R, 2004, LECT NOTES COMPUT SC, V3138, P806
[7]   MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning [J].
Barua, Sukarna ;
Islam, Md. Monirul ;
Yao, Xin ;
Murase, Kazuyuki .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) :405-425
[8]   Class prediction for high-dimensional class-imbalanced data [J].
Blagus, Rok ;
Lusa, Lara .
BMC BIOINFORMATICS, 2010, 11 :523
[9]   An experimental comparison of classification algorithms for imbalanced credit scoring data sets [J].
Brown, Iain ;
Mues, Christophe .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (03) :3446-3453
[10]   Hybrid probabilistic sampling with random subspace for imbalanced data learning [J].
Cao, Peng ;
Zhao, Dazhe ;
Zaiane, Osmar .
INTELLIGENT DATA ANALYSIS, 2014, 18 (06) :1089-1108