DANNP: an efficient artificial neural network pruning tool

被引:9
作者
Alshahrani, Mona [1 ]
Soufan, Othman [1 ]
Magana-Mora, Arturo [1 ,2 ]
Bajic, Vladimir B. [1 ]
机构
[1] KAUST, CBRC, Thuwal, Saudi Arabia
[2] Natl Inst Adv Ind Sci & Technol, CBBD OIL, Tokyo, Japan
关键词
Artificial neural networks; Pruning; Parallelization; Feature selection; Classification problems; Machine learning; Artificial inteligence; FEATURE-EXTRACTION; FEATURE-SELECTION; INFORMATION; PREDICTION;
D O I
10.7717/peerj-cs.137
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Background. Artificial neural networks (ANNs) are a robust class of machine learning models and are a frequent choice for solving classification problems. However, determining the structure of the ANNs is not trivial as a large number of weights (connection links) may lead to overfitting the training data. Although several ANN pruning algorithms have been proposed for the simplification of ANNs, these algorithms are not able to efficiently cope with intricate ANN structures required for complex classification problems. Methods. We developed DANNP, a web-based tool, that implements parallelized versions of several ANN pruning algorithms. The DANNP tool uses a modified version of the Fast Compressed Neural Network software implemented in C++ to considerably enhance the running time of the ANN pruning algorithms we implemented. In addition to the performance evaluation of the pruned ANNs, we systematically compared the set of features that remained in the pruned ANN with those obtained by different state-of-the-art feature selection (FS) methods. Results. Although the ANN pruning algorithms are not entirely parallelizable, DANNP was able to speed up the ANN pruning up to eight times on a 32-core machine, compared to the serial implementations. To assess the impact of the ANN pruning by DANNP tool, we used 16 datasets from different domains. In eight out of the 16 datasets, DANNP significantly reduced the number of weights by 70%-99%, while maintaining a competitive or better model performance compared to the unpruned ANN. Finally, we used a naive Bayes classifier derived with the features selected as a byproduct of the ANN pruning and demonstrated that its accuracy is comparable to those obtained by the classifiers trained with the features selected by several state-of-the- art FS methods. The FS ranking methodology proposed in this study allows the users to identify the most discriminant features of the problem at hand. To the best of our knowledge, DANNP (publicly available at www.cbrc.kaust.edu.sa/dannp) is the only available and on-line accessible tool that provides multiple parallelized ANN pruning options. Datasets and DANNP code can be obtained at www.cbrc.kaust.edu.sa/dannp/data.php and https:// doi.org/10.5281/zenodo.1001086.
引用
收藏
页数:22
相关论文
共 56 条
[1]   Predictive non-linear modeling of complex data by artificial neural networks [J].
Almeida, JS .
CURRENT OPINION IN BIOTECHNOLOGY, 2002, 13 (01) :72-76
[2]  
Amdahl G. M., 1967, P APR 18 20 1967 SPR, P483, DOI [10.1145/1465482.1465560, DOI 10.1145/1465482.1465560]
[3]  
Anguita D., 2013, ESANN, V3, P3
[4]  
[Anonymous], 1999, P INT ICSC S ADV INT
[5]  
[Anonymous], 1999, Ph.D. Thesis
[6]  
[Anonymous], 1989, NIPS
[7]  
[Anonymous], 2006, PATTERN RECOGN
[8]  
[Anonymous], 2013, P 30 INT C INT C MAC
[9]  
Ashoor H, 2012, SYSTEMIC APPROACHES IN BIOINFORMATICS AND COMPUTATIONAL SYSTEMS BIOLOGY: RECENT ADVANCES, P105, DOI 10.4018/978-1-61350-435-2.ch005
[10]   Promoter prediction analysis on the whole human genome [J].
Bajic, VB ;
Tan, SL ;
Suzuki, Y ;
Sugano, S .
NATURE BIOTECHNOLOGY, 2004, 22 (11) :1467-1473