Learning Understandable Neural Networks With Nonnegative Weight Constraints

被引:111
作者
Chorowski, Jan [1 ,2 ]
Zurada, Jacek M. [2 ,3 ]
机构
[1] Univ Wroclaw, Dept Math & Comp Sci, PL-50137 Wroclaw, Poland
[2] Univ Louisville, Dept Elect & Comp Engn, Louisville, KY 40292 USA
[3] Univ Social Sci, Inst Informat Technol, PL-90113 Lodz, Poland
关键词
Multilayer perceptron; pattern analysis; supervised learning; white-box models; ALGORITHMS; RULES;
D O I
10.1109/TNNLS.2014.2310059
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
People can understand complex structures if they relate to more isolated yet understandable concepts. Despite this fact, popular pattern recognition tools, such as decision tree or production rule learners, produce only flat models which do not build intermediate data representations. On the other hand, neural networks typically learn hierarchical but opaque models. We show how constraining neurons' weights to be nonnegative improves the interpretability of a network's operation. We analyze the proposed method on large data sets: the MNIST digit recognition data and the Reuters text categorization data. The patterns learned by traditional and constrained network are contrasted to those learned with principal component analysis and nonnegative matrix factorization.
引用
收藏
页码:62 / 69
页数:8
相关论文
共 38 条
[1]   Survey and critique of techniques for extracting rules from trained artificial neural networks [J].
Andrews, R ;
Diederich, J ;
Tickle, AB .
KNOWLEDGE-BASED SYSTEMS, 1995, 8 (06) :373-389
[2]  
[Anonymous], 1991, ADV NEURAL INFORM PR
[3]  
[Anonymous], 2010, EUR S ART NEUR NETW
[4]  
[Anonymous], 2007, P 20 INT C NEURAL IN
[5]  
[Anonymous], 2011, PROC 28 INT C MACH L
[6]   Using neural network rule extraction and decision tables for credit-risk evaluation [J].
Baesens, B ;
Setiono, R ;
Mues, C ;
Vanthienen, J .
MANAGEMENT SCIENCE, 2003, 49 (03) :312-329
[7]   The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network [J].
Bartlett, PL .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (02) :525-536
[8]   The ''independent components'' of natural scenes are edge filters [J].
Bell, AJ ;
Sejnowski, TJ .
VISION RESEARCH, 1997, 37 (23) :3327-3338
[9]   Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127
[10]  
Bishop CM, 1995, Neural Networks for Pattern Recognition