Universal approximation capability of cascade correlation for structures

被引:33
作者
Hammer, B [1 ]
Micheli, A
Sperduti, A
机构
[1] Tech Univ Clausthal, Inst Comp Sci, D-38678 Clausthal Zellerfeld, Germany
[2] Univ Pisa, Dipartimento Informat, Pisa, Italy
[3] Univ Padua, Dipartimento Matemat Pura & Applicata, I-35131 Padua, Italy
关键词
D O I
10.1162/0899766053491878
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cascade correlation (CC) constitutes a training method for neural networks that determines the weights as well as the neural architecture during training. Various extensions of CC to structured data have been proposed: recurrent cascade correlation (RCC) for sequences, recursive cascade correlation (RecCC) for tree structures with limited fan-out, and contextual recursive cascade correlation (CRecCC) for rooted directed positional acyclic graphs (DPAGs) with limited fan-in and fan-out. We show that these models possess the universal approximation property in the following sense: given a probability measure P on the input set, every measurable function from sequences into a real vector space can be approximated by a sigmoidal RCC up to any desired degree of accuracy up to inputs of arbitrary small probability. Every measurable function from tree structures with limited fan-out into a real vector space can be approximated by a sigmoidal RecCC with multiplicative neurons up to any desired degree of accuracy up to inputs of arbitrary small probability. For sigmoidal CRecCC networks with multiplicative neurons, we show the universal approximation capability for functions on an important subset of all DPAGs with limited fan-in and fan-out for which a specific linear representation yields unique codes. We give one sufficient structural condition for the latter property, which can easily be tested: the enumeration of ingoing and outgoing edges should be compatible. This property can be fulfilled for every DPAG with fan-in and fan-out two via reenumeration of children and parents, and for larger fan-in and fan-out via an expansion of the fan-in and fan-out and reenumeration of children and parents. In addition, the result can be generalized to the case of input-output isomorphic transductions of structures. Thus, CRecCC networks constitute the first neural models for which the universal approximation capability of functions involving fairly general acyclic graph structures is proved.
引用
收藏
页码:1109 / 1159
页数:51
相关论文
共 42 条
[1]   AN ALGEBRAIC FRAMEWORK TO REPRESENT FINITE-STATE MACHINES IN SINGLE-LAYER RECURRENT NEURAL NETWORKS [J].
ALQUEZAR, R ;
SANFELIU, A .
NEURAL COMPUTATION, 1995, 7 (05) :931-949
[2]  
[Anonymous], LEARNING RECURRENT N
[3]  
[Anonymous], 1991, Advances in Neural Information Processing Systems
[4]   Exploiting the past and the future in protein secondary structure prediction [J].
Baldi, P ;
Brunak, S ;
Frasconi, P ;
Soda, G ;
Pollastri, G .
BIOINFORMATICS, 1999, 15 (11) :937-946
[5]   LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].
BENGIO, Y ;
SIMARD, P ;
FRASCONI, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166
[6]   Input-output HMM's for sequence processing [J].
Bengio, Y ;
Frasconi, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1996, 7 (05) :1231-1249
[7]   Application of cascade correlation networks for structures to chemistry [J].
Bianucci, AM ;
Micheli, A ;
Sperduti, A ;
Starita, A .
APPLIED INTELLIGENCE, 2000, 12 (1-2) :117-146
[8]   Similarity learning for graph-based image representations [J].
de Mauro, C ;
Diligenti, M ;
Gori, M ;
Maggini, M .
PATTERN RECOGNITION LETTERS, 2003, 24 (08) :1115-1122
[9]  
Diligenti M, 2003, IEEE T PATTERN ANAL, V25, P519, DOI 10.1109/TPAMI.2003.1190578
[10]  
Fahlman S., 1990, ADV NEURAL INFORMATI, V2, P524