Computational Model Based on Neural Network of Visual Cortex for Human Action Recognition

被引:29
作者
Liu, Haihua [1 ,2 ,3 ]
Shu, Na [1 ]
Tang, Qiling [1 ]
Zhang, Wensheng [4 ]
机构
[1] South Cent Univ Nationalities, Sch Biomed Engn, Wuhan 430074, Hubei, Peoples R China
[2] Key Lab Cognit Sci State Ethn Affairs Commiss, Wuhan 430074, Hubei, Peoples R China
[3] Hubei Key Lab Med Informat Anal & Tumor Diag & Tr, Wuhan 430074, Hubei, Peoples R China
[4] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; classical receptive field (RF); spiking neural networks (SNNs); surround suppression; visual cortex; CELL RECEPTIVE-FIELDS; SPATIOTEMPORAL ORGANIZATION; MOTION; FEATURES; ARCHITECTURE; ENHANCEMENT; SUPPRESSION; SELECTIVITY; DYNAMICS;
D O I
10.1109/TNNLS.2017.2669522
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a bioinspired model for human action recognition through modeling neural mechanisms of information processing in two visual cortical areas: the primary visual cortex (V1) and the middle temporal cortex (MT) dedicated to motion. This model, named V1-MT, is composed of V1 and MT models (layers) corresponding to their cortical areas, which are built with layered spiking neural networks (SNNs). Some neuron properties in V1 and MT, such as direction and speed selectivity, spatiotemporal inseparability, and center surround suppression, are integrated into SNNs. Based on speed and direction selectivity, V1 and MT models contain multiple SNN channels, each of which processes motion information in sequences with spatiotemporal tunings of neurons at a certain speed and different directions. Therefore, we propose two operations, input signal perceiving with 3-D Gabor filters and surround inhibition processing with 3-D differences of Gaussian functions, to perform this task according to the spatiotemporal inseparability and center surround suppression of neurons. Then, neurons are modeled with our simplified integrate-and-fire model and motion information is transformed into spike trains. Afterward, we define a new feature vector: a mean motion map computed from spike trains in all channels to represent human actions. Finally, a support vector machine is trained to classify actions represented by the feature vectors. We conducted extensive experiments on public action databases, and the results show that our model outperforms other bioinspired models and rivals the state-of-the-art approaches.
引用
收藏
页码:1427 / 1440
页数:14
相关论文
共 55 条
[11]  
Escobar M., 2008, ECCV 08, P186
[12]   Action recognition via bio-inspired features: The richness of center-surround interaction [J].
Escobar, Maria-Jose ;
Kornprobst, Pierre .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2012, 116 (05) :593-605
[13]   Action Recognition Using a Bio-Inspired Feedforward Spiking Network [J].
Escobar, Maria-Jose ;
Masson, Guillaume S. ;
Vieville, Thierry ;
Kornprobst, Pierre .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2009, 82 (03) :284-301
[14]   Evaluation of Color STIPs for Human Action Recognition [J].
Everts, Ivo ;
van Gemert, Jan C. ;
Gevers, Theo .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :2850-2857
[15]  
Gao Z., 2014, P IEEE VEH TECHN C V, P1
[16]   Neural mechanisms for the recognition of biological movements [J].
Giese, MA ;
Poggio, T .
NATURE REVIEWS NEUROSCIENCE, 2003, 4 (03) :179-192
[17]   Comparison of texture features based on Gabor filters [J].
Grigorescu, SE ;
Petkov, N ;
Kruizinga, P .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2002, 11 (10) :1160-1167
[18]  
Heng Wang, 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P3169, DOI 10.1109/CVPR.2011.5995407
[19]   A multi-layer sparse coding network learns contour coding from natural images [J].
Hoyer, PO ;
Hyvärinen, A .
VISION RESEARCH, 2002, 42 (12) :1593-1605
[20]   RECEPTIVE FIELDS, BINOCULAR INTERACTION AND FUNCTIONAL ARCHITECTURE IN CATS VISUAL CORTEX [J].
HUBEL, DH ;
WIESEL, TN .
JOURNAL OF PHYSIOLOGY-LONDON, 1962, 160 (01) :106-&