Research on Family Activity Recognition Method Based on Additive Margin Capsule Network

被引:0
作者
Zheng Q.-H. [1 ]
Wang Z.-Q. [1 ,2 ]
Liu B.-T. [1 ,2 ]
Chen Y. [1 ]
Chen Y.-R. [2 ]
机构
[1] School of Information Science and Engineering, Changzhou University, Changzhou, 213164, Jiangsu
[2] College of Information Science and Technology, Zhejiang Shuren University, Hangzhou, 310015, Zhejiang
来源
Tien Tzu Hsueh Pao/Acta Electronica Sinica | 2020年 / 48卷 / 08期
关键词
Additive margin softmax; Capsule network; Classification of acoustic events; Family activity recognition;
D O I
10.3969/j.issn.0372-2112.2020.08.017
中图分类号
学科分类号
摘要
We study the method of family activity recognition based on audio and propose a capsule neural network recognition model based on additive margin.In view of the drawbacks of the traditional capsule neural network objective function only with the output capsule mode length as the constraint, this paper adds a Transition layer to the capsule neural network structure from the perspective of geometry and uses the Transition layer to rebase the capsule unit spatial relationship to the one-dimensional.Then, using the additive margin Softmax as the objective function, the change of similar features is small, and the difference of non-similar features is used as the optimization strategy to construct the objective function based on the capsule vector space relationship to improve model classification ability.Finally, test this method by classified identified for audio events for family activities.Selecting Detection and Classification of Acoustic Scenes and Events(DCASE)2018 Challenge Task 5 as a dataset for classifier construction and testing, with a final average F1 score of 92.3%, which is superior to other mainstream methods. © 2020, Chinese Institute of Electronics. All right reserved.
引用
收藏
页码:1580 / 1586
页数:6
相关论文
共 17 条
[1]  
Nathan V, Paul S, Prioleau T, Et al., A survey on smart homes for aging in place: Toward solutions to the specific needs of the elderly, IEEE Signal Processing Magazine, 35, 5, pp. 111-119, (2018)
[2]  
Sophiya E, Jothilakshmi S., Large scale data based audio scene classification, International Journal of Speech Technology, 21, 4, pp. 825-836, (2018)
[3]  
Ferguson E L, Ramakrishnan R, Williams S B, Et al., Deep learning approach to passive monitoring of the underwater acoustic environment, The Journal of the Acoustical Society of America, 140, 4, pp. 3351-3351, (2016)
[4]  
Kasnesis P, Tatlas N A, Mitilineos S A, Et al., Acoustic sensor data flow for cultural heritage monitoring and safeguarding, Sensors, 19, 7, (2019)
[5]  
Lapuschkin S, Waldchen S, Binder A, Et al., Unmasking clever hans predictors and assessing what machines really learn, Nature communications, 10, 1, pp. 1-8, (2019)
[6]  
Keren G, Schuller B., Convolutional RNN: An enhanced model for extracting features from sequential data, 2016 International Joint Conference on Neural Networks, pp. 3412-3419, (2016)
[7]  
Chew J, Sun Y, Jayasinghe L, Et al., DCASE 2018 Challenge: Solution for task 5, (2018)
[8]  
Sabour S, Frosst N, Hinton G E., Dynamic routing between capsules, Advances in neural information processing systems, pp. 3856-3866, (2017)
[9]  
REN Kai-xu, WANG Yu-long, LIU Tong-cun, LI Wei, A probabilistic matrix factorization model based on multidimensional semantic representation learning, Acta Electronica Sinica, 47, 9, pp. 1848-1854, (2019)
[10]  
JIA Xudong, WANG Li, Text classification model based on multi-head attention capsule networks, Journal of Tsinghua University(Science and Technology), 60, 5, pp. 415-421, (2020)