Novel dynamic center based binary and ternary pattern network using M4 pooling for real world voice recognition

被引:22
作者
Tuncer, Turker [1 ]
Dogan, Sengul [1 ]
机构
[1] Firat Univ, Technol Fac, Dept Digital Forens Engn, Elazig, Turkey
关键词
Dynamic center based binary ternary; Pattern network; M4; pooling; Voice classification; Pattern recognition; Real world audio recognition; CLASSIFICATION;
D O I
10.1016/j.apacoust.2019.06.029
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The signal processing is one the very important research area in the computer sciences and artificial intelligence. Because, audio recognition, voice activity detection, disease diagnosis, brain activity detection and predictions methods are evaluated using signal processing methods. Nowadays, deep methods have been become popular in the signal processing applications. In this article, a novel hybrid feature extraction network by using novel approximations and multiple pooling method. The proposed method uses both binary pattern (BP) and ternary pattern (TP) as feature extractor. In order to extract variable and distinctive features, dynamic center based feature extraction strategy is used. Hence, the proposed feature extraction network is called as dynamic center based binary and ternary pattern network (DC-BTPNet). The proposed DC-BTPNet is consists of 9 layer. Also, a novel multiple pooling method is used in DC-BTPNet. In order to select features, neighborhood component analysis (NCA) is utilized. Finally, the extracted features are forwarded to polynomial kernel support vector machine (SVM). In order to evaluate performance of the proposed method, a novel dataset is created. The proposed DC-BTPNet based multiple learning method achieved 89.0% accuracy rates and it was compared to other state-of-art convolutional networks. Other well-known conventional classifiers are also used for instance linear discriminant analysis (LDA), k nearest neighbor (KNN) and bagged tree (BT) classifiers are used to compare performance of the classifiers. The comparisons and results clearly proved success of the DC-BTPNet. These results demonstrated that the proposed methods can be achieved successful results in larger datasets. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:176 / 185
页数:10
相关论文
共 43 条
[1]  
Abdalrahman Roaya Salhalden A., 2018, Procedia Computer Science, V131, P1223, DOI 10.1016/j.procs.2018.04.334
[2]   Neural mechanisms for voice recognition [J].
Andics, Attila ;
McQueen, James M. ;
Petersson, Karl Magnus ;
Gal, Viktor ;
Rudas, Gabor ;
Vidnyanszky, Zoltan .
NEUROIMAGE, 2010, 52 (04) :1528-1540
[3]  
[Anonymous], INF SCI
[4]  
[Anonymous], PATTERN RECOGN
[5]  
[Anonymous], DIGITAL IMAGE FORENS
[6]  
[Anonymous], ANTIFORENSICS DIGITA
[7]  
[Anonymous], 2016, ARXIV160207360
[8]  
[Anonymous], J VOICE
[9]  
[Anonymous], 2004, IEEE T CIRCUITS SYST
[10]  
[Anonymous], EUR J TECH