Adaptation of deep learning auditory event recognition and detection in audio surveillance systems

被引：0

作者：

Alsubhi, Sara ^{[1
]}

Alkabsani, Ahad ^{[1
]}

Endargiri, Safiah ^{[1
]}

Laabidi, Kaouther ^{[1
,2
]}

机构：

[1] Univ Jeddah, Coll Comp Sci & Engn, Jeddah 23445, Saudi Arabia

[2] Univ Tunis El Manar, Tunis 1068, Tunisia

来源：

INTERNATIONAL JOURNAL OF MODELLING IDENTIFICATION AND CONTROL | 2021年 / 38卷 / 3-4期

关键词：

classification; automatic speech recognition; NLP; natural language processing; ANN; artificial neural network; deep learning; CNN; convolutional neural network; audio detection; speaker recognition; speech recognition; Mel spectrogram; CCTV; closed circuit television;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The work of this paper focuses on the idea of adapting computerised machines with the sophisticated abilities to relatively comprehend and act upon an auditory input of natural linguistic nature. In this paper, we emphasise the addition of acoustic-based audio inputs to the current closed circuit television (CCTV) systems for the goal of compensating any incomplete data and to reach the maximum utilisation of the current surveillance systems. In this model, we apply the isolated word technique on a dataset of 8000 audio inputs dedicated to different individuals through the application of two distinct neural networks. The algorithm provides event-based detection capabilities by allowing the detection of unauthorised accesses through the automatic recognition of each spoken input together with the identity of the speaker. The proposed algorithm obtained accuracy rates of 84.1% and 80.1% for both the recognition by the speaker's identity and the spoken input recognition. In addition, it showed its superiority over the support vector machines (SVM) based model.

引用

页码：241 / 247

页数：7

共 20 条

[1] Deep Audio-Visual Speech Recognition
Afouras, Triantafyllos
Chung, Joon Son
Senior, Andrew
Vinyals, Oriol
Zisserman, Andrew
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 8717 - 8727
[2] [Anonymous], 2005, AUTOMATIC SPEECH REC
[3] Beckert S, 2018, Columbia Stud Hist U, P1
[4] Bezoui M, 2019, IAES INT J ARTIF INT, V8, P7
[5] Bock S., 2012, Proceedings of the 13th International Society for Music Information Retrieval Conference, ISMIR, Porto, Portugal, P49
[6] Chandrasekhar Vijay., 2011, ISMIR, V20, P801
[7] Cook A., 2002, ASSISTIVE TECHNOLOGI, V2nd
[8] Audio Example Recognition and Retrieval Based on Geometric Incremental Learning Support Vector Machine System
Fan, Linyuan
[J]. IEEE ACCESS, 2020, 8 : 78630 - 78638
[9] Gowrishankar B. S., 2016, 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES), P140, DOI 10.1109/SCOPES.2016.7955698
[10] Deep Neural Networks for Acoustic Modeling in Speech Recognition
Hinton, Geoffrey
Deng, Li
Yu, Dong
Dahl, George E.
Mohamed, Abdel-rahman
Jaitly, Navdeep
Senior, Andrew
Vanhoucke, Vincent
Patrick Nguyen
Sainath, Tara N.
Kingsbury, Brian
[J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 82 - 97

← 1 2 →