SPECTROGRAM-BASED CLASSIFICATION OF SPOKEN FOUL LANGUAGE USING DEEP CNN

被引：10

作者：

Wazir, Abdulaziz Saleh Ba ^{[1
]}

Karim, Hezerul Abdul ^{[1
]}

Abdullah, Mohd Haris Lye ^{[1
]}

Mansor, Sarina ^{[1
]}

AlDahoul, Nouar ^{[1
]}

Fauzi, Mohammad Faizal Ahmad ^{[1
]}

See, John ^{[2
]}

机构：

[1] Multimedia Univ, Fac Engn, Cyberjaya, Malaysia

[2] Multimedia Univ, Fac Comp & Informat, Cyberjaya, Malaysia

来源：

2020 IEEE 22ND INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2020年

关键词：

Foul language; Speech detection; Censorship; Spectrogram; CNN;

D O I：

10.1109/mmsp48831.2020.9287133

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Excessive content of profanity in audio and video files has proven to shape one's character and behavior. Currently, conventional methods of manual detection and censorship are being used. Manual censorship method is time consuming and prone to misdetection of foul language. This paper proposed an intelligent model for foul language censorship through automated and robust detection by deep convolutional neural networks (CNNs). A dataset of foul language was collected and processed for the computation of audio spectrogram images that serve as an input to evaluate the classification of foul language. The proposed model was first tested for 2-class (Foul vs Normal) classification problem, the foul class is then further decomposed into a 10-class classification problem for exact detection of profanity. Experimental results show the viability of proposed system by demonstrating high performance of curse words classification with 1.24-2.71 Error Rate (ER) for 2-class and 5.49-8.30 F1-score. Proposed Resnet50 architecture outperforms other models in terms of accuracy, sensitivity, specificity, F1-score.

引用

页数：6

共 33 条

[1]

Amodei D, 2016, PR MACH LEARN RES, V48

[2]

[Anonymous], 2006, FILM CENSORSHIP ACT

[3]

Arora Prerna, 2017, IEEE INT WORKSH MULT

[4]

Badshah AM, 2017, 2017 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON), P125

[5]

Bandanau D, 2016, INT CONF ACOUST SPEE, P4945, DOI 10.1109/ICASSP.2016.7472618

[6] Classifying environmental sounds using image recognition networks [J].

Boddapati, Venkatesh ;

Petef, Andrej ;

Rasmusson, Jim ;

Lundberg, Lars .

KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS, 2017, 112 :2048-2056

[7]

Bozkurt Elif, 2010, Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR 2010), P3708, DOI 10.1109/ICPR.2010.903

[8]

Chiu CC, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P4774, DOI 10.1109/ICASSP.2018.8462105

[9]

Dahake PP, 2016, 2016 INTERNATIONAL CONFERENCE ON AUTOMATIC CONTROL AND DYNAMIC OPTIMIZATION TECHNIQUES (ICACDOT), P1080, DOI 10.1109/ICACDOT.2016.7877753

[10]

Font F., 2013, P 21 ACM INT C MULT, P411, DOI 10.1145/2502081.2502245

← 1 2 3 4 →