Using Approximated Auditory Roughness as a Pre-filtering Feature for Human Screaming and Affective Speech AED

被引:3
作者
He, Di [1 ]
Cheng, Zuofu [2 ]
Hasegawa-Johnson, Mark [3 ]
Chen, Deming [1 ]
机构
[1] Univ Illinois, Coordianted Sci Lab, Urbana, IL 61801 USA
[2] Inspirit IoT Inc, Champaign, IL 61822 USA
[3] Univ Illinois, Beckman Inst, Urbana, IL 61801 USA
来源
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年
关键词
Audio Event Detection; pre-filtering; Auditory Roughness; computational complexity;
D O I
10.21437/Interspeech.2017-593
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Detecting human screaming, shouting, and other verbal manifestations of fear and anger are of great interest to security Audio Event Detection (AED) systems. The Internet of Things (IoT) approach allows wide-covering, powerful AED systems to be distributed across the Internet. But a good feature to prefilter the audio is critical to these systems. This work evaluates the potential of detecting screaming and affective speech using Auditory Roughness and proposes a very light-weight approximation method. Our approximation uses a similar amount of Multiple Add Accumulate (MAA) compared to short-term energy (STE), and at least 10x less MAA than MFCC. We evaluated the performance of our approximated roughness on the Mandarin Affective Speech corpus and a subset of the Youtube AudioSet for screaming against other low-complexity features. We show that our approximated roughness returns higher accuracy.
引用
收藏
页码:1914 / 1918
页数:5
相关论文
共 26 条
[1]  
Ahmed T, 2013, INT CONF ACOUST SPEE, P513, DOI 10.1109/ICASSP.2013.6637700
[2]  
[Anonymous], 2017, IEEE ICASSP
[3]  
[Anonymous], P IEEE INT C MULT EX
[4]  
[Anonymous], 2016, ARXIV160909430
[5]  
[Anonymous], 2006, 2006 IEEE OD SPEAK L, DOI DOI 10.1109/ODYSSEY.2006.248084
[6]   Human Screams Occupy a Privileged Niche in the Communication Soundscape [J].
Arnal, Luc H. ;
Flinker, Adeen ;
Kleinschmidt, Andreas ;
Giraud, Anne-Lise ;
Poeppel, David .
CURRENT BIOLOGY, 2015, 25 (15) :2051-2056
[7]  
AURES W, 1985, ACUSTICA, V58, P268
[8]  
Bachu R, 2008, AM SOC ENG ED ASEE Z, P1
[9]  
Banks K, 2002, EMBED SYST PROGRAM, V15, P34
[10]   HOW TO USE THE 2 SAMPLE TERT-TEST [J].
CRESSIE, NAC ;
WHITFORD, HJ .
BIOMETRICAL JOURNAL, 1986, 28 (02) :131-148