AUDIO FEATURE EXTRACTION FOR VEHICLE ENGINE NOISE CLASSIFICATION

被引：0

作者：

Becker, Luca ^{[1
]}

Nelus, Alexandra ^{[1
]}

Gauer, Johannes ^{[1
]}

Rudolph, Lars ^{[1
]}

Martin, Rainer ^{[1
]}

机构：

[1] Ruhr Univ Bochum, Inst Commun Acoust, Bochum, Germany

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

关键词：

Classification; modulation per-channel energy normalization; siamese neural network; vehicle engine noise classification; privacy; RECOGNITION; NETWORKS;

D O I：

10.1109/icassp40776.2020.9053117

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper we propose a new scheme for vehicle engine noise classification as a more privacy-preserving alternative to classifying vehicles based on video recordings. We establish two scenarios: diesel vs. petrol and heavy goods vehicle vs. personal car classification. Our approach includes a novel modulation-spectrum-based feature representation that is used in conjunction with a siamese neural network classifier. Additionally, a database containing recordings from diverse urban acoustic scenarios is provided. The obtained results show the advantage of the proposed approach compared to conventional feature representations and classifiers. This is achieved by de-correlating background noise from target noise and by quantifying the degree of variation of noise characteristics.

引用

页码：711 / 715

页数：5

共 20 条

[1] Hybridizing Extreme Learning Machines and Genetic Algorithms to select acoustic features in vehicle classification applications
Alexandre, E.
Cuadra, L.
Salcedo-Sanz, S.
Pastor-Sanchez, A.
Casanova-Mateo, C.
[J]. NEUROCOMPUTING, 2015, 152 : 58 - 68
[2] Bromley J., 1993, International Journal of Pattern Recognition and Artificial Intelligence, V7, P669, DOI 10.1142/S0218001493000339
[3] Learning a similarity metric discriminatively, with application to face verification
Chopra, S
Hadsell, R
LeCun, Y
[J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 539 - 546
[4] COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES
DAVIS, SB
MERMELSTEIN, P
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04): : 357 - 366
[5] Ebbers J., 2018, P ANN M GER AC SOC D, P1518
[6] Gergen S, 2015, 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, P1992
[7] He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]
[8] Kingma DP, 2014, ADV NEUR IN, V27
[9] LOGAN B., 2000, ISMIR, P1
[10] Per-Channel Energy Normalization: Why and How
Lostanlen, Vincent
Salamon, Justin
Cartwright, Mark
McFee, Brian
Farnsworth, Andrew
Kelling, Steve
Bello, Juan Pablo
[J]. IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (01) : 39 - 43

← 1 2 →