TEMPORAL CODING OF LOCAL SPECTROGRAM FEATURES FOR ROBUST SOUND RECOGNITION

被引：0

作者：

Dennis, Jonathan ^{[1
]}

Qiang, Yu ^{[1
]}

Tang Huajin ^{[1
]}

Tran Huy Dat ^{[1
]}

Li Haizhou ^{[1
]}

机构：

[1] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore

来源：

2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年

关键词：

Sound recognition; neural coding; local features; AUTOMATIC SPEECH RECOGNITION; NOISE;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

There is much evidence to suggest that the human auditory system uses localised time-frequency information for the robust recognition of sounds. Despite this, conventional systems typically rely on features extracted from short windowed frames over time,covering the whole frequency spectrum. Such approaches are not inherently robust to noise, as each frame will contain a mixture of the spectral information from noise and signal. Here, we propose a novel approach based on the temporal coding of Local Spectrogram Features (LSFs), which generate spikes that are used to traina Spiking Neural Network (SNN) with temporal learning. LSFs represent robust location information in the spectrogram surrounding keypoints,which are detected in a signal-driven manner such that the effect of noise on the temporal coding is reduced. Our experiments demonstrate the robust performance of our approach a cross a variety of noise conditions, such that it is able to out perform the conventional frame-based baseline methods

引用

页码：803 / 807

页数：5

共 50 条

[21] Noise Robust Speaker Recognition with Convolutive Sparse Coding [J].

Hurmalainen, Antti ;

Saeidi, Rahim ;

Virtanen, Tuomas .

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, :244-248

[22] Spectrum enhancement with sparse coding for robust speech recognition [J].

He, Yongjun ;

Sun, Guanglu ;

Han, Jiqing .

DIGITAL SIGNAL PROCESSING, 2015, 43 :59-70

[23] Histogram equalization of contextual statistics of speech features for robust speech recognition [J].

Hsieh, Hsin-Ju ;

Chen, Berlin ;

Hung, Jeih-weih .

MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (17) :6769-6795

[24] Temporal Auditory Coding Features for Causal Speech Enhancement [J].

Thoidis, Iordanis ;

Vrysis, Lazaros ;

Markou, Dimitrios ;

Papanikolaou, George .

ELECTRONICS, 2020, 9 (10) :1-17

[25] Investigating Modulation Spectrogram Features for Deep Neural Network-based Automatic Speech Recognition [J].

Baby, Deepak ;

Van Hamme, Hugo .

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, :2479-2483

[26] HYBRID CODING OF VISUAL CONTENT AND LOCAL IMAGE FEATURES [J].

Baroffio, Luca ;

Cesana, Matteo ;

Redondi, Alessandro ;

Tagliasacchi, Marco ;

Tubaro, Stefano .

2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, :2530-U1640

[27] Temporal Modulation Spectral Restoration for Robust Speech Recognition [J].

Wang, Svu-Siang ;

Tsao, Yu .

2016 IEEE SECOND INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2016, :481-486

[28] Robust Features with Convolutional Autoencoder Speech Command Recognition [J].

Zilvan, Vicky ;

Khoirunisa, Awalia Agustina ;

Ramdan, Ade ;

Pratiwi, Hasih ;

Nadirman, Firnas ;

Darwis, Fajri ;

Suryawati, Endang ;

Pardede, Hilman F. .

2024 INTERNATIONAL CONFERENCE ON RADAR, ANTENNA, MICROWAVE, ELECTRONICS, AND TELECOMMUNICATIONS, ICRAMET 2024, 2024, :177-181

[29] Combining Binaural and Cortical Features for Robust Speech Recognition [J].

Spille, Constantin ;

Kollmeier, Birger ;

Meyer, Bernd T. .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) :756-767

[30] COMBINING ROBUST SPIKE CODING WITH SPIKING NEURAL NETWORKS FOR SOUND EVENT CLASSIFICATION [J].

Dennis, Jonathan ;

Tran Huy Dat ;

Li, Haizhou .

2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, :176-180

← 1 2 3 4 5 →