TCNN: TEMPORAL CONVOLUTIONAL NEURAL NETWORK FOR REAL-TIME SPEECH ENHANCEMENT IN THE TIME DOMAIN

被引:0
作者
Pandey, Ashutosh [1 ]
Wang, DeLiang [1 ,2 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Ohio State Univ, Ctr Cognit & Brain Sci, Columbus, OH 43210 USA
来源
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年
关键词
noise-independent and speaker-independent speech enhancement; real-time implementation; time domain; temporal convolutional neural network; TCNN; NOISE;
D O I
10.1109/icassp.2019.8683634
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This work proposes a fully convolutional neural network (CNN) for real-time speech enhancement in the time domain. The proposed CNN is an encoder-decoder based architecture with an additional temporal convolutional module (TCM) inserted between the encoder and the decoder. We call this architecture a Temporal Convolutional Neural Network (TCNN). The encoder in the TCNN creates a low dimensional representation of a noisy input frame. The TCM uses causal and dilated convolutional layers to utilize the encoder output of the current and previous frames. The decoder uses the TCM output to reconstruct the enhanced frame. The proposed model is trained in a speaker-and noise-independent way. Experimental results demonstrate that the proposed model gives consistently better enhancement results than a state-of-the-art real-time convolutional recurrent model. Moreover, since the model is fully convolutional, it has much fewer trainable parameters than earlier models.
引用
收藏
页码:6875 / 6879
页数:5
相关论文
共 50 条
  • [21] A time-frequency smoothing neural network for speech enhancement
    Yuan, Wenhao
    SPEECH COMMUNICATION, 2020, 124 : 75 - 84
  • [22] Implementation of Real-Time Speech Separation Model Using Time-Domain Audio Separation Network (TasNet) and Dual-Path Recurrent Neural Network (DPRNN)
    Wijayakusuma, Alfian
    Gozali, Davin Reinaldo
    Widjaja, Anthony
    Ham, Hanry
    5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL INTELLIGENCE 2020, 2021, 179 : 762 - 772
  • [23] Lightweight Real-Time Recurrent Models for Speech Enhancement and Automatic Speech Recognition
    Dhahbi, Sami
    Saleem, Nasir
    Gunawan, Teddy Surya
    Bourouis, Sami
    Ali, Imad
    Trigui, Aymen
    Algarni, Abeer D.
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2024, 8 (06): : 74 - 85
  • [24] SPEECH SEGMENT CLUSTERING FOR REAL-TIME EXEMPLAR-BASED SPEECH ENHANCEMENT
    Nesbitt, David
    Crookes, Danny
    Ming, Ji
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5419 - 5423
  • [25] Hybrid Dilated and Recursive Recurrent Convolution Network for Time-Domain Speech Enhancement
    Song, Zhendong
    Ma, Yupeng
    Tan, Fang
    Feng, Xiaoyi
    APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [26] Real-Time Short-Term Voltage Stability Assessment Using Combined Temporal Convolutional Neural Network and Long Short-Term Memory Neural Network
    Adhikari, Ananta
    Naetiladdanon, Sumate
    Sangswang, Anawach
    APPLIED SCIENCES-BASEL, 2022, 12 (13):
  • [27] A New Framework for Supervised Speech Enhancement in the Time Domain
    Pandey, Ashutosh
    Wang, Deliang
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1136 - 1140
  • [28] LOW-COMPLEXITY, REAL-TIME JOINT NEURAL ECHO CONTROL AND SPEECH ENHANCEMENT BASED ON PERCEPNET
    Valin, Jean-Marc
    Tenneti, Srikanth
    Helwani, Karim
    Isik, Umut
    Krishnaswamy, Arvindh
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7133 - 7137
  • [29] COMBINING DEEP NEURAL NETWORKS AND BEAMFORMING FOR REAL-TIME MULTI-CHANNEL SPEECH ENHANCEMENT USING A WIRELESS ACOUSTIC SENSOR NETWORK
    Ceolini, Enea
    Liu, Shih-Chii
    2019 IEEE 29TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2019,
  • [30] Speech Enhancement using Fully Convolutional UNET and Gated Convolutional Neural Network
    Baloch, Danish
    Abdullah, Sidrah
    Qaiser, Asma
    Ahmed, Saad
    Nasim, Faiza
    Kanwal, Mehreen
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (11) : 831 - 836