Implementation of Real-Time Speech Separation Model Using Time-Domain Audio Separation Network (TasNet) and Dual-Path Recurrent Neural Network (DPRNN)

被引:3
|
作者
Wijayakusuma, Alfian [1 ]
Gozali, Davin Reinaldo [1 ]
Widjaja, Anthony [1 ]
Ham, Hanry [1 ]
机构
[1] Bina Nusantara Univ, Sch Comp Sci, Comp Sci Dept, Jakarta 11480, Indonesia
来源
5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL INTELLIGENCE 2020 | 2021年 / 179卷
关键词
Speech Separation; Time-Domain; Time-Domain Audio Separation Network; Dual-Path Recurrent Neural Network; Real-Time;
D O I
10.1016/j.procs.2021.01.065
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The purpose of this research is to develop a model that is able to perform real-time speaker independent multi-talker speech separation task in time-domain using Time-Domain Audio Separation Network (TasNet) and Dual-Path Recurrent Neural Network (DPRNN). This research will conduct experiments on some RNN architectures, number of batch size, and optimizers as hyperparameters in order to implement TasNet and DPRNN. This research also try to analyze the impact of these hyperparameters setup on model performance. The expected result of this research is a more accurate model and lower latency to complete speaker independent multi-talker speech separation task in real-time than previous research model. (C) 2021 The Authors. Published by Elsevier B.V.
引用
收藏
页码:762 / 772
页数:11
相关论文
共 50 条
  • [41] Deep Recurrent Neural Network based Monaural Speech Separation using Recurrent Temporal Restricted Boltzmann Machines
    Samui, Suman
    Chakrabarti, Indrajit
    Ghosh, Soumya K.
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3622 - 3626
  • [42] Real-Time Model Predictive Control Using a Self-Organizing Neural Network
    Han, Hong-Gui
    Wu, Xiao-Long
    Qiao, Jun-Fei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (09) : 1425 - 1436
  • [43] DCE-CDPPTnet: Dense Connected Encoder Cross Dual-path Parrel Transformer Network for Multi-channel Speech Separation
    Zhuang, Chenghao
    Zhou, Lin
    Cao, Yanxiang
    Wang, Qirui
    Cheng, Yunling
    2024 13TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, ICCCAS 2024, 2024, : 303 - 308
  • [44] Real -Time Zero -Phase Digital Filter Using Recurrent Neural Network
    Sinjanakhom, Tantep
    Chivapreecha, Sorawat
    2023 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, APCCAS, 2024, : 348 - 352
  • [45] Learning in a neural network model in real time using real world stimuli
    Sánchez-Montañés, MA
    König, P
    Verschure, PFMJ
    NEUROCOMPUTING, 2001, 38 : 859 - 865
  • [46] Time-frequency Domain Filter-and-sum Network for Multi-channel Speech Separation
    Deng, Zhewen
    Zhou, Yi
    Liu, Hongqing
    INTERSPEECH 2023, 2023, : 3689 - 3693
  • [47] A Time-Domain Real-Valued Generalized Wiener Filter for Multi-Channel Neural Separation Systems
    Luo, Yi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 3008 - 3019
  • [48] Real-time iris segmentation model based on lightweight convolutional neural network
    Huo, Guang
    Lin, Dawei
    Liu, Yuanning
    Zhu, Xiaodong
    Yuan, Meng
    JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (04)
  • [49] Real-Time Tool Localization for Laparoscopic Surgery Using Convolutional Neural Network
    Benavides, Diego
    Cisnal, Ana
    Fonturbel, Carlos
    de la Fuente, Eusebio
    Fraile, Juan Carlos
    SENSORS, 2024, 24 (13)
  • [50] Time-Domain Computing in Memory Using Spintronics for Energy-Efficient Convolutional Neural Network
    Zhang, Yue
    Wang, Jinkai
    Lian, Chenyu
    Bai, Yining
    Wang, Guanda
    Zhang, Zhizhong
    Zheng, Zhenyi
    Chen, Lei
    Zhang, Kun
    Sirakoulis, Georgios
    Zhang, Youguang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2021, 68 (03) : 1193 - 1205