Implementation of Real-Time Speech Separation Model Using Time-Domain Audio Separation Network (TasNet) and Dual-Path Recurrent Neural Network (DPRNN)

被引：3

作者：

Wijayakusuma, Alfian ^{[1
]}

Gozali, Davin Reinaldo ^{[1
]}

Widjaja, Anthony ^{[1
]}

Ham, Hanry ^{[1
]}

机构：

[1] Bina Nusantara Univ, Sch Comp Sci, Comp Sci Dept, Jakarta 11480, Indonesia

来源：

5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL INTELLIGENCE 2020 | 2021年 / 179卷

关键词：

Speech Separation; Time-Domain; Time-Domain Audio Separation Network; Dual-Path Recurrent Neural Network; Real-Time;

D O I：

10.1016/j.procs.2021.01.065

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The purpose of this research is to develop a model that is able to perform real-time speaker independent multi-talker speech separation task in time-domain using Time-Domain Audio Separation Network (TasNet) and Dual-Path Recurrent Neural Network (DPRNN). This research will conduct experiments on some RNN architectures, number of batch size, and optimizers as hyperparameters in order to implement TasNet and DPRNN. This research also try to analyze the impact of these hyperparameters setup on model performance. The expected result of this research is a more accurate model and lower latency to complete speaker independent multi-talker speech separation task in real-time than previous research model. (C) 2021 The Authors. Published by Elsevier B.V.

引用

页码：762 / 772

页数：11

共 50 条

[41] Deep Recurrent Neural Network based Monaural Speech Separation using Recurrent Temporal Restricted Boltzmann Machines
Samui, Suman
Chakrabarti, Indrajit
Ghosh, Soumya K.
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3622 - 3626
[42] Real-Time Model Predictive Control Using a Self-Organizing Neural Network
Han, Hong-Gui
Wu, Xiao-Long
Qiao, Jun-Fei
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (09) : 1425 - 1436
[43] DCE-CDPPTnet: Dense Connected Encoder Cross Dual-path Parrel Transformer Network for Multi-channel Speech Separation
Zhuang, Chenghao
Zhou, Lin
Cao, Yanxiang
Wang, Qirui
Cheng, Yunling
2024 13TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, ICCCAS 2024, 2024, : 303 - 308
[44] Real -Time Zero -Phase Digital Filter Using Recurrent Neural Network
Sinjanakhom, Tantep
Chivapreecha, Sorawat
2023 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, APCCAS, 2024, : 348 - 352
[45] Learning in a neural network model in real time using real world stimuli
Sánchez-Montañés, MA
König, P
Verschure, PFMJ
NEUROCOMPUTING, 2001, 38 : 859 - 865
[46] Time-frequency Domain Filter-and-sum Network for Multi-channel Speech Separation
Deng, Zhewen
Zhou, Yi
Liu, Hongqing
INTERSPEECH 2023, 2023, : 3689 - 3693
[47] A Time-Domain Real-Valued Generalized Wiener Filter for Multi-Channel Neural Separation Systems
Luo, Yi
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 3008 - 3019
[48] Real-time iris segmentation model based on lightweight convolutional neural network
Huo, Guang
Lin, Dawei
Liu, Yuanning
Zhu, Xiaodong
Yuan, Meng
JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (04)
[49] Real-Time Tool Localization for Laparoscopic Surgery Using Convolutional Neural Network
Benavides, Diego
Cisnal, Ana
Fonturbel, Carlos
de la Fuente, Eusebio
Fraile, Juan Carlos
SENSORS, 2024, 24 (13)
[50] Time-Domain Computing in Memory Using Spintronics for Energy-Efficient Convolutional Neural Network
Zhang, Yue
Wang, Jinkai
Lian, Chenyu
Bai, Yining
Wang, Guanda
Zhang, Zhizhong
Zheng, Zhenyi
Chen, Lei
Zhang, Kun
Sirakoulis, Georgios
Zhang, Youguang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2021, 68 (03) : 1193 - 1205

← 1 2 3 4 5 →