Sources separation of passive sonar array signal using recurrent neural network-based deep neural network with 3-D tensor

被引:0
作者
Lee, Sangheon [1 ]
Jung, Dongku [1 ]
Yu, Jaesok [1 ]
机构
[1] DGIST, Dept Robot & Mechatron Engn, Techno Jungang Daero 333, Daegu 42988, South Korea
来源
JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA | 2023年 / 42卷 / 04期
关键词
Passive sonar; Multichannel signals separation; 3-D tensor; Recurrent Neural Network (RNN); Deep learning;
D O I
10.7776/ASK.2023.42.4.357
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In underwater signal processing, separating individual signals from mixed signals has long been a challenge due to low signal quality. The common method using Short-time Fourier transform for spectrogram analysis has faced criticism for its complex parameter optimization and loss of phase data. We propose a Triple-path Recurrent Neural Network, based on the Dual-path Recurrent Neural Network's success in long time series signal processing, to handle three-dimensional tensors from multi-channel sensor input signals. By dividing input signals into short chunks and creating a 3D tensor, the method accounts for relationships within and between chunks and channels, enabling local and global feature learning. The proposed technique demonstrates improved Root Mean Square Error and Scale Invariant Signal to Noise Ratio compared to the existing method.
引用
收藏
页码:357 / 363
页数:7
相关论文
共 16 条
[1]  
[Anonymous], 2014, Proceedings of the 15th International Society for Music Information Retrieval Conference, DOI DOI 10.5281/ZENODO.1415678
[2]   A comprehensive study of speech separation: spectrogram vs waveform separation [J].
Bahmaninezhad, Fahimeh ;
Wu, Jian ;
Gu, Rongzhi ;
Zhang, Shi-Xiong ;
Xu, Yong ;
Yu, Meng ;
Yu, Dong .
INTERSPEECH 2019, 2019, :4574-4578
[3]   Adaptive Sparsity Non-Negative Matrix Factorization for Single-Channel Source Separation [J].
Gao, Bin ;
Woo, W. L. ;
Dlay, S. S. .
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (05) :989-1001
[4]   Identity Mappings in Deep Residual Networks [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :630-645
[5]  
Kavalerov I, 2019, IEEE WORK APPL SIG, P175, DOI [10.1109/waspaa.2019.8937253, 10.1109/WASPAA.2019.8937253]
[6]  
Kingma D.P., 2015, P INT C LEARN REPR I
[7]   Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks [J].
Kolbaek, Morten ;
Yu, Dong ;
Tan, Zheng-Hua ;
Jensen, Jesper .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (10) :1901-1913
[8]   End-to-end music source separation: is it possible in the waveform domain? [J].
Lluis, Francesc ;
Pons, Jordi ;
Serra, Xavier .
INTERSPEECH 2019, 2019, :4619-4623
[9]  
Luo Y, 2020, INT CONF ACOUST SPEE, P46, DOI [10.1109/icassp40776.2020.9054266, 10.1109/ICASSP40776.2020.9054266]
[10]   Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation [J].
Luo, Yi ;
Mesgarani, Nima .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (08) :1256-1266