PeakRNN and StatsRNN: Dynamic Pruning in Recurrent Neural Networks

被引:0
作者
Jelcicova, Zuzana [1 ,2 ]
Jones, Rasmus [1 ]
Blix, David Thorn [1 ]
Verhelst, Marian [3 ]
Sparso, Jens [2 ]
机构
[1] Demant AS, Kongebakken 9, DK-2765 Smorum, Denmark
[2] Tech Univ Denmark, Richard Petersens Plads, Bldg 322, DK-2800 Lyngby, Denmark
[3] Katholieke Univ Leuven, Oude Markt 13, B-3000 Leuven, Belgium
来源
29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021) | 2021年
关键词
RNN; determinism; statistics; peaks; threshold; single-channel speech enhancement; hearing instruments;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper introduces two dynamic real-time pruning techniques PeakRNN and StatsRNN for reducing costly multiplications and memory accesses in recurrent neural networks. The methods are demonstrated on a gated recurrent unit in a multi-layer network, solving a single-channel speech enhancement task with a wide variety of real-world acoustic environments and speakers. The performance is compared against the baseline gated recurrent unit and the DeltaRNN method. Compared to the unprocessed speech, the SNR and Perceptual Evaluation of Speech Quality were on average improved by 8.11 dB and 0.43 MOS-LQO, respectively. Additionally, the two proposed methods outperformed DeltaRNN by 0.7 dB and 0.11MOS-LQO in the two objective measures, while using the same computational budget per timestep and reducing the original operations by 88%. Furthermore, PeakRNN is fully deterministic, i.e. it is always known in advance how many computations will be executed. Such worst-case guarantees are crucial for real-time acoustics applications.
引用
收藏
页码:416 / 420
页数:5
相关论文
共 18 条
[1]  
Andersen G, 2011, Akustiske Database for Dansk
[2]  
[Anonymous], 2013, COMPUT REV
[3]  
[Anonymous], 2017, University of Edinburgh. The Centre for Speech Technology Research (CSTR)
[4]   Long short-term memory [J].
Hochreiter, S ;
Schmidhuber, J .
NEURAL COMPUTATION, 1997, 9 (08) :1735-1780
[5]  
[Anonymous], 2017, P 34 INT C MACH LEAR
[6]  
CHO K, ABS14061078 CORR
[7]  
Duan ZY, 2012, 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, P594
[8]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02) :443-445
[9]  
Fedorov I., 2020, Tech. Rep.
[10]  
Green M. C., 2017, **DATA OBJECT**, DOI 10.5281/zenodo.1012809