Quality Enhancement of Compressed Vibrotactile Signals Using Recurrent Neural Networks and Residual Learning

被引：4

作者：

Noll, Andreas ^{[1
,2
]}

Guerbuez, Ayten ^{[1
]}

Guelecyuez, Basak ^{[1
,2
]}

Cui, Kai ^{[1
]}

Steinbach, Eckehard ^{[1
,2
]}

机构：

[1] Tech Univ Munich, Dept Elect & Comp Engn, D-80333 Munich, Germany

[2] Tech Univ, Ctr Tactile Internet Human In TheLoop CeTI, D-01062 Dresden, Germany

来源：

IEEE TRANSACTIONS ON HAPTICS | 2021年 / 14卷 / 02期

关键词：

Artificial neural networks; Codecs; Signal to noise ratio; Recurrent neural networks; Training; Testing; Task analysis; Quality enhancement; machine learning; RNN; residual learning; tactile signal compression;

D O I：

10.1109/TOH.2021.3078889

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We present a neural network-based compression artifact removal technique for vibrotactile signals. The proposed decoder-side quality enhancement approach is based on recurrent neural networks (RNNs) and the principle of residual learning. We use a total of 8 nonlinear RNN layers trained to first estimate the difference between the original and the compressed signal. The estimated difference signal is then added to the compressed signal, followed by further linear processing steps to construct the enhanced signal. With our approach, we are able to enhance signals at almost all compression ratios by up to $1.25\ \mathrm {dB}$. For the signals in our data set, rougly 86% are enhanced in their quality. Through an ablation study, we show that every block of our network is functioning as intended and contributes to the compression artifact removal. Additionally, we show that the chosen network parameters maximize performance.

引用

页码：316 / 321

页数：6

共 26 条

[1]

Biswas A, 2020, INT CONF ACOUST SPEE, P356, DOI [10.1109/ICASSP40776.2020.9053113, 10.1109/icassp40776.2020.9053113]

[2] Perceptual and Bitrate-Scalable Coding of Haptic Surface Texture Signals [J].

Chaudhariu, Rahul ;

Schuwerk, Clemens ;

Danaei, Mojtaba ;

Steinbach, Eckehard .

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2015, 9 (03) :462-473

[3]

Cui K., 2018, P IEEE C COMP VIS PA, P2571

[4] Exploiting time-frequency patterns with LSTM-RNNs for low-bitrate audio restoration [J].

Deng, Jun ;

Schuller, Bjoern ;

Eyben, Florian ;

Schuller, Dagmar ;

Zhang, Zixing ;

Francois, Holly ;

Oh, Eunmi .

NEURAL COMPUTING & APPLICATIONS, 2020, 32 (04) :1095-1107

[5] Compression Artifacts Reduction by a Deep Convolutional Network [J].

Dong, Chao ;

Deng, Yubin ;

Loy, Chen Change ;

Tang, Xiaoou .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :576-584

[6]

Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1

[7] Framewise phoneme classification with bidirectional LSTM and other neural network architectures [J].

Graves, A ;

Schmidhuber, J .

NEURAL NETWORKS, 2005, 18 (5-6) :602-610

[8] Speech Bandwidth Extension Using Bottleneck Features and Deep Recurrent Neural Networks [J].

Gu, Yu ;

Ling, Zhen-Hua ;

Dai, Li-Rong .

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, :297-301

[9] Subjective Evaluation of the Spectral Temporal SIMilarity (ST-SIM) Measure for Vibrotactile Quality Assessment [J].

Hassen, Rania ;

Steinbach, Eckehard .

IEEE TRANSACTIONS ON HAPTICS, 2020, 13 (01) :25-31

[10]

He K., 2016, P 29 IEEE C COMPUTER, P770

← 1 2 3 →