Artificial Bandwidth Extension using Frequency Shifting, H∞\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H^\infty $$\end{document} Optimization, and Deep Neural Network

被引:0
|
作者
Deepika Gupta [1 ]
Hanumant Singh Shekhawat [2 ]
机构
[1] Indian Institute of Technology Guwahati,Department of Electronics and Electrical Engineering
[2] Galgotias University,School of Computing Science and Engineering
关键词
system norm; Signal model; Speech processing; Deep neural network;
D O I
10.1007/s00034-024-02911-y
中图分类号
学科分类号
摘要
Artificial bandwidth extension (ABE) approach expands signal bandwidth. In narrowband communication, the ABE approach is used to expand the bandwidth at the receiver end of a narrowband signal. An ABE approach is proposed to improve the perception of narrowband signals. A novel error system is proposed and designed by combining the bandwidth extension process, reference band-pass shifted signal generating process, and narrowband signal generating process. The error system is transformed into a closed-loop system. Solution of the closed-loop system is obtained using the H∞\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H^\infty $$\end{document}-norm. Solution of the closed-loop system is a synthesis filter utilized in ABE process for synthesizing high-frequency components. A gain factor corresponding to the synthesis filter is computed and used for adjusting energy levels of the estimated high-frequency components. Speech signals have time-varying characteristics. Therefore, several synthesis filters and corresponding gains are needed for constructing the whole signal. Two distinct deep neural networks (DNNs) are designed to predict them. The proposed approach is evaluated on the TIMIT and RSR15 datasets and improves the MOS-LQO and CMOS values more than the baselines. The proposed approach reduces the LSDUB\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {LSD}_{UB}$$\end{document} value by 1.85 dB and LSDFB\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {LSD}_{FB}$$\end{document} value by 1.08 dB and increases the MOS-LQO value by 0.07 points when compared with our prior work.
引用
收藏
页码:3088 / 3111
页数:23
相关论文
共 5 条