Unsupervised Training of a DNN-based Formant Tracker

被引:2
|
作者
Lilley, Jason [1 ]
Bunnell, H. Timothy [1 ]
机构
[1] Nemours Biomed Res, Wilmington, DE 19803 USA
来源
INTERSPEECH 2021 | 2021年
关键词
speech analysis; formant estimation; formant tracking; deep learning; acoustic models of speech; SPEECH;
D O I
10.21437/Interspeech.2021-1690
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Phonetic analysis often requires reliable estimation of formants, but estimates provided by popular programs can be unreliable. Recently, Dissen et al. [1] described DNN- based formant trackers that produced more accurate frequency estimates than several others, but require manually-corrected formant data for training. Here we describe a novel unsupervised training method for corpus-based DNN formant parameter estimation and tracking with accuracy similar to [1]. Frame-wise spectral envelopes serve as the input. The output is estimates of the frequencies and bandwidths plus amplitude adjustments for a prespecified number of poles and zeros, hereafter referred to as "formant parameters." A custom loss measure based on the difference between the input envelope and one generated from the estimated formant parameters is calculated and backpropagated through the network to establish the gradients with respect to the formant parameters. The approach is similar to that of autoencoders, in that the model is trained to reproduce its input in order to discover latent features, in this case, the formant parameters. Our results demonstrate that a reliable formant tracker can be constructed for a speech corpus without the need for hand-corrected training data.
引用
收藏
页码:1189 / 1193
页数:5
相关论文
共 50 条
  • [31] Recent Development of the DNN-based Singing Voice Synthesis System - Sinsy
    Hono, Yukiya
    Murata, Shumma
    Nakamura, Kazuhiro
    Hashimoto, Kei
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1003 - 1009
  • [32] IMPACT OF SINGLE-MICROPHONE DEREVERBERATION ON DNN-BASED MEETING TRANSCRIPTION SYSTEMS
    Yoshioka, Takuya
    Chen, Xie
    Gales, Mark J. F.
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [33] Uncertainty decoding with adaptive sampling for noise robust DNN-based acoustic modeling
    Tran, Dung T.
    Delcroix, Marc
    Ogawa, Atsunori
    Nakatani, Tomohiro
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3852 - 3856
  • [34] DNN-Based Source Enhancement to Increase Objective Sound Quality Assessment Score
    Koizumi, Yuma
    Niwa, Kenta
    Hioka, Yusuke
    Kobayashi, Kazunori
    Haneda, Yoichi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (10) : 1780 - 1792
  • [35] DNN-Based Score Calibration With Multitask Learning for Noise Robust Speaker Verification
    Tan, Zhili
    Mak, Man-Wai
    Mak, Brian Kan-Wing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (04) : 700 - 712
  • [36] Environment -aware Testing for DNN-based Smart -home WiFi Sensing Systems
    Zheng, Naiyu
    Chen, Ting
    Dong, Chuchu
    Yang, Yubo
    Li, Yuanzhe
    Liu, Yunxin
    Li, Yuanchun
    2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING, SANER, 2023, : 763 - 767
  • [37] A real-time formant tracker based on the inverse filter control method
    Ueda, Yuichi
    Hamakawa, Tomoya
    Sakata, Tadashi
    Hario, Syota
    Watanabe, Akira
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2007, 28 (04) : 271 - 274
  • [38] DLUT: Decoupled Learning-Based Unsupervised Tracker
    Xu, Zhengjun
    Huang, Detian
    Huang, Xiaoqian
    Song, Jiaxun
    Liu, Hang
    SENSORS, 2024, 24 (01)
  • [39] To what extent do DNN-based image classification models make unreliable inferences?
    Tian, Yongqiang
    Ma, Shiqing
    Wen, Ming
    Liu, Yepang
    Cheung, Shing-Chi
    Zhang, Xiangyu
    EMPIRICAL SOFTWARE ENGINEERING, 2021, 26 (05)
  • [40] A DNN-based Background Segmentation Accelerator for FPGA-equipped satellites Invited Paper
    Fiorito, Michele
    Curzel, Serena
    Gozzi, Giovanni
    Ferrandi, Fabrizio
    PROCEEDINGS OF THE 21ST ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS 2024-WORKSHOPS AND SPECIAL SESSIONS, CF 2024 COMPANION, 2024, : 128 - 132