Unsupervised Training of a DNN-based Formant Tracker

被引:2
|
作者
Lilley, Jason [1 ]
Bunnell, H. Timothy [1 ]
机构
[1] Nemours Biomed Res, Wilmington, DE 19803 USA
来源
INTERSPEECH 2021 | 2021年
关键词
speech analysis; formant estimation; formant tracking; deep learning; acoustic models of speech; SPEECH;
D O I
10.21437/Interspeech.2021-1690
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Phonetic analysis often requires reliable estimation of formants, but estimates provided by popular programs can be unreliable. Recently, Dissen et al. [1] described DNN- based formant trackers that produced more accurate frequency estimates than several others, but require manually-corrected formant data for training. Here we describe a novel unsupervised training method for corpus-based DNN formant parameter estimation and tracking with accuracy similar to [1]. Frame-wise spectral envelopes serve as the input. The output is estimates of the frequencies and bandwidths plus amplitude adjustments for a prespecified number of poles and zeros, hereafter referred to as "formant parameters." A custom loss measure based on the difference between the input envelope and one generated from the estimated formant parameters is calculated and backpropagated through the network to establish the gradients with respect to the formant parameters. The approach is similar to that of autoencoders, in that the model is trained to reproduce its input in order to discover latent features, in this case, the formant parameters. Our results demonstrate that a reliable formant tracker can be constructed for a speech corpus without the need for hand-corrected training data.
引用
收藏
页码:1189 / 1193
页数:5
相关论文
共 50 条
  • [1] Unsupervised Domain Adaptation for DNN-based Automated Harvesting
    Shkanaev, Aleksandr Yu
    Sholomov, Dmitry L.
    Nikolaev, Dmitry P.
    TWELFTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2019), 2020, 11433
  • [2] Resisting DNN-Based Website Fingerprinting Attacks Enhanced by Adversarial Training
    Qiao, Litao
    Wu, Bang
    Yin, Shuijun
    Li, Heng
    Yuan, Wei
    Luo, Xiapu
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 5375 - 5386
  • [3] Deep Spread Multiplexing and Study of Training Methods for DNN-Based Encoder and Decoder
    Kim, Minhoe
    Lee, Woongsup
    SENSORS, 2023, 23 (08)
  • [4] Exploiting foreign resources for DNN-based ASR
    Motlicek, Petr
    Imseng, David
    Potard, Blaise
    Garner, Philip N.
    Himawan, Ivan
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2015, : 1 - 10
  • [5] Refining a deep learning-based formant tracker using linear prediction methods
    Alku, Paavo
    Kadiri, Sudarsana Reddy
    Gowda, Dhananjaya
    COMPUTER SPEECH AND LANGUAGE, 2023, 81
  • [6] DNN-based QoT Estimation Using Topological Inputs and Training with Synthetic-Physical Data
    Mayer, Kayol S.
    dos Santos, Luan C. M.
    Pinto, Rossano P.
    Dal Maso, Marcos P. A.
    Rothenberg, Christian E.
    Arantes, Dalton S.
    Mello, Darli A. A.
    2023 IEEE PHOTONICS CONFERENCE, IPC, 2023,
  • [7] DNN-Based PolSAR Image Classification on Noisy Labels
    Ni, Jun
    Xiang, Deliang
    Lin, Zhiyuan
    Lopez-Martinez, Carlos
    Hu, Wei
    Zhang, Fan
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 3697 - 3713
  • [8] DNN-based Approach to Detect and Classify Pathological Voice
    Chuang, Zong-Ying
    Yu, Xiao-Tong
    Chen, Ji-Ying
    Hsu, Yi-Te
    Xu, Zhe-Zhuang
    Wang, Chi-Te
    Lin, Feng-Chuan
    Fang, Shih-Hau
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 5238 - 5241
  • [9] DNN-based Indoor Fingerprinting Localization with WiFi FTM
    Eberechukwu, Paulson
    Park, Hyunwoo
    Laoudias, Christos
    Horsmanheimo, Seppo
    Kim, Sunwoo
    2022 23RD IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2022), 2022, : 367 - 371
  • [10] Towards breaking DNN-based audio steganalysis with GAN
    Wang, Jie
    Wang, Rangding
    Dong, Li
    Yan, Diqun
    Zhang, Xueyuan
    Lin, Yuzhen
    INTERNATIONAL JOURNAL OF AUTONOMOUS AND ADAPTIVE COMMUNICATIONS SYSTEMS, 2021, 14 (04) : 371 - 383