Formant speech synthesis

被引:0
作者
Pinto, N.B.
Childers, D.G.
机构
来源
Journal of the Institution of Electronics and Telecommunication Engineers | 1988年 / 34卷 / 01期
关键词
Vocoders;
D O I
暂无
中图分类号
学科分类号
摘要
This paper describes analysis and synthesis methods for a digital formant synthesizer. It is shown that synthetic speech generated using excitation pulses which resemble the true glottal volume-velocity excitation waveform is preferred over speech synthesized using a two pole glottal filter and impulse excitation. A series of algorithms for voice/unvoiced/mixed/silent interval clasification, pitch detection, and formant estimation and racking are described. We have also initiated an investigation into the feasibility of using the digital formant synthesizer to study the acoustic correlates of voice quality. A number of experiments involving male/female voice conversion, and the stimulation of various vocal characteristics, such as breathiness, roughness, and vocal fry, were undertaken. The results have helped to establish the importance of various acoustic features as descriptors of specific voice qualities.
引用
收藏
页码:5 / 20
相关论文
共 42 条
  • [31] Harmonic-Net: Fundamental Frequency and Speech Rate Controllable Fast Neural Vocoder
    Matsubara, Keisuke
    Okamoto, Takuma
    Takashima, Ryoichi
    Takiguchi, Tetsuya
    Toda, Tomoki
    Kawai, Hisashi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1902 - 1915
  • [32] An optimal unit-selection algorithm for ultra low bit-rate speech coding
    Ramasubramanian, V.
    Harish, D.
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 541 - +
  • [33] An Improved Noise Reduction Technique for Enhancing the Intelligibility of Sinewave Vocoded Speech: Implication in Cochlear Implants
    Poluboina, Venkateswarlu
    Pulikala, Aparna
    Muthu, Arivudai Nambi Pitchai
    IEEE ACCESS, 2023, 11 : 787 - 796
  • [34] An Open-Source Speech Codec at 450 bit/s with Pseudo-Wideband Mode
    Erhardt, Stefan
    Kurin, Thomas
    Lurz, Fabian
    Weigel, Robert
    Koelpin, Alexander
    2019 49TH EUROPEAN MICROWAVE CONFERENCE (EUMC), 2019, : 1048 - 1051
  • [35] An Open-Source Speech Codec at 450 bit/s with Pseudo-Wideband Mode
    Erhardt, Stefan
    Kurin, Thomas
    Lurz, Fabian
    Weigel, Robert
    Koelpin, Alexander
    2019 16TH EUROPEAN RADAR CONFERENCE (EURAD), 2019, : 413 - 416
  • [36] Fast Neural Speech Waveform Generative Models With Fully-Connected Layer-Based Upsampling
    Yamashita, Haruki
    Okamoto, Takuma
    Takashima, Ryoichi
    Ohtani, Yamato
    Takiguchi, Tetsuya
    Toda, Tomoki
    Kawai, Hisashi
    IEEE ACCESS, 2024, 12 : 31409 - 31421
  • [37] Robust AI-Synthesized Speech Detection Using Feature Decomposition Learning and Synthesizer Feature Augmentation
    Zhang, Kuiyuan
    Hua, Zhongyun
    Zhang, Yushu
    Guo, Yifang
    Xiang, Tao
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 871 - 885
  • [38] End-to-End Mandarin Speech Reconstruction Based on Ultrasound Tongue Images Using Deep Learning
    Li, Fengji
    Shen, Fei
    Ma, Ding
    Zhou, Jie
    Zhang, Shaochuan
    Wang, Li
    Fan, Fan
    Liu, Tao
    Chen, Xiaohong
    Toda, Tomoki
    Niu, Haijun
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2025, 33 : 140 - 149
  • [39] Comparison of segment quantizers: VQ, MQ, VLSQ and unit-selection algorithms for ultra low bit-rate speech coding
    Harish, D.
    Ramasubramanian, V.
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4773 - 4776
  • [40] A Bitrate-Scalable Variational Recurrent Mel-Spectrogram Coder for Real-Time Resynthesis-Based Speech Coding
    Stahl, Benjamin
    Windtner, Simon
    Sontacchi, Alois
    IEEE ACCESS, 2024, 12 : 159239 - 159251