GLOTTAL INSTANTS EXTRACTION FROM SPEECH SIGNAL USING GENERATIVE ADVERSARIAL NETWORK

被引:0
|
作者
Deepak, K. T. [1 ]
Kulkarni, Pavitra [2 ]
Mudenagudi, U. [2 ]
Prasanna, S. R. M. [3 ]
机构
[1] IIIT Dharwad, Elect & Commun Engn Dept, Dharwad, Karnataka, India
[2] KLE Technol Univ, Sch Elect & Commun Engn, Hubli, Karnataka, India
[3] IIT Dharwad, Dept Elect Engn, Dharwad, Karnataka, India
来源
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年
关键词
Glottal closure instants; glottal opening instants; electroglottograph; generative adversarial network; SPEAKER VERIFICATION;
D O I
10.1109/icassp.2019.8683298
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The Glottal Closure and Opening instants ( GCIs and GOIs) form important events in excitation source signal. These instants represent closing and opening events of vocal folds while producing voiced speech signal. Estimation of such instants from speech signal is beneficial and several applications rely on accurate estimation of Closure and Opening instants. In this work, Electroglottographic like ( EGG-like) signal is synthesized from speech signal using Generative Adversarial Network ( GAN). The Glottal Closure and Opening instants are located using the derivative of EGG-like signal, which is essentially a difference EGG-like signal. The proposed method is evaluated on CMU-Arctic database, as the database consists of simultaneous recordings of speech and EGG signal, respectively. To evaluate the results, the locations obtained from synthesized EGG-like signal are compared with the reference difference EGG signal. The results are evaluated for both seen and unseen conditions. It is shown that the performance of GCI and GOI estimation is comparable to existing state-of-the-art methods.
引用
收藏
页码:5946 / 5950
页数:5
相关论文
共 50 条
  • [1] Glottal instants extraction from speech signal using Deep Feature Loss
    Shetty, Supritha M.
    Durgesht, Suraj
    Deepak, K. T.
    2022 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS, SPCOM, 2022,
  • [2] USING EXTREME GRADIENT BOOSTING TO DETECT GLOTTAL CLOSURE INSTANTS IN SPEECH SIGNAL
    Matousek, Jindrich
    Tihelka, Daniel
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6515 - 6519
  • [3] Estimation of glottal closure instants by considering speech signal as a spectrum
    Sripriya, N.
    Nagarajan, T.
    ELECTRONICS LETTERS, 2015, 51 (08) : 649 - 651
  • [4] Comparison of glottal closure instants obtained by using wavelet transform of speech signal and EGG signal
    Seok, JW
    Bae, KS
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1999, E82D (11) : 1486 - 1488
  • [5] Accurate Estimation of Glottal Closure Instants and Glottal Opening Instants from Electroglottographic Signal Using Variational Mode Decomposition
    G. Jyothish Lal
    E. A. Gopalakrishnan
    D. Govind
    Circuits, Systems, and Signal Processing, 2018, 37 : 810 - 830
  • [6] Accurate Estimation of Glottal Closure Instants and Glottal Opening Instants from Electroglottographic Signal Using Variational Mode Decomposition
    Lal, G. Jyothish
    Gopalakrishnan, E. A.
    Govind, D.
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2018, 37 (02) : 810 - 830
  • [7] Determination of the instants of glottal closure from speech wave using wavelet transform
    Du, LM
    Hou, ZQ
    ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 268 - 271
  • [8] Generative adversarial network-based glottal waveform model for statistical parametric speech synthesis
    Bollepalli, Bajibabu
    Juvela, Lauri
    Alku, Paavo
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3394 - 3398
  • [9] Estimation of Glottal Closure Instants from Telephone Speech using a Group Delay-Based Approach that Considers Speech Signal as a Spectrum
    Rachel, G. Anushiya
    Vijayalakshmi, P.
    Nagarajan, T.
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1181 - 1185
  • [10] Transforming the Emotion in Speech using a Generative Adversarial Network
    Yasuda, Kenji
    Orihara, Ryohei
    Sei, Yuichi
    Tahara, Yasuyuki
    Ohsuga, Akihiko
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 427 - 434