Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech

被引:0
|
作者
Song, Kaitao [1 ]
Wan, Teng [2 ]
Wang, Bixia [2 ]
Jiang, Huiqiang [1 ]
Qiu, Luna [1 ]
Xu, Jiahang [1 ]
Jiang, Liping [2 ]
Lou, Qun [2 ]
Yang, Yuqing [1 ]
Li, Dongsheng [1 ]
Wang, Xudong [2 ]
Qiu, Lili [1 ]
机构
[1] Microsoft Res, Redmond, WA 98052 USA
[2] Shanghai Jiao Tong Univ, Dept Oral & Craniomaxillofacial Surg, Shanghai Ninth Peoples Hosp, Sch Med, Shanghai, Peoples R China
来源
INTERSPEECH 2022 | 2022年
关键词
Cleft Palate; Hypernasality; Automatic Speech Recognition; ACOUSTIC ANALYSIS; LIP; CHILDREN;
D O I
10.21437/Interspeech.2022-438
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Hypernasality is an abnormal resonance in human speech production, especially in patients with craniofacial anomalies such as cleft palate. In clinical application, hypernasality estimation is crucial in cleft palate diagnosis, as its results determine the subsequent surgery and additional speech therapy. Therefore, designing an automatic hypernasality assessment method will facilitate speech-language pathologists to make precise diagnoses. Existing methods for hypernasality estimation only conduct acoustic analysis based on low-resource cleft palate dataset, by using statistical or neural network-based features. In this paper, we propose a novel approach that uses automatic speech recognition model to improve hypernasality estimation. Specifically, we first pre-train an encoder-decoder framework in an automatic speech recognition (ASR) objective by using speech-to-text dataset, and then fine-tune ASR encoder on the cleft palate dataset for hypernasality estimation. Benefiting from such design, our model for hypernasality estimation can enjoy the advantages of ASR model: 1) compared with low-resource cleft palate dataset, the ASR task usually includes large-scale speech data in the general domain, which enables better model generalization; 2) the text annotations in ASR dataset guide model to extract better acoustic features. Experimental results on two cleft palate datasets demonstrate that our method achieves superior performance compared with previous approaches.
引用
收藏
页码:4820 / 4824
页数:5
相关论文
共 50 条
  • [21] Zero Time Windowing Analysis of Hypernasality in Speech of Cleft Lip and Palate Children
    Dubey, Akhilesh Kumar
    Prasanna, S. R. Mahadeva
    Dandapat, S.
    2016 TWENTY SECOND NATIONAL CONFERENCE ON COMMUNICATION (NCC), 2016,
  • [22] Validation of an automatic speech analysis in children with isolated cleft palate
    Schulz, A.
    Bocklet, T.
    Eysholdt, U.
    Bohr, C.
    Doellinger, M.
    Ziethe, A.
    HNO, 2014, 62 (07) : 525 - 529
  • [23] Subjective and Objective Evaluation of Speech in Adult Patients with Unrepaired Cleft Palate
    Lou, Qun
    Wang, Xudong
    Jiang, Liping
    Wang, Guomin
    Chen, Yang
    Liu, Qiong
    JOURNAL OF CRANIOFACIAL SURGERY, 2022, 33 (05) : E528 - E532
  • [24] The impact of speech material on speech judgement in children with and without cleft palate
    Klinto, Kristina
    Salameh, Eva-Kristina
    Svensson, Henry
    Lohmander, Anette
    INTERNATIONAL JOURNAL OF LANGUAGE & COMMUNICATION DISORDERS, 2011, 46 (03) : 348 - 360
  • [25] Hypernasality Severity Analysis in Cleft Lip and Palate Speech Using Vowel Space Area
    Nikitha, K.
    Kalita, Sishir
    Vikram, C. M.
    Pushpavathi, M.
    Prasanna, S. R. M.
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1829 - 1833
  • [26] Automated cleft speech evaluation using speech recognition
    Vucovich, Megan
    Hallac, Rami R.
    Kane, Alex A.
    Cook, Julie
    Van'T Slot, Cortney
    Seaward, James R.
    JOURNAL OF CRANIO-MAXILLOFACIAL SURGERY, 2017, 45 (08) : 1268 - 1271
  • [27] Improving Speech Synthesis by Automatic Speech Recognition and Speech Discriminator
    Huang, Li-Yu
    Chen, Chia-Ping
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2024, 40 (01) : 189 - 200
  • [28] Objective assessment of cleft lip and palate speech intelligibility using articulation and hypernasality measures
    Kalita, Sishir
    Girish, K. S.
    Pushpavathi, M.
    Prasanna, S. R. Mahadeva
    Dandapat, S.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 146 (02) : 1164 - 1175
  • [29] AN ATTENTION MODEL FOR HYPERNASALITY PREDICTION IN CHILDREN WITH CLEFT PALATE
    Mathad, Vikram C.
    Scherer, Nancy
    Chapman, Kathy
    Liss, Julie
    Berisha, Visar
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7248 - 7252
  • [30] Enhancement of cleft palate speech using temporal and spectral processing
    Sudro, Protima Nomo
    Prasanna, S. R. Mahadeva
    SPEECH COMMUNICATION, 2020, 123 : 70 - 82