Perceptual Information Loss due to Impaired Speech Production

被引:14
作者
Asaei, Afsaneh [1 ,2 ]
Cernak, Milos [1 ]
Bourlard, Herve [1 ,3 ]
机构
[1] Ctr Parc, Idiap Res Inst, CH-1920 Martigny, Switzerland
[2] Tech Univ Munich, UnternehmerTUM, Ctr Innovat & Business Creat, D-80333 Munich, Germany
[3] Ecole Polytech Fed Lausanne, CH-1015 Lausanne, Switzerland
关键词
Information transmission; motor speech disorders; speech production; speech perception; CORTICAL ORGANIZATION; RECOGNITION; DEEP;
D O I
10.1109/TASLP.2017.2738445
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Phonological classes define articulatory-free and articulatory-bound phone attributes. Deep neural network is used to estimate the probability of phonological classes from the speech signal. In theory, a unique combination of phone attributes form a phoneme identity. Probabilistic inference of phonological classes thus enables estimation of their compositional phoneme probabilities. A novel information theoretic framework is devised to quantify the information conveyed by each phone attribute, and assess the speech production quality for perception of phonemes. As a use case, we hypothesize that disruption in speech production leads to information loss in phone attributes, and thus confusion in phoneme identification. We quantify the amount of information loss due to dysarthric articulation recorded in the TORGO database. A novel information measure is formulated to evaluate the deviation from an ideal phone attribute production leading us to distinguish healthy production from pathological speech.
引用
收藏
页码:2433 / 2443
页数:11
相关论文
共 40 条
[1]   How Do Humans Process and Recognize Speech? [J].
Allen, Jont B. .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (04) :567-577
[2]  
[Anonymous], 2011, P IEEE WORKSH AUT SP
[3]  
[Anonymous], 2011, Speech and Audio Signal Processing: Processing and Perception of Speech and Music, DOI 10.1002/9781118142882
[4]  
[Anonymous], 1989, Haskins Laboratories Status Report on Speech Research, DOI DOI 10.1017/S0952675700001019
[5]  
Asaei A., 2016, P 7 WORKSH SPEECH LA, P50
[6]  
Asaei A, 2015, 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, P418
[7]   Functional organization of human sensorimotor cortex for speech articulation [J].
Bouchard, Kristofer E. ;
Mesgarani, Nima ;
Johnson, Keith ;
Chang, Edward F. .
NATURE, 2013, 495 (7441) :327-332
[8]  
Bourlard H, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P426
[9]   GESTURAL SPECIFICATION USING DYNAMICALLY-DEFINED ARTICULATORY STRUCTURES [J].
BROWMAN, CP ;
GOLDSTEIN, L .
JOURNAL OF PHONETICS, 1990, 18 (03) :299-320
[10]   ARTICULATORY PHONOLOGY - AN OVERVIEW [J].
BROWMAN, CP ;
GOLDSTEIN, L .
PHONETICA, 1992, 49 (3-4) :155-180