4D magnetic resonance imaging atlas construction using temporally aligned audio waveforms in speech

被引:6
作者
Xing, Fangxu [1 ]
Jin, Riwei [2 ]
Gilbert, Imani R. [3 ]
Perry, Jamie L. [3 ]
Sutton, Bradley P. [2 ]
Liu, Xiaofeng [1 ]
El Fakhri, Georges [1 ]
Shosted, Ryan K. [4 ]
Woo, Jonghye [1 ]
机构
[1] Harvard Med Sch, Massachusetts Gen Hosp, Dept Radiol, Gordon Ctr Med Imaging, Boston, MA 02114 USA
[2] Univ Illinois, Dept Bioengn, Champaign, IL 61801 USA
[3] East Carolina Univ, Dept Commun Sci & Disorders, Greenville, NC 27858 USA
[4] Univ Illinois, Dept Linguist, Champaign, IL 61801 USA
基金
美国国家卫生研究院;
关键词
TONGUE; MOTION;
D O I
10.1121/10.0007064
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Magnetic resonance (MR) imaging is becoming an established tool in capturing articulatory and physiological motion of the structures and muscles throughout the vocal tract and enabling visual and quantitative assessment of real-time speech activities. Although motion capture speed has been regularly improved by the continual developments in high-speed MR technology, quantitative analysis of multi-subject group data remains challenging due to variations in speaking rate and imaging time among different subjects. In this paper, a workflow of post-processing methods that matches different MR image datasets within a study group is proposed. Each subject's recorded audio waveform during speech is used to extract temporal domain information and generate temporal alignment mappings from their matching pattern. The corresponding image data are resampled by deformable registration and interpolation of the deformation fields, achieving inter-subject temporal alignment between image sequences. A four-dimensional dynamic MR speech atlas is constructed using aligned volumes from four human subjects. Similarity tests between subject and target domains using the squared error, cross correlation, and mutual information measures all show an overall score increase after spatiotemporal alignment. The amount of image variability in atlas construction is reduced, indicating a quality increase in the multi-subject data for groupwise quantitative analysis.(c) 2021 Acoustical Society of America.
引用
收藏
页码:3500 / 3508
页数:9
相关论文
共 34 条
  • [1] ABDELMALEK S, 1955, J ANAT, V89, P250
  • [2] Avants B. B., 2009, Insight Journal, V2, P1, DOI DOI 10.54294/UVNHIN
  • [3] Real-Time Magnetic Resonance Imaging of Velopharyngeal Activities With Simultaneous Speech Recordings
    Bae, Youkyung
    Kuehn, David P.
    Conway, Charles A.
    Sutton, Bradley P.
    [J]. CLEFT PALATE-CRANIOFACIAL JOURNAL, 2011, 48 (06) : 695 - 707
  • [4] Broadwell KR., 2015, CONT ISSUES COMMUN S, V42, P173, DOI [10.1044/cicsd_42_F_173, DOI 10.1044/CICSD_42_F_173]
  • [5] Denby B, 2004, 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS, P685
  • [6] A spatiotemporal statistical atlas of motion for the quantification of abnormal myocardial tissue velocities
    Duchateau, Nicolas
    De Craene, Mathieu
    Piella, Gemma
    Silva, Etelvino
    Doltra, Adelina
    Sitges, Marta
    Bijnens, Bart H.
    Frangi, Alejandro F.
    [J]. MEDICAL IMAGE ANALYSIS, 2011, 15 (03) : 316 - 328
  • [7] High-Frame-Rate Full-Vocal-Tract 3D Dynamic Speech Imaging
    Fu, Maojing
    Barlaz, Marissa S.
    Holtrop, Joseph L.
    Perry, Jamie L.
    Kuehn, David P.
    Shosted, Ryan K.
    Liang, Zhi-Pei
    Sutton, Bradley P.
    [J]. MAGNETIC RESONANCE IN MEDICINE, 2017, 77 (04) : 1619 - 1629
  • [8] High-Resolution Dynamic Speech Imaging with Joint Low-Rank and Sparsity Constraints
    Fu, Maojing
    Zhao, Bo
    Carignan, Christopher
    Shosted, Ryan K.
    Perry, Jamie L.
    Kuehn, David P.
    Liang, Zhi-Pei
    Sutton, Bradley P.
    [J]. MAGNETIC RESONANCE IN MEDICINE, 2015, 73 (05) : 1820 - 1832
  • [9] Jin R., 2020, 12 INT SEM SPEECH PR, P25
  • [10] Kotlarek KJ., 2018, Clin Arch Commun Disord, V3, P236, DOI [10.21849/cacd.2018.00360, DOI 10.21849/CACD.2018.00360]