Phonetic Segmentation of Speech using STEP and t-SNE

被引:0
|
作者
Stan, Adriana [1 ]
Valentini-Botinhao, Cassia [2 ]
Giurgiu, Mircea [1 ]
King, Simon [2 ]
机构
[1] Tech Univ Cluj Napoca, Dept Commun, Cluj Napoca, Romania
[2] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9YL, Midlothian, Scotland
来源
2015 INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED) | 2015年
关键词
phonetic segmentation; STEP; t-SNE; HMM acoustic model; k-Means;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a first attempt to perform phoneme-level segmentation of speech based on a perceptual representation - the Spectro Temporal Excitation Pattern (STEP) - and a dimensionality reduction technique - the t-Distributed Stochastic Neighbour Embedding (t-SNE). The method searches for the true phonetic boundaries in the vicinity of those produced by an HMM-based segmentation. It looks for perceptually-salient spectral changes which occur at these phonetic transitions, and exploits t-SNE's ability to capture both local and global structure of the data. The method is intended to be used in any language and it is therefore not tailored to any particular dataset or language. Results show that this simple approach improves segmentation accuracy of unvoiced phonemes by 4% within a 5 ms margin, and 5% at a 10 ms margin. For the voiced phonemes, however, accuracy drops slightly.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Conditional t-SNE: more informative t-SNE embeddings
    Bo Kang
    Darío García García
    Jefrey Lijffijt
    Raúl Santos-Rodríguez
    Tijl De Bie
    Machine Learning, 2021, 110 : 2905 - 2940
  • [2] Conditional t-SNE: more informative t-SNE embeddings
    Kang, Bo
    Garcia Garcia, Dario
    Lijffijt, Jefrey
    Santos-Rodriguez, Raul
    De Bie, Tijl
    MACHINE LEARNING, 2021, 110 (10) : 2905 - 2940
  • [3] Conditional t-SNE: More informative t-SNE embeddings
    Kang, Bo
    Garcia, Dario Garcia
    Lijffijt, Jefrey
    Santos-Rodriguez, Raul
    De Bie, Tijl
    2021 IEEE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2021,
  • [4] Visualizing Data using t-SNE
    van der Maaten, Laurens
    Hinton, Geoffrey
    JOURNAL OF MACHINE LEARNING RESEARCH, 2008, 9 : 2579 - 2605
  • [5] Visualizing data using t-SNE
    TiCC, Ttlburg University, P.O. Box 90153, 5000 LE Tilburg, Netherlands
    不详
    J. Mach. Learn. Res., 2008, (2579-2625):
  • [6] Wasserstein t-SNE
    Bachmann, Fynn
    Hennig, Philipp
    Kobak, Dmitry
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT I, 2023, 13713 : 104 - 120
  • [7] A Review of t-SNE
    Jung, Sangwon
    Dagobert, Tristan
    Morel, Jean-Michel
    Facciolo, Gabriele
    IMAGE PROCESSING ON LINE, 2024, 14 : 250 - 270
  • [8] Data Segmentation via t-SNE, DBSCAN, and Random Forest
    DeLise, Timothy
    INTELLIGENT COMPUTING, VOL 2, 2021, 284 : 139 - 151
  • [9] t-SNE-PSO: Optimizing t-SNE using particle swarm optimization
    Allaoui, Mebarka
    Belhaouari, Samir Brahim
    Hedjam, Rachid
    Bouanane, Khadra
    Kherfi, Mohammed Lamine
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 269
  • [10] Interactive Supervision with t-SNE
    Luus, Francois
    Khan, Naweed
    Akhalwaya, Ismail
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON KNOWLEDGE CAPTURE (K-CAP '19), 2019, : 85 - 92