Phoneme Segmentation using Deep Learning for Speech Synthesis

被引:2
|
作者
Lee, Young Han [1 ]
Yang, Jong-Yeol [1 ]
Cho, Choongsang [1 ]
Jung, Hyedong [1 ]
机构
[1] Korea Elect Technol Inst, Artificial Intelligent Res Ctr, Seongnam, South Korea
关键词
Phoneme segmentation; Speech synthesis; Deep learning;
D O I
10.1145/3264746.3264801
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we propose the phoneme segmentation method, which is one of the basic module that consist unit-selection-based speech synthesis, using deep learning algorithm. To enhance this, we apply the additional cross entropy loss into the Deep speech based speech recognition architecture. From this approach, we can get higher accuracy of phoneme boundary. In our experiments, the proposed method has 20.91 % boundary accuracy which is higher than the conventional phoneme segmentation.
引用
收藏
页码:59 / 61
页数:3
相关论文
共 50 条
  • [41] Speech Emotion Recognition Using Deep Learning
    Alagusundari, N.
    Anuradha, R.
    ARTIFICIAL INTELLIGENCE: THEORY AND APPLICATIONS, VOL 1, AITA 2023, 2024, 843 : 313 - 325
  • [42] Speech Command Recognition Using Deep Learning
    Ayache, Mohammad
    Kanaan, Hussien
    Kassir, Kawthar
    Kassir, Yasser
    2021 SIXTH INTERNATIONAL CONFERENCE ON ADVANCES IN BIOMEDICAL ENGINEERING (ICABME), 2021, : 24 - 29
  • [43] Persian speech recognition using deep learning
    Hadi Veisi
    Armita Haji Mani
    International Journal of Speech Technology, 2020, 23 : 893 - 905
  • [44] Fake Speech Recognition Using Deep Learning
    Camacho, Steven
    Maria Ballesteros, Dora
    Renza, Diego
    APPLIED COMPUTER SCIENCES IN ENGINEERING, WEA 2021, 2021, 1431 : 38 - 48
  • [45] Speech Emotion Recognition Using Deep Learning
    Ahmed, Waqar
    Riaz, Sana
    Iftikhar, Khunsa
    Konur, Savas
    ARTIFICIAL INTELLIGENCE XL, AI 2023, 2023, 14381 : 191 - 197
  • [46] Deep Learning Techniques in Tandem with Signal Processing Cues for Phonetic Segmentation for Text to Speech Synthesis in Indian Languages
    Baby, Arun
    Prakash, Jeena J.
    Vignesh, Rupak
    Murthy, Hema A.
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3817 - 3821
  • [47] Automatic segmentation of deep endometriosis in the rectosigmoid using deep learning
    Figueredo, Weslley Kelson Ribeiro
    Silva, Aristofanes Correa
    de Paiva, Anselmo Cardoso
    Diniz, Joao Otavio Bandeira
    Brandao, Alice
    Oliveira, Marco Aurelio Pinho
    IMAGE AND VISION COMPUTING, 2024, 151
  • [48] Practical Study of Deep Learning Models for Speech Synthesis
    Langlois, Quentin
    Jodogne, Sebastien
    PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS, PETRA 2023, 2023, : 700 - 706
  • [49] Alternative Vietnamese Speech Synthesis System with Phoneme Structure
    Quang Tuong Lam
    Duc Hao Do
    Thanh Hung Vo
    Duc Dung Nguyen
    ISCIT 2019: PROCEEDINGS OF 2019 19TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES (ISCIT), 2019, : 64 - 69
  • [50] Blood Cell Images Segmentation using Deep Learning Semantic Segmentation
    Thanh Tran
    Kwon, Oh-Heum
    Kwon, Ki-Ryong
    Lee, Suk-Hwan
    Kang, Kyung-Won
    2018 IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION ENGINEERING (ICECE 2018), 2018, : 13 - 16