Multimodal voice conversion based on non-negative matrix factorization

被引:0
|
作者
Kenta Masaka
Ryo Aihara
Tetsuya Takiguchi
Yasuo Ariki
机构
[1] Kobe University,Graduate School of System Informatics
[2] Kobe University,Organization of Advanced Science and Technology
来源
EURASIP Journal on Audio, Speech, and Music Processing | / 2015卷
关键词
Voice conversion; Multimodal; Image features; Non-negative matrix factorization; Noise robustness;
D O I
暂无
中图分类号
学科分类号
摘要
A multimodal voice conversion (VC) method for noisy environments is proposed. In our previous non-negative matrix factorization (NMF)-based VC method, source and target exemplars are extracted from parallel training data, in which the same texts are uttered by the source and target speakers. The input source signal is then decomposed into source exemplars, noise exemplars, and their weights. Then, the converted speech is constructed from the target exemplars and the weights related to the source exemplars. In this study, we propose multimodal VC that improves the noise robustness of our NMF-based VC method. Furthermore, we introduce the combination weight between audio and visual features and formulate a new cost function to estimate audio-visual exemplars. Using the joint audio-visual features as source features, VC performance is improved compared with that of a previous audio-input exemplar-based VC method. The effectiveness of the proposed method is confirmed by comparing its effectiveness with that of a conventional audio-input NMF-based method and a Gaussian mixture model-based method.
引用
收藏
相关论文
共 50 条
  • [41] Document clustering based on spectral clustering and non-negative matrix factorization
    Bao, Lei
    Tang, Sheng
    Li, Jintao
    Zhang, Yongdong
    Ye, Wei-Ping
    NEW FRONTIERS IN APPLIED ARTIFICIAL INTELLIGENCE, 2008, 5027 : 149 - +
  • [42] Fast intrusion detection based on a non-negative matrix factorization model
    Guan, Xiaohong
    Wang, Wei
    Zhang, Xiangliang
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2009, 32 (01) : 31 - 44
  • [43] Non-negative matrix factorization based approaches for wall mitigation in TWRI
    D. Kumlu
    I. Erer
    S. Paker
    Signal, Image and Video Processing, 2022, 16 : 889 - 896
  • [44] Evaluation of distance metrics for recognition based on non-negative matrix factorization
    Guillamet, D
    Vitrià, J
    PATTERN RECOGNITION LETTERS, 2003, 24 (9-10) : 1599 - 1605
  • [45] On Detecting Target Acoustic Signals Based on Non-negative Matrix Factorization
    Jin, Yu Gwang
    Kim, Nam Soo
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (04) : 922 - 925
  • [46] Image Denoising based on Sparse Representation and Non-Negative Matrix Factorization
    Farouk, R. M.
    Khalil, H. A.
    LIFE SCIENCE JOURNAL-ACTA ZHENGZHOU UNIVERSITY OVERSEAS EDITION, 2012, 9 (01): : 337 - 341
  • [47] Binary Codes Based on Non-Negative Matrix Factorization for Clustering and Retrieval
    Xiong, Jiang
    Tao, Yingyin
    Zhang, Meng
    Li, Huaqing
    IEEE ACCESS, 2020, 8 : 207012 - 207023
  • [48] Source Separation Based on Non-Negative Matrix Factorization of the Synchrosqueezing Transform
    Singh, Neha
    Meignen, Sylvain
    Oberlin, Thomas
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1910 - 1914
  • [49] A Fast Distributed Non-Negative Matrix Factorization Algorithm Based on DSGD
    Gao, Yan
    Zhou, Lingjun
    Chen, Baifan
    Xing, Xiaobing
    INTERNATIONAL JOURNAL OF DISTRIBUTED SYSTEMS AND TECHNOLOGIES, 2018, 9 (03) : 24 - 38
  • [50] NON-NEGATIVE MATRIX FACTORIZATION ON THE ENVELOPE MATRIX IN COCHLEAR IMPLANT
    Hu, Hongmei
    Sang, Jinqiu
    Lutman, Mark
    Bleeck, Stefan
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7790 - 7794