Multimodal voice conversion based on non-negative matrix factorization

被引:0
|
作者
Kenta Masaka
Ryo Aihara
Tetsuya Takiguchi
Yasuo Ariki
机构
[1] Kobe University,Graduate School of System Informatics
[2] Kobe University,Organization of Advanced Science and Technology
来源
EURASIP Journal on Audio, Speech, and Music Processing | / 2015卷
关键词
Voice conversion; Multimodal; Image features; Non-negative matrix factorization; Noise robustness;
D O I
暂无
中图分类号
学科分类号
摘要
A multimodal voice conversion (VC) method for noisy environments is proposed. In our previous non-negative matrix factorization (NMF)-based VC method, source and target exemplars are extracted from parallel training data, in which the same texts are uttered by the source and target speakers. The input source signal is then decomposed into source exemplars, noise exemplars, and their weights. Then, the converted speech is constructed from the target exemplars and the weights related to the source exemplars. In this study, we propose multimodal VC that improves the noise robustness of our NMF-based VC method. Furthermore, we introduce the combination weight between audio and visual features and formulate a new cost function to estimate audio-visual exemplars. Using the joint audio-visual features as source features, VC performance is improved compared with that of a previous audio-input exemplar-based VC method. The effectiveness of the proposed method is confirmed by comparing its effectiveness with that of a conventional audio-input NMF-based method and a Gaussian mixture model-based method.
引用
收藏
相关论文
共 50 条
  • [31] Rank selection for non-negative matrix factorization
    Cai, Yun
    Gu, Hong
    Kenney, Toby
    STATISTICS IN MEDICINE, 2023, 42 (30) : 5676 - 5693
  • [32] Initialization enhancer for non-negative matrix factorization
    Zheng, Zhonglong
    Yang, Jie
    Zhu, Yitan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2007, 20 (01) : 101 - 110
  • [33] Convex Non-Negative Matrix Factorization in the Wild
    Thurau, Christian
    Kersting, Kristian
    Bauckhage, Christian
    2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 523 - 532
  • [34] Robust discriminative non-negative matrix factorization
    Zhang, Ruiqing
    Hu, Zhenfang
    Pan, Gang
    Wang, Yueming
    NEUROCOMPUTING, 2016, 173 : 552 - 561
  • [35] Truncated Cauchy Non-Negative Matrix Factorization
    Guan, Naiyang
    Liu, Tongliang
    Zhang, Yangmuzi
    Tao, Dacheng
    Davis, Larry S.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (01) : 246 - 259
  • [36] Non-negative matrix factorization with sparseness constraints
    Hoyer, PO
    JOURNAL OF MACHINE LEARNING RESEARCH, 2004, 5 : 1457 - 1469
  • [37] Non-negative Matrix Factorization for Binary Data
    Larsen, Jacob Sogaard
    Clemmensen, Line Katrine Harder
    2015 7TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (IC3K), 2015, : 555 - 563
  • [38] Probabilistic Sparse Non-negative Matrix Factorization
    Hinrich, Jesper Love
    Morup, Morten
    LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION (LVA/ICA 2018), 2018, 10891 : 488 - 498
  • [39] Study on Text Classification Algorithm Based on Non-negative Matrix Factorization
    Jing, Yongxia
    Gou, Heping
    Fu, Chuanyi
    Liu, Qiang
    2017 10TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2017, : 484 - 487
  • [40] Classification of landsat TM image based on non-negative matrix factorization
    Ren, Jiamian
    Yu, Xianchuan
    Hao, Bixin
    IGARSS: 2007 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS 1-12: SENSING AND UNDERSTANDING OUR PLANET, 2007, : 405 - 408