A Fuzzy Training Framework for Controllable Sequence-to-Sequence Generation

被引:1
|
作者
Li, Jiajia [1 ]
Wang, Ping [2 ]
Li, Zuchao [2 ]
Liu, Xi [3 ]
Utiyama, Masao [4 ]
Sumita, Eiichiro [4 ]
Zhao, Hai [5 ]
Ai, Haojun [2 ]
机构
[1] Hankou Univ, Mus Sch, Wuhan 430212, Peoples R China
[2] Wuhan Univ, Wuhan 430072, Peoples R China
[3] Wuhan Conservatory Mus, Wuhan 430060, Peoples R China
[4] Natl Inst Informat & Commun Technol, Koganei, Tokyo 1848795, Japan
[5] Shanghai Jiao Tong Univ, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Artificial intelligence; Decoding; Machine translation; Training data; Music; Natural languages; Computational modeling; Time factors; Fuzzy systems; Task analysis; Music lyrics generation; controllable generation; music understanding; constrained decoding; fuzzy training;
D O I
10.1109/ACCESS.2022.3202010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The generation of music lyrics by artificial intelligence (AI) is frequently modeled as a language-targeted sequence-to-sequence generation task. Formally, if we convert the melody into a word sequence, we can consider the lyrics generation task to be a machine translation task. Traditional machine translation tasks involve translating between cross-lingual word sequences, whereas music lyrics generation tasks involve translating between music and natural language word sequences. The theme or key words of the generated lyrics are usually limited to meet the needs of the users when they are generated. This requirement can be thought of as a restricted translation problem. In this paper, we propose a fuzzy training framework that allows a model to simultaneously support both unrestricted and restricted translation by adopting an additional auxiliary training process without constraining the decoding process. This maintains the benefits of restricted translation but greatly reduces the extra time overhead of constrained decoding, thus improving its practicality. The experimental results show that our framework is well suited to the Chinese lyrics generation and restricted machine translation tasks, and that it can also generate language sequence under the condition of given restricted words without training multiple models, thereby achieving the goal of green AI.
引用
收藏
页码:92467 / 92480
页数:14
相关论文
共 50 条
  • [21] Synthesis of expressive speaking styles with limited training data in a multi-speaker, prosody-controllable sequence-to-sequence architecture
    Shechtman, Slava
    Fernandez, Raul
    Sorin, Alexander
    Haws, David
    INTERSPEECH 2021, 2021, : 4693 - 4697
  • [22] A NEW SEQUENCE-TO-SEQUENCE TRANSFORMATION
    CLARK, WD
    GRAY, HL
    SIAM REVIEW, 1969, 11 (04) : 648 - &
  • [23] Question Generation Using Sequence-to-Sequence Model with Semantic Role Labels
    Naeiji, Alireza
    An, Aijun
    Davoudi, Heidar
    Delpisheh, Marjan
    Alzghool, Muath
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 2830 - 2842
  • [24] Sequence-to-sequence transfer transformer network for automatic flight plan generation
    Yang, Yang
    Qian, Shengsheng
    Zhang, Minghua
    Cai, Kaiquan
    IET INTELLIGENT TRANSPORT SYSTEMS, 2024, 18 (05) : 904 - 915
  • [25] Sparse Sequence-to-Sequence Models
    Peters, Ben
    Niculae, Vlad
    Martins, Andre F. T.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1504 - 1519
  • [26] Linear Sequence-to-Sequence Alignment
    Padua, Flavio L. C.
    Carceroni, Rodrigo L.
    Santos, Geraldo A. M. R.
    Kutulakos, Kiriakos N.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (02) : 304 - 320
  • [27] Sequence-to-sequence self calibration
    Wolf, L
    Zomet, A
    COMPUTER VISION - ECCV 2002, PT II, 2002, 2351 : 370 - 382
  • [28] Linear sequence-to-sequence alignment
    Carceroni, RL
    Pádua, FLC
    Santos, GAMR
    Kutulakos, KN
    PROCEEDINGS OF THE 2004 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, 2004, : 746 - 753
  • [29] MutaGAN: A sequence-to-sequence GAN framework to predict mutations of evolving protein populations
    Berman, Daniel S.
    Howser, Craig
    Mehoke, Thomas
    Ernlund, Amanda W.
    Evans, Jared D.
    VIRUS EVOLUTION, 2023, 9 (01)
  • [30] IMPROVING SEQUENCE-TO-SEQUENCE SPEECH RECOGNITION TRAINING WITH ON-THE-FLY DATA AUGMENTATION
    Nguyen, Thai-Son
    Stuker, Sebastian
    Niehues, Jan
    Waibel, Alex
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7689 - 7693