LSTM-Based Framework for the Synthesis of Original Soundtracks

被引：0

作者：

Huo, Yuanzhi ^{[1
]}

Jin, Mengjie ^{[2
]}

You, Sicong ^{[3
]}

机构：

[1] Henan Univ Sci & Technol, Coll Informat Engn, Luoyang 471000, Henan, Peoples R China

[2] Henan Univ Sci & Technol, Sch Math & Stat, Luoyang 471000, Henan, Peoples R China

[3] Nanjing Agr Univ, Coll Food Sci & Technol, Nanjing 210095, Jiangsu, Peoples R China

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Long short term memory; Logic gates; Music; Training; Computer architecture; Task analysis; Recurrent neural networks; Deep learning; Machine learning; Sequences; Performance evaluation; LSTM; machine learning; music synthesis; RNN; sequence prediction; MUSIC;

D O I：

10.1109/ACCESS.2024.3372581

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recently, significant developments have been made in Long Short-Term Memory (LSTM) networks within the realm of synthesis music. Notwithstanding these advancements, several challenges persist warranting further research. Primarily, there exists an absence of dedicated research on the application of LSTM networks for the synthesis of Original Sound Tracks (OST). Secondly, in general, people can only judge whether the synthesized music meets their expectations based on the model output. However, due to the time-consuming of training the model may need to try multiple times to obtain successful training results. Moreover, the subjective of music quality evaluation relying on human perception, not only the result of model training. To address these multifaceted challenges, this paper concentrates specifically on OST and proposes a framework termed the OST Synthesis Framework (OSTSF) utilizing LSTM. This framework accepts various OST types as input, processed through LSTM to yield innovative OST. Additionally, a novel preprocessing algorithm is proposed to screen input OST elements such as notes and chords, enabling control over music type and quality before the training phase. This algorithm serves to mitigate training uncertainties and reduce situations that require repeated training. Besides, a postprocessing approach, leveraging mathematical formulations facilitates the evaluation of synthesis OST also proposed. This approach aims to quantify subjective evaluations, providing a more intuitive representation through scoring metrics. Experiment results reveal that the OSTSF synthesized OST received favorable rate among a cohort of 100 surveyed respondents attaining 78.8%, demonstrating the efficacy of the proposed framework in the realm of music synthesis utilizing LSTM.

引用

页码：33832 / 33842

页数：11

共 41 条

[1] AIVA, the AI Music Generation Assistant
[2] [Anonymous], Google Colab-What is Google Colab?
[3] [Anonymous], Understanding LSTM Networks
[4] [Anonymous], Note Names, MIDI Numbers and Frequencies
[5] [Anonymous], The MIDI Standard: Introduction To MIDI and Computer Music: Center for Electronic and Computer Music: Jacobs School of Music
[6] [Anonymous], Flow Machines
[7] [Anonymous], MAGENTA
[8] Chien H.-Y. S., 2021, arXiv
[9] Dua Mohit, 2020, Procedia Computer Science, V171, P465, DOI 10.1016/j.procs.2020.04.049
[10] Garoufis C, 2020, INT CONF ACOUST SPEE, P4502, DOI [10.1109/icassp40776.2020.9053992, 10.1109/ICASSP40776.2020.9053992]

← 1 2 3 4 5 →