Message-Driven Generative Music Steganography Using MIDI-GAN

被引:1
作者
Su, Zhaopin [1 ,2 ,3 ,4 ]
Zhang, Guofu [1 ,2 ,3 ,4 ]
Shi, Zhiyuan [1 ,2 ,3 ,4 ]
Hu, Donghui [1 ,2 ,3 ,4 ]
Zhang, Weiming [5 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230601, Peoples R China
[2] Minist Educ, Engn Res Ctr Safety Crit Ind Measurement & Control, Hefei 230009, Peoples R China
[3] Hefei Univ Technol, Intelligent Interconnected Syst Lab Anhui Prov, Hefei 230009, Peoples R China
[4] Hefei Univ Technol, Anhui Prov Key Lab Ind Safety & Emergency Technol, Hefei 230601, Peoples R China
[5] Univ Sci & Technol China, Sch Cyber Sci & Technol, Hefei 230026, Peoples R China
关键词
Steganography; Generators; Videos; Speech recognition; Multiple signal classification; Adversarial machine learning; Music steganography; generative adversarial networks; MIDI; chord numbers; statistical distribution; QUANTIZATION INDEX MODULATION; STEGANALYSIS; INFORMATION;
D O I
10.1109/TDSC.2024.3372139
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Generative steganography has become a popular research topic in the field of generative AI, including generative image and synthetic speech steganography. However, music files have different statistical properties and knowledge representation compared to image and speech files, and the reversible transform between secret message and music is also challenging. Therefore, the existing generative steganographic methods that are effective for image/speech may not be directly effective for music. In this article, we propose a generative music steganography method, named MIDI-GAN, to generate a secret message as an artificial stego MIDI file using generative adversarial networks (GANs). The created stego MIDI file is small in size, has sweet melodies, and is undetectable to deep learning-based steganalyzers. Unlike the previous generative image/speech steganography, the stego MIDI can also be presented as a sequence of chord numbers, making it difficult for anyone to detect and see grounds for suspicion. Moreover, these chord numbers can be transmitted as any other digital or physical medium to evade detection. Specifically, MIDI-GAN comprises a generator, a discriminator, and an extractor. The generator synthesizes a stego MIDI file from the secret message, while the discriminator ensures that the stego MIDI file approaches the authentic rather than the synthetic MIDI file as much as possible in statistical distribution. The extractor recovers the secret message from the stego MIDI file or chord sequence. Experimental results demonstrate that MIDI-GAN has high concealment and security, as the stego MIDI generated by our method is closely similar to the authentic MIDI files and maintains excellent anti-detection ability against deep learning-based steganalysis.
引用
收藏
页码:5196 / 5207
页数:12
相关论文
共 53 条
[1]   A Sparse Representation-Based Wavelet Domain Speech Steganography Method [J].
Ahani, Soodeh ;
Ghaemmaghami, Shahrokh ;
Wang, Z. Jane .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (01) :80-91
[2]   Digital audio steganography: Systematic review, classification, and analysis of the current state of the art [J].
AlSabhany, Ahmed A. ;
Ali, Ahmed Hussain ;
Ridzuan, Farida ;
Azni, A. H. ;
Mokhtar, Mohd Rosmadi .
COMPUTER SCIENCE REVIEW, 2020, 38
[3]  
Arjovsky M, 2017, PR MACH LEARN RES, V70
[4]   THE TECHNOLOGY OF ERROR-CORRECTING CODES [J].
BERLEKAMP, ER .
PROCEEDINGS OF THE IEEE, 1980, 68 (05) :564-593
[5]   Quantization index modulation: A class of provably good methods for digital watermarking and information embedding [J].
Chen, B ;
Wornell, GW .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2001, 47 (04) :1423-1443
[6]   Audio Steganalysis With Convolutional Neural Network [J].
Chen, Bolin ;
Luo, Weiqi ;
Li, Haodong .
IH&MMSEC'17: PROCEEDINGS OF THE 2017 ACM WORKSHOP ON INFORMATION HIDING AND MULTIMEDIA SECURITY, 2017, :85-90
[7]   Distribution-Preserving Steganography Based on Text-to-Speech Generative Models [J].
Chen, Kejiang ;
Zhou, Hang ;
Zhao, Hanqing ;
Chen, Dongdong ;
Zhang, Weiming ;
Yu, Nenghai .
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2022, 19 (05) :3343-3356
[8]   Derivative-Based Steganographic Distortion and its Non-additive Extensions for Audio [J].
Chen, Kejiang ;
Zhou, Hang ;
Li, Weixiang ;
Yang, Kuan ;
Zhang, Weiming ;
Yu, Nenghai .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (07) :2027-2032
[9]   Learning to Generate Steganographic Cover for Audio Steganography Using GAN [J].
Chen, Lang ;
Wang, Rangding ;
Yan, Diqun ;
Wang, Jie .
IEEE ACCESS, 2021, 9 :88098-88107
[10]  
Cuthbert M. S., 2010, P 11 INT SOC MUS INF, P637