VStego800K: Large-Scale Steganalysis Dataset for Streaming Voice

被引:1
作者
Xu, Xuan [1 ]
Guo, Shengnan [1 ]
Fang, Zhengyang [1 ]
Zhou, Pengcheng [1 ]
Yang, Zhongliang [1 ]
Zhou, Linna [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Cyberspace Security, Beijing 100876, Peoples R China
来源
DIGITAL FORENSICS AND WATERMARKING, IWDW 2023 | 2024年 / 14511卷
基金
中国国家自然科学基金;
关键词
VStego800K; Voice Steganalysis; Dataset; QUANTIZATION INDEX MODULATION; STEGANOGRAPHY;
D O I
10.1007/978-981-97-2585-4_21
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, more and more steganographic methods based on streaming voice have appeared, which poses a great threat to the security of cyberspace. In this paper, in order to promote the development of streaming voice steganalysis technology, we construct and release a large-scale streaming voice steganalysis dataset called VStego800K. To truly reflect the needs of reality, we mainly follow three considerations when constructing the VStego800K dataset: large-scale, real-time, and diversity. The large-scale dataset allows researchers to fully explore the statistical distribution differences of streaming signals caused by steganography. Therefore, the proposed VStego800K dataset contains 814,592 streaming voice fragments. Among them, 764,592 samples (382,296 cover-stego pairs) are divided as the training set and the remaining 50,000 as testing set. The duration of all samples in the data set is uniformly cut to 1 s to encourage researchers to develop near real-time speech steganalysis algorithms. To ensure the diversity of the dataset, the collected voice signals are mixed with male and female as well as Chinese and English from different speakers. For each steganographic sample in VStego800K, we randomly use two typical streaming voice steganography algorithms, and randomly embed random bit with embedding rates of 10%-40%. We tested the performance of some latest steganalysis algorithms on VStego800K, with specific results and analysis details in the experimental part. We hope that the VStego800K dataset will further promote the development of universal voice steganalysis technology. The description of VStego800K and instructions will be released here: https://github.com/YangzlTHU/VStego800K.
引用
收藏
页码:292 / 303
页数:12
相关论文
共 37 条
[1]   Highly transparent steganography model of speech signals using Efficient Wavelet Masking [J].
Ballesteros L, Dora M. ;
Moreno A, Juan M. .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (10) :9141-9149
[2]   An Approach to Information Hiding in Low Bit-rate Speech Stream [J].
Bo, Xiao ;
Yongfeng, Huang ;
Tang, Shanyu .
GLOBECOM 2008 - 2008 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE, 2008,
[3]   Quantization index modulation: A class of provably good methods for digital watermarking and information embedding [J].
Chen, B ;
Wornell, GW .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2001, 47 (04) :1423-1443
[4]  
Erchi Xu, 2011, Proceedings of the 2011 14th International Conference on Network-Based Information Systems (NBiS 2011), P612, DOI 10.1109/NBiS.2011.103
[5]   Detecting LSB steganography in color and gray-scale images [J].
Fridrich, J. ;
Goljan, M. ;
Du, R. .
IEEE Multimedia, 2001, 8 (04) :22-28
[6]   Voice over Internet protocol (VoIP) [J].
Goode, B .
PROCEEDINGS OF THE IEEE, 2002, 90 (09) :1495-1517
[7]  
Hamdaqa M., 2011, Proceedings of the 2011 Fifth International Conference on Secure Software Integration and Reliability Improvement (SSIRI 2011), P189, DOI 10.1109/SSIRI.2011.24
[8]   Detection of heterogeneous parallel steganography for low bit-rate VoIP speech streams [J].
Hu, Yuting ;
Huang, Yihua ;
Yang, Zhongliang ;
Huang, Yongfeng .
NEUROCOMPUTING, 2021, 419 :70-79
[9]   Detection of covert voice-over Internet protocol communications using sliding window-based steganalysis [J].
Huang, Y. F. ;
Tang, S. ;
Zhang, Y. .
IET COMMUNICATIONS, 2011, 5 (07) :929-936
[10]   Steganography in Inactive Frames of VoIP Streams Encoded by Source Codec [J].
Huang, Yong Feng ;
Tang, Shanyu ;
Yuan, Jian .
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2011, 6 (02) :296-306