Simultaneous speaker identification and watermarking

被引：0

作者：

Basant S. Abd El-Wahab

Heba A. El-khobby

Mustafa M. Abd Elnaby

Fathi E. Abd El-Samie

机构：

[1] Tanta University,Department of Electronics and Electrical Communications Engineering, Faculty of Engineering

[2] Menoufia University,Department of Electronics and Electrical Communications, Faculty of Electronic Engineering

来源：

International Journal of Speech Technology | 2021年 / 24卷

关键词：

Biometric systems; Speech watermarking; Empirical mode decomposition; Mel frequency cepstral coefficients; Speech enhancement; Speaker identification;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Biometric template protection of speech signals and information hiding in speech signals are two challenging issues. To resolve such limitations and increase the level of security, our objective is to build multi-level security systems based on speech signals. So, speech watermarking is used simultaneously with automatic speaker identification. The speech watermarking is performed to embed images into the speech signals that are used for speaker identification. The watermark is extracted for authentication, and then the effect of watermark removal on the performance of the speaker identification system in the presence of degradations is studied. This paper presents an approach for speech watermarking based on empirical mode decomposition (EMD) in different transform domains and singular value decomposition (SVD). The speech signal is decomposed in different transform domains with EMD to yield zero-mean components called intrinsic mode functions (IMFs). The watermark is inserted into one of these IMF components with SVD. A comparison between different transform domains for implementing the proposed watermarking scheme on different IMFs is presented. The log-likelihood ratio (LLR), correlation coefficient (Cr), signal-to-noise ratio (SNR), and spectral distortion (SD) are used as metrics for the comparison. According to the simulation results, we find that the watermark embedding in the discrete sine transform domain provides higher SNR and Cr values and lower SD and LLR values. The proposed approach is robust to different attacks.

引用

页码：205 / 218

页数：13

共 50 条

[41] Speaker identification utilizing noncontemporary speech
Hollien, H
Schwartz, R
JOURNAL OF FORENSIC SCIENCES, 2001, 46 (01) : 63 - 67
[42] Is voice transformation a threat to speaker identification?
Jin, Qin
Toth, Arthur R.
Black, Alan W.
Schultz, Tanja
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4845 - 4848
[43] Speaker identification based on neural networks
Marhon, Sajid A.
Al-Aghar, Duaa N. Ubaid
NEURAL NETWORK WORLD, 2006, 16 (04) : 277 - 290
[44] Bilingual speaker identification: Chinese and English
Mok, Peggy P. K.
Xu, Robert Bo
Zuo, Donghui
INTERNATIONAL JOURNAL OF SPEECH LANGUAGE AND THE LAW, 2015, 22 (01) : 57 - 77
[45] Speaker Identification Based on Fractal Dimensions
侯丽敏
王朔中
Journal of Shanghai University, 2003, (01) : 60 - 63
[46] The case for aural perceptual speaker identification
Hollien, Harry
Didla, Grace
Harnsberger, James D.
Hollien, Keith A.
FORENSIC SCIENCE INTERNATIONAL, 2016, 269 : 8 - 20
[47] Speaker Identification by Comparison of Smart Methods
Meimand, Ali Mahdavi
Asadi, Amin
Mohamadi, Majid
JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE-JMCS, 2014, 10 (01): : 61 - 71
[48] LARGE-SCALE SPEAKER IDENTIFICATION
Schmidt, Ludwig
Sharifi, Matthew
Moreno, Ignacio Lopez
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[49] A Histogram Based Speaker Identification Technique
Sleit, Azzam
Serhan, Sami
Nemir, Loai
2008 FIRST INTERNATIONAL CONFERENCE ON THE APPLICATIONS OF DIGITAL INFORMATION AND WEB TECHNOLOGIES, VOLS 1 AND 2, 2008, : 391 - 395
[50] An FPGA based VQ for speaker identification
Elmisery, FA
Khaleil, AH
Salama, AE
El-Geldawi, F
17th ICM 2005: 2005 International Conference on Microelectronics, Proceedings, 2005, : 130 - 132

← 1 2 3 4 5 →