Simultaneous speaker identification and watermarking

被引：2

作者：

Abd El-Wahab, Basant S. ^{[1
]}

El-khobby, Heba A. ^{[1
]}

Abd Elnaby, Mustafa M. ^{[1
]}

Abd El-Samie, Fathi E. ^{[2
]}

机构：

[1] Tanta Univ, Fac Engn, Dept Elect & Elect Commun Engn, Tanta, Egypt

[2] Menoufia Univ, Fac Elect Engn, Dept Elect & Elect Commun, Al Minufiyah, Egypt

来源：

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY | 2021年 / 24卷 / 01期

关键词：

Biometric systems; Speech watermarking; Empirical mode decomposition; Mel frequency cepstral coefficients; Speech enhancement; Speaker identification;

D O I：

10.1007/s10772-019-09658-x

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Biometric template protection of speech signals and information hiding in speech signals are two challenging issues. To resolve such limitations and increase the level of security, our objective is to build multi-level security systems based on speech signals. So, speech watermarking is used simultaneously with automatic speaker identification. The speech watermarking is performed to embed images into the speech signals that are used for speaker identification. The watermark is extracted for authentication, and then the effect of watermark removal on the performance of the speaker identification system in the presence of degradations is studied. This paper presents an approach for speech watermarking based on empirical mode decomposition (EMD) in different transform domains and singular value decomposition (SVD). The speech signal is decomposed in different transform domains with EMD to yield zero-mean components called intrinsic mode functions (IMFs). The watermark is inserted into one of these IMF components with SVD. A comparison between different transform domains for implementing the proposed watermarking scheme on different IMFs is presented. The log-likelihood ratio (LLR), correlation coefficient (C-r), signal-to-noise ratio (SNR), and spectral distortion (SD) are used as metrics for the comparison. According to the simulation results, we find that the watermark embedding in the discrete sine transform domain provides higher SNR and C-r values and lower SD and LLR values. The proposed approach is robust to different attacks.

引用

页码：205 / 218

页数：14

共 50 条

[21] Perceptual Features in Speaker Identification
Segarceanu, Svetlana
Zaharia, Tiberius
Radoi, Constantin
PROCEEDINGS OF THE 2010 8TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS (COMM), 2010, : 95 - 98
[22] Limited data speaker identification
H. S. Jayanna
S. R. Mahadeva Prasanna
Sadhana, 2010, 35 : 525 - 546
[23] Effects of stimulus contents and speaker familiarity on perceptual speaker identification
Amino, Kanae
Arai, Takayuki
ACOUSTICAL SCIENCE AND TECHNOLOGY, 2007, 28 (02) : 128 - 130
[24] Improving Speaker Segmentation via Speaker Identification and Text Segmentation
Li, Runxin
Schultz, Tanja
Jin, Qin
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 928 - 931
[25] FORENSIC APPLICATION OF SPEAKER IDENTIFICATION
Draghicescu, Dragos
UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2015, 77 (03): : 107 - 122
[26] Limited data speaker identification
Jayanna, H. S.
Prasanna, S. R. Mahadeva
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2010, 35 (05): : 525 - 546
[27] Speaker Identification in Overlapping Speech
Tsai, Wei-Ho
Liao, Shih-Jie
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2010, 26 (05) : 1891 - 1903
[28] Text-Independent Speaker Identification Using Vowel Formants
Noor Almaadeed
Amar Aggoun
Abbes Amira
Journal of Signal Processing Systems, 2016, 82 : 345 - 356
[29] Text-Independent Speaker Identification Using Vowel Formants
Almaadeed, Noor
Aggoun, Amar
Amira, Abbes
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2016, 82 (03): : 345 - 356
[30] A DISCRIMINATIVE APPROACH FOR SPEAKER SELECTION IN SPEAKER DE-IDENTIFICATION SYSTEMS
Abou-Zleikha, Mohamed
Tan, Zheng-Hua
Christensen, Mads Graesboll
Jensen, Soren Holdt
2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2102 - 2106

← 1 2 3 4 5 →