Text-independent speaker identification based on deep Gaussian correlation supervector

被引:0
|
作者
Linhui Sun
Ting Gu
Keli Xie
Jia Chen
机构
[1] Nanjing University of Posts and Telecommunications,College of Telecommunications & Information Engineering
[2] Ministry of Education,Key Lab of Broadband Wireless Communication and Sensor Network Technology
[3] Nanjing University of Posts and Telecommunications,undefined
来源
International Journal of Speech Technology | 2019年 / 22卷
关键词
Gaussian mixture model; Deep neural network; Speaker identification; Bottleneck feature; Deep Gaussian correlation supervector;
D O I
暂无
中图分类号
学科分类号
摘要
Great progress has been made in speaker recognition by extracting features from Gaussian mixture model (GMM) or deep neural network (DNN). In this paper, to extract the personality characteristics of speakers more accurately, we propose a novel deep Gaussian correlation supervector (DGCS) feature based on a DBN-GMM hybrid model. In the method, we firstly extract MFCC from preprocessed speech signals and employ a DBN to gain bottleneck features. Then bottleneck features are fed to a GMM to extract deep Gaussian supervector (DGS) which can be as the input of SVM achieving pattern discrimination and judgment. Further considering the relevance between deep mean vectors of DGS, DGS will be transformed to DGCS by the method of supervector recombination. Our experiments show that utilizing DGCS can significantly improve recognition rate by 17.979% compared to the system only with supervector, 18.22% compared to the system with DGS and 1.875% compared to the system with correlation supervector. In addition, the proposed DGCS demonstrates that time complexity for identification task can be largely reduced.
引用
收藏
页码:449 / 457
页数:8
相关论文
共 50 条
  • [41] Modified layer deep convolution neural network for text-independent speaker recognition
    Karthikeyan, V
    Priyadharsini, Suja S.
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2024, 36 (02) : 273 - 285
  • [42] Improving Text-independent Speaker Recognition with GMM
    Chakroun, Rania
    Zouari, Leila Beltaifa
    Frikha, Mondher
    Ben Hamida, Ahmed
    2016 2ND INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP), 2016, : 693 - 696
  • [43] Robust Text-independent Speaker recognition with Short Utterances using Gaussian Mixture Models
    Chakroun, Rania
    Frikha, Mondher
    2020 16TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC, 2020, : 2204 - 2209
  • [44] DISTRIBUTED AUTOMATIC TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING GMM-UBM SPEAKER MODELS
    Chowdhury, Md Foezur Rahman
    Selouani, Sid-Ahmed
    O'Shaughnessy, Douglas
    2009 IEEE 22ND CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1 AND 2, 2009, : 1039 - +
  • [45] Text-independent Speaker Identification Using Fisher Discrimination Dictionary Learning Method
    Wang, Xia
    Yin, Qian
    Guo, Ping
    PROCEEDINGS OF THE 2012 EIGHTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS 2012), 2012, : 435 - 438
  • [46] Text-Independent Emirati-Accented Speaker Identification in Emotional Talking Environment
    Shahin, Ismail
    2018 FIFTH HCT INFORMATION TECHNOLOGY TRENDS (ITT): EMERGING TECHNOLOGIES FOR ARTIFICIAL INTELLIGENCE, 2018, : 257 - 262
  • [47] An Efficient Text-Independent Speaker Identification Using Feature Fusion and Transformer Model
    Khan, Arfat Ahmad
    Jahangir, Rashid
    Alroobaea, Roobaea
    Alyahyan, Saleh Yahya
    Almulhi, Ahmed H.
    Alsafyani, Majed
    Wechtaisong, Chitapong
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 4085 - 4100
  • [48] Residual networks for text-independent speaker identification: Unleashing the power of residual learning
    Gambhir, Pooja
    Dev, Amita
    Bansal, Poonam
    Sharma, Deepak Kumar
    Gupta, Deepak
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2024, 80
  • [49] 2ND-ORDER STATISTICAL MEASURES FOR TEXT-INDEPENDENT SPEAKER IDENTIFICATION
    BIMBOT, F
    MAGRINCHAGNOLLEAU, I
    MATHAN, L
    SPEECH COMMUNICATION, 1995, 17 (1-2) : 177 - 192
  • [50] Cubic Law and MAP Compensation Techniques for Robust Text-Independent Speaker Identification
    Anacleto, Harry
    Chavez, David
    PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING (IWSSIP), 27TH EDITION, 2020, : 387 - 392