Histogram equalization using a reduced feature set of background speakers’ utterances for speaker recognition

被引:0
|
作者
Myung-jae KIM [1 ]
Il-ho YANG [1 ]
Min-seok KIM [1 ]
Ha-jin YU [1 ]
机构
[1] School of Computer Science, University of Seoul
关键词
Speaker recognition; Histogram equalization; i-vector;
D O I
暂无
中图分类号
TN912.34 [语音识别与设备];
学科分类号
0711 ;
摘要
We propose a method for histogram equalization using supplement sets to improve the performance of speaker recognition when the training and test utterances are very short. The supplement sets are derived using outputs of selection or clustering algorithms from the background speakers’ utterances. The proposed approach is used as a feature normalization method for building histograms when there are insufficient input utterance samples.In addition, the proposed method is used as an i-vector normalization method in an i-vector-based probabilistic linear discriminant analysis(PLDA) system, which is the current state-of-the-art for speaker verification. The ranks of sample values for histogram equalization are estimated in ascending order from both the input utterances and the supplement set. New ranks are obtained by computing the sum of different kinds of ranks. Subsequently, the proposed method determines the cumulative distribution function of the test utterance using the newly defined ranks. The proposed method is compared with conventional feature normalization methods, such as cepstral mean normalization(CMN), cepstral mean and variance normalization(MVN), histogram equalization(HEQ), and the European Telecommunications Standards Institute(ETSI) advanced front-end methods. In addition, performance is compared for a case in which the greedy selection algorithm is used with fuzzy C-means and K-means algorithms.The YOHO and Electronics and Telecommunications Research Institute(ETRI) databases are used in an evaluation in the feature space. The test sets are simulated by the Opus Vo IP codec. We also use the 2008 National Institute of Standards and Technology(NIST) speaker recognition evaluation(SRE) corpus for the i-vector system. The results of the experimental evaluation demonstrate that the average system performance is improved when the proposed method is used, compared to the conventional feature normalization methods.
引用
收藏
页码:738 / 750
页数:13
相关论文
共 50 条
  • [1] Histogram equalization using a reduced feature set of background speakers’ utterances for speaker recognition
    Myung-jae Kim
    Il-ho Yang
    Min-seok Kim
    Ha-jin Yu
    Frontiers of Information Technology & Electronic Engineering, 2017, 18 : 738 - 750
  • [2] Histogram equalization using a reduced feature set of background speakers' utterances for speaker recognition
    Kim, Myung-jae
    Yang, Il-ho
    Kim, Min-seok
    Yu, Ha-jin
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2017, 18 (05) : 738 - 750
  • [3] Histogram Equalization Using Centroids of Fuzzy C-Means of Background Speakers' Utterances for Majority Voting Based Speaker Identification
    Kim, Myung-Jae
    Yang, Il-Ho
    Yu, Ha-Jin
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2014, 33 (01): : 68 - 74
  • [4] Probabilistic location recognition using reduced feature set
    Li, Fayin
    Kosecka, Jana
    2006 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), VOLS 1-10, 2006, : 3405 - +
  • [5] FUZZY-BASED HISTOGRAM EQUALIZATION PROCEDURE FOR IMAGE RECOGNITION USING POINTED SET
    YOO, SW
    GIARDINA, CR
    JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY, 1995, 39 (05): : 457 - 463
  • [6] Face Recognition and Verification using Histogram Equalization
    Ramirez-Gutierrez, Kelsey
    Cruz-Perez, Daniel
    Perez-Meana, Hector
    SELECTED TOPICS IN APPLIED COMPUTER SCIENCE, 2010, : 85 - +
  • [7] Environmental robust speech and speaker recognition through multi-channel histogram equalization
    Squartini, Stefano
    Principi, Emanuele
    Rotili, Rudy
    Piazza, Francesco
    NEUROCOMPUTING, 2012, 78 (01) : 111 - 120
  • [8] Asynchronous Factorisation of Speaker and Background with Feature Transforms in Speech Recognition
    Saz, Oscar
    Hain, Thomas
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1237 - 1241
  • [9] Advanced Palmprint Recognition using Unsharp Masking and Histogram Equalization
    Palanikumar, S.
    Sajan, C. Minu
    Sasikumar, M.
    2013 IEEE CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES (ICT 2013), 2013, : 47 - 52
  • [10] Histogram of Oriented Gradients Based Reduced Feature for Traffic Sign Recognition
    Deepika
    Vashisth, Sharda
    Saurav, Sumeet
    2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 2206 - 2212