I-vector Extraction for Speaker Recognition Based on Dimensionality Reduction

被引：18

作者：

Ibrahim, Noor Salwani ^{[1
]}

Ramli, Dzati Athiar ^{[1
]}

机构：

[1] Univ Sains Malaysia, Sch Elect & Elect, Engn Campus, Nibong Tebal 14300, Pulau Pinang, Malaysia

来源：

KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KES-2018) | 2018年 / 126卷

关键词：

Bob Spear toolbox; I-vectors; Dimensionality Reduction; UBM size; Frog Identification; SHORT SEQUENCES; IDENTIFICATION;

D O I：

10.1016/j.procs.2018.08.126

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the domain of speaker recognition, many methods have been proposed over time. The technology for automatic speaker recognition has now reached a good level of performance but there is still need of improvement. In this paper, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis also known as i-vector. This space is named the total variability space because it models both speaker and channel variabilities. The i-vector subspace modelling is one of the recent methods that have become the state of the art technique in this domain. This method largely provides the benefit of modelling both the intra-domain and inter-domain variabilities into the same low dimensional space. In this study, 2656 syllables bio-acoustic signals from 55 species of frog taken from Intelligent Biometric Group, USM database are used for frog identification system. Parameters of the system are initially tuned such as Universal Background Model (UBM) size (32, 64 and 128 Gaussians) and i-vector dimensionality (100, 200 and 400 dimensions). To the end, we assess the effect of the parameter tuned and record the computation time. We observed that, the accuracy for smaller UBM size and higher i-vector dimensionality outperforms others with result of 91.11% is achieved. From this research, it can be concluded that UBM size and i-vector dimensionality effect the accuracy of frog identification based on i-vector. (C) 2018 The Authors. Published by Elsevier Ltd.

引用

页码：1534 / 1540

页数：7

共 17 条

[1] [Anonymous], 2011, INTERSPEECH
[2] Factors affecting i-vector based foreign accent recognition: A case study in spoken Finnish
Behravan, Hamid
Hautamaki, Ville
Kinnunen, Tomi
[J]. SPEECH COMMUNICATION, 2015, 66 : 118 - 129
[3] Speaker Identification in Noisy Conditions Using Short Sequences of Speech Frames
Biagetti, Giorgio
Crippa, Paolo
Falaschetti, Laura
Orcioni, Simone
Turchetti, Claudio
[J]. INTELLIGENT DECISION TECHNOLOGIES 2017, KES-IDT 2017, PT II, 2018, 73 : 43 - 52
[4] An Investigation on the Accuracy of Truncated DKLT Representation for Speaker Identification With Short Sequences of Speech Frames
Biagetti, Giorgio
Crippa, Paolo
Falaschetti, Laura
Orcioni, Simone
Turchetti, Claudio
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (12) : 4235 - 4249
[5] Front-End Factor Analysis for Speaker Verification
Dehak, Najim
Kenny, Patrick J.
Dehak, Reda
Dumouchel, Pierre
Ouellet, Pierre
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 788 - 798
[6] Dehak N, 2009, INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, P1527
[7] Deep Learning Backend for Single and Multisession i-Vector Speaker Recognition
Ghahabi, Omid
Hernando, Javier
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) : 807 - 817
[8] Glembek O, 2011, INT CONF ACOUST SPEE, P4516
[9] Greenberg CS, 2013, INTERSPEECH, P1970
[10] Huang Z, 2013, INTERSPEECH, P2281

← 1 2 →