A framework for speaker retrieval and identification through unsupervised learning

被引:4
|
作者
Campos, Victor de Abreu [1 ]
Guimaraes Pedronette, Daniel Carlos [1 ]
机构
[1] State Univ Sao Paulo, UNESP, Dept Stat Appl Math & Comp, Rio Claro, Brazil
来源
COMPUTER SPEECH AND LANGUAGE | 2019年 / 58卷 / 153-174期
基金
巴西圣保罗研究基金会;
关键词
Speaker recognition; Speaker retrieval; Unsupervised learning; Vector quantization; Gaussian mixture model; i-vector; IMAGE RE-RANKING; RECOGNITION; SIMILARITY; MACHINES;
D O I
10.1016/j.csl.2019.04.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speaker recognition is a task of remarkable relevance, with applications in diversified domains. Recently, mainly due to the facilities in audio-visual content acquisition, the capacity of analyzing growing datasets independent of labeled data has become a crucial advantage. This paper presents a speaker recognition approach based on recent unsupervised learning methods, which do not require any labeled data or user intervention. The approach is organized in terms of a framework which exploits a rank-based formulation. The similarity information defined by speaker modeling techniques is encoded in ranked lists, which are used as input by the unsupervised learning algorithms. Vector quantization, Gaussian mixture models and i-vectors are employed as modeling techniques, while the algorithms RL-Sim and ReckNN are used for unsupervised learning tasks. The framework was experimentally evaluated on query-by-example speaker retrieval and speaker identification tasks, both on clean and noisy speech recordings. An experimental evaluation was conducted on three public datasets, different languages, and recordings conditions. Effectiveness gains up to +56% on retrieval measures were obtained through the use of unsupervised learning algorithms over traditional speaker recognition techniques. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:153 / 174
页数:22
相关论文
共 50 条
  • [11] Unsupervised Speaker Segmentation Framework Based on Sparse Correlation Feature
    Sun, Yi Xin
    Ma, Yong
    Shi, Kai Bo
    Hu, Jiang Ping
    Zhao, Yi Yi
    Zhang, Yu Ping
    2017 CHINESE AUTOMATION CONGRESS (CAC), 2017, : 3058 - 3063
  • [12] Unsupervised Similarity Learning through Cartesian Product of Ranking References for Image Retrieval Tasks
    Valem, Lucas Pascotti
    Guimaraes Pedronette, Daniel Carlos
    2016 29TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2016, : 249 - 256
  • [13] Accelerated dinuclear palladium catalyst identification through unsupervised machine learning
    Hueffel, Julian A.
    Sperger, Theresa
    Funes-Ardoiz, Ignacio
    Ward, Jas S.
    Rissanen, Kari
    Schoenebeck, Franziska
    SCIENCE, 2021, 374 (6571) : 1134 - +
  • [14] Unsupervised Speaker Identification using Overlaid Texts in TV Broadcast
    Poignant, Johann
    Bredin, Herve
    Le, Viet Bac
    Besacier, Laurent
    Barras, Claude
    Quenot, Georges
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2649 - 2652
  • [15] Unsupervised Speaker Identification in TV Broadcast Based on Written Names
    Poignant, Johann
    Besacier, Laurent
    Quenot, Georges
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (01) : 57 - 68
  • [16] Unsupervised real-time speaker identification for daily movies
    Ying, L
    Kuo, CCJ
    INTERNET MULTIMEDIA MANAGEMENT SYSTEMS III, 2002, 4862 : 151 - 162
  • [17] An Unsupervised Emotional Scene Retrieval Framework for Lifelog Videos
    Nomiya, Hiroki
    Morikuni, Atsushi
    Hochin, Teruhisa
    2014 IIAI 3RD INTERNATIONAL CONFERENCE ON ADVANCED APPLIED INFORMATICS (IIAI-AAI 2014), 2014, : 609 - 615
  • [18] Unsupervised Manifold Learning for Video Genre Retrieval
    Almeida, Jurandy
    Pedronette, Daniel C. G.
    Penatti, Otavio A. B.
    PROGRESS IN PATTERN RECOGNITION IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2014, 2014, 8827 : 604 - 612
  • [19] Audio retrieval: Based on unsupervised learning approach
    Zhao, XY
    Wu, F
    Lin, B
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1625 - 1628
  • [20] Enhancing Sparse Retrieval via Unsupervised Learning
    Ma, Xueguang
    Fun, Hengxin
    Yin, Xusen
    Mallia, Antonio
    Lin, Jimmy
    ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL IN THE ASIA PACIFIC REGION, SIGIR-AP 2023, 2023, : 150 - 157