Speaker Identification for the Analysis of Joint Attention in Video

被引:0
作者
Gonzalez Contreras, Carlos Eduardo [1 ]
De-la-Torre, Miguel [1 ]
Gonzalez Becerra, Victor Hugo [1 ]
Avila-George, Himer [1 ]
Hernandez Palacio, Raul [2 ]
机构
[1] Univ Guadalajara, Ameca, Mexico
[2] Univ Autonoma Estado Hidalgo, Pachuca, Hidalgo, Mexico
来源
2019 8TH INTERNATIONAL CONFERENCE ON SOFTWARE PROCESS IMPROVEMENT (CIMPS) | 2019年
关键词
Joint attention; speaker identification; MFCC; GMM; SVM;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Joint attention (AC) is a skill of human beings essential for the development of the individual, including language learning. Experimental studies in AC commonly involve the analysis of video recordings of scenes with interactions between individuals, and some elements are manually registered, including the intervention of each one. In this work, the design of a speaker identification system is proposed for the analysis of AC, which provides the sequence of interventions from each speaker in videos from AC scenarios. In order to support implementation, a comparative of the most common techniques for speaker identification is provided. Such techniques include the Mel Frequency Cepstral Coefficients (MFCC) and the addition of the MFCC+deltaMFCC. For classification, the Gaussian mixture models (GMM) and support vector machines (SVM) were employed. Results after a 5-fold cross validation process, with 30 audio segments with a duration of 3-4 seconds, throw an accuracy close to 90%, using MFCC+deltaMFCC with SVM. This result evidences the implementation feasibility of the proposed system.
引用
收藏
页数:7
相关论文
共 21 条
  • [1] Anjanendu C, 2018, 2018 2 INT C TRENDS, P510, DOI [10.1109/ICOEI.2018.8553783, DOI 10.1109/ICOEI.2018.8553783]
  • [2] Beigi H, 2011, FUNDAMENTALS OF SPEAKER RECOGNITION, P1, DOI 10.1007/978-0-387-77592-0
  • [3] Bharathi, 2016, 2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, AND OPTIMIZATION TECHNIQUES (ICEEOT), P1843, DOI 10.1109/ICEEOT.2016.7755007
  • [4] Chandar Kumar, 2018, P INT C COMPUTING MA, P1
  • [5] Lince: multiplatform sport analysis software
    Gabin, Brais
    Camerino, Oleguer
    Teresa Anguera, M.
    Castaner, Marta
    [J]. 4TH WORLD CONFERENCE ON EDUCATIONAL SCIENCES (WCES-2012), 2012, 46 : 4692 - 4694
  • [6] Galeote M., 2004, GALEOTE ARREGADO, P114
  • [7] Garcia M. Olivia Ramirez, 2019, INT C SOFTW PROC IMP INT C SOFTW PROC IMP
  • [8] Gonzalez-Becerra V. H., 2016, 26 C MEX AN COND VAL 26 C MEX AN COND VAL
  • [9] Speaker Recognition by Machines and Humans
    Hansen, John H. L.
    Hasan, Taufiq
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2015, 32 (06) : 74 - 99
  • [10] Luque-Suarez F., 2019, MULTIMED TOOLS APPL