Forensic Speaker Verification Using Ordinary Least Squares

被引:7
作者
Machado, Thyago J. [1 ]
Filho, Jozue Vieira [2 ]
de Oliveira, Mario A. [3 ]
机构
[1] Sao Paulo State Univ UNESP, Campus Ilha Solteira, BR-15385000 Sao Paulo, SP, Brazil
[2] Sao Paulo State Univ UNESP, Telecommun & Aeronaut Engn, BR-13876750 Sao Joao Da Boa, Vista Sp, Brazil
[3] Mato Grosso Fed Inst Technol, Automat & Control Engn, BR-78005200 Cuiaba, Brazil
关键词
forensic speaker comparison; forensic phonetics; voice processing; ordinary least squares (OLS); linear predictive coding (LPC); RECOGNITION;
D O I
10.3390/s19204385
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
In Brazil, the recognition of speakers for forensic purposes still relies on a subjectivity-based decision-making process through a results analysis of untrustworthy techniques. Owing to the lack of a voice database, speaker verification is currently applied to samples specifically collected for confrontation. However, speaker comparative analysis via contested discourse requires the collection of an excessive amount of voice samples for a series of individuals. Further, the recognition system must inform who is the most compatible with the contested voice from pre-selected individuals. Accordingly, this paper proposes using a combination of linear predictive coding (LPC) and ordinary least squares (OLS) as a speaker verification tool for forensic analysis. The proposed recognition technique establishes confidence and similarity upon which to base forensic reports, indicating verification of the speaker of the contested discourse. Therefore, in this paper, an accurate, quick, alternative method to help verify the speaker is contributed. After running seven different tests, this study preliminarily achieved a hit rate of 100% considering a limited dataset (Brazilian Portuguese). Furthermore, the developed method extracts a larger number of formants, which are indispensable for statistical comparisons via OLS. The proposed framework is robust at certain levels of noise, for sentences with the suppression of word changes, and with different quality or even meaningful audio time differences.
引用
收藏
页数:22
相关论文
共 47 条
  • [11] Chung JS, 2018, INTERSPEECH, P1086
  • [12] Devi J.S., 2019, INT J RECENT TECHNOL, V7, P327
  • [13] A Near Real-Time Automatic Speaker Recognition Architecture for Voice-Based User Interface
    Dhakal, Parashar
    Damacharla, Praveen
    Javaid, Ahmad Y.
    Devabhaktuni, Vijay
    [J]. MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2019, 1 (01): : 504 - 520
  • [14] Dresch A., 2015, THESIS
  • [15] EFFICIENT MUSICAL NOISE SUPPRESSION FOR SPEECH ENHANCEMENT SYSTEMS
    Esch, Thomas
    Vary, Peter
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4409 - 4412
  • [16] Essenwanger O M., 1986, Elements of statistical analysis
  • [17] Furui S, 2010, HUMAN-CENTRIC INTERFACES FOR AMBIENT INTELLIGENCE, P163, DOI 10.1016/B978-0-12-374708-2.00007-3
  • [18] Gold E., 2015, P 18 INT C PHON SCI
  • [19] Goldberger A.S, 1964, ECONOMETRIC THEORY G
  • [20] GUJARATI D. N., 2008, Econometria Basica, V5