VocalPrint: A mmWave-Based Unmediated Vocal Sensing System for Secure Authentication

被引:19
作者
Li, Huining [1 ]
Xu, Chenhan [1 ]
Rathore, Aditya Singh [1 ]
Li, Zhengxiong [1 ]
Zhang, Hanbin [1 ]
Song, Chen [2 ]
Wang, Kun [3 ]
Su, Lu [4 ]
Lin, Feng [5 ]
Ren, Kui [5 ]
Xu, Wenyao [1 ]
机构
[1] Univ Buffalo, Dept Comp Sci & Engn, Buffalo, NY 14261 USA
[2] San Diego State Univ, Dept Comp Sci, San Diego, CA 92182 USA
[3] Univ Calif Los Angeles, Dept Elect & Comp Engn, Los Angeles, CA 90095 USA
[4] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
[5] Zhejiang Univ, Dept Comp Sci & Technol, Hangzhou 310027, Peoples R China
基金
美国国家科学基金会;
关键词
mmWave sensing; voice authentication; biometrics; SPEAKER; RECOGNITION; MODEL;
D O I
10.1109/TMC.2021.3084971
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the continuing growth of voice-controlled devices, voice metrics have been widely used for user identification. However, voice biometrics is vulnerable to replay attacks and ambient noise. We identify that the fundamental vulnerability in voice biometrics is rooted in its indirect sensing modality (e.g., microphone). In this paper, we present VocalPrint, a resilient mmWave interrogation system which directly captures and analyzes the vocal vibrations for user authentication. Specifically, VocalPrint exploits the unique disturbance of the skin-reflect radio frequency (RF) signals around the near-throat region of the user, caused by the vocal vibrations. The complex ambient noise is isolated from the RF signal using a novel resilience-aware clutter suppression approach for preserving fine-grained vocal biometric properties. Afterward, we extract the vocal tract and vocal source features and input them into an ensemble classifier for authentication. VocalPrint is practical as it allows the effortless transition to a smartphone while having sufficient usability due to its non-contact nature. Our experimental results from 41 participants with different interrogation distances, orientations, and body motions show that VocalPrint achieves over 96 percent authentication accuracy even under unfavorable conditions. We demonstrate the resilience of our system against complex noise interference and spoof attacks of various threat levels.
引用
收藏
页码:589 / 606
页数:18
相关论文
共 89 条
[51]   Combining evidence from residual phase and MFCC features for speaker recognition [J].
Murty, KR ;
Yegnanarayana, B .
IEEE SIGNAL PROCESSING LETTERS, 2006, 13 (01) :52-55
[52]  
Nakagawa S., 2007, PROC 8 ANN C INT SPE, P4502
[53]   An efficient method of eliminating the range ambiguity for a low-cost FMCW radar using VCO tuning characteristics [J].
Park, Jung Dong ;
Kim, Wan Joo .
IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, 2006, 54 (10) :3623-3629
[54]  
Patil A.S., 2012, Proceedings of the International Multiconference of Engineers and Computer Science, V14-16 March, P1
[55]  
Petkie D. T., 2009, P SPIE, V7485
[56]   Modeling of the glottal flow derivative waveform with application to speaker identification [J].
Plumpe, MD ;
Quatieri, TF ;
Reynolds, DA .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (05) :569-586
[57]   Hidebehind: Enjoy Voice Input with Voiceprint Unclonability and Anonymity [J].
Qian, Jianwei ;
Du, Haohua ;
Hou, Jiahui ;
Chen, Linlin ;
Jung, Taeho ;
Li, Xiang-Yang .
SENSYS'18: PROCEEDINGS OF THE 16TH CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, 2018, :82-94
[58]  
Rakotomamonjy A., 2007, P 24 INT C MACHINE L, P775, DOI [10.1145/1273496.1273594, DOI 10.1145/1273496.1273594]
[59]   A COMPARATIVE-STUDY OF ROBUST LINEAR PREDICTIVE ANALYSIS-METHODS WITH APPLICATIONS TO SPEAKER IDENTIFICATION [J].
RAMACHANDRAN, RP ;
ZILOVIC, MS ;
MAMMONE, RJ .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (02) :117-125
[60]   ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS [J].
REYNOLDS, DA ;
ROSE, RC .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01) :72-83