Acoustic frontend;
analog machine learning;
context-aware computing;
hierarchical computing;
scalable low power analog;
voice activity detection (VAD);
ANALOG;
SYSTEM;
END;
D O I:
10.1109/JSSC.2015.2487276
中图分类号:
TM [电工技术];
TN [电子技术、通信技术];
学科分类号:
0808 ;
0809 ;
摘要:
This work presents a sub-6 mu W acoustic frontend for speech/non-speech classification in a voice activity detection (VAD) in 90 nm CMOS. Power consumption of the VAD system is minimized by architectural design around a new power-proportional sensing paradigm and the use of machine-learning-assisted moderate-precision analog analytics for classification. Power-proportional sensing allows for hierarchical and context-aware scaling of the frontend's power consumption depending on the complexity of the ongoing information extraction, while the use of analog analytics brings increased power efficiency through switching ON/OFF the computation of individual features depending on the features' usefulness in a particular context. The proposed VAD system reduces the power consumption by 10x as compared to state-of-the-art (SotA) systems and yet achieves an 89% average hit rate (HR) for a 12 dB signal-to-acoustic-noise ratio (SANR) in babble context, which is at par with software-based VAD systems.