共 21 条
[1]
Ramirez J., Gorriz J.-M., Segura J.-C., Voice activity detection, fundamentals and speech recognition system robustness, robust speech recognition and understanding
[2]
Wisdom S., Okopal G., Atlas L., Pitton J., Voice activity detection using subband noncircularity, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 4505-4509, (2015)
[3]
Heese F., Niermann M., Vary P., Speech-codebook based soft voice activity detection, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 4335-4339, (2015)
[4]
Tao F.-J., Hansen H.-L., Busso C., An unsupervised visual-only voice activity detection approach using temporal orofacial features, Proceedings of 16th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 2302-2306, (2015)
[5]
Zhan G., Huang Z.-Q., Et al., Spectrographic speech mask estimation using the time-frequency correlation of speech presence, Proceedings of 16th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 2287-2291, (2015)
[6]
Ramirez J., Segura J.-C., Benitez C., Et al., Efficient voice activity detection algorithms using long-term speech information, Speech Communication, 42, 3, pp. 271-287, (2004)
[7]
Ghosh P.-K., Tsiartas A., Narayanan S., Robust voice activity detection using long-term signal variability, IEEE Transactions on Audio, Speech, and Language Processing, 19, 3, pp. 600-613, (2011)
[8]
Ma Y., Nishihara A., Efficient voice activity detection algorithm using long-term spectral flatness measure, EURASIP Journal on Audio, Speech and Music Processing, (2013)
[9]
Yang X.-K., He L., Qu D., Zhang W.-Q., Voice activity detection algorithm based on long-term pitch information, EURASIP Journal on Audio, Speech, and Music Processing, (2016)
[10]
Davis S., Mermelstein P., Comparison of parametric representations for monosyllabic word recognitions in continuously spoken sentences, IEEE Transactions on Acoustics, Speech and Signal Processing, 28, 4, pp. 357-366, (1980)