共 29 条
[1]
Alsayadi H., Abdelhamid A., Hegazy I., Fayed Z., Arabic speech recognition using end-to-end deep learning, IET Signal Processing, (2021)
[2]
Amirgaliyev N., Kuanyshbay D., Baimuratov O., Development of automatic speech recognition for Kazakh language using transfer learning, Speech Recognition for Kazakh Language Project, (2020)
[3]
Brown J., Smaragdis P., Hidden Markov and Gaussian mixture models for automatic call classification, The Journal of the Acoustical Society of America, 125, pp. EL221-EL224, (2009)
[4]
Chan W., Jaitly N., Le Q., Vinyals O., Listen, attend and spell: A neural network for large vocabulary conversational speech recognition, 2016B IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4960-4964, (2016)
[5]
Chan W., Jaitly N., Le Q., Vinyals O., Listen, attend and spell: A neural network for large vocabulary conversational speech recognition, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4960-4964, (2016)
[6]
Chen J., Nishimura R., Kitaoka N., End-to-end recognition of streaming Japanese speech using CTC and local attention, APSIPA Transactions on Signal and Information Processing, (2020)
[7]
Emiru E., Li Y., Fesseha A., Diallo M., Improving Amharic Speech Recognition System using connectionist temporal classification with attention model and phoneme-based byte-pair-encodings, Information, 12, (2021)
[8]
Graves A., Fernandez S., Gomez F., Schmidhuber J., Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural 'networks, ICML 2006—Proceedings of the 23Rd International Conference on Machine Learning, 2006, pp. 369-376, (2006)
[9]
Hinton G., Deng L., Yu D., Dahl G., Mohamed A.-R., Jaitly N., Senior A., Vanhoucke V., Nguyen P., Sainath T., Kingsbury B., Deep neural networks for acoustic modeling in speech recognition, IEEE Signal Processing Magazine, 29, 6, pp. 82-97, (2012)
[10]
Hori T., Watanabe S., Zhang Y., Chan W., Advances in Joint CTC–attention based end-to-end speech recognition with a deep CNN encoder and RNN-LM, INTERSPEECH 2017, (2017)