Audiovisual speech recognition for Kannada language using feed forward neural network

被引:3
|
作者
Shashidhar, R. [1 ]
Patilkulkarni, S. [1 ]
机构
[1] JSS Sci & Technol Univ, Dept Elect & Commun Engn, Mysuru 570006, India
来源
NEURAL COMPUTING & APPLICATIONS | 2022年 / 34卷 / 18期
关键词
Audiovisual speech recognition; Dlib; Feed forward neural network; Kannada Language; LSTM; MFCC;
D O I
10.1007/s00521-022-07249-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Audiovisual speech recognition is one of the promising technologies in a noisy environment. In this work, we develop the database for Kannada Language and develop an AVSR system for the same. The proposed work is categorized into three main components: a. Audio mechanism. b. Visual speech mechanism. c. Integration of audio and visual mechanisms. In the audio model, MFCC is used to extract the features and a one-dimensional convolutional neural network is used for classification. In the visual module, Dlib is used to extract the features and long short-term memory recurrent neural network is used for classification. Finally, integration of audio and visual module is done using feed forward neural network. Audio speech recognition of Kannada dataset training accuracy achieved is 93.86 and 91.07% for testing data using seventy epochs. Visual speech recognition for Kannada dataset training accuracy is 77.57%, and testing accuracy is 75%. After integration, audiovisual speech recognition for Kannada dataset train accuracy is 93.33% and for testing is 92.26%.
引用
收藏
页码:15603 / 15615
页数:13
相关论文
共 50 条
  • [1] Audiovisual speech recognition for Kannada language using feed forward neural network
    R. Shashidhar
    S. Patilkulkarni
    Neural Computing and Applications, 2022, 34 : 15603 - 15615
  • [2] Speech Recognition Using Feed Forward Neural Network and Principle Component Analysis
    Momo, Nusrat
    Abdullah
    Uddin, Jia
    ADVANCES IN SIGNAL PROCESSING AND INTELLIGENT RECOGNITION SYSTEMS, 2018, 678 : 228 - 239
  • [3] Visual Speech Recognition for Kannada Language Using VGG16 Convolutional Neural Network
    Rudregowda, Shashidhar
    Kulkarni, Sudarshan Patil
    Gururaj, H. L.
    Ravi, Vinayakumar
    Krichen, Moez
    ACOUSTICS, 2023, 5 (01): : 343 - 353
  • [4] As experiment with feed-forward neural network for speech recognition
    Jelinek, B
    Juhar, J
    Cizmar, A
    STATE OF THE ART IN COMPUTATIONAL INTELLIGENCE, 2000, : 308 - 313
  • [5] Back Propagation Feed forward neural network approach for Speech Recognition
    Rajput, Neelima
    Verma, S. K.
    2014 3RD INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (ICRITO) (TRENDS AND FUTURE DIRECTIONS), 2014,
  • [6] Continuous Speech Recognition of Kannada Language using Triphone Modeling
    Sajjan, Sharada C.
    Vijaya, C.
    PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2016, : 451 - 455
  • [7] Kannada Character Recognition System Using Neural Network
    Kumar, Suresh D. S.
    Kalyan, K. Srinivasa
    Kumar, Ajay B. R.
    INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2012), 2013, 8768
  • [8] Audiovisual speech recognition based on a deep convolutional neural network
    Rudregowda S.
    Patilkulkarni S.
    Ravi V.
    H.L. G.
    Krichen M.
    Data Science and Management, 2024, 7 (01): : 25 - 34
  • [9] Emotion recognition using multilayer perceptron and generalized feed forward neural network
    Khanchandani, K. B.
    Hussain, Moiz A.
    JOURNAL OF SCIENTIFIC & INDUSTRIAL RESEARCH, 2009, 68 (05): : 367 - 371
  • [10] A Novel Recognition of Indian Bank Cheques Using Feed Forward Neural Network
    Raghavendra, S. P.
    Danti, Ajit
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, CIDM, VOL 2, 2016, 411 : 71 - 79