Spoken Language Identification with Deep Convolutional Neural Network and Data Augmentation

被引:0
|
作者
Korkut, Can [1 ]
Haznedaroglu, Ali [1 ]
Arslan, Levent M. [1 ,2 ]
机构
[1] Sestek, Istanbul, Turkey
[2] Bogazici Univ, Elekt Elekt Muhendisligi Bolumu, Istanbul, Turkey
关键词
Spoken Language Identification; CNN; Data Augmentation; SPEECH;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, a spoken language detection system based on deep convolutional neural networks is presented. The neural network model is trained and tested on a speech dataset containing five languages. Speech signals are first converted into mel-spectrogram features and these features are fed into the deep convolutional neural network. Flattened outputs of the deep convolutional network are then fed into a recurrent layer, and a dense layer with softmax activation function is used as an output layer to predict the output language probabilities. This network results in 0.89 F1-score in our test data. We also used a data augmentation method, namely Spec Augment, which increased the F1-score to 0.94.
引用
收藏
页数:4
相关论文
共 50 条
  • [31] Data Augmentation for Deep Neural Network Acoustic Modeling
    Cui, Xiaodong
    Goel, Vaibhava
    Kingsbury, Brian
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (09) : 1469 - 1477
  • [32] DATA AUGMENTATION FOR DEEP NEURAL NETWORK ACOUSTIC MODELING
    Cui, Xiaodong
    Goel, Vaibhava
    Kingsbury, Brian
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [33] Automatic diagnosis of fungal keratitis using data augmentation and image fusion with deep convolutional neural network
    Liu, Zhi
    Cao, Yankun
    Li, Yujun
    Xiao, Xiaoyan
    Qiu, Qingchen
    Yang, Meijun
    Zhao, Yuefeng
    Cui, Lizhen
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2020, 187
  • [34] Deep learning in remote sensing scene classification: a data augmentation enhanced convolutional neural network framework
    Yu, Xingrui
    Wu, Xiaomin
    Luo, Chunbo
    Ren, Peng
    GISCIENCE & REMOTE SENSING, 2017, 54 (05) : 741 - 758
  • [35] Language Identification Using Deep Convolutional Recurrent Neural Networks
    Bartz, Christian
    Herold, Tom
    Yang, Haojin
    Meinel, Christoph
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT VI, 2017, 10639 : 880 - 889
  • [36] A driver stress detection model via data augmentation based on deep convolutional recurrent neural network
    Zhao, Qianxi
    Yang, Liu
    Lyu, Nengchao
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [37] The Impact of Using Data Augmentation Techniques for Automatic Detection of Arrhythmia With a Deep Convolutional Neural Network Model
    Degachi, Oumayma
    Ouni, Kais
    2024 IEEE INTERNATIONAL CONFERENCE ON ADVANCED SYSTEMS AND EMERGENT TECHNOLOGIES, ICASET 2024, 2024,
  • [38] Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification
    Salamon, Justin
    Bello, Juan Pablo
    IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (03) : 279 - 283
  • [39] Land Cover Classification based on Deep Convolutional Neural Network with Feature-based Data Augmentation
    Wang, Bo
    Huang, Chengeng
    Guo, Yuhua
    Tao, Jiahui
    JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY, 2021, 65 (01)
  • [40] Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Recognition
    Takahashi, Naoya
    Gygli, Michael
    Pfister, Beat
    Van Goole, Luc
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2982 - 2986