Turkish dialect recognition in terms of prosodic by long short-term memory neural networks

被引:7
作者
Isik, Gultekin [1 ]
Artuner, Harun [1 ]
机构
[1] Hacettepe Univ, Comp Engn Dept, TR-06800 Ankara, Turkey
来源
JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY | 2020年 / 35卷 / 01期
关键词
Turkish dialect recognition; Long short-term memory neural networks; Prosody; Language model; Legendre polynomials; LANGUAGE IDENTIFICATION; EXTRACTION; FEATURES;
D O I
10.17341/gazimmfd.453677
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Dialects are forms of speech, separated from languages which they belong to in terms of some characteristics and which are specific to a certain region of the country. Obtaining dialect-specific characteristics and recognition of dialects using them is among the popular topics in speech processing. In particular, the dialect of the speech is asked to be identified first in order to improve the performance of large scale speech recognition systems. Languages/dialects are distinguished from one another by prosodic features such as intonation, stress and rhythm. These perceptual features are obtained by measuring the pitch, energy and duration at the physical level, respectively. In recent years, with the increasing popularity of deep neural networks, Long Short-Term Memory (LSTM) neural networks are frequently used in sequence classification and language modeling problems. LSTM neural networks are successful in modeling long-term contextual information. In this study, Turkish dialect recognition was performed with LSTM neural networks using prosodic features. Here, LSTM neural networks were used both as sequence classifier and language modeler. It was observed that the proposed methods gave an accuracy rate of 78.7% on the Turkish dataset consisting of Ankara, Alanya, Kibris and Trabzon dialects.
引用
收藏
页码:213 / 224
页数:12
相关论文
共 38 条
  • [1] Modeling prosodic differences for speaker recognition
    Adami, Andre Gustavo
    [J]. SPEECH COMMUNICATION, 2007, 49 (04) : 277 - 291
  • [2] Sequential failure analysis using counters of Petri net models
    Adamyan, A
    He, D
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2003, 33 (01): : 1 - 11
  • [3] [Anonymous], IDSIA0399
  • [4] [Anonymous], P 4 INT C SPOK LANG
  • [5] [Anonymous], THESIS
  • [6] [Anonymous], AGIZ ARASTIRMALARIND
  • [7] [Anonymous], 2018, SIG PROCESS COMMUN
  • [8] LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT
    BENGIO, Y
    SIMARD, P
    FRASCONI, P
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02): : 157 - 166
  • [9] Boersma Paul, 2018, PRAAT DOING PHONETIC
  • [10] Large-Scale Machine Learning with Stochastic Gradient Descent
    Bottou, Leon
    [J]. COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 177 - 186