Native Language Identification using Probabilistic Graphical Models

被引:0
作者
Nicolai, Garrett [1 ]
Islam, Md Asadul [1 ]
Greiner, Russ [1 ]
机构
[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
来源
2013 INTERNATIONAL CONFERENCE ON ELECTRICAL INFORMATION AND COMMUNICATION TECHNOLOGY (EICT) | 2013年
关键词
NLI; Machine Learning; SVM; Bayesian Methods; TAN;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Native Language Identification (NLI) is the task of identifying the native language of an author of a text written in a second language. Support Vector Machines and Maximum Entrophy Learners are the most common methods used to solve this problem, but we consider it from the point-of-view of probabilistic graphical models. We hypothesize that graphical models are well-suited to this task, as they can capture feature inter-dependencies that cannot be exploited by SVMs. Using progressively more connected graphical models, we show that these models out-perform SVMs on reduced feature sets. Furthermore, on full feature sets, even naive Bayes increases accuracy from 82.06% to 83.41% over SVMs on a 5-language classification task.
引用
收藏
页数:6
相关论文
共 50 条
[41]   Probabilistic models for permutations and dependence [J].
Fetiveau, Arthur ;
Durrieu, Gilles ;
Frenod, Emmanuel .
COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2025,
[42]   Probabilistic models for bacterial taxonomy [J].
Gyllenberg, M ;
Koski, T .
INTERNATIONAL STATISTICAL REVIEW, 2001, 69 (02) :249-276
[43]   Bayesian graphical models for modern biological applications [J].
Ni, Yang ;
Baladandayuthapani, Veerabhadran ;
Vannucci, Marina ;
Stingo, Francesco C. .
STATISTICAL METHODS AND APPLICATIONS, 2022, 31 (02) :197-225
[44]   Graphical Models in Reconstructability Analysis and Bayesian Networks [J].
Harris, Marcus ;
Zwick, Martin .
ENTROPY, 2021, 23 (08)
[45]   Possum: identification and interpretation of potassium ion inhibitors using probabilistic feature vectors [J].
Hassan, Mir Tanveerul ;
Tayara, Hilal ;
Chong, Kil To .
ARCHIVES OF TOXICOLOGY, 2025, 99 (01) :225-235
[46]   Writer Identification using a Probabilistic Model of Handwritten Digits and Approximate Bayesian Computation [J].
Ahmadian, Amirhosein ;
Fouladi, Kazim ;
Araabi, Babak Nadjar .
2016 2ND INTERNATIONAL CONFERENCE OF SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2016, :40-45
[47]   Bayesian graphical models for modern biological applications [J].
Yang Ni ;
Veerabhadran Baladandayuthapani ;
Marina Vannucci ;
Francesco C. Stingo .
Statistical Methods & Applications, 2022, 31 (2) :197-225
[48]   Toward Discriminating and Synthesizing Motion Traces Using Deep Probabilistic Generative Models [J].
Zhou, Fan ;
Liu, Xin ;
Zhang, Kunpeng ;
Trajcevski, Goce .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) :2401-2414
[49]   Language Identification From Speech Features Using SVM and LDA [J].
Anjana, J. S. ;
Poorna, S. S. .
2018 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2018,
[50]   Identification of Spam Comments using Natural Language Processing Techniques [J].
Radulescu, Cristina ;
Dinsoreanu, Mihaela ;
Potolea, Rodica .
2014 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 2014, :29-35