Native Language Identification using Probabilistic Graphical Models

被引：0

作者：

Nicolai, Garrett ^{[1
]}

Islam, Md Asadul ^{[1
]}

Greiner, Russ ^{[1
]}

机构：

[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada

来源：

2013 INTERNATIONAL CONFERENCE ON ELECTRICAL INFORMATION AND COMMUNICATION TECHNOLOGY (EICT) | 2013年

关键词：

NLI; Machine Learning; SVM; Bayesian Methods; TAN;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Native Language Identification (NLI) is the task of identifying the native language of an author of a text written in a second language. Support Vector Machines and Maximum Entrophy Learners are the most common methods used to solve this problem, but we consider it from the point-of-view of probabilistic graphical models. We hypothesize that graphical models are well-suited to this task, as they can capture feature inter-dependencies that cannot be exploited by SVMs. Using progressively more connected graphical models, we show that these models out-perform SVMs on reduced feature sets. Furthermore, on full feature sets, even naive Bayes increases accuracy from 82.06% to 83.41% over SVMs on a 5-language classification task.

引用

收藏

页数：6

相关论文

共 50 条

[41] Probabilistic models for permutations and dependence [J].

Fetiveau, Arthur ;

Durrieu, Gilles ;

Frenod, Emmanuel .

COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2025,

[42] Probabilistic models for bacterial taxonomy [J].

Gyllenberg, M ;

Koski, T .

INTERNATIONAL STATISTICAL REVIEW, 2001, 69 (02) :249-276

[43] Bayesian graphical models for modern biological applications [J].

Ni, Yang ;

Baladandayuthapani, Veerabhadran ;

Vannucci, Marina ;

Stingo, Francesco C. .

STATISTICAL METHODS AND APPLICATIONS, 2022, 31 (02) :197-225

[44] Graphical Models in Reconstructability Analysis and Bayesian Networks [J].

Harris, Marcus ;

Zwick, Martin .

ENTROPY, 2021, 23 (08)

[45] Possum: identification and interpretation of potassium ion inhibitors using probabilistic feature vectors [J].

Hassan, Mir Tanveerul ;

Tayara, Hilal ;

Chong, Kil To .

ARCHIVES OF TOXICOLOGY, 2025, 99 (01) :225-235

[46] Writer Identification using a Probabilistic Model of Handwritten Digits and Approximate Bayesian Computation [J].

Ahmadian, Amirhosein ;

Fouladi, Kazim ;

Araabi, Babak Nadjar .

2016 2ND INTERNATIONAL CONFERENCE OF SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2016, :40-45

[47] Bayesian graphical models for modern biological applications [J].

Yang Ni ;

Veerabhadran Baladandayuthapani ;

Marina Vannucci ;

Francesco C. Stingo .

Statistical Methods & Applications, 2022, 31 (2) :197-225

[48] Toward Discriminating and Synthesizing Motion Traces Using Deep Probabilistic Generative Models [J].

Zhou, Fan ;

Liu, Xin ;

Zhang, Kunpeng ;

Trajcevski, Goce .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) :2401-2414

[49] Language Identification From Speech Features Using SVM and LDA [J].

Anjana, J. S. ;

Poorna, S. S. .

2018 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2018,

[50] Identification of Spam Comments using Natural Language Processing Techniques [J].

Radulescu, Cristina ;

Dinsoreanu, Mihaela ;

Potolea, Rodica .

2014 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 2014, :29-35

← 1 2 3 4 5 →