Deep Learning neural nets versus traditional machine learning in gender identification of authors of RusProfiling texts

被引:9
作者
Sboev, Alexander [1 ,2 ]
Moloshnikov, Ivan [1 ]
Gudovskikh, Dmitry [1 ]
Selivanov, Anton [1 ]
Rybka, Roman [1 ]
Litvinova, Tatiana [1 ,3 ]
机构
[1] Natl Res Ctr, Kurchatov Inst, Moscow, Russia
[2] MEPhI Natl Res Nucl Univ, Moscow, Russia
[3] Voronezh State Pedag Univ, Voronezh, Russia
来源
8TH ANNUAL INTERNATIONAL CONFERENCE ON BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES, BICA 2017 (EIGHTH ANNUAL MEETING OF THE BICA SOCIETY) | 2018年 / 123卷
基金
俄罗斯科学基金会;
关键词
gender identification; neural networks; natural language processing; data-driven modeling;
D O I
10.1016/j.procs.2018.01.065
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we compare accuracies of solving the task of gender identification of RusProfiling texts without gender deception on base of two types of data-driven modeling approaches: on the one hand, well-known conventional machine learning algorithms, such as Support Vector machine, Gradient Boosting; and, on the other hand, the set of Deep Learning neuronets, such as neuronet topologies with convolution, fully-connected, and Long Short-Term Memory layers, etc. The dependence of effectiveness of these models on the feature selection and on their representation is investigated. The obtained F1-score of 88% establishes the state of the art in the gender identification task with the RusProfiling corpus. (C) 2018 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/) Peer-review under responsibility of the scientific committee of the 8th Annual International Conference on Biologically Inspired Cognitive Architectures
引用
收藏
页码:424 / 431
页数:8
相关论文
共 10 条
[1]  
[Anonymous], 2016, ARXIV160304351
[2]  
[Anonymous], ARXIV150701526
[3]  
[Anonymous], CORR
[4]  
Basile A., 2017, WORKING NOTES CLEF 2
[5]  
Kutuzov A., 2017, PROC SOFTW DEMONSTRA, P99
[6]  
Litvinova T, 2017, P WORKSH STYL VAR, P69, DOI DOI 10.18653/V1/W17-4909
[7]  
Litvinova T, 2016, PROCEEDINGS OF THE INTERNATIONAL FRUCT CONFERENCE ON INTELLIGENCE, SOCIAL MEDIA AND WEB (ISMW FRUCT 2016), P29
[8]  
Rangel Francisco, 2017, Working Notes Papers of the CLEF, P1613
[9]  
RusProfiling Lab, 2017, RUSPR CORP RUSS TEXT
[10]  
Sboev A, 2016, 2016 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE & COMPUTATIONAL INTELLIGENCE (CSCI), P1101, DOI [10.1109/CSCI.2016.0210, 10.1109/CSCI.2016.209]