Source Code Classification Using Neural Networks

被引:0
作者
Gilda, Shlok [1 ]
机构
[1] Pune Inst Comp Technol, Dept Comp Engn, Pune, Maharashtra, India
来源
PROCEEDINGS OF 2017 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE) | 2017年
关键词
Artificial neural network; Multi-layer neural network; Supervised learning; Feature extraction;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Programming languages are the primary tools of the software development industry. As of today, the programming language of the vast majority of the published source code is manually specified or programmatically assigned based solely on the respective file extension. This work shows that the identification of the programming language can be done automatically by utilizing an artificial neural network based on supervised learning and intelligent feature extraction from the source code files. We employ a multi-layer neural network - word embedding layers along with a Convolutional Neural Network - to achieve this goal. Our criteria for an automatic source code identification solution include high accuracy, fast performance, and large programming language coverage. The model achieves a 97% accuracy rate while classifying 60 programming languages.
引用
收藏
页数:6
相关论文
共 21 条
[1]  
[Anonymous], GITHUB DAT BIGQUERY
[2]  
[Anonymous], CORR
[3]  
[Anonymous], 2013, P 2013 C N AM CHAPTE
[4]  
[Anonymous], ARXIV E PRINTS
[5]  
Basu A., 2003, 36 ANN HAW INT C SYS, DOI [10.1109/HICSS.2003.1174243, DOI 10.1109/HICSS.2003.1174243]
[6]   A neural probabilistic language model [J].
Bengio, Y ;
Ducharme, R ;
Vincent, P ;
Jauvin, C .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) :1137-1155
[7]   Knowledge based neural network for text classification [J].
Goyal, Ram Dayal .
GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS, 2007, :542-547
[8]  
Khasnabish JN, 2014, LECT NOTES ARTIF INT, V8556, P513, DOI 10.1007/978-3-319-08979-9_39
[9]  
Kim Y., 2014, ARXIV 14085882, DOI DOI 10.3115/V1/D14-1181
[10]  
Lai SW, 2015, AAAI CONF ARTIF INTE, P2267