Combining bidirectional long short-term memory and self-attention mechanism for code search

被引：1

作者：

Cao, Ben ^{[1
,2
]}

Liu, Jianxun ^{[1
,2
,3
]}

机构：

[1] Hunan Univ Sci & Technol, Sch Comp Sci & Engn, Xiangtan, Peoples R China

[2] Hunan Univ Sci & Technol, Key Lab Serv Comp & Novel Software Technol, Xiangtan, Peoples R China

[3] Hunan Univ Sci & Technol, Sch Comp Sci & Engn, Xiangtan 411201, Hunan, Peoples R China

来源：

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2023年 / 35卷 / 10期

基金：

中国国家自然科学基金;

关键词：

BiLSTM; code search; deep learning; self-attention mechanisms; semantic similarity; INFORMATION; NETWORKS;

D O I：

10.1002/cpe.7662

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

With the wide application of deep learning in code search, especially the proposed code search model based on attention mechanism, the accuracy of code search has been greatly improved. However, the attention mechanism only captures the attention weight relationship between two words in the code fragment, without considering the contextual semantic relationship that exists between words in the code fragment, which can help improve the accuracy of code search. To address this problem, this paper proposes a model that combining bidirectional long short-term memory and self-attention mechanisms for code search (CBLSAM-CS). The model first captures the contextual semantic relationship of each word in the code fragment by long-short term memory network, and then uses the self-attention mechanism to extract deep-level features of the sequence. In order to verify the effectiveness of the proposed model, the paper has been conducted an experimental comparison with three other baseline models, CODEnn, CARLCS-CNN, and SAN-CS, on the basis of a public dataset containing 18 million code fragments. The experimental results show that the proposed model in this paper achieves 92.24% and 93.55% in mean reciprocal rank value and normalized discounted cumulative gain metrics, respectively, which are better than the baseline model. Therefore, it shows that the CBLSAM-CS model proposed in this paper can effectively improve the accuracy and efficiency of code search.

引用

页数：19

共 51 条

[1]

Agarap AF., 2018, PREPRINT

[2]

Brandt J, 2009, CHI2009: PROCEEDINGS OF THE 27TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, VOLS 1-4, P1589

[3] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[4]

Chan W. K., 2012, P ACM SIGSOFT 20 INT, P1, DOI DOI 10.1145/2393596.2393606

[5] Recurrent neural network with attention mechanism for language model [J].

Chen, Mu-Yen ;

Chiang, Hsiu-Sen ;

Sangaiah, Arun Kumar ;

Hsieh, Tsung-Che .

NEURAL COMPUTING & APPLICATIONS, 2020, 32 (12) :7915-7923

[6] Emerging Trends Word2Vec [J].

Church, Kenneth Ward .

NATURAL LANGUAGE ENGINEERING, 2017, 23 (01) :155-162

[7] Self-Attention Networks for Code Search [J].

Fang, Sen ;

Tan, You-Shuai ;

Zhang, Tao ;

Liu, Yepang .

INFORMATION AND SOFTWARE TECHNOLOGY, 2021, 134

[8]

Graves A, 2005, LECT NOTES COMPUT SC, V3697, P799

[9]

Gu J., 2021, PREPRINT

[10] Deep Code Search [J].

Gu, Xiaodong ;

Zhang, Hongyu ;

Kim, Sunghun .

PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2018, :933-944

← 1 2 3 4 5 6 →