FMCS: Improving Code Search by Multi-Modal Representation Fusion and Momentum Contrastive Learning

被引:0
作者
Liu, Wenjie [1 ]
Chen, Gong [1 ]
Xie, Xiaoyuan [1 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan, Hubei, Peoples R China
来源
2024 IEEE 24TH INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS | 2024年
关键词
code search; contrastive learning; multi-modal models; data augmentation;
D O I
10.1109/QRS62785.2024.00068
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Code search is a critical task in software engineering, which is to search relevant codes from the codebase based on the natural language query. Although existing code search methods based on multi-modal contrast learning have achieved advanced performance, these methods still have limitations in the representation learning of multi-modal data and do not sufficiently explore the role of functionally equivalent code pairs in representation learning. To address these limitations, we propose a code search framework based on multi-modal representation fusion and momentum contrastive learning, named FMCS. We effectively retain the semantic and structural information of the code by multi-modal representation fusion. We further learn the correlation between the relevant samples by the momentum contrastive learning between samples. The experimental results on the CodeSearchNet benchmark show the effectiveness of FMCS.
引用
收藏
页码:632 / 638
页数:7
相关论文
共 24 条
[11]  
Guo DY, 2022, PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), P7212
[12]   Momentum Contrast for Unsupervised Visual Representation Learning [J].
He, Kaiming ;
Fan, Haoqi ;
Wu, Yuxin ;
Xie, Saining ;
Girshick, Ross .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :9726-9735
[13]  
Hjelm R.D., 2018, Learning deep representations by mutual information estimation and maximization
[14]  
Husain Hamel, 2020, CodeSearchNet Challenge: Evaluating the State of Semantic Code Search
[15]  
Jiang X, 2021, PR MACH LEARN RES, V161, P54
[16]   Big Code Search: A Bibliography [J].
Kim, Kisub ;
Ghatpande, Sankalp ;
Kim, Dongsun ;
Zhou, Xin ;
Liu, Kui ;
Bissyande, Tegawende F. ;
Klein, Jacques ;
Le Traon, Yves .
ACM COMPUTING SURVEYS, 2024, 56 (01)
[17]   Sourcerer: mining and searching internet-scale software repositories [J].
Linstead, Erik ;
Bajracharya, Sushil ;
Ngo, Trung ;
Rigor, Paul ;
Lopes, Cristina ;
Baldi, Pierre .
DATA MINING AND KNOWLEDGE DISCOVERY, 2009, 18 (02) :300-336
[18]   Opportunities and Challenges in Code Search Tools [J].
Liu, Chao ;
Xia, Xin ;
Lo, David ;
Gao, Cuiyun ;
Yang, Xiaohu ;
Grundy, John .
ACM COMPUTING SURVEYS, 2022, 54 (09)
[19]  
McMillan C, 2011, 2011 33RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), P111, DOI 10.1145/1985793.1985809
[20]   Gated Graph Recurrent Neural Networks [J].
Ruiz, Luana ;
Gama, Fernando ;
Ribeiro, Alejandro .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2020, 68 :6303-6318