A context-aware citation recommendation model with BERT and graph convolutional networks

被引:107
作者
Jeong, Chanwoo [1 ]
Jang, Sion [1 ]
Park, Eunjeong [2 ]
Choi, Sungchul [1 ]
机构
[1] Gachon Univ, Dept Ind Engn, TEAMLAB, Seongnam Si, Gyeonggi Do, South Korea
[2] NAVER, Papago, Seongnam Si, Gyeonggi Do, South Korea
基金
新加坡国家研究基金会;
关键词
Paper citation; Citation recommendation; BERT; Deep learning; Transformer; Graph convolution network;
D O I
10.1007/s11192-020-03561-y
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the tremendous growth in the number of scientific papers being published, searching for references while writing a scientific paper is a time-consuming process. A technique that could add a reference citation at the appropriate place in a sentence will be beneficial. In this perspective, the context-aware citation recommendation has been researched for around two decades. Many researchers have utilized the text data called the context sentence, which surrounds the citation tag, and the metadata of the target paper to find the appropriate cited research. However, the lack of well-organized benchmarking datasets, and no model that can attain high performance has made the research difficult. In this paper, we propose a deep learning-based model and well-organized dataset for context-aware paper citation recommendation. Our model comprises a document encoder and a context encoder. For this, we use graph convolutional networks layer, and bidirectional encoder representations from transformers, a pre-trained model of textual data. By modifying the related PeerRead dataset, we propose a new dataset called FullTextPeerRead containing context sentences to cited references and paper metadata. To the best of our knowledge, this dataset is the first well-organized dataset for a context-aware paper recommendation. The results indicate that the proposed model with the proposed datasets can attain state-of-the-art performance and achieve a more than 28% improvement in mean average precision and recall@k.
引用
收藏
页码:1907 / 1922
页数:16
相关论文
共 23 条
  • [1] Predicting the citations of scholarly paper
    Bai, Xiaomei
    Zhang, Fuli
    Lee, Ivan
    [J]. JOURNAL OF INFORMETRICS, 2019, 13 (01) : 407 - 418
  • [2] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [3] DRAGOMIR BGP, 2009, J AM SOC INFORM SCI
  • [4] Neural Citation Network for Context-Aware Citation Recommendation
    Ebesu, Travis
    Fang, Yi
    [J]. SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 1093 - 1096
  • [5] He Q., 2011, P 4 ACM INT C WEB SE, P755, DOI [DOI 10.1145/1935826.1935926, 10.1145/1935826.1935926]
  • [6] He Q., 2010, P WWW, P421, DOI DOI 10.1145/1772690.1772734
  • [7] Huang WY, 2015, AAAI CONF ARTIF INTE, P2404
  • [8] Kang Dongyeop., 2018, Long Papers, DOI [10.18653/v1/N18-1149, DOI 10.18653/V1/N18-1149, 10.18653/v1/n18-1149]
  • [9] A scientometric review of emerging trends and new developments in recommendation systems
    Kim, Meen Chul
    Chen, Chaomei
    [J]. SCIENTOMETRICS, 2015, 104 (01) : 239 - 263
  • [10] Kingma DP, 2014, ADV NEUR IN, V27