Citation recommendation using semantic representation of cited papers' relations and content

被引:24
作者
Zhang, Jinzhu [1 ]
Zhu, Lipeng [1 ,2 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Econ & Management, Dept Informat Management, Nanjing, Peoples R China
[2] Jinhu Cty Peoples Hosp, Dept Informat, Huaian, Peoples R China
基金
中国国家自然科学基金;
关键词
Citation recommendation; Cited paper; Co-citation; Citation content; Semantic representation; LINK PREDICTION;
D O I
10.1016/j.eswa.2021.115826
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Citation recommendation can help researchers quickly find supplementary or alternative references in massive academic resources. Current research on citation recommendation mainly focuses on the citing papers, resulting in the enormous cited papers are ignored, including the relations among cited papers and their citation context cited in citing papers. Moreover, cited paper's content is often denoted with its original title and abstract, which is hard to acquire and rarely considers different citation motivations. Furthermore, the most appropriate method for semantic representation of cited papers' relations and content is uncertain. Therefore, this paper studies citation recommendation from the perspective of semantic representation of cited papers' relations and content. Firstly, four forms of citation context are designed and extracted as cited papers' content considering citation motivations, as well as co-citation relationships are extracted as cited papers' relations. Secondly, 132 methods are designed for generating semantic vector of cited paper, including four network embedding methods, 16 methods by combining four text representation algorithms with four forms of citation content, and 112 fusion methods. Finally, similarity among cited papers is calculated for citation recommendation and a quantitative evaluation method based on link prediction is designed, to find the most appropriate form of citation content and the optimal method. The result shows that doc2vecC (Document to Vector through Corruption) with the form of CS&SS (Current Sentences and Surrounding Sentences) performs best, in which the AUC (Area Under Curve) and MAP (Macro Average Precision) reach 0.877 and 0.889 and have increased by 0.462 and 0.370 compared with the worst-performing method. This performance is slightly improved by parameters adjustment, and a case study is performed whose results have further proved the effectiveness of this method. In addition, among four forms of cited papers' content, CS&SS performs best in almost all methods. Furthermore, the fusion methods not always perform better than the single methods, where doc2vecC (CS&SS) performs better than the best fusion method GCN (Graph Convolutional Network). These results not only prove the effectiveness of citation recommendation from the perspective of cited paper, but also provide helpful and useful suggestions for method selection and citation content selection. The data and conclusions can be extended to other text mining-related tasks. Simultaneously, it is a preliminary research which needs to be further studied in other domains using emerging semantic representation methods.
引用
收藏
页数:13
相关论文
共 81 条
[1]  
Abu-Jbara Amjad., 2012, 2012 C N AM CHAPT AS, P80
[2]   A collaborative filtering recommender system using genetic algorithm [J].
Alhijawi, Bushra ;
Kilani, Yousef .
INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (06)
[3]   Paper recommendation based on heterogeneous network embedding [J].
Ali, Zafar ;
Qi, Guilin ;
Muhammad, Khan ;
Ali, Bahadar ;
Abro, Waheed Ahmed .
KNOWLEDGE-BASED SYSTEMS, 2020, 210
[4]   A graph-based taxonomy of citation recommendation models [J].
Ali, Zafar ;
Qi, Guilin ;
Kefalas, Pavlos ;
Abro, Waheed Ahmad ;
Ali, Bahadar .
ARTIFICIAL INTELLIGENCE REVIEW, 2020, 53 (07) :5217-5260
[5]  
Angrosh M., 2010, Proceedings of the 10th annual joint conference on Digital libraries, P293
[6]  
[Anonymous], 2014, INT C MACH LEARN
[7]  
[Anonymous], 2013, P 1 AUSTR WEB C SYDN
[8]  
[Anonymous], 2014, 23 ACM INT C C INF K, DOI DOI 10.1145/2661829.2661965
[9]   Application of TextRank algorithm for credibility assessment [J].
Balcerzak, Bartlomiej ;
Jaworski, Wojciech ;
Wierzbicki, Adam .
2014 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, 2014, :451-454
[10]  
Bhagavatula C., 2018, P 2018 C N AM CHAPT, P238, DOI DOI 10.18653/V1/N18-1022