Citation impact prediction for scientific papers using stepwise regression analysis

被引:150
作者
Yu, Tian [1 ]
Yu, Guang [1 ]
Li, Peng-Yu [2 ]
Wang, Liang [1 ]
机构
[1] Harbin Inst Technol, Sch Management, Harbin 150006, Peoples R China
[2] Harbin Inst Technol, Sch Educ, Harbin 150006, Peoples R China
基金
中国国家自然科学基金;
关键词
Scientific paper; Citation impact prediction; Feature space; Multiple regression; BIOMEDICAL LITERATURE; SCIENCE; COUNTS; JOURNALS; MODELS; INFORMATION; ARTICLES; GENDER;
D O I
10.1007/s11192-014-1279-6
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Researchers typically pay greater attention to scientific papers published within the last 2 years, and especially papers that may have great citation impact in the future. However, the accuracy of current citation impact prediction methods is still not satisfactory. This paper argues that objective features of scientific papers can make citation impact prediction relatively accurate. The external features of a paper, features of authors, features of the journal of publication, and features of citations are all considered in constructing a paper's feature space. The stepwise multiple regression analysis is used to select appropriate features from the space and to build a regression model for explaining the relationship between citation impact and the chosen features. The validity of this model is also experimentally verified in the subject area of Information Science & Library Science. The results show that the regression model is effective within this subject.
引用
收藏
页码:1233 / 1252
页数:20
相关论文
共 36 条
[1]   Characteristics of highly cited papers [J].
Aksnes, DW .
RESEARCH EVALUATION, 2003, 12 (03) :159-170
[2]  
[Anonymous], P 14 INT JOINT C ART
[3]  
Borsuk RM., 2009, OPEN ECOLOGY J, V2, P25, DOI [10.2174/1874213000902010025, DOI 10.2174/1874213000902010025]
[4]  
Boyack KW, 2011, PRO INT CONF SCI INF, P123
[5]   Predicting future citation behavior [J].
Burrell, QL .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2003, 54 (05) :372-378
[6]   Stochastic modelling of the first-citation distribution [J].
Burrell, QL .
SCIENTOMETRICS, 2001, 52 (01) :3-12
[7]   Can the Quality of Scientific Work Be Predicted Using Information on the Author's Track Record? [J].
Danell, Rickard .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2011, 62 (01) :50-60
[8]   Determinants of research citation impact in nanoscience and nanotechnology [J].
Didegah, Fereshteh ;
Thelwall, Mike .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2013, 64 (05) :1055-1064
[9]   Predictive ranking of computer scientists using CiteSeer data [J].
Feitelson, DG ;
Yovel, U .
JOURNAL OF DOCUMENTATION, 2004, 60 (01) :44-61
[10]   Computer models for identifying instrumental citations in the biomedical literature [J].
Fu, Lawrence D. ;
Aphinyanaphongs, Yindalon ;
Aliferis, Constantin F. .
SCIENTOMETRICS, 2013, 97 (03) :871-882