Understanding the Impact of Early Citers on Long-Term Scientific Impact

被引:0
作者
Singh, Mayank [1 ]
Jaiswal, Ajay [1 ]
Shree, Priya [1 ]
Pal, Arindam [2 ]
Mukherjee, Animesh [1 ]
Goyal, Pawan [1 ]
机构
[1] IIT Kharagpur, Dept Comp Sci & Engn, Kharagpur, W Bengal, India
[2] TCS Innovat Labs, Chennai, Tamil Nadu, India
来源
2017 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL 2017) | 2017年
关键词
Long-term scientific impact; citation count; early citers; supervised regression models; CITATION IMPACT; PUBLICATION; COUNTS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper explores an interesting new dimension to the challenging problem of predicting long-term scientific impact (LTSI) usually measured by the number of citations accumulated by a paper in the long-term. It is well known that early citations (within 1-2 years after publication) acquired by a paper positively affects its LTSI. However, there is no work that investigates if the set of authors who bring in these early citations to a paper also affect its LTSI. In this paper, we demonstrate for the first time, the impact of these authors whom we call early citers (EC) on the LTSI of a paper. Note that this study of the complex dynamics of EC introduces a brand new paradigm in citation behavior analysis. Using a massive computer science bibliographic dataset we identify two distinct categories of EC - we call those authors who have high overall publication/citation count in the dataset as influential and the rest of the authors as non-influential. We investigate three characteristic properties of EC and present an extensive analysis of how each category correlates with LTSI in terms of these properties. In contrast to popular perception, we find that influential EC negatively affects LTSI possibly owing to attention stealing. To motivate this, we present several representative examples from the dataset. A closer inspection of the collaboration network reveals that this stealing effect is more profound if an EC is nearer to the authors of the paper being investigated. As an intuitive use case, we show that incorporating EC properties in the state-of-the-art supervised citation prediction models leads to high performance margins. At the closing, we present an online portal to visualize EC statistics along with the prediction results for a given query paper. We make all the codes and the processed dataset available in the public domain at our portal: http://www.cnergres.iitkgp.ac.in/earlyciters/
引用
收藏
页码:59 / 68
页数:10
相关论文
共 33 条
[1]   Early citation counts correlate with accumulated impact [J].
Adams, J .
SCIENTOMETRICS, 2005, 63 (03) :567-581
[2]   The Eigenfactor™ Metrics [J].
Bergstrom, Carl T. ;
West, Jevin D. ;
Wiseman, Marc A. .
JOURNAL OF NEUROSCIENCE, 2008, 28 (45) :11433-11434
[3]   Which percentile-based approach should be preferred for calculating normalized citation impact values? An empirical comparison of five approaches including a newly developed citation-rank approach (P100) [J].
Bornmann, Lutz ;
Leydesdorff, Loet ;
Wang, Jian .
JOURNAL OF INFORMETRICS, 2013, 7 (04) :933-944
[4]  
Breiman F, 1984, OLSHEN STONE CLASSIF
[5]   Earlier web usage statistics as predictors of later citation impact [J].
Brody, Tim ;
Harnad, Stevan ;
Carr, Leslie .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2006, 57 (08) :1060-1072
[6]   Journal prestige, publication bias, and other characteristics associated with citation of published studies in peer-reviewed journals [J].
Callaham, M ;
Wears, RL ;
Weber, E .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2002, 287 (21) :2847-2850
[7]  
Cameron AC, 1997, J ECONOMETRICS, V77, P329
[8]  
Chakraborty T, 2014, ACM-IEEE J CONF DIG, P351, DOI 10.1109/JCDL.2014.6970190
[9]   Which factors help authors produce the highest impact research? Collaboration, journal and document properties [J].
Didegah, Fereshteh ;
Thelwall, Mike .
JOURNAL OF INFORMETRICS, 2013, 7 (04) :861-873
[10]  
Drucker H, 1997, ADV NEUR IN, V9, P155