Online author name disambiguation in evolving digital library

被引:2
作者
Pooja, K. M. [1 ]
Mondal, Samrat [1 ]
Chandra, Joydeep [1 ]
机构
[1] Indian Inst Technol Patna, Dept Comp Sci & Engn, Patna, India
关键词
Author name disambiguation; Dynamic graph embedding; Digital library; Academic social network;
D O I
10.1016/j.neucom.2021.07.104
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Name ambiguity is a prevalent problem in digital library domain where mapping of bibliographic records to authors isa major issue. The unprecedented growth of the bibliographic records and absence of unique identifiers are further exacerbating the problem. Specifically, name ambiguity affects various bibliometric analysis tasks that include record management as well as scientific assessment of the authors thereby necessitating the name disambiguation. The name disambiguation task is to assign the records, possibly with the ambiguous authorship, to corresponding authors. While existing techniques are good at extract-ing abstract features from set of records with a common author name that can be subsequently used for clustering the records based on unique author identities, however, such techniques usually perform poorly in disambiguating isolated individual record entries that arrive continuously. Disambiguation of only newly arrived records, rather than the whole records of the digital library is challenging, however, computationally rewarding and thus, not only preferable but becoming the necessity due to tremendous growth in the number of bibliographic records with the time, which is likely to continue. In this regard, we propose an online author name disambiguation approach for evolving digital library. Our approach involves representation learning of records in an online manner in evolving (academic networks) digital library using dynamic graph embedding and clustering of latent representation of records. We show the use of our online name disambiguation method in batch setting (for static or initial records of digital library) and incremental setting (for new records of digital library). Significant improvement, over exist-ing state-of-the-art methods in terms of various evaluation metrics, has been observed which indicates the effectiveness of the proposed approach.(c) 2022 Published by Elsevier B.V.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 45 条
[1]   Three-feature model to reproduce the topology of citation networks and the effects from authors' visibility on their h-index [J].
Amancio, Diego Raphael ;
Oliveira, Osvaldo Novais, Jr. ;
Costa, Luciano da Fontoura .
JOURNAL OF INFORMETRICS, 2012, 6 (03) :427-434
[2]  
[Anonymous], 2014, Social Network Analysis-Community Detection and Evolution, Lecture Notes in Social Networks
[3]   The Nested Chinese Restaurant Process and Bayesian Nonparametric Inference of Topic Hierarchies [J].
Blei, David M. ;
Griffiths, Thomas L. ;
Jordan, Michael I. .
JOURNAL OF THE ACM, 2010, 57 (02)
[4]  
Cen L, 2013, SIGIR'13: THE PROCEEDINGS OF THE 36TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH & DEVELOPMENT IN INFORMATION RETRIEVAL, P741
[5]   How Do the Open Source Communities Address Usability and UX Issues? An Exploratory Study [J].
Cheng, Jinghui ;
Guo, Jin L. C. .
CHI 2018: EXTENDED ABSTRACTS OF THE 2018 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2018,
[6]   An Unsupervised Heuristic-Based Hierarchical Method for Name Disambiguation in Bibliographic Citations [J].
Cota, Ricardo G. ;
Ferreira, Anderson A. ;
Nascimento, Cristiano ;
Goncalves, Marcos Andre ;
Laender, Alberto H. F. .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2010, 61 (09) :1853-1870
[7]  
Esperidio L.V.B., 2014, Journal of Information and Data Management, V5, P293
[8]  
Fan X., 2011, Journal of Data & Information Quality, V2, P1, DOI [10.1145/1891879.1891883, DOI 10.1145/1891879.1891883]
[9]   Self-Training Author Name Disambiguation for Information Scarce Scenarios [J].
Ferreira, Anderson A. ;
Veloso, Adriano ;
Goncalves, Marcos Andre ;
Laender, Alberto H. F. .
JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2014, 65 (06) :1257-1278
[10]  
Ferreira Anderson A., 2011, J INFORM DATA MANAGE, V2, P289