Towards Effective Author Name Disambiguation by Hybrid Attention

被引:1
作者
Zhou, Qian [1 ]
Chen, Wei [1 ]
Zhao, Peng-Peng [1 ]
Liu, An [1 ]
Xu, Jia-Jie [1 ]
Qu, Jian-Feng [1 ]
Zhao, Lei [1 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China
基金
中国国家自然科学基金;
关键词
author name disambiguation; multiple-feature information; hybrid attention; pruning strategy; structural information loss of vector space;
D O I
10.1007/s11390-023-2070-z
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Author name disambiguation (AND) is a central task in academic search, which has received more attention recently accompanied by the increase of authors and academic publications. To tackle the AND problem, existing studies have proposed various approaches based on different types of information, such as raw document features (e.g., co-authors, titles, and keywords), the fusion feature (e.g., a hybrid publication embedding based on multiple raw document features), the local structural information (e.g., a publication's neighborhood information on a graph), and the global structural information (e.g., interactive information between a node and others on a graph). However, there has been no work taking all the above-mentioned information into account and taking full advantage of the contributions of each raw document feature for the AND problem so far. To fill the gap, we propose a novel framework named EAND (Towards Effective Author Name Disambiguation by Hybrid Attention). Specifically, we design a novel feature extraction model, which consists of three hybrid attention mechanism layers, to extract key information from the global structural information and the local structural information that are generated from six similarity graphs constructed based on different similarity coefficients, raw document features, and the fusion feature. Each hybrid attention mechanism layer contains three key modules: a local structural perception, a global structural perception, and a feature extractor. Additionally, the mean absolute error function in the joint loss function is used to introduce the structural information loss of the vector space. Experimental results on two real-world datasets demonstrate that EAND achieves superior performance, outperforming state-of-the-art methods by at least +2.74% in terms of the micro-F1 score and +3.31% in terms of the macro-F1 score.
引用
收藏
页码:929 / 950
页数:22
相关论文
共 32 条
  • [1] CONNA: Addressing Name Disambiguation on the Fly
    Chen, Bo
    Zhang, Jing
    Tang, Jie
    Cai, Lingfan
    Wang, Zhaoyu
    Zhao, Shu
    Chen, Hong
    Li, Cuiping
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (07) : 3139 - 3152
  • [2] Multilingual author matching across different academic databases: a case study on KAKEN, DBLP, and PubMed
    Chikazawa, Yuto
    Katsurai, Marie
    Ohmukai, Ikki
    [J]. SCIENTOMETRICS, 2021, 126 (03) : 2311 - 2327
  • [3] An Unsupervised Heuristic-Based Hierarchical Method for Name Disambiguation in Bibliographic Citations
    Cota, Ricardo G.
    Ferreira, Anderson A.
    Nascimento, Cristiano
    Goncalves, Marcos Andre
    Laender, Alberto H. F.
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2010, 61 (09): : 1853 - 1870
  • [4] Fan X., 2011, ACM J DATA INF QUAL, P1, DOI [10.1145/1891879.1891883, DOI 10.1145/1891879.1891883]
  • [5] Ferreira A A., 2020, Automatic Disambiguation of Author Names in Bibliographic Repositories, DOI [10.1007/978-3-031-02322-4, DOI 10.1007/978-3-031-02322-4]
  • [6] Godoi TA, 2013, ACM-IEEE J CONF DIG, P209
  • [7] An Approach for Focused Crawler to Harvest Digital Academic Documents in Online Digital Libraries
    Gupta, Sumita
    Duhan, Neelam
    Bansal, Poonam
    [J]. INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2019, 9 (03) : 23 - 47
  • [8] Name disambiguation spectral in author citations using a K-way clustering method
    Han, H
    Zha, HY
    Giles, CL
    [J]. PROCEEDINGS OF THE 5TH ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, PROCEEDINGS, 2005, : 334 - 343
  • [9] Two supervised learning approaches for name disambiguation in author citations
    Han, H
    Giles, L
    Zha, H
    Li, C
    Tsioutsiouliklis, K
    [J]. JCDL 2004: PROCEEDINGS OF THE FOURTH ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES: GLOBAL REACH AND DIVERSE IMPACT, 2004, : 296 - 305
  • [10] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778