WAGRank: A word ranking model based on word attention graph for keyphrase extraction

被引:0
|
作者
Bian, Rong [1 ,2 ]
Cheng, Bing [1 ,3 ]
机构
[1] Chinese Acad Sci, Acad Math & Syst Sci, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Math Sci, Beijing, Peoples R China
[3] Chinese Acad Sci, Ctr Forecasting Sci, Acad Math & Syst Sci, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
Keyphrase extraction; attention mechanism; graph-based model; pre-trained language model; semantic feature;
D O I
10.1177/1088467X241296257
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Keyphrase extraction is an essential task of identifying representative words or phrases in document processing. Main traditional models rely on each word frequency feature in a document and its associated corpus. There are two major limitations of the word frequency method: first, it fails to fully exploit semantic information in the document, that is, it is a bag-of-word method; second, it tends to be influenced by local word frequency in the short current text when the linked corpus is not available or incomplete. This paper proposes WAGRank, a novel unsupervised ranking model on a word attention graph, where nodes are words and edges are semantic relations between words. To assign edge weights, two interpretable statistical methods of assessing correlation strength between words are designed using attention mechanism. WAGRank depends on word semantics rather than frequency only in the current text, using external knowledge stored in a pre-trained language model. WAGRank was evaluated on two publicly available datasets against twelve baselines, presenting its effectiveness and robustness. Besides, the Granger causality test illustrated that word attention has a statistically significant predictive effect on word frequency, providing a more reasonable explanation for word frequency analysis.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Graph-based Keyphrase Extraction Using Word and Document Embeddings
    Zu, Xian
    Xie, Fei
    Liu, Xiaojian
    11TH IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH (ICKG 2020), 2020, : 70 - 76
  • [2] Automatic keyphrase extraction using word embeddings
    Yuxiang Zhang
    Huan Liu
    Suge Wang
    W. H. Ip.
    Wei Fan
    Chunjing Xiao
    Soft Computing, 2020, 24 : 5593 - 5608
  • [3] Automatic keyphrase extraction using word embeddings
    Zhang, Yuxiang
    Liu, Huan
    Wang, Suge
    Ip, W. H.
    Fan, Wei
    Xiao, Chunjing
    SOFT COMPUTING, 2020, 24 (08) : 5593 - 5608
  • [4] Keyphrase Extraction Using PageRank and Word Features
    Le, Huong T.
    Bui, Que X.
    2021 RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES (RIVF 2021), 2021, : 257 - 261
  • [5] Local word vectors guiding keyphrase extraction
    Papagiannopoulou, Eirini
    Tsoumakas, Grigorios
    INFORMATION PROCESSING & MANAGEMENT, 2018, 54 (06) : 888 - 902
  • [6] Keyphrase Extraction Based on Optimized Random Walks on Multiple Word Relations
    Chen, Wenyan
    Liu, Zheng
    Shi, Wei
    Yu, Jeffrey Xu
    WEB AND BIG DATA (APWEB-WAIM 2018), PT II, 2018, 10988 : 359 - 367
  • [7] Keyphrase Extraction Using Enhanced Word and Document Embedding
    Alotaibi, Fahd Saleh
    Sharma, Saurabh
    Gupta, Vishal
    Gupta, Savita
    IETE JOURNAL OF RESEARCH, 2023, 69 (12) : 8876 - 8888
  • [8] A Noun-Centric Keyphrase Extraction Model: Graph-Based Approach
    Abimbola, Rilwan O.
    Awoyelu, Iyabo O.
    Hunsu, Folasade O.
    Akinyemi, Bodunde O.
    Aderounmu, Ganiyu A.
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2022, 13 (06) : 578 - 589
  • [9] Geoscience keyphrase extraction algorithm using enhanced word embedding
    Qiu, Qinjun
    Xie, Zhong
    Wu, Liang
    Li, Wenjia
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 125 : 157 - 169
  • [10] Exploring Word Embeddings in CRF-based Keyphrase Extraction from Research Papers
    Patel, Krutarth
    Caragea, Cornelia
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON KNOWLEDGE CAPTURE (K-CAP '19), 2019, : 37 - 44