WAGRank: A word ranking model based on word attention graph for keyphrase extraction

被引:0
|
作者
Bian, Rong [1 ,2 ]
Cheng, Bing [1 ,3 ]
机构
[1] Chinese Acad Sci, Acad Math & Syst Sci, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Math Sci, Beijing, Peoples R China
[3] Chinese Acad Sci, Ctr Forecasting Sci, Acad Math & Syst Sci, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
Keyphrase extraction; attention mechanism; graph-based model; pre-trained language model; semantic feature;
D O I
10.1177/1088467X241296257
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Keyphrase extraction is an essential task of identifying representative words or phrases in document processing. Main traditional models rely on each word frequency feature in a document and its associated corpus. There are two major limitations of the word frequency method: first, it fails to fully exploit semantic information in the document, that is, it is a bag-of-word method; second, it tends to be influenced by local word frequency in the short current text when the linked corpus is not available or incomplete. This paper proposes WAGRank, a novel unsupervised ranking model on a word attention graph, where nodes are words and edges are semantic relations between words. To assign edge weights, two interpretable statistical methods of assessing correlation strength between words are designed using attention mechanism. WAGRank depends on word semantics rather than frequency only in the current text, using external knowledge stored in a pre-trained language model. WAGRank was evaluated on two publicly available datasets against twelve baselines, presenting its effectiveness and robustness. Besides, the Granger causality test illustrated that word attention has a statistically significant predictive effect on word frequency, providing a more reasonable explanation for word frequency analysis.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] Relation Extraction in Vietnamese Text via Piecewise Convolution Neural Network with Word-Level Attention
    Van-Nhat Nguyen
    Ha-Thanh Nguyen
    Dinh-Hieu Vo
    Le-Minh Nguyen
    PROCEEDINGS OF 2018 5TH NAFOSTED CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS 2018), 2018, : 99 - 103
  • [42] A Natural Language Process-Based Framework for Automatic Association Word Extraction
    Hu, Zheng
    Luo, Jiao
    Zhang, Chunhong
    Li, Wei
    IEEE ACCESS, 2020, 8 : 1986 - 1997
  • [43] A supervised keyphrase extraction method based on the logistic regression model for social question answering sites
    Lin, Ge
    Xiang, Yi
    Wang, Zhong
    Wang, Ruomei
    Journal of Information and Computational Science, 2014, 11 (10): : 3571 - 3583
  • [44] Attention-Based CNN and Bi-LSTM Model Based on TF-IDF and GloVe Word Embedding for Sentiment Analysis
    Kamyab, Marjan
    Liu, Guohua
    Adjeisah, Michael
    APPLIED SCIENCES-BASEL, 2021, 11 (23):
  • [45] aDFR: An Attention-Based Deep Learning Model for Flight Ranking
    Yi, Yuan
    Cao, Jian
    Tan, YuDong
    Nie, QiangQiang
    Lu, XiaoXi
    WEB INFORMATION SYSTEMS ENGINEERING, WISE 2020, PT II, 2020, 12343 : 548 - 562
  • [46] Distant-supervised Relation Extraction with Hierarchical Attention Based on Knowledge Graph
    Yao, Hong
    Dong, Lijun
    Zhen, Shiqi
    Kang, Xiaojun
    Li, Xinchuan
    Liang, Qingzhong
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 229 - 236
  • [47] Entity Linking Model Based on Cascading Attention and Dynamic Graph
    Li, Hongchan
    Li, Chunlei
    Sun, Zhongchuan
    Zhu, Haodong
    ELECTRONICS, 2024, 13 (19)
  • [48] Deep learning-based extractive text summarization with word-level attention mechanism
    Gambhir, Mahak
    Gupta, Vishal
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (15) : 20829 - 20852
  • [49] Chinese Text Sentiment Analysis Based on Dual Channel Attention Network with Hybrid Word Embedding
    Zhou N.
    Zhong N.
    Jin G.
    Liu B.
    Data Analysis and Knowledge Discovery, 2023, 7 (03) : 58 - 68
  • [50] Deep learning-based extractive text summarization with word-level attention mechanism
    Mahak Gambhir
    Vishal Gupta
    Multimedia Tools and Applications, 2022, 81 : 20829 - 20852