WAGRank: A word ranking model based on word attention graph for keyphrase extraction

被引：0

作者：

Bian, Rong ^{[1
,2
]}

Cheng, Bing ^{[1
,3
]}

机构：

[1] Chinese Acad Sci, Acad Math & Syst Sci, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Math Sci, Beijing, Peoples R China

[3] Chinese Acad Sci, Ctr Forecasting Sci, Acad Math & Syst Sci, Beijing, Peoples R China

来源：

INTELLIGENT DATA ANALYSIS | 2024年

基金：

国家重点研发计划;

关键词：

Keyphrase extraction; attention mechanism; graph-based model; pre-trained language model; semantic feature;

D O I：

10.1177/1088467X241296257

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Keyphrase extraction is an essential task of identifying representative words or phrases in document processing. Main traditional models rely on each word frequency feature in a document and its associated corpus. There are two major limitations of the word frequency method: first, it fails to fully exploit semantic information in the document, that is, it is a bag-of-word method; second, it tends to be influenced by local word frequency in the short current text when the linked corpus is not available or incomplete. This paper proposes WAGRank, a novel unsupervised ranking model on a word attention graph, where nodes are words and edges are semantic relations between words. To assign edge weights, two interpretable statistical methods of assessing correlation strength between words are designed using attention mechanism. WAGRank depends on word semantics rather than frequency only in the current text, using external knowledge stored in a pre-trained language model. WAGRank was evaluated on two publicly available datasets against twelve baselines, presenting its effectiveness and robustness. Besides, the Granger causality test illustrated that word attention has a statistically significant predictive effect on word frequency, providing a more reasonable explanation for word frequency analysis.

引用

页数：23

共 50 条

[41] Relation Extraction in Vietnamese Text via Piecewise Convolution Neural Network with Word-Level Attention
Van-Nhat Nguyen
Ha-Thanh Nguyen
Dinh-Hieu Vo
Le-Minh Nguyen
PROCEEDINGS OF 2018 5TH NAFOSTED CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS 2018), 2018, : 99 - 103
[42] A Natural Language Process-Based Framework for Automatic Association Word Extraction
Hu, Zheng
Luo, Jiao
Zhang, Chunhong
Li, Wei
IEEE ACCESS, 2020, 8 : 1986 - 1997
[43] A supervised keyphrase extraction method based on the logistic regression model for social question answering sites
Lin, Ge
Xiang, Yi
Wang, Zhong
Wang, Ruomei
Journal of Information and Computational Science, 2014, 11 (10): : 3571 - 3583
[44] Attention-Based CNN and Bi-LSTM Model Based on TF-IDF and GloVe Word Embedding for Sentiment Analysis
Kamyab, Marjan
Liu, Guohua
Adjeisah, Michael
APPLIED SCIENCES-BASEL, 2021, 11 (23):
[45] aDFR: An Attention-Based Deep Learning Model for Flight Ranking
Yi, Yuan
Cao, Jian
Tan, YuDong
Nie, QiangQiang
Lu, XiaoXi
WEB INFORMATION SYSTEMS ENGINEERING, WISE 2020, PT II, 2020, 12343 : 548 - 562
[46] Distant-supervised Relation Extraction with Hierarchical Attention Based on Knowledge Graph
Yao, Hong
Dong, Lijun
Zhen, Shiqi
Kang, Xiaojun
Li, Xinchuan
Liang, Qingzhong
2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 229 - 236
[47] Entity Linking Model Based on Cascading Attention and Dynamic Graph
Li, Hongchan
Li, Chunlei
Sun, Zhongchuan
Zhu, Haodong
ELECTRONICS, 2024, 13 (19)
[48] Deep learning-based extractive text summarization with word-level attention mechanism
Gambhir, Mahak
Gupta, Vishal
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (15) : 20829 - 20852
[49] Chinese Text Sentiment Analysis Based on Dual Channel Attention Network with Hybrid Word Embedding
Zhou N.
Zhong N.
Jin G.
Liu B.
Data Analysis and Knowledge Discovery, 2023, 7 (03) : 58 - 68
[50] Deep learning-based extractive text summarization with word-level attention mechanism
Mahak Gambhir
Vishal Gupta
Multimedia Tools and Applications, 2022, 81 : 20829 - 20852

← 1 2 3 4 5 →