WAGRank: A word ranking model based on word attention graph for keyphrase extraction

被引:0
|
作者
Bian, Rong [1 ,2 ]
Cheng, Bing [1 ,3 ]
机构
[1] Chinese Acad Sci, Acad Math & Syst Sci, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Math Sci, Beijing, Peoples R China
[3] Chinese Acad Sci, Ctr Forecasting Sci, Acad Math & Syst Sci, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
Keyphrase extraction; attention mechanism; graph-based model; pre-trained language model; semantic feature;
D O I
10.1177/1088467X241296257
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Keyphrase extraction is an essential task of identifying representative words or phrases in document processing. Main traditional models rely on each word frequency feature in a document and its associated corpus. There are two major limitations of the word frequency method: first, it fails to fully exploit semantic information in the document, that is, it is a bag-of-word method; second, it tends to be influenced by local word frequency in the short current text when the linked corpus is not available or incomplete. This paper proposes WAGRank, a novel unsupervised ranking model on a word attention graph, where nodes are words and edges are semantic relations between words. To assign edge weights, two interpretable statistical methods of assessing correlation strength between words are designed using attention mechanism. WAGRank depends on word semantics rather than frequency only in the current text, using external knowledge stored in a pre-trained language model. WAGRank was evaluated on two publicly available datasets against twelve baselines, presenting its effectiveness and robustness. Besides, the Granger causality test illustrated that word attention has a statistically significant predictive effect on word frequency, providing a more reasonable explanation for word frequency analysis.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] Word-character attention model for Chinese text classification
    Qiao, Xue
    Peng, Chen
    Liu, Zhen
    Hu, Yanfeng
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (12) : 3521 - 3537
  • [22] Contextual topic discovery using unsupervised keyphrase extraction and hierarchical semantic graph model
    Hung Du
    Srikanth Thudumu
    Antonio Giardina
    Rajesh Vasa
    Kon Mouzakis
    Li Jiang
    John Chisholm
    Sanat Bista
    Journal of Big Data, 10
  • [23] Word-character attention model for Chinese text classification
    Xue Qiao
    Chen Peng
    Zhen Liu
    Yanfeng Hu
    International Journal of Machine Learning and Cybernetics, 2019, 10 : 3521 - 3537
  • [24] Bert-Based Chinese Medical Keyphrase Extraction Model Enhanced with External Features
    Ding, Liangping
    Zhang, Zhixiong
    Zhao, Yang
    TOWARDS OPEN AND TRUSTWORTHY DIGITAL SOCIETIES, ICADL 2021, 2021, 13133 : 167 - 176
  • [25] A Dependency Graph-Based Keyphrase Extraction Method Using Anti-patterns
    Batsuren, Khuyagbaatar
    Batbaatar, Erdenebileg
    Munkhdalai, Tsendsuren
    Li, Meijing
    Namsrai, Oyun-Erdene
    Ryu, Keun Ho
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2018, 14 (05): : 1254 - 1271
  • [26] Electronic word-of-mouth effects on studio performance leveraging attention-based model
    Yang Liu
    Hao Fei
    Qingguo Zeng
    Bobo Li
    Lili Ma
    Donghong Ji
    Joaquín Ordieres Meré
    Neural Computing and Applications, 2020, 32 : 17601 - 17622
  • [27] Electronic word-of-mouth effects on studio performance leveraging attention-based model
    Liu, Yang
    Fei, Hao
    Zeng, Qingguo
    Li, Bobo
    Ma, Lili
    Ji, Donghong
    Ordieres Mere, Joaquin
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (23) : 17601 - 17622
  • [28] SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-Trained Language Model
    Sun, Yi
    Qiu, Hangping
    Zheng, Yu
    Wang, Zhongwei
    Zhang, Chaoran
    IEEE ACCESS, 2020, 8 : 10896 - 10906
  • [29] RankUp: Enhancing graph-based keyphrase extraction methods with error-feedback propagation
    Figueroa, Gerardo
    Chen, Po-Chi
    Chen, Yi-Shin
    COMPUTER SPEECH AND LANGUAGE, 2018, 47 : 112 - 131
  • [30] ISKE: An unsupervised automatic keyphrase extraction approach using the iterated sentences based on graph method
    Chi, Ling
    Hu, Liang
    KNOWLEDGE-BASED SYSTEMS, 2021, 223