Keyphrase extraction for legal questions based on a sequence to sequence model

被引:0
作者
Zeng D. [1 ,2 ]
Tong G. [1 ,2 ]
Dai Y. [1 ,2 ]
Li F. [1 ,2 ]
Han B. [3 ]
Xie S. [3 ]
机构
[1] School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha
[2] Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation, Changsha University of Science and Technology, Changsha
[3] Hunan Date-driven AI Technology Co. Ltd., Changsha
来源
Qinghua Daxue Xuebao/Journal of Tsinghua University | 2019年 / 59卷 / 04期
关键词
Keyphrase extraction; Reinforcement learning; Sequence-to-sequence model;
D O I
10.16511/j.cnki.qhdxxb.2019.21.007
中图分类号
学科分类号
摘要
Traditional keyphrase extraction algorithms cannot extract keyphrases that have not appeared in the text, so they cannot effectively extract keyphrases in short legal texts. This paper presents a sequence-to-sequence (seq2seq) model based on reinforcement learning to extract keyphrases from legal questions. First, the encoder pushes the semantic information of a given legal question text into a dense vector; then, the decoder automatically generates the keyphrases. Since the order of the generated keyphrases does not matter in the keyphrase extraction task, reinforcement learning is used to train the model. This method combines the advantages of reinforcement learning for decision-making and the advantages of the sequence-to-sequence model for long-term memory. Tests on real datasets show that the model provides accurate keyphrase extraction. © 2019, Tsinghua University Press. All right reserved.
引用
收藏
页码:256 / 261
页数:5
相关论文
共 18 条
[1]  
Turney P.D., Learning algorithms for keyphrase extraction, Information Retrieval Journal, 2, 4, pp. 303-336, (2002)
[2]  
Frank E., Paynter G.W., Witten I.H., Et al., Domain-specific keyphrase extraction, International Joint Conference on Artificial Intelligence, 2, pp. 668-673, (1999)
[3]  
Liu Z., Li P., Zheng Y., Et al., Clustering to find exemplar terms for keyphrase extraction, Conference on Empirical Methods in Natural Language Processing, pp. 257-266, (2009)
[4]  
Medelyan O., Frank E., Witten I.H., Human-competitive tagging using automatic keyphrase extraction, Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 1318-1327, (2009)
[5]  
Hasan K.S., Ng V., Automatic keyphrase extraction: A survey of the state of the art, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1, pp. 1262-1273, (2014)
[6]  
Wang M., Zhao B., Huang Y., PTR: Phrase-based topical ranking for automatic keyphrase extraction in scientific publications, International Conference on Neural Information Processing, pp. 120-128, (2016)
[7]  
Bellaachia A., Al-Dhelaan M., Ne-rank: A novel graph-based keyphrase extraction in twitter, Proceedings of the 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology, 1, pp. 372-379, (2012)
[8]  
Zhang Q., Wang Y., Gong Y., Et al., Keyphrase extraction using deep recurrent neural networks on Twitter, Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 836-845, (2016)
[9]  
Cho K., Van Merrienboer B., Gulcehre C., Et al., Learning phrase representations using rnn encoder-decoder for statistical machine translation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1724-1734, (2014)
[10]  
Vinyals O., Kaiser L., Koo T., Et al., Grammar as a foreign language, Advances in Neural Information Processing Systems, pp. 2773-2781, (2015)