Sentence Embedding Approach using LSTM Auto-encoder for Discussion Threads Summarization

被引:3
作者
Khan, Abdul Wali [1 ]
Al-Obeidat, Feras [2 ]
Khalid, Afsheen [1 ]
Amin, Adnan [1 ]
Moreira, Fernando [3 ,4 ]
机构
[1] Inst Management Sci Peshawar, Ctr Excellence Informat Technol, Peshawar, Pakistan
[2] Zayed Univ Abu Dhabi, Coll Technol Innovat, Abu Dhabi, U Arab Emirates
[3] Univ Portucalense, IJP, REMIT, Porto, Portugal
[4] Univ Aveiro Portugal, IEETA, Aveiro, Portugal
关键词
Sentence embedding; LSTM Auto-encoder; CBOW; Deep learning; Machine learning; NLP; DOCUMENT;
D O I
10.2298/CSIS221210055K
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Online discussion forums are repositories of valuable information where users interact and articulate their ideas and opinions, and share experiences about numerous topics. These online discussion forums are internet-based online communities where users can ask for help and find the solution to a problem. A new user of online discussion forums becomes exhausted from reading the significant number of irrelevant replies in a discussion. An automated discussion thread summarizing system (DTS) is necessary to create a candid view of the entire discussion of a query. Most of the previous approaches for automated DTS use the continuous bag of words (CBOW) model as a sentence embedding tool, which is poor at capturing the overall meaning of the sentence and is unable to grasp word dependency. To overcome these limitations, we introduce the LSTM Auto-encoder as a sentence embedding technique to improve the performance of DTS. The empirical result in the context of the proposed approach's average precision, recall, and F-measure with respect to ROGUE-1 and ROUGE-2 of two standard experimental datasets demonstrates the effectiveness and efficiency of the proposed approach and outperforms the state-of-the-art CBOW model in sentence embedding tasks and boost the performance of the automated DTS model.
引用
收藏
页码:1367 / 1387
页数:21
相关论文
共 44 条
[11]  
Kallimani JS, 2012, CYBERN INF TECHNOL, V12, P34
[12]  
Khan A., 2020, Complexity, V2020
[13]  
Khan R., 2019, International Journal of Information Engineering Electronic Business, V11
[14]  
Karn SK, 2021, Arxiv, DOI arXiv:2103.05131
[15]  
Kupiec J. M., P 18 ANN INT ACM SIG, P68, DOI [DOI 10.1145/215206.215333, 10.1145/215206.215333]
[16]  
Lin C.Y., 2004, Text Summarization Branches Out
[17]  
Liu F., 2008, P ACL 08 HLT SHORT P, P201
[18]  
Ma SM, 2018, Arxiv, DOI arXiv:1805.04869
[19]  
Macintyre J, 2019, 20 INT C EANN 2019 X, V1000
[20]  
Marge M., 2010, P NAACL HLT 2010 WOR, P99