Sentence Embedding Approach using LSTM Auto-encoder for Discussion Threads Summarization

被引:3
作者
Khan, Abdul Wali [1 ]
Al-Obeidat, Feras [2 ]
Khalid, Afsheen [1 ]
Amin, Adnan [1 ]
Moreira, Fernando [3 ,4 ]
机构
[1] Inst Management Sci Peshawar, Ctr Excellence Informat Technol, Peshawar, Pakistan
[2] Zayed Univ Abu Dhabi, Coll Technol Innovat, Abu Dhabi, U Arab Emirates
[3] Univ Portucalense, IJP, REMIT, Porto, Portugal
[4] Univ Aveiro Portugal, IEETA, Aveiro, Portugal
关键词
Sentence embedding; LSTM Auto-encoder; CBOW; Deep learning; Machine learning; NLP; DOCUMENT;
D O I
10.2298/CSIS221210055K
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Online discussion forums are repositories of valuable information where users interact and articulate their ideas and opinions, and share experiences about numerous topics. These online discussion forums are internet-based online communities where users can ask for help and find the solution to a problem. A new user of online discussion forums becomes exhausted from reading the significant number of irrelevant replies in a discussion. An automated discussion thread summarizing system (DTS) is necessary to create a candid view of the entire discussion of a query. Most of the previous approaches for automated DTS use the continuous bag of words (CBOW) model as a sentence embedding tool, which is poor at capturing the overall meaning of the sentence and is unable to grasp word dependency. To overcome these limitations, we introduce the LSTM Auto-encoder as a sentence embedding technique to improve the performance of DTS. The empirical result in the context of the proposed approach's average precision, recall, and F-measure with respect to ROGUE-1 and ROUGE-2 of two standard experimental datasets demonstrates the effectiveness and efficiency of the proposed approach and outperforms the state-of-the-art CBOW model in sentence embedding tasks and boost the performance of the automated DTS model.
引用
收藏
页码:1367 / 1387
页数:21
相关论文
共 44 条
[1]  
Adi Y, 2017, Arxiv, DOI arXiv:1608.04207
[2]   Neural sentence embedding models for semantic similarity estimation in the biomedical domain [J].
Blagec, Kathrin ;
Xu, Hong ;
Agibetov, Asan ;
Samwald, Matthias .
BMC BIOINFORMATICS, 2019, 20 (1)
[3]  
Erkan G., 2004, P 2004 C EMP METH NA, P365
[4]  
Ghosh S, 2013, INT J ADV COMPUT SC, V4, P35
[5]  
Grover J., 2017, P 10 WORKSH BUILD US, P16
[6]  
Gupta Vishal, 2010, Journal of Emerging Technologies in Web Intelligence, V2, P258, DOI 10.4304/jetwi.2.3.258-268
[7]  
Harabagiu S.M., 2002, P DOC UND C PHIL PA, P11
[8]  
Hill F, 2016, Arxiv, DOI arXiv:1602.03483
[9]   Data clustering: A review [J].
Jain, AK ;
Murty, MN ;
Flynn, PJ .
ACM COMPUTING SURVEYS, 1999, 31 (03) :264-323
[10]   Enhancements of Attention-Based Bidirectional LSTM for Hybrid Automatic Text Summarization [J].
Jiang, Jiawen ;
Zhang, Haiyang ;
Dai, Chenxu ;
Zhao, Qingjuan ;
Feng, Hao ;
Ji, Zhanlin ;
Ganchev, Ivan .
IEEE ACCESS, 2021, 9 :123660-123671