Improving paraphrase generation using supervised neural-based statistical machine translation framework

被引:2
作者
Razaq, Abdur [1 ,3 ]
Shah, Babar [2 ,3 ]
Khan, Gohar [2 ,3 ]
Alfandi, Omar [2 ,3 ]
Ullah, Abrar [2 ,3 ]
Halim, Zahid [1 ,3 ]
Ur Rahman, Atta [1 ,3 ]
机构
[1] Ghulam Ishaq Khan Inst Engn Sci & Technol, Topi, Pakistan
[2] Zayed Univ, Coll Technol Innovat, Abu Dhabi, U Arab Emirates
[3] Heriot Watt Univ, Dubai, U Arab Emirates
关键词
Phrase generation; Neural machine translation; Statistical machine translation; Neural-based statistical machine translation;
D O I
10.1007/s00521-023-08830-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In phrase generation (PG), a sentence in the natural language is changed into a new one with a different syntactic structure but having the same semantic meaning. The present sequence-to-sequence strategy aims to recall the words and structures from the training dataset rather than learning the words' semantics. As a result, the resulting statements are frequently grammatically accurate but incorrect linguistically. The neural machine translation approach suffers to handle unusual words, domain mismatch, and unfamiliar words, but it takes context well. This work presents a novel model for creating paraphrases that use neural-based statistical machine translation (NSMT). Our approach creates potential paraphrases for any source input, calculates the level of semantic similarity between text segments of any length, and encodes paraphrases in a continuous space. To evaluate the suggested model, Quora Question Pair and Microsoft Common Objects in Context benchmark datasets are used. We demonstrate that the proposed technique achieves cutting-edge performance on both datasets using automatic and human assessments. Experimental findings across tasks and datasets demonstrate that the suggested NSMT-based PG outperforms those achieved with traditional phrase-based techniques. We also show that the proposed technique may be used automatically for the development of paraphrases for a variety of languages.
引用
收藏
页码:7705 / 7719
页数:15
相关论文
共 49 条
[11]  
Devlin J, 2014, PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P1370
[12]  
Edward Hu J., 2019, P 23 C COMPUTATIONA, P44
[13]  
Egonmwan Elozino, 2019, P 3 WORKSH NEUR GEN, P249, DOI DOI 10.18653/V1/D19-5627
[14]  
Fan C, 2021, Arxiv, DOI arXiv:2109.02950
[15]  
Fu Y, 2019, ADV NEURAL INFORM PR, P32
[16]  
Gadag A, 2016, 2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER AND OPTIMIZATION TECHNIQUES (ICEECCOT), P188, DOI 10.1109/ICEECCOT.2016.7955212
[17]  
Goyal T, 2020, Arxiv, DOI arXiv:2005.02013
[18]  
Guo ZL, 2021, PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, P3815
[19]  
Gupta A, 2018, AAAI CONF ARTIF INTE, P5149
[20]  
Khuong Nguyen-Ngoc, 2018, Integrated Uncertainty in Knowledge Modelling and Decision Making. 6th International Symposium, IUKM 2018. Proceedings: LNAI 10758, P166, DOI 10.1007/978-3-319-75429-1_14