Cross-sentence Pre-trained model for Interactive QA matching

被引:0
作者
Wu, Jinmeng [1 ,2 ]
Hao, Yanbin [3 ]
机构
[1] Wuhan Inst Technol, Sch Elect & Informat Engn, Wuhan, Peoples R China
[2] Univ Liverpool, Sch Elect Engn Elect & Comp Sci, Brownlow Hill, Liverpool, Merseyside, England
[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong 999077, Peoples R China
来源
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020) | 2020年
基金
中国国家自然科学基金;
关键词
Question answering; Interactive matching; Pre-trained language model; Context jump Dependencies;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Semantic matching measures the dependencies between query and answer representations, which is an important criterion for evaluating whether the matching is successful. In fact, such matching does not examine each sentence individually, because the context information between sentences should be considered equally important to the syntactic context inside a sentence. Considering the above, we propose a novel QA matching model, built upon a cross-sentence context-aware architecture. Specifically, an interactive attention mechanism with a pre-trained language model is presented to automatically select salient positional answer representations that contribute more significantly to the answer relevance of a given question. In addition to the context information captured at each word position, we incorporate a quantity of context jump dependencies to leverage the attention weight formulation. This can capture the amount of useful information brought by the next word, is computed by modeling the joint probability between two adjacent word states. The proposed method is compared with multiple state-of-the-art methods using the TREC library, WikiQA, and the Yahoo! community question datasets. Experimental results show that the proposed method outperforms satisfactorily the competing ones.
引用
收藏
页码:5417 / 5424
页数:8
相关论文
共 50 条
[41]   CPM: A large-scale generative Chinese Pre-trained language model [J].
Zhang, Zhengyan ;
Han, Xu ;
Zhou, Hao ;
Ke, Pei ;
Gu, Yuxian ;
Ye, Deming ;
Qin, Yujia ;
Su, Yusheng ;
Ji, Haozhe ;
Guan, Jian ;
Qi, Fanchao ;
Wang, Xiaozhi ;
Zheng, Yanan ;
Zeng, Guoyang ;
Cao, Huanqi ;
Chen, Shengqi ;
Li, Daixuan ;
Sun, Zhenbo ;
Liu, Zhiyuan ;
Huang, Minlie ;
Han, Wentao ;
Tang, Jie ;
Li, Juanzi ;
Zhu, Xiaoyan ;
Sun, Maosong .
AI OPEN, 2021, 2 :93-99
[42]   JointMatcher: Numerically-aware entity matching using pre-trained language models with attention concentration [J].
Ye, Chen ;
Jiang, Shihao ;
Zhang, Hua ;
Wu, Yifan ;
Shi, Jiankai ;
Wang, Hongzhi ;
Dai, Guojun .
KNOWLEDGE-BASED SYSTEMS, 2022, 251
[43]   Semorph: A Morphology Semantic Enhanced Pre-trained Model for Chinese Spam Text Detection [J].
Lai, Kaiting ;
Long, Yinong ;
Wu, Bowen ;
Li, Ying ;
Wang, Baoxun .
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, :1003-1013
[44]   AdaDS: Adaptive data selection for accelerating pre-trained language model knowledge distillation [J].
Zhou, Qinhong ;
Li, Peng ;
Liu, Yang ;
Guan, Yuyang ;
Xing, Qizhou ;
Chen, Ming ;
Sun, Maosong ;
Liu, Yang .
AI OPEN, 2023, 4 :56-63
[45]   Investigating Trojan Attacks on Pre-trained Language Model-powered Database Middleware [J].
Dong, Peiran ;
Guo, Song ;
Wang, Junxiao .
PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, :437-447
[46]   DynamicRetriever: A Pre-trained Model-based IR System Without an Explicit Index [J].
Zhou, Yu-Jia ;
Yao, Jing ;
Dou, Zhi-Cheng ;
Wu, Ledell ;
Wen, Ji-Rong .
MACHINE INTELLIGENCE RESEARCH, 2023, 20 (02) :276-288
[47]   Satellite and instrument entity recognition using a pre-trained language model with distant supervision [J].
Lin, Ming ;
Jin, Meng ;
Liu, Yufu ;
Bai, Yuqi .
INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2022, 15 (01) :1290-1304
[48]   Integrating R-Drop and Pre-trained Language Model for Short Text Classification [J].
Liu, Dengfeng ;
Cai, Fei ;
Pan, Zhiqiang ;
Zheng, Jianming ;
Mao, Yanying ;
Wang, Mengru .
2022 8TH INTERNATIONAL CONFERENCE ON BIG DATA AND INFORMATION ANALYTICS, BIGDIA, 2022, :330-335
[49]   Construction and implementation of knowledge enhancement pre-trained language model for text sentiment analysis [J].
Cui, Lan .
SYSTEMS AND SOFT COMPUTING, 2025, 7
[50]   DictPrompt: Comprehensive dictionary-integrated prompt tuning for pre-trained language model [J].
Cao, Rui ;
Wang, Yihao ;
Gao, Ling ;
Yang, Meng .
KNOWLEDGE-BASED SYSTEMS, 2023, 273