Logiformer: A Two-Branch Graph Transformer Network for Interpretable Logical Reasoning

被引：18

作者：

Xu, Fangzhi ^{[1
]}

Liu, Jun ^{[2
]}

Lin, Qika ^{[1
]}

Pan, Yudai ^{[1
]}

Zhang, Lingling ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Comp Sci & Technol, Xian, Peoples R China

[2] Natl Engn Lab Big Data Analyt, Shaanxi Prov Key Lab Satellite & Terr Network Tec, Xian, Peoples R China

来源：

PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22) | 2022年

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

logical reasoning; machine reading comprehension; graph transformer;

D O I：

10.1145/3477495.3532016

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Machine reading comprehension has aroused wide concerns, since it explores the potential of model for text understanding. To further equip the machine with the reasoning capability, the challenging task of logical reasoning is proposed. Previous works on logical reasoning have proposed some strategies to extract the logical units from different aspects. However, there still remains a challenge to model the long distance dependency among the logical units. Also, it is demanding to uncover the logical structures of the text and further fuse the discrete logic to the continuous text embedding. To tackle the above issues, we propose an end-to-end model Logiformer which utilizes a two-branch graph transformer network for logical reasoning of text. Firstly, we introduce different extraction strategies to split the text into two sets of logical units, and construct the logical graph and the syntax graph respectively. The logical graph models the causal relations for the logical branch while the syntax graph captures the co-occurrence relations for the syntax branch. Secondly, to model the long distance dependency, the node sequence from each graph is fed into the fully connected graph transformer structures. The two adjacent matrices are viewed as the attention biases for the graph transformer layers, which map the discrete logical structures to the continuous text embedding space. Thirdly, a dynamic gate mechanism and a question-aware self-attention module are introduced before the answer prediction to update the features. The reasoning process provides the interpretability by employing the logical units, which are consistent with human cognition. The experimental results show the superiority of our model, which outperforms the state-of-the-art single model on two logical reasoning benchmarks.

引用

页码：1055 / 1065

页数：11

共 41 条

[1]

[Anonymous], 2016, ARXIV160601549

[2]

Bowman S.R., 2015, C P EMNLP 2015 C EMP, P632, DOI [DOI 10.18653/V1/D15-1075, 10.18653/v1/d15-1075, 10.18653/v1/D15-1075]

[3]

Brown TB, 2020, ADV NEUR IN, V33

[4]

Cai D, 2020, AAAI CONF ARTIF INTE, V34, P7464

[5] Natural language processing [J].

Chowdhury, GG .

ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY, 2003, 37 :51-89

[6]

[Clark Kevin ELECTRA ELECTRA], 2020, arXiv, DOI [DOI 10.48550/ARXIV.2003.10555, DOI 10.48550/arXiv.2003.10555, 10.48550/arXiv.2003.10555]

[7]

Devlin J., 2018, CORR

[8]

Dwivedi Vijay Prakash, 2020, arXiv preprint arXiv:2012.09699

[9] Advances in natural language processing [J].

Hirschberg, Julia ;

Manning, Christopher D. .

SCIENCE, 2015, 349 (6245) :261-266

[10]

Hirschman L., 2001, Natural Language Engineering, V7, P275, DOI 10.1017/S1351324901002807

← 1 2 3 4 5 →