Probing for Bridging Inference in Transformer Language Models

被引:0
|
作者
Pandit, Onkar [1 ]
Hou, Yufang [2 ]
机构
[1] Univ Lille, CNRS, Cent Lille, INRIA Lille,UMR 9189,CRIStAL, F-59000 Lille, France
[2] IBM Res Europe, Dublin, Ireland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We probe pre-trained transformer language models for bridging inference. We first investigate individual attention heads in BERT and observe that attention heads at higher layers prominently focus on bridging relations incomparison with the lower and middle layers, also, few specific attention heads concentrate consistently on bridging. More importantly, we consider language models as a whole in our second approach where bridging anaphora resolution is formulated as a masked token prediction task (Of-Cloze test). Our formulation produces optimistic results without any fine-tuning, which indicates that pre-trained language models substantially capture bridging inference. Our further investigation shows that the distance between anaphor-antecedent and the context provided to language models play an important role in the inference.
引用
收藏
页码:4153 / 4163
页数:11
相关论文
共 50 条
  • [1] Probing Natural Language Inference Models through Semantic Fragments
    Richardson, Kyle
    Hu, Hai
    Moss, Lawrence S.
    Sabharwal, Ashish
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8713 - 8721
  • [2] Probing Linguistic Information For Logical Inference In Pre-trained Language Models
    Chen, Zeming
    Gao, Qiyue
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 10509 - 10517
  • [3] An Architecture for Accelerated Large-Scale Inference of Transformer-Based Language Models
    Ganiev, Amir
    Chapin, Colt
    de Andrade, Anderson
    Liu, Chen
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2021, 2021, : 163 - 169
  • [4] ATTENTION OR CONVOLUTION: TRANSFORMER ENCODERS IN AUDIO LANGUAGE MODELS FOR INFERENCE EFFICIENCY<bold> </bold>
    Jeon, Sungho
    Yeh, Ching-Feng
    Inan, Hakan
    Hsu, Wei-Ning
    Rungta, Rashi
    Mehdad, Yashar
    Bikel, Daniel
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 555 - 559
  • [5] Not all quantifiers are equal: Probing transformer-based language models' understanding of generalised quantifiers
    Madusanka, Tharindu
    Zahid, Iqra
    Li, Hao
    Pratt-Hartmann, Ian
    Batista-Navarro, Riza
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 8680 - 8692
  • [6] LVCSR with Transformer Language Models
    Beck, Eugen
    Schlueter, Ralf
    Ney, Hermann
    INTERSPEECH 2020, 2020, : 1798 - 1802
  • [7] TrojBits: A Hardware Aware Inference-Time Attack on Transformer-Based Language Models
    Al Ghanim, Mansour
    Santriaji, Muhammad
    Lou, Qian
    Solihin, Yan
    Frontiers in Artificial Intelligence and Applications, 2023, 372 : 60 - 68
  • [8] Deep context transformer: bridging efficiency and contextual understanding of transformer models
    Ghaith, Shadi
    APPLIED INTELLIGENCE, 2024, 54 (19) : 8902 - 8923
  • [9] Feature Fusion Transformer Network for Natural Language Inference
    Sun, Lei
    Yan, Hengxin
    PROCEEDINGS OF 2022 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION (IEEE ICMA 2022), 2022, : 1009 - 1014
  • [10] Natural Language Inference with Transformer Ensembles and Explainability Techniques
    Perikos, Isidoros
    Souli, Spyro
    ELECTRONICS, 2024, 13 (19)