Understanding models understanding language

被引:0
作者
Anders Søgaard
机构
[1] University of Copenhagen,Department of Computer Science, Pioneer Centre for Artificial Intelligence, and Department of Philosophy
来源
Synthese | / 200卷
关键词
Artificial intelligence; Language; Mind;
D O I
暂无
中图分类号
学科分类号
摘要
Landgrebe and Smith (Synthese 198(March):2061–2081, 2021) present an unflattering diagnosis of recent advances in what they call language-centric artificial intelligence—perhaps more widely known as natural language processing: The models that are currently employed do not have sufficient expressivity, will not generalize, and are fundamentally unable to induce linguistic semantics, they say. The diagnosis is mainly derived from an analysis of the widely used Transformer architecture. Here I address a number of misunderstandings in their analysis, and present what I take to be a more adequate analysis of the ability of Transformer models to learn natural language semantics. To avoid confusion, I distinguish between inferential and referential semantics. Landgrebe and Smith (2021)’s analysis of the Transformer architecture’s expressivity and generalization concerns inferential semantics. This part of their diagnosis is shown to rely on misunderstandings of technical properties of Transformers. Landgrebe and Smith (2021) also claim that referential semantics is unobtainable for Transformer models. In response, I present a non-technical discussion of techniques for grounding Transformer models, giving them referential semantics, even in the absence of supervision. I also present a simple thought experiment to highlight the mechanisms that would lead to referential semantics, and discuss in what sense models that are grounded in this way, can be said to understand language. Finally, I discuss the approach Landgrebe and Smith (2021) advocate for, namely manual specification of formal grammars that associate linguistic expressions with logical form.
引用
收藏
相关论文
共 30 条
  • [1] Aldarmaki H(2018)Unsupervised word mapping using structural similarities in monolingual embeddings Transactions of the Association for Computational Linguistics 6 185-196
  • [2] Mohan M(2013)Whatever next? Predictive brains, situated agents, and the future of cognitive science The Behavioral and Brain Sciences 36 181-204
  • [3] Diab M(2021)(what) can deep learning contribute to theoretical linguistics? Minds and Machines 31 617-635
  • [4] Clark A(1967)Language identification in the limit Information and Control 10 447-474
  • [5] Dupre G(1990)The symbol grounding problem Physica D: Nonlinear Phenomena 42 335-346
  • [6] Gold EM(1996)Grounding computational engines Artificial Intelligence Review 10 65-82
  • [7] Harnad S(2021)Making ai meaningful again Synthese 198 2061-2081
  • [8] Jackson SA(2021)Recursive non-autoregressive graph-to-graph transformer for dependency parsing with iterative refinement Transactions of the Association for Computational Linguistics 9 120-138
  • [9] Sharkey NE(1938)Categories Proceedings of the Aristotelian Society 38 189-206
  • [10] Landgrebe J(1980)Minds, brains, and programs Behavioral and Brain Sciences 3 417-424