Generating Qualitative Descriptions of Diagrams with a Transformer-Based Language Model

被引:0
作者
Schorlemmer, Marco [1 ]
Ballout, Mohamad [2 ]
Kuehnberger, Kai-Uwe [2 ]
机构
[1] CSIC, Artificial Intelligence Res Inst IIIA, Barcelona, Spain
[2] Osnabruck Univ, Inst Cognit Sci, Osnabruck, Germany
来源
DIAGRAMMATIC REPRESENTATION AND INFERENCE, DIAGRAMS 2024 | 2024年 / 14981卷
关键词
diagram understanding; Euler diagram; region connection calculus; transformer-based language model; EULER DIAGRAMS;
D O I
10.1007/978-3-031-71291-3_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To address the task of diagram understanding we propose to distinguish between the perception of the geometric configuration of a diagram from the assignment of meaning to the geometric entities and their topological relationships. As a consequence, diagram parsing does not need to assume any particular a priori interpretations of diagrams and their constituents. Focussing on Euler diagrams, we tackle the first of these subtasks-that of identifying the geometric entities that constitute a diagram (i.e., circles, rectangles, lines, arrows, etc.) and their topological relations-as an image captioning task, using a Vision Transformer for image recognition combined with language model GPT-2 to generate qualitative spatial descriptions of Euler diagrams with an encoder-decoder model. Due to the lack of sufficient high-quality data to train the pre-trained language model for this task, we describe how we generated a synthetic dataset of Euler diagrams annotated with qualitative spatial representations based on the Region Connection Calculus (RCC8). Results showed over 95% accuracy of the transformer-based language model in the generation of meaning-carrying RCC8 specifications for given Euler diagrams.
引用
收藏
页码:61 / 75
页数:15
相关论文
共 28 条
[1]  
Allwein GerardBarwise., 1996, Logical Reasoning with Diagrams
[2]  
Ballout Mohamad, 2023, Procedia Computer Science, P94, DOI [10.1016/j.procs.2023.08.147, 10.1016/j.procs.2023.08.147]
[3]  
Bourou D., 2021, P ANN M COGN SCI SOC, P1105
[4]   Euler vs Hasse Diagrams for Reasoning About Sets: A Cognitive Approach [J].
Bourou, Dimitra ;
Schorlemmer, Marco ;
Plaza, Enric .
DIAGRAMMATIC REPRESENTATION AND INFERENCE, DIAGRAMS 2022, 2022, 13462 :151-167
[5]   Image Schemas and Conceptual Blending in Diagrammatic Reasoning: The Case of Hasse Diagrams [J].
Bourou, Dimitra ;
Schorlemmer, Marco ;
Plaza, Enric .
DIAGRAMMATIC REPRESENTATION AND INFERENCE, DIAGRAMS 2021, 2021, 12909 :297-314
[6]  
Cohn A.G., 1997, GeoInformatica, V1, P275, DOI DOI 10.1023/A:1009712514511
[7]  
Dosovitskiy A., 2021, P 9 INT C LEARN REPR
[8]   Conceptual integration networks [J].
Fauconnier, G ;
Turner, M .
COGNITIVE SCIENCE, 1998, 22 (02) :133-187
[9]   Abstractions of Euler Diagrams [J].
Fish, Andrew ;
Flower, Jean .
ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2005, 134 :77-101
[10]  
Hampe B, 2005, COGN LINGUIST RES, V29, P1