Visual explainable artificial intelligence for graph-based visual question answering and scene graph curation

被引:0
作者
Sebastian Künzel [1 ]
Tanja Munz-Körner [1 ]
Pascal Tilli [2 ]
Noel Schäfer [1 ]
Sandeep Vidyapu [1 ]
Ngoc Thang Vu [2 ]
Daniel Weiskopf [1 ]
机构
[1] VISUS, University of Stuttgart, Stuttgart
[2] IMS, University of Stuttgart, Stuttgart
关键词
Explainable artificial intelligence; Scene graphs; Visual analytics; Visual question answering;
D O I
10.1186/s42492-025-00185-y
中图分类号
学科分类号
摘要
This study presents a novel visualization approach to explainable artificial intelligence for graph-based visual question answering (VQA) systems. The method focuses on identifying false answer predictions by the model and offers users the opportunity to directly correct mistakes in the input space, thus facilitating dataset curation. The decision-making process of the model is demonstrated by highlighting certain internal states of a graph neural network (GNN). The proposed system is built on top of a GraphVQA framework that implements various GNN-based models for VQA trained on the GQA dataset. The authors evaluated their tool through the demonstration of identified use cases, quantitative measures, and a user study conducted with experts from machine learning, visualization, and natural language processing domains. The authors’ findings highlight the prominence of their implemented features in supporting the users with incorrect prediction identification and identifying the underlying issues. Additionally, their approach is easily extendable to similar models aiming at graph-based question answering. © The Author(s) 2025.
引用
收藏
相关论文
共 45 条
  • [41] Tilli P., Vu N.T., Intrinsic subgraph generation for interpretable graph based visual question answering, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), ELRA and ICCL, Torino, pp. 20-25, (2024)
  • [42] Gervautz M., Purgathofer W., A simple method for color quantization: octree quantization, New trends in computer graphics: proceedings of CG International’88, pp. 219-231, (1988)
  • [43] Pennington J., Socher R., Manning C.D., GloVe: Global vectors for word representation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp. 25-29, (2014)
  • [44] Ericsson K.A., Simon H.A., Protocol analysis, (1993)
  • [45] Richer G., Pister A., Abdelaal M., Fekete J.D., Sedlmair M., Weiskopf D., Scalability in visualization, IEEE Trans Vis Comput Graph, 30, 7, pp. 3314-3330, (2024)