The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?

被引：0

作者：

Zhao, Qinyu ^{[1
]}

Xu, Ming ^{[1
]}

Gupta, Kartik ^{[2
]}

Asthana, Akshay ^{[2
]}

Zheng, Liang ^{[1
]}

Gould, Stephen ^{[1
]}

机构：

[1] Australian Natl Univ, Canberra, ACT, Australia

[2] Seeing Machines Ltd, Fyshwick, Australia

来源：

COMPUTER VISION - ECCV 2024, PT XLVIII | 2025年 / 15106卷

基金：

澳大利亚研究理事会;

关键词：

Large Vision-Language Models; Logit Distribution; First Token; Hidden Knowledge; Linear Probing;

D O I：

10.1007/978-3-031-73195-2_8

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large vision-language models (LVLMs), designed to interpret and respond to human instructions, occasionally generate hallucinated or harmful content due to inappropriate instructions. This study uses linear probing to shed light on the hidden knowledge at the output layers of LVLMs. We demonstrate that the logit distributions of the first tokens contain sufficient information to determine whether to respond to the instructions, including recognizing unanswerable visual questions, defending against jailbreaking attacks, and identifying deceptive questions. Such hidden knowledge is gradually lost in logits of subsequent tokens during response generation. Then, we illustrate a simple decoding strategy at the generation of the first token, effectively improving the generated content. In experiments, we find a few interesting insights: First, the CLIP model already contains a strong signal for solving these tasks, which indicates potential bias in the existing datasets. Second, we observe performance improvement by utilizing the first logit distributions on three additional tasks, including indicating uncertainty in math solving, mitigating hallucination, and image classification. Last, with the same training data, simply finetuning LVLMs improves models' performance but is still inferior to linear probing on these tasks (Our code is available at https://github.com/Qinyu-Allen- Zhao/LVLM-LP).

引用

页码：127 / 142

页数：16

共 50 条

[1] IVTP: Instruction-Guided Visual Token Pruning for Large Vision-Language Models
Huang, Kai
Zou, Hao
Xi, Ye
Wang, BoChen
Xie, Zhen
Yu, Liang
COMPUTER VISION - ECCV 2024, PT XVII, 2025, 15075 : 214 - 230
[2] UDKAG: Augmenting Large Vision-Language Models with Up-to-Date Knowledge
Li, Chuanhao
Li, Zhen
Jing, Chenchen
Liu, Shuo
Shao, Wenqi
Wu, Yuwei
Luo, Ping
Qiao, Yu
Zhang, Kaipeng
arXiv,
[3] Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained Vision-Language Models
Zheng, Kecheng
Wu, Wei
Feng, Ruili
Zhu, Kai
Liu, Jiawei
Zhao, Deli
Zha, Zheng-Jun
Chen, Wei
Shen, Yujun
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11629 - 11639
[4] Attention Prompting on Image for Large Vision-Language Models
Yu, Runpeng
Yu, Weihao
Wang, Xinchao
COMPUTER VISION - ECCV 2024, PT XXX, 2025, 15088 : 251 - 268
[5] Effectiveness assessment of recent large vision-language models
Yao Jiang
Xinyu Yan
Ge-Peng Ji
Keren Fu
Meijun Sun
Huan Xiong
Deng-Ping Fan
Fahad Shahbaz Khan
Visual Intelligence, 2 (1):
[6] Evaluating Attribute Comprehension in Large Vision-Language Models
Zhang, Haiwen
Yang, Zixi
Liu, Yuanzhi
Wang, Xinran
He, Zheqi
Liang, Kongming
Ma, Zhanyu
PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 98 - 113
[7] Evaluating Object Hallucination in Large Vision-Language Models
Li, Yifan
Du, Yifan
Zhou, Kun
Wang, Jinpeng
Zhao, Wayne Xin
Wen, Ji-Rong
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 292 - 305
[8] On Evaluating Adversarial Robustness of Large Vision-Language Models
Zhao, Yunqing
Pang, Tianyu
Du, Chao
Yang, Xiao
Li, Chongxuan
Cheung, Ngai-Man
Lin, Min
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[9] ORacle: Large Vision-Language Models for Knowledge-Guided Holistic OR Domain Modeling
Oezsoy, Ege
Pellegrini, Chantal
Keicher, Matthias
Navab, Nassir
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT VI, 2024, 15006 : 455 - 465
[10] Adapting Vision-Language Models via Learning to Inject Knowledge
Xuan, Shiyu
Yang, Ming
Zhang, Shiliang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5798 - 5809

← 1 2 3 4 5 →