JAILBREAK ANTIDOTE: RUNTIME SAFETY-UTILITY BALANCE VIA SPARSE REPRESENTATION ADJUSTMENT IN LARGE LANGUAGE MODELS

被引：0

作者：

Shen, Guobin ^{[1
,2
,3
,4
]}

Zhao, Dongcheng ^{[1
,2
,3
]}

Dong, Yiting ^{[1
,2
,3
,4
]}

He, Xiang ^{[1
,2
,3
]}

Zeng, Yi ^{[1
,2
,3
,4
]}

机构：

[1] Brain-inspired Cognitive Intelligence Lab., Institute of Automation, Chinese Academy of Sciences, China

[2] Beijing Institute of AI Safety and Governance, China

[3] Center for Long-term Artificial Intelligence, China

[4] School of Future Technology, University of Chinese Academy of Sciences, China

来源：

arXiv |

关键词：

Compendex;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

引用