Investigating Trojan Attacks on Pre-trained Language Model-powered Database Middleware

被引:4
作者
Dong, Peiran [1 ]
Guo, Song [1 ]
Wang, Junxiao [2 ,3 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
[2] King Abdullah Univ Sci & Technol, Thuwal, Saudi Arabia
[3] SDAIA KAUST AI, Thuwal, Saudi Arabia
来源
PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023 | 2023年
关键词
database middleware; pre-trained language model; Trojan attack;
D O I
10.1145/3580305.3599395
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The recent success of pre-trained language models (PLMs) such as BERT has resulted in the development of various beneficial database middlewares, including natural language query interfaces and entity matching. This shift has been greatly facilitated by the extensive external knowledge of PLMs. However, as PLMs are often provided by untrusted third parties, their lack of standardization and regulation poses significant security risks that have yet to be fully explored. This paper investigates the security threats posed by malicious PLMs to these emerging database middlewares. We specifically propose a novel type of Trojan attack, where a maliciously designed PLM causes unexpected behavior in the database middleware. These Trojan attacks possess the following characteristics: (1) Triggerability: The Trojan-infected database middleware will function normally with normal input, but will likely malfunction when triggered by the attacker. (2) Imperceptibility: There is no need for noticeable modification of the input to trigger the Trojan. (3) Generalizability: The Trojan is capable of targeting a variety of downstream tasks, not just one specific task. We thoroughly evaluate the impact of these Trojan attacks through experiments and analyze potential countermeasures and their limitations. Our findings could aid in the creation of stronger mechanisms for the implementation of PLMs in database middleware.
引用
收藏
页码:437 / 447
页数:11
相关论文
共 45 条
[1]  
AlexWang Amanpreet Singh, 2019, INT C LEARN REPR
[2]  
[Anonymous], 2015, P ANN M ASS COMP LIN
[3]  
Biggio Battista, 2012, ICML, DOI DOI 10.48550/ARXIV.1206.6389
[4]  
Boucher N, 2022, P IEEE S SECUR PRIV, P1987, DOI [10.1109/SP46214.2022.00045, 10.1109/SP46214.2022.9833641]
[5]  
Brunner Ursin, 2020, EDBT
[6]  
Chen Kangjie, 2022, P INT C LEARN REPR I
[7]  
Crescenzi V, 2021, Arxiv, DOI arXiv:2101.11259
[8]  
Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
[9]   Can Adversarial Weight Perturbations Inject Neural Backdoors? [J].
Garg, Siddhant ;
Kumar, Adarsh ;
Goel, Vibhor ;
Liang, Yingyu .
CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, :2029-2032
[10]  
Guo T, 2020, Arxiv, DOI arXiv:1910.07179