Watermarking for Large Language Models: A Survey

被引:0
作者
Yang, Zhiguang [1 ]
Zhao, Gejian [1 ]
Wu, Hanzhou [1 ,2 ]
机构
[1] Shanghai Univ, Sch Commun & Informat Engn, Shanghai 200444, Peoples R China
[2] Guizhou Normal Univ, Sch Big Data & Comp Sci, Guiyang 550025, Peoples R China
基金
中国国家自然科学基金;
关键词
watermarking; large language models; security; deep learning; information hiding;
D O I
10.3390/math13091420
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
With the rapid advancement and widespread deployment of large language models (LLMs), concerns regarding content provenance, intellectual property protection, and security threats have become increasingly prominent. Watermarking techniques have emerged as a promising solution for embedding verifiable signals into model outputs, enabling attribution, authentication, and mitigation of unauthorized usage. Despite growing interest in watermarking LLMs, the field lacks a systematic review to consolidate existing research and assess the effectiveness of different techniques. Key challenges include the absence of a unified taxonomy and limited understanding of trade-offs between capacity, robustness, and imperceptibility in real-world scenarios. This paper addresses these gaps by providing a comprehensive survey of watermarking methods tailored to LLMs, structured around three core contributions: (1) We classify these methods as training-free and training-based approaches and detail their mechanisms, strengths, and limitations to establish a structured understanding of existing techniques. (2) We evaluate these techniques based on key criteria-including robustness, imperceptibility, and payload capacity-to identify their effectiveness and limitations, highlighting challenges in designing resilient and practical watermarking solutions. (3) We also discuss critical open challenges while outlining future research directions and practical considerations to drive innovation in watermarking for LLMs. By providing a structured synthesis, this work advances the development of secure and effective watermarking solutions for LLMs.
引用
收藏
页数:27
相关论文
共 105 条
[1]  
Aaronson S., 2023, Watermarking gpt outputs
[2]  
Abdelnabi S, 2021, P IEEE S SECUR PRIV, P121, DOI 10.1109/SP40001.2021.00083
[3]  
Ajith A., 2024, P FIND ASS COMP LING, P14039, DOI [10.18653/v1/2024.findings-emnlp.821, DOI 10.18653/V1/2024.FINDINGS-EMNLP.821]
[4]  
Atallah Mikhail J, 2001, INFORM HIDING 4 INT, V4, P185
[5]  
Ayoobi N., 2024, arXiv
[6]  
Bahri D., 2024, arXiv
[7]   CROSS-ATTENTION WATERMARKING OF LARGE LANGUAGE MODELS [J].
Baldassini, Folco Bertini ;
Huy H. Nguyen ;
Chang, Ching-Chung ;
Echizen, Isao .
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, :4625-4629
[8]  
Baluja S, 2017, ADV NEUR IN, V30
[9]   Robust audio watermarking in the time domain [J].
Bassia, P ;
Pitas, I ;
Nikolaidis, N .
IEEE TRANSACTIONS ON MULTIMEDIA, 2001, 3 (02) :232-241
[10]  
Bengio Y., 2025, arXiv, DOI 10.48550/arXiv.2501.17805