IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v13y2025i9p1420-d1643042.html
   My bibliography  Save this article

Watermarking for Large Language Models: A Survey

Author

Listed:
  • Zhiguang Yang

    (School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China)

  • Gejian Zhao

    (School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China)

  • Hanzhou Wu

    (School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China
    School of Big Data and Computer Science, Guizhou Normal University, Guiyang 550025, China)

Abstract

With the rapid advancement and widespread deployment of large language models (LLMs), concerns regarding content provenance, intellectual property protection, and security threats have become increasingly prominent. Watermarking techniques have emerged as a promising solution for embedding verifiable signals into model outputs, enabling attribution, authentication, and mitigation of unauthorized usage. Despite growing interest in watermarking LLMs, the field lacks a systematic review to consolidate existing research and assess the effectiveness of different techniques. Key challenges include the absence of a unified taxonomy and limited understanding of trade-offs between capacity, robustness, and imperceptibility in real-world scenarios. This paper addresses these gaps by providing a comprehensive survey of watermarking methods tailored to LLMs, structured around three core contributions: (1) We classify these methods as training-free and training-based approaches and detail their mechanisms, strengths, and limitations to establish a structured understanding of existing techniques. (2) We evaluate these techniques based on key criteria—including robustness, imperceptibility, and payload capacity—to identify their effectiveness and limitations, highlighting challenges in designing resilient and practical watermarking solutions. (3) We also discuss critical open challenges while outlining future research directions and practical considerations to drive innovation in watermarking for LLMs. By providing a structured synthesis, this work advances the development of secure and effective watermarking solutions for LLMs.

Suggested Citation

  • Zhiguang Yang & Gejian Zhao & Hanzhou Wu, 2025. "Watermarking for Large Language Models: A Survey," Mathematics, MDPI, vol. 13(9), pages 1-27, April.
  • Handle: RePEc:gam:jmathe:v:13:y:2025:i:9:p:1420-:d:1643042
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/13/9/1420/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/13/9/1420/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:13:y:2025:i:9:p:1420-:d:1643042. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.