As large language models have become mainstream tools for organizations to process internal and customer communications, the risk of LLM data leakage has become a top security concern. LLM leaks can expose your customer data, employee records, financial information, or proprietary software code, or even reveal information hackers can use to gain access to your network and launch other attacks. Because large language models can interact with other software your company uses, you stand exposed to these risks whether you use LLMs or not. This makes plugging LLM data leaks a critical security imperative.
In this blog, we'll share what you need to know about LLM leaks and how to prevent them. We'll cover:
- The connection between data security and LLMs
- Examples of LLM data leaks
- Types of LLM data leaks
- 10 LLM data security best practices
Understanding the Connection between Data Security and LLMs
Large language models learn by analyzing large amounts of data from various sources. This may include public data as well as private data such as your customer database, employee records, internal or external communication, or financial statements. Any security vulnerabilities in your LLM or LLMs you use potentially can put this data at risk or can corrupt your data with outside data.
Unfortunately, LLMs can leak data through various vulnerable points and attack vectors. Here are a few examples to illustrate:
Examples of LLMs Leaking Data
LLM behavior is controlled by instructions called system prompts, which set boundaries constraining the prompts users enter into the system. However, hackers constantly seek ways to construct user prompts to exploit or bypass system prompts. For example, let's say your system prompt calls a database containing user credentials. If you don't have proper defenses in place, a hacker might construct an SQL query that tricks your system prompt into displaying the user credentials for everyone on your LLM system, an attack method known as prompt injection. Armed with this information, the attacker can steal sensitive data and exploit user credentials to launch further attacks.
Let's take a different example that involves leaks in the data to use to train your large language model. Say you're a financial consulting firm and you're using an LLM to help you optimize the advice you give your clients. To train your LLM, you feed it data from your client database. However, your database includes sensitive personally identifiable information (PII) about your clients, such as their names and contact information. Potentially, without safeguards, queries to your LLM may expose this data and make it available to hackers.
Finally, let's take an example involving a leak in the test data you use to evaluate the performance of your large language model. Test data serves to help correct model accuracy. To keep your model accurate, it's critical to have good test data kept segregated from your model data. However, without adequate checks, it's possible for test data to leak into model data, contaminating your model and diminishing its accuracy and value. Worse, bad actors who manage to hack your LLM can deliberately contaminate your model with bad test data to generate inaccurate results, disrupt its functionality, or produce malicious output.
Types of LLM Data Leaks
The examples above illustrate three main categories of LLM data leaks:
- Prompt leakage
- Model data leakage
- Test data leakage
Prompt Leakage
System prompt leakage ranks among the top 10 LLM security risks recognized by the Open Worldwide Application Security Project (OWASP). It occurs when bad actors craft user prompts to manipulate system prompts into exposing sensitive data. For example, an attacker may create a user prompt that instructs the LLM to bypass system prompts guarding against remote code execution.
Attacks leveraging prompt leakage may target data exposing:
- Sensitive functionality attackers can exploit, such as system architecture, user tokens, database credentials, or API keys
- User permissions and roles, which reveal information attackers can use for privilege escalation attacks that set the stage for other attacks
- Filtering criteria an LLM uses to restrict malicious data input, which attackers can leverage to identify unprotected vulnerabilities
Internal rules revealing decision-making procedures bad actors can manipulate, such as criteria a bank LLM uses to screen loan applications
Both your own LLM and other LLMs you or your employees use may be vulnerable to prompt leakage. For example, employees using ChatGPT or Chinese brand Baidu's Ernie Bot may expose your data to those LLM environments. A 2023 study of 1.6 million workers by security provider Cyberhaven found that 4.7% of employees had pasted confidential data into ChapGPT, and 11% of data employees pasted into ChatGPT was confidential. The findings showed that the average company leaked sensitive data to ChatGPT hundreds of times each week, including internal data, source code, and client data.
Model Data Leakage
Model data leakage occurs when your model returns output which exposes sensitive data used as training input. For example, if you use customer data to train your model, model data leakage can expose that information. Model data leaks can compromise any type of data you enter into your LLM training data, including personally identifiable information, biometric data, financial information, or healthcare data.
Test Data Leakage
Test data leakage occurs when your LLM leaks data about the database used to test its model. LLM users may obtain testing sets from internal data, open-source software repositories, private providers, government databases, or other sources. Exposed internal data can include the same types of data at risk in prompt or model data leakage, while external data can give attackers clues about how to target your LLM, bias your output, or launch other attack methods.
Data Security: LLM Best Practices
How can you guard against data leakage vulnerabilities? Here are 10 of the top best practices for security LLMs against data leaks and other risks:
- Implement access controls
- Minimize data input and storage
- Validate and sanitize and data
- Encrypt data
- Secure model and testing data
- Protect execution environments
- Audit supply chains
- Incorporate human supervision
- Monitor LLMs in production
- Deploy offensive security testing
1. Implement Access Controls
Preventing unauthorized access to LLM functionality forms a first line of defense against data leakage. Apply access controls such as the use of role-based access control (RBAC), MFA, authorization, permissions, as well as secure API endpoints with authentication tokens.
2. Minimize Data Input and Storage
Retaining unnecessary data creates more potential risks. Only collect data you need to achieve your LLM's functional goals, and don't store it any longer than you need to.
3. Validate and Sanitize and Data
Input controls can protect you against certain types of malicious code such as SQL injections. Validate and sanitize data elements such as data type, range, format, and consistency.
4. Encrypt Data
Encryption reduces the value of data to attackers even if leakage occurs. Apply encryption to data both at rest and in transit.
5. Secure Model and Testing Data
Neglecting model and training data security can leave you exposed on those attack surfaces. Apply minimization, validation, sanitization, and encryption to model and training data.
6. Protect Execution Environments
Vulnerabilities in your LLM's execution environment can open up data leakage risks. Protect your LLM during runtime with techniques such as trusted execution environments (TEEs) and containerization.
7. Audit Supply Chains
Third-party elements such as components, modules, packages, libraries, and configuration can introduce vulnerabilities into your LLM. Secure your supply chain by using trusted software sources and scanning third-party dependencies before deployment.
8. Incorporate Human Supervision
Human review can intercept vulnerabilities that automated checks can miss. Require human-in-the-loop (HitL) approval for decisions that can compromise LLM data security, such as authorizing third-party connections or deleting files.
9. Monitor LLMs in Production
Production environments can introduce live risks to LLMs. Keep an eye out for anomalous activity by continuously monitoring access controls, system logs, and data use.
10. Deploy Offensive Security Testing
Without testing, you don't know if your LLM security measures are adequate. Verify your defenses by deploying offensive security measures such as penetration testing (pentesting) and red teaming.
Plug LLM Data Leaks with Cobalt LLM Pentest Services
Protecting your LLM applications against data leaks requires a comprehensive approach. With so many potential attack vectors, it's easy to miss something if you take a piecemeal approach. This underscores the criticality of offensive security testing to cover all your bases.
The expert security team at Cobalt provides next-gen pentesting for LLMs and AI applications. Our elite pentesters work with industry leaders like OWASP to develop and maintain cutting-edge LLM security standards that neutralize the latest threats. The user-friendly penetration testing as a service (PtaaS) platform makes it easy for your security team to work with ours to rapidly schedule customized tests adapted to your business and regulatory requirements.