Prompt Injection Liability

The exposure created when an attacker manipulates an AI system's instructions to exfiltrate data, bypass guardrails, or cause unintended actions on behalf of a user or company.

Prompt injection liability is the legal exposure that arises when an attacker manipulates a generative AI system's instructions (its system prompt, retrieved context, or user-facing inputs) to make the system behave outside its intended scope. The attacker can extract confidential data the model was instructed to protect, bypass safety guardrails, induce the system to send unauthorized messages, manipulate an agent into taking an unauthorized action, or surface harmful content the deployer would otherwise have blocked. The deployer is the party that bears the liability when the manipulation causes harm to a user or a third party.

OWASP has classified prompt injection as the top security risk for large-language-model applications since its inaugural LLM Top 10 ranking in 2023, and the ranking has held in subsequent versions. Direct prompt injection is the simpler form (a user types malicious instructions into the AI chat). Indirect prompt injection is the more dangerous form: a document, web page, image, or email the AI processes contains hidden instructions that override the deployer's controls. Indirect injection is particularly serious for agentic AI, where the agent reads attacker-controlled content and then acts on it.

The insurance picture is mixed. Cyber Liability policies often respond to first-party costs of a prompt injection incident that meets the wording's definition of unauthorized access to a computer system, particularly where the injection produced a data exfiltration claim. Third-party harm from the manipulated output (an agent that wired funds to an attacker, a chatbot that defamed a third party at attacker direction) typically does not fit cleanly inside cyber and falls into Generative AI Liability territory. Standalone Gen AI policies increasingly carry insuring agreements that name prompt injection as a covered cause of loss with appropriate sub-limits.

Underwriters of prompt injection risk look for input validation and sanitization, system prompt segregation from user input, output filtering, agent action scoping with human-in-the-loop checkpoints, red-team testing programs, and an incident response plan specific to prompt injection. Controls are graded against the deployment surface area: a public-facing chatbot with internet retrieval is judged more heavily than a closed internal tool with a small known user base.

Also known as

Prompt Injection Attack Liability, LLM Prompt Injection Liability, Indirect Prompt Injection Exposure

Frequently asked

What is the difference between direct and indirect prompt injection?

Direct prompt injection occurs when an attacker types malicious instructions into the AI chat directly, attempting to override the system prompt. It is the simpler and more contained form. Indirect prompt injection occurs when an attacker hides instructions inside content the AI will later process (a webpage, document, email, or image retrieved during normal operation). The AI reads the attacker's instructions through the retrieved content rather than from the user, which makes indirect injection harder to detect and particularly dangerous for agentic systems that act on what they read.

Can cyber insurance respond to a prompt injection claim?

Sometimes, depending on the wording and the loss shape. A prompt injection that exfiltrates personal data through an LLM-powered customer service tool often meets the cyber policy's definition of unauthorized access to a computer system, triggering the first-party breach-response section and the third-party data-breach section. A prompt injection that causes an agent to take a wrongful action against a third party (a transaction, a defamatory message, an unauthorized contract) more typically falls outside cyber and into the territory a Generative AI Liability policy is designed to answer.

Related terms

generative AI liability insurance overview

General information, not legal or insurance advice.