The Reprompt Attack on Microsoft Copilot

[ AI Cyber Security & Threat Landscape ]

The Reprompt Attack on Microsoft Copilot

A user clicks a legitimate Microsoft Copilot link shared in an email. The page loads, a prompt executes, and the interface appears idle. The user closes the tab. Copilot continues executing instructions embedded in that link, making outbound requests that include user-accessible data, without further interaction or visibility.

One click. No downloads, no attachments, no warnings. A user opens a link to Microsoft Copilot, watches the page load, and closes the tab. The interaction appears to end there.

It doesn't. Behind the scenes, Copilot continues executing instructions embedded in that URL, querying user-accessible data and sending it to an external server. The user sees nothing.

This is Reprompt, an indirect prompt injection vulnerability disclosed in January 2026. Security researchers at Varonis Threat Labs demonstrated that by chaining three design behaviors in Copilot Personal, an attacker could achieve covert, single-click data exfiltration. Microsoft patched the issue on January 13, 2026. No in-the-wild exploitation has been confirmed.

Reprompt affected only Copilot Personal, the consumer-facing version of Microsoft's AI assistant integrated into Windows and Edge. Microsoft 365 Copilot, used in enterprise tenants, was not vulnerable. The architectural difference matters: enterprise Copilot enforces tenant isolation, permission scoping, and integration with Microsoft Purview Data Loss Prevention. Consumer Copilot had none of these boundaries.

This distinction is central to understanding the vulnerability. Reprompt did not exploit a flaw in the underlying language model. It exploited product design decisions that prioritized frictionless user experience over session control and permission boundaries.

Varonis Threat Labs identified the vulnerability and disclosed it to Microsoft on August 31, 2025. Microsoft released a patch as part of its January 2026 Patch Tuesday cycle, and public disclosure followed. The vulnerability was assigned CVE-2026-21521.

Reprompt belongs to a broader class of indirect prompt injection attacks, where instructions hidden in untrusted content are ingested by an AI system and treated as legitimate commands. What made Reprompt notable was not a new model-level technique, but a practical exploit path created by compounding product choices.

How the Mechanism Works

Reprompt relied on three interconnected behaviors.

1. Parameter-to-prompt execution

Copilot Personal accepted prompts via the q URL parameter. When a user navigated to a URL such as copilot.microsoft.com/?q=Hello, the contents of the parameter were automatically executed as a prompt on page load. This behavior was intended to streamline user experience by pre-filling and submitting prompts.

Researchers demonstrated that complex, multi-step instructions could be embedded in this parameter. When a user clicked a crafted link, Copilot executed the injected instructions immediately within the context of the user's authenticated session.

2. Double-request safeguard bypass

Copilot implemented protections intended to prevent data exfiltration, such as blocking untrusted URLs or stripping sensitive information from outbound requests. However, these safeguards were enforced primarily on the initial request in a conversation.

Attackers exploited this by instructing Copilot to repeat the same action twice, often framed as a quality check or retry. The first request triggered safeguards. The second request, executed within the same session, did not consistently reapply them. This allowed sensitive data to be included in outbound requests on the second execution.

3. Chain-request execution

Reprompt also enabled a server-controlled instruction loop. After the initial prompt executed, Copilot was instructed to fetch follow-on instructions from an attacker-controlled server.

Each response from Copilot informed the next instruction returned by the server. This enabled a staged extraction process where the attacker dynamically adjusted what data to request based on what Copilot revealed in earlier steps. Because later instructions were not embedded in the original URL, they were invisible to static inspection of the link itself.

What an Attack Could Look Like

Consider a realistic scenario based on the technical capabilities Reprompt enabled.

An employee receives an email from what appears to be a colleague: "Here's that Copilot prompt I mentioned for summarizing meeting notes." The link points to copilot.microsoft.com with a long query string. Nothing looks suspicious.

The employee clicks. Copilot opens, displays a brief loading state, then appears idle. The employee closes the tab and returns to work.

During those few seconds, the injected prompt instructed Copilot to search the user's recent emails for messages containing "contract," "offer," or "confidential." Copilot retrieved snippets. The prompt then instructed Copilot to summarize the results and send them to an external URL disguised as a logging endpoint.

Because the prompt used the double-request technique, Copilot's outbound data safeguards did not block the second request. Because the session persisted, follow-on instructions from the attacker's server continued to execute after the tab closed. The attacker received a structured summary of sensitive email content without the user ever knowing a query occurred.

The employee saw a blank Copilot window for two seconds. The attacker received company data.

This scenario is hypothetical, but every capability it describes was demonstrated in Varonis's proof-of-concept research.

Why Existing Safeguards Failed

The Reprompt attack exposed several structural weaknesses.

Instruction indistinguishability

From the model's perspective, there is no semantic difference between a prompt typed by a user and an instruction embedded in a URL or document. Both are treated as authoritative text. This is a known limitation of instruction-following language models and makes deterministic prevention at the model layer infeasible.

Session persistence without revalidation

Copilot Personal sessions remained authenticated after the user closed the interface. This design choice optimized for convenience but allowed background execution of follow-on instructions without renewed user intent or visibility.

Asymmetric safeguard enforcement

Safeguards were applied inconsistently across request sequences. By focusing validation on the first request, the system assumed benign conversational flow. Reprompt violated that assumption by automating malicious multi-step sequences.

Permission inheritance without boundaries

Copilot Personal operated with the full permission set of the authenticated user. Any data the user could access, Copilot could query. There was no least-privilege enforcement or data scoping layer comparable to enterprise controls.

CVE Registration and Classification

The vulnerability was registered as CVE-2026-21521 with the following characteristics:

Vulnerability type: Improper neutralization of control sequences (CWE-150)

Attack vector: Network

User interaction: Required (clicking a crafted link)

Privileges required: None

Scope: Changed

Confidentiality impact: High

Integrity impact: Low to none

Availability impact: None

A separate CVE, CVE-2026-24307, addressed a different information disclosure issue in Microsoft 365 Copilot and is unrelated to the Reprompt root cause.

Microsoft’s Defense-In-Depth Response

Microsoft patched the Reprompt vulnerability in January 2026 and described a broader defense strategy against indirect prompt injection.

Key elements in Microsoft's security guidance include:

Hardened system prompts to reduce the likelihood that models follow instructions embedded in untrusted content

Spotlighting techniques that mark or encode untrusted input so models can distinguish it from user instructions

Prompt Shields, classifier-based detection integrated with Azure AI Content Safety and Defender for Cloud

Deterministic blocking of known exfiltration channels such as malicious image tags or markdown links

Human-in-the-loop controls for high-risk actions, requiring explicit user approval

Microsoft characterized indirect prompt injection as an inherent risk of probabilistic language models and positioned mitigation as a layered control problem rather than a single fix.

Enterprise Impact and Recommended Actions

Who was affected

Users of Microsoft Copilot Personal on consumer devices

Any user who clicked a malicious Copilot link during the vulnerability window

Who was not affected

Microsoft 365 Copilot enterprise tenants, which enforce tenant boundaries, permission scoping, and Purview DLP controls

Recommended enterprise actions

Verify January 13, 2026 Patch Tuesday deployment on systems running Copilot Personal

Issue user guidance discouraging work-related use of Copilot Personal

Mandate Microsoft 365 Copilot for enterprise AI use cases

Enable and test Microsoft Purview DLP policies for Copilot

Apply sensitivity labels to high-risk content

Monitor audit logs for anomalous Copilot access patterns

Review data sharing and permission hygiene to reduce inherited access risk

Risks and Open Questions

Indirect prompt injection remains unresolved at a foundational level. Microsoft's mitigations reduce risk but are probabilistic, not absolute.

Consumer AI tools continue to operate outside enterprise device management and monitoring. Future vulnerabilities may have similar disclosure-to-patch windows.

Reprompt also raises unresolved attribution questions. When an AI acts on smuggled instructions using valid credentials, distinguishing user intent from system behavior becomes technically and legally complex.

The Reprompt Attack on Microsoft Copilot