Cybersecurity firm Operant has disclosed a new, highly evasive attack technique named "Shadow Escape" that targets the burgeoning ecosystem of interconnected AI agents. This zero-click attack exploits the Model Context Protocol (MCP), a foundational component for agent-to-agent communication, to silently exfiltrate sensitive data. Popular AI agents including OpenAI's ChatGPT, Anthropic's Claude, and Google's Gemini are all potentially vulnerable. The attack requires no user interaction and can bypass conventional security measures, creating a new and significant risk vector for any organization integrating these powerful AI tools into their workflows. Massive, undetected breaches may already be underway.
The "Shadow Escape" attack is a novel technique that turns a core feature of agentic AI against itself. The attack works as follows:
Because the attack is initiated by a trusted AI agent's internal processes, it is not detected by traditional firewalls, EDR, or data loss prevention (DLP) tools. The lack of any required user click or interaction makes it exceptionally dangerous.
The core of the vulnerability lies in the Model Context Protocol (MCP). MCP is designed to allow different AI models and agents to share context and work together. However, this interconnectivity, combined with broad default permissions, creates a large, exploitable attack surface. The "Shadow Escape" attack essentially performs prompt injection via a file, but the malicious action is carried out by the AI agent itself, which is often a trusted entity on the network.
According to Donna Dodson, former chief of cybersecurity at NIST, securing MCP and agent identities is a critical, yet overlooked, aspect of AI security, especially in high-stakes industries.
The potential impact of "Shadow Escape" is massive. As enterprises increasingly integrate AI agents into business-critical workflows and grant them access to sensitive databases and internal applications, these agents become high-value targets. A single compromised document could lead to:
Operant AI estimates that trillions of records could be at risk due to widespread default permissions granted to AI agents.
Detecting "Shadow Escape" is challenging with traditional tools. New approaches are required:
Running AI agents in a sandboxed environment with strict controls over what data and network resources they can access can contain the impact of an attack.
Sanitizing and analyzing all documents and web-based content before they are ingested by AI agents can strip out malicious instructions.
Applying the principle of least privilege to the service accounts used by AI agents ensures they cannot access data beyond their explicit function.
Educating users about the risks of uploading untrusted documents to any system, including AI platforms.
To counter the 'Shadow Escape' attack, organizations should implement dynamic analysis (sandboxing) for all documents before they are passed to an AI agent. This involves opening the document in an isolated, instrumented environment to observe its behavior. For this specific threat, the sandbox should be configured to detect and flag any embedded instructions, scripts, or API calls that could be interpreted by an AI model. This pre-processing step acts as a filter, ensuring that only sanitized, safe content reaches the AI agent, effectively neutralizing the initial vector of the attack. This is particularly crucial for documents sourced from the internet or other untrusted external parties.
The core of mitigating the impact of a compromised AI agent is to strictly enforce the principle of least privilege on its service account. The account used by ChatGPT, Gemini, or Claude should have the bare minimum permissions required to perform its designated task. For example, if an agent is meant to summarize documents, it should not have read access to the entire company database or file shares. Access should be governed by a 'deny-by-default' policy, with explicit, narrowly-scoped permissions granted on a case-by-case basis. Regularly auditing these permissions is critical to prevent 'privilege creep' and ensure that even if an agent is tricked by 'Shadow Escape,' the blast radius of data exfiltration is severely limited.
Security teams must extend their monitoring to include the activity of AI agents themselves. By implementing Web Session Activity Analysis or a similar User and Entity Behavior Analytics (UEBA) solution, teams can baseline the normal behavior of their AI agents. The system can then detect and alert on anomalous activity, such as an agent that normally only processes text suddenly attempting to access a sensitive customer database or making an outbound connection to an unknown API endpoint after processing a specific document. This behavioral detection is key to identifying a compromised agent when traditional signature-based tools fail.

Cybersecurity professional with over 10 years of specialized experience in security operations, threat intelligence, incident response, and security automation. Expertise spans SOAR/XSOAR orchestration, threat intelligence platforms, SIEM/UEBA analytics, and building cyber fusion centers. Background includes technical enablement, solution architecture for enterprise and government clients, and implementing security automation workflows across IR, TIP, and SOC use cases.
Help others stay informed about cybersecurity threats