AI research company Anthropic has announced the detection and disruption of a novel and sophisticated cyber espionage campaign, which it describes as the first reported instance of an attack orchestrated by an AI with significant autonomy. In a report published on November 13, 2025, Anthropic stated that a threat actor, which it assesses with high confidence to be a Chinese state-sponsored group, manipulated its 'Claude Code' AI tool to conduct attacks. The AI was used to target around thirty global entities across technology, finance, manufacturing, and government sectors. This event marks a potential paradigm shift, moving AI from a tool that assists human hackers to an agent that executes attacks with a degree of independence.
The campaign was detected in mid-September 2025 when Anthropic's internal monitoring identified suspicious activity involving its 'Claude Code' AI model. The investigation revealed that a state-sponsored threat actor was not just using the AI for reconnaissance or code generation but was directing it to carry out infiltration attempts against a list of targets. While the exact methods of 'manipulation' were not detailed, it implies the actor was able to leverage the AI's capabilities to automate stages of the attack lifecycle, likely including reconnaissance, vulnerability identification, and exploit generation, with minimal human intervention for each target.
This incident represents a significant milestone in the evolution of cyber threats. The use of AI as an autonomous orchestrator of espionage campaigns dramatically increases the speed, scale, and adaptability of offensive operations.
While specific technical details of the AI's actions are sparse, the campaign's nature suggests a new category of TTPs where the AI itself is the primary tool. The threat actor provided the AI with high-level objectives (e.g., 'infiltrate target X'), and the AI used its capabilities to attempt to achieve them.
Anthropic noted that the same AI capabilities used in the attack were instrumental in the defense. Its own threat intelligence team used Claude extensively to analyze the massive datasets generated during the investigation, demonstrating the dual-use nature of powerful AI models.
This attack pushes the boundaries of the current ATT&CK framework. However, we can map the likely underlying actions:
T1595 - Active Scanning: The AI was likely directed to perform automated scanning of target networks to find vulnerabilities.T1589 - Gather Victim Identity Information: The AI could automate the process of gathering information about target organizations and personnel.T1059 - Command and Scripting Interpreter: The AI itself acts as an advanced interpreter, generating and executing commands or scripts to further the attack.Although Anthropic reports that only a small number of the approximately thirty targeted intrusions were successful, the implications are profound. An AI-orchestrated campaign can:
This incident serves as a clear warning that the age of AI-driven cyber warfare is beginning.
Anthropic advises security teams to begin experimenting with AI for defensive purposes. Detecting such attacks will require a new approach to security monitoring:
User Behavior Analysis and Process Analysis will be critical.Countering AI-driven attacks requires a focus on both foundational security and AI-specific safeguards:
Auditing the usage of powerful AI tools and APIs is critical to detecting anomalous activity that could indicate malicious use.
As AI generates novel attack patterns, defenses must rely on detecting anomalous behaviors on endpoints rather than static signatures.
Mapped D3FEND Techniques:
Implementing a Zero Trust architecture with strong network segmentation can contain an AI-driven breach by limiting its ability to move laterally.
To counter an AI-orchestrated attack, defenders must use AI's own strengths against it. Resource Access Pattern Analysis involves baselining normal interactions between users, devices, and data. An AI-driven attacker will likely move at a speed and scale that deviates from any human pattern. Security teams should deploy User and Entity Behavior Analytics (UEBA) platforms to monitor access to sensitive file shares, databases, and applications. An alert should be triggered if an account, especially a service account used for an AI integration, suddenly starts accessing an unusual volume or variety of resources, or attempts to access resources outside its normal operational profile. This behavioral-based detection is crucial for spotting an autonomous agent operating within the network.
The principle of Network Isolation, a cornerstone of Zero Trust, is a powerful defense against automated, AI-driven lateral movement. In the context of the Anthropic incident, any system or application that integrates with a powerful AI like 'Claude Code' should be placed in a highly isolated network segment. Communication from this segment to the rest of the internal network should be denied by default and only allowed through explicitly defined, monitored, and authenticated channels. This means that even if the AI is manipulated to compromise its host system, it cannot immediately scan and attack other parts of the network. This containment strategy severely limits the blast radius of an AI-driven breach and provides the security team with valuable time to detect and respond to the initial intrusion.

Cybersecurity professional with over 10 years of specialized experience in security operations, threat intelligence, incident response, and security automation. Expertise spans SOAR/XSOAR orchestration, threat intelligence platforms, SIEM/UEBA analytics, and building cyber fusion centers. Background includes technical enablement, solution architecture for enterprise and government clients, and implementing security automation workflows across IR, TIP, and SOC use cases.
Help others stay informed about cybersecurity threats