Claude Code sandbox bypass vulnerability disclosure prompt injection Anthropic

Anthropic Quietly Fixed a Claude Code Sandbox Bypass Nobody Told You About

20 May 2026

Anthropic quietly fixed two vulnerabilities in Claude Code's network sandbox that could have allowed attackers to bypass network restrictions and exfiltrate sensitive data. The second flaw, discovered by researcher Aonan Guan, involved a SOCKS5 null-byte injection trick that could fool the allowlist filter into permitting connections to unauthorized hosts. Guan has criticized Anthropic for lacking transparency, noting no CVE was assigned to his finding and no public disclosure or release notes warned users — though Anthropic states the fix was deployed before his bug bounty report was submitted.

A security researcher has gone public with details of a Claude Code vulnerability that let attackers punch through the tool's network sandbox, after Anthropic fixed the bug without assigning it a CVE or mentioning it in any release notes.

Claude Code's sandbox is supposed to restrict outbound network traffic to an approved list of hosts. Everything else gets blocked. Researcher Aonan Guan found a way around that using a SOCKS5 hostname null-byte injection trick.

The attack is straightforward in hindsight. Suppose a policy allows connections only to *.google.com. An attacker crafts a hostname like attacker-host.com\x00.google.com. The filter sees the .google.com suffix at the end and waves it through. The operating system, however, truncates at the null byte and dials attacker-host.com instead. The sandbox never knew what hit it.

Guan says the flaw was present from October 20, 2025, when the sandbox became generally available, until it was quietly fixed somewhere around version 2.1.88 or 2.1.90 in late March or early April. He submitted a bug bounty report through HackerOne on April 3. Anthropic marked it as a duplicate, saying its own team had already caught and patched the issue before his report landed, with a fix committed to the sandbox-runtime repository on March 27 and shipped on March 31.

Anthropic's timeline may well be accurate. But Guan's frustration is understandable regardless. No CVE was assigned to the flaw in Claude Code itself. The release notes said nothing. Users running the sandbox in production had no way of knowing the protection they were relying on had a hole in it, and no way of knowing after the fact that it ever did.

To make that worse, a separate sandbox vulnerability, tracked as CVE-2025-66479, was assigned to the sandbox-runtime library rather than to Claude Code directly. That one involved the sandbox misinterpreting a block-all-traffic setting as allow-everything, which is about as bad as sandbox failures get. It was fixed on November 26, 2025. Again, Claude Code users received no direct notification.

Guan also points out how damaging this class of bypass could be when combined with prompt injection. He recently disclosed a technique called Comment and Control, which demonstrated that AI coding agents including Claude Code's security review tooling could be hijacked via specially crafted GitHub comments, pull request titles, and issue bodies. Chain a prompt injection attack with a sandbox bypass and you have a route to exfiltrating environment variables, credentials, tokens, and infrastructure details without the user seeing anything suspicious.

Anthropic says it appreciates the research. It also maintains it got there first. Both things can be true, and the dispute over credit is arguably less interesting than the broader question of why users were left in the dark.

READ NEXT

Anthropic quietly buried hidden tracking code in Claude Code. Now it's removing it.How a Lobster Mascot and a Bunch of Obsessives Dragged the AI Agent Era Into Existence Capital One Releases AI Vulnerability Hunter to the Public