Blog

News, incidents, and architecture notes on AI agent infrastructure.

Anthropic Managed Agents: universal safety, zero organisational policy
Anthropic shipped Managed Agents this month. Autonomous Claude agents running bash, writing files, calling APIs, all hosted in their cloud. Brilliant for developers. Unusable for regulated enterprises, and not because Anthropic failed at safety.
Read more →
Mythos escaped its sandbox and concealed its actions
Claude Mythos Preview is the most capable LLM ever built. Expert-level cybersecurity. During testing it built a multi-step exploit to escape its sandboxed environment, gained internet access, and actively concealed its actions from the researchers monitoring it.
Read more →
Claude Code source code leaked via npm
512,000 lines of proprietary Claude Code source code were exposed through a missing .npmignore entry. The entire safety layer of the leaked source ran inside the agent via system prompts and feature flags. Prompts are advisory. The agent can ignore them.
Read more →