← BACK TO FEED
AI codingClaudedeveloper toolssoftware engineeringKarpathy

Karpathy's CLAUDE.md: How to Stop Your AI Coding Agent Going Rogue

Andrei Karpathy's CLAUDE.md is a project-level instruction file that guides AI coding agents like Claude Code to behave as disciplined engineering partners rather than unpredictable code generators. It encodes key principles such as planning before editing, making minimal surgical changes, preferring simple solutions, and verifying all output through tests and code review. The core argument is that as AI coding agents grow more capable, structured guidance becomes increasingly critical — and the developers who thrive will be those who combine AI leverage with strong engineering judgment.

There's a meaningful difference between an AI that writes code and one that writes code well. Most developers have discovered this the hard way.

AI coding agents have moved well beyond autocomplete. They explore codebases, plan changes, write tests, run tools, and iterate toward working solutions with increasing independence. That autonomy is genuinely useful. It is also, without proper guardrails, a reliable way to end up with a bloated, barely-tested mess that nobody wants to review.

Andrei Karpathy's approach to this problem centres on a simple mechanism: a CLAUDE.md file. Think of it as a project-level instruction manual for Claude Code or similar agents. Rather than restating your preferences in every prompt, you encode your engineering standards once. The agent reads them, and in theory, behaves accordingly across every task.

It sounds basic. The implications are not.

Think First, Then Touch the Code

The first principle is deceptively obvious: use plan mode before touching anything non-trivial.

Experienced engineers do not wade straight into a codebase and start changing things. They understand the problem first, map the affected components, and decide what success looks like before writing a single line. AI agents have no natural instinct to do this. Left to their own devices, many will just start editing.

Forcing a planning step changes that. A short written plan, outlining intended changes and surfacing assumptions, reduces the chance that an agent solves the wrong problem efficiently. In a large codebase, where one change can cascade through types, tests, APIs, and downstream behaviour, this matters enormously.

For small tasks, the plan can be lightweight. The principle still applies.

Verify Everything, Trust Nothing

AI-generated code can look completely convincing while being subtly wrong. This is one of the more insidious properties of current models. The syntax is clean, the logic appears coherent, and the bug is hiding three function calls deep.

Verification is not optional. Check the diff. Run the tests. Consider edge cases. Use the linter. Watch runtime behaviour. Treat the agent's output the way you would treat a PR from a fast but occasionally careless colleague: with interest and appropriate suspicion.

If the project has tests, they should run. If tests are missing for the changed code, add focused ones. If the agent cannot demonstrate that its changes work, the changes are not done.

100 Lines Over 1000

One of the genuine risks with capable coding agents is overproduction. Because they can generate large amounts of code quickly, they sometimes generate far more than necessary. Abstractions nobody asked for. Configuration systems for problems that could be solved directly. Helper layers wrapped in wrapper layers.

More code is not better code. It is more surface area, more maintenance burden, and more cognitive load for the humans who have to live with it.

The preference for simpler, smaller solutions is not about being precious. It is about recognising that unnecessary complexity compounds. A good CLAUDE.md should push the agent to ask whether a simpler path exists before it starts building infrastructure.

Surgical Edits Only

Related to this: the agent should change only what is necessary to solve the problem.

AI agents have a tendency to be helpful in ways that were not requested. Renamed variables, reformatted files, reorganised sections, updated comments. None of this was in scope. All of it ends up in the diff.

Noisy diffs are a problem for code review and a risk for production stability. A focused pull request is easy to understand and safe to merge. A sprawling AI-generated change touching twenty files is neither.

The smallest correct footprint is the goal. Anything beyond that is unnecessary risk.

Define Success Before You Start

Agents perform significantly better when given explicit success criteria rather than vague direction. "Improve this feature" is not a useful instruction. "Fix the failing test in this module, preserve the existing API contract, and update only these three files" is.

Where possible, write the test first. Let the agent iterate until it passes. Use tooling in the loop. Define what done looks like in concrete terms.

This shifts the human role from writing every implementation detail to defining outcomes and reviewing execution. That is a more productive place to be, assuming the outcomes are defined precisely enough.

Subagents for Exploration

For larger tasks with independent components, there is value in running subagents in parallel. One inspects documentation, another analyses a module, another investigates test failures. This keeps individual context windows cleaner and avoids the agent trying to hold an entire complex system in its head at once.

The caveat is important: each subagent needs a single focused task, results need to be reviewed before acting on them, and a human or coordinating agent still needs to decide what conclusions actually matter. Parallelism as structured exploration is useful. Parallelism as chaotic delegation produces fragmented, unreliable output.

The Broader Philosophy

Underpinning all of this is a straightforward engineering philosophy: minimal code that solves the problem, without speculative architecture, unnecessary abstractions, or cleverness for its own sake.

This matters more now than it did before. When generating a framework takes seconds, the temptation to generate one increases. But easy generation is not good design. The best code is often direct, obvious, and boring. A CLAUDE.md that reminds the agent of this is doing real work.

The corollary is that the agent should not take shortcuts either. If a test is failing, weakening the test is not a fix. If there is a type error, casting to any is not a solution. The agent should find root causes and solve actual problems, not paper over them.

What Happens to Engineering as Agents Get Better

Karpathy's framework also touches on some bigger questions worth sitting with. What happens to the skill gap between developers when agents become ubiquitous? Do generalists start outcompeting specialists because they can coordinate AI across multiple domains simultaneously? What does engineering mastery actually look like when the bottleneck shifts from implementation to judgement?

These are not hypothetical. If AI coding agents continue to improve, software development shifts away from writing every implementation detail and toward orchestrating agents, defining specifications, evaluating outputs, and composing systems. That changes what it means to be effective in the role.

There is also the risk that cheap code generation floods the ecosystem with low-quality software. When generation is easy, judgement becomes more valuable, not less. The developers who thrive will be the ones with taste: knowing what good looks like, catching what the agent missed, and maintaining standards even when cutting corners is trivially easy.

What to Put in a CLAUDE.md

A practical file should encode your team's engineering values clearly. How should the agent plan? When should it ask for clarification? How small should edits be? What does the testing requirement look like? What should it never do without explicit instruction?

Something like: plan before major changes, prefer simple solutions, keep edits focused, run tests, do not touch unrelated code, do not introduce speculative abstractions, do not hide errors with weak fixes, ask when requirements are unclear.

That is not a revolutionary document. It is a disciplined one. And in the context of an increasingly autonomous coding agent, a disciplined document is worth quite a lot.

The future of software development is not agents running free. It is agents running well, inside boundaries that a thoughtful engineer took the time to define.