You hand the whole task to one agent: "build this." It thinks for a moment, then starts typing — and somewhere in the first fifty lines it has silently made every architectural decision the task contained, without stating a single one. It works. It might even be good. But you couldn't tell, because the plan was never anywhere you could read it. It lived in the agent's head for exactly as long as it took to turn into code, then evaporated.
01Deciding and doing are different jobs
We treat "build this" as one task. It's two, stacked: decide the approach, then execute it. A single agent doing both compresses them into one pass — and the casualty is the plan. It never becomes a thing you can inspect, argue with, or hold the build against. It's assumed mid-keystroke, and gone.
Split the two and something useful is forced into existence: the plan has to be handed over, which means it has to be written down. The moment one agent must brief another, the approach stops being a private hunch and becomes an artifact — an explicit spec. And an explicit spec is the only thing that makes a build checkable later. You didn't just divide the labor; you turned the invisible half of the work visible.
✕ One agent decides and does
✓ One leads, one builds
02Cast each agent for what it's strong at
Agents aren't interchangeable, and pretending they are wastes both. One may be the stronger reasoner — better at holding the whole problem, weighing trade-offs, seeing what breaks three steps out. Another may be the stronger executor — faster and cleaner at turning a precise spec into working code. Run them as clones racing the same task and you get two mediocre halves. Run them as a team with two roles and each does the part it's actually good at.
So cast them. Claude as the lead architect: it owns the approach, slices the work, defines the interfaces, and writes down what "done" looks like — but it deliberately writes little of the code itself. Codex as the builder: it executes within that frame, fast, and when the spec is silent it asks rather than quietly inventing an answer. The lead thinks; the builder makes; neither pretends to be both.
Owns the approach, the breakdown into slices, the interfaces between them, and the acceptance criteria. Writes the brief, not the bulk of the code.
Executes inside the frame, slice by slice. When the spec doesn't say, it asks — it doesn't guess and drift. Fast hands, bounded scope.
The architect checks the build against the criteria it wrote. The builder doesn't sign off its own work — the one who set the bar holds it.
03Running it: cast the two, then hand off
First, cast each agent once — the role-setting prompt you paste at the top of each session. Into the one you'll run as the architect:
You're the lead architect on this, not the implementer. You own the approach and the
plan; you deliberately write very little code yourself. Your deliverable is a brief a
second agent can build from without coming back to ask you. Think in slices,
interfaces, and acceptance criteria — not implementation detail.
And into the one you'll run as the builder:
You're the builder, working to a spec a second agent wrote. Don't redesign it. Build
strictly within it, one slice at a time. Where the spec is silent or ambiguous, stop
and ask — never guess and move on. If something in the spec looks wrong, flag it
rather than quietly work around it.
With the roles set, the work is three handoffs. First, make the architect produce the brief — held back from code:
Don't write the implementation yet — design it. Produce a brief a second agent
could build from without talking to you:
- the approach in two lines
- the work cut into slices small enough to check one at a time
- the interfaces where slices meet (inputs / outputs)
- the danger zones: what not to touch, what's easy to break
- acceptance criteria: concretely, what "done" means and how to test it
Hand that brief to the builder, and bound it — build within the spec, ask when it's unsure, don't redesign:
Build this strictly within the spec below, one slice at a time. Don't redesign it.
Where the spec is silent or ambiguous, stop and ask — do not guess and move on.
After each slice, show what changed and which acceptance criterion it satisfies.
[paste the spec]
Then close the loop — the output goes back to the architect, checked only against the bar it set:
Here's the build against the spec you wrote: [paste the diff / output].
Check it only against your acceptance criteria — list each as met or not met,
and flag anything that drifted from the approach. Don't rubber-stamp it.
04The handoff is the whole job
If the lead writes a vague brief, the split buys you nothing — the builder fills the gaps with its own guesses and you're back to an unstated plan, just hiding in a second agent. Notice what the middle prompt above is really protecting: the instruction to ask when the spec is silent is what stops the builder from quietly becoming a second architect. A handoff only works when the brief is concrete enough that there's little left to guess — and honest enough to flag where it isn't.
And the last item on that brief — the acceptance criteria — is what closes the loop. Because the architect wrote them, the architect can now review the build against them, and it's the same principle from a second set of eyes: the one who didn't write the code is the right one to judge it. The lead is the non-author reviewer by construction. The roles even rotate cleanly — the day Codex designs something, Claude takes the builder's seat, and the gate stays with whoever set the bar.
05When the altitude split is worth it
It's pure overhead on a small, well-understood change — a single agent should just do that one. The split earns its cost when the task has real architecture in it, when a lone agent would drift halfway through, and when you keep getting output that runs but is shaped wrong — built competently toward a plan no one ever pinned down. That last symptom is the tell. The fix isn't a smarter builder; it's giving the builder a plan it can't misread, written by an agent whose entire job was to get the plan right.