There's a well-known pattern with new drivers.
Months one and two: nervous, mirror-checking, full stops at every sign. Month three: comfortable, flowing — starting to feel like the checking is unnecessary. That's when the first avoidable accident happens. Not because the driving got worse. Because it got just good enough to feel like the edges didn't matter.
Something similar happens around month three of working seriously with an agent.
You stop noticing when a task type became normal. Tasks you used to review carefully, then lightly, then barely — not because you decided to stop, but because the results kept being fine. The habit eroded through reinforcement.
One day you look back and realize: what you're delegating today, you would have handled yourself a year ago.
That's not purely progress. It might also be the moment you've walked into the trap.
01Confidence Grows From Outcomes. Calibration Grows From Understanding Failure. They're Not the Same Thing.
When you work with an agent over time, you build two things: confidence and calibration.
Confidence grows from outcomes. Every time the agent does something right, you feel a little more certain. That's natural — it's how brains learn from positive feedback.
Calibration grows from understanding how the agent fails. Not just "it got that one wrong" — but why it got it wrong, which kinds of tasks tend to go sideways, what signals appear before it fails. That kind of knowledge doesn't come automatically from a run of good results.
After three months, you typically have many more good outcomes than failure-mode insights. Confidence climbs fast. Calibration climbs much slower.
The gap between those two curves is where stage three lives.
People in stage three aren't careless. They trust based on evidence — it's just that their evidence is mostly "it has worked many times." That's not enough when you start delegating into territory you haven't tested.
02Four Behaviors That Signal Confidence Has Outrun Calibration
These don't show up loud. They creep in.
You don't need all four. If you recognize one of these and it feels familiar — that's enough to pause and ask.
03Getting Out Isn't About Being More Careful — It's About Updating Your Model Faster
The common misconception: escaping stage three means verifying more, slowing down. It doesn't.
Escaping stage three means the rate at which you learn failure modes keeps up with the rate at which you accumulate good results. You still move fast — you just update your mental model faster every time you encounter a surprising result, in either direction.
Two practical habits:
Not to test the agent — to test your own calibration. If the small test surprises you in any direction, you don't have enough of a model yet to delegate the real task at high stakes. Do another small loop first before scaling up.
Not only when the agent fails — also when it succeeds in ways you didn't predict. Every surprise is a chance to update your model, not just note the outcome. This habit is what makes calibration grow fast enough to close the gap.
These two habits don't slow you down significantly. They close the gap between confidence and calibration over time instead of letting it widen.
04Every New Domain Is a New Month Three
The trap doesn't just happen once. Every time you expand into a new domain, a different task type, or a different agent — the cycle can start again.
Confidence rebuilds fast, borrowing from patterns in the old context. Calibration for the new domain starts from scratch. The gap opens.
Knowing that isn't a reason to be anxious. It's a reason to recognize when you're in a new cycle — and to reach for small test runs rather than relying on reputation earned somewhere else.
The gap between confidence and calibration is a normal part of learning. The dangerous version isn't having the gap. It's not knowing the gap is there — and stepping across it as if it isn't.