Building with AI Agents: What Actually Works
A few things I’ve learned after spending months shipping agent-assisted workflows in production.
The mental model shift
The mistake most engineers make is treating AI agents like better autocomplete. They’re not. The right mental model is closer to a junior contractor who reads fast, never gets tired, and occasionally hallucinates API docs.
That mental model change has two practical consequences:
- You define the contracts, not the logic. Give the agent a clear interface — exact inputs, expected outputs, explicit failure modes. Don’t let it infer what you mean.
- Verification is your job. The agent ships fast. You verify. Every agent-written PR gets the same review as any other PR — probably more, because the agent won’t feel bad about it.
What I’ve actually shipped
A few patterns that held up:
File-based workflows — anything that’s just read/write on a predictable structure (like this blog). The agent knows the schema, pushes a file, done. Zero surprises.
Scaffolding, not architecture — agents are excellent at creating the 80% skeleton of a new service or feature. They’re poor at architectural decisions that require context spanning multiple years of system history. Use them for the former, own the latter yourself.
Research synthesis — give an agent a set of docs, a GitHub issue, and a question. Get back a structured answer. This alone saves an hour a day.
What doesn’t work
- Long-horizon tasks with ambiguous checkpoints
- Anything that requires institutional memory the agent doesn’t have
- Decisions involving trade-offs you haven’t made explicit
The actual workflow
For this blog specifically: I open Claude Code, describe what I want to write, and it scaffolds the post. I edit, push. The whole loop is under five minutes once you have the repo set up right.
That’s the real unlock — not the AI capability, but removing the friction around it.