Agents Are Here

Late last year I started writing im-out-on-agents. Agents sucked, the models were good, but there was still something missing between the harnesses and the...

Copy this post

Late last year I started writing I'm Out On Agents. Agents sucked, the models were good, but there was still something missing between the harnesses and the models. They could write good code, they could do some debugging and exploring, but they were too good a fucking up the whole project to be useful. They could crank out Green Field POC’s like nobody’s business, but they created so much mess in brown field projects that it was easier to chat and edit yourself.

The vibe coding era—before the breakthrough—meant accepting code you didn’t read. Theo's framework mattered then: don’t land in “don’t know, don’t care, mission critical.” Most of us were stuck there. Developers naturally got great tools. Artists got prompt engineering. The difference in how we approached AI shaped everything that came next.

The Beautiful Glitch - Gemini

The Inflection Point #

It’s very well agreed on that the inflection point for most people happened with Anthropic Opus 4.5 in late Nov 2025. Early adopters probably noticed right away and shouted from the rooftops how good it was. But we’ve all heard that developers have 6 months before ai writes all the code for years, so this felt like the rest of the noise.

Hitting the December slowdown many of us hit code freezes at work. We completely disconnect from work for the last Week and come back in January. During this time, its very common for us to try out new tools, new techniques work on side projects, create POC for that thing we never have time for. While it looks like less features coming into the apps we support this is an important time for us to explore and reflect.

December wasn’t just about the models getting better. The tooling exploded. I started noticing variants popping up—fast, slow, thinking modes. Anthropic was super generous with a free tier giving out huge levels of free tokens at the time. So many of us laughed and threw it at our side projects expecting the normal garbage output, but maybe some good ideas to come from it. But that’s not what was happening anymore, somehow these agents do some real work, follow plans and stick to scope really well. And if you lay out a big enough plan they tended to keep cooking and completing features.

This was the shift I started documenting in I'm In-ish On Agents. “Context is king, good plans are paramount, syntax barely matter.”

January 2026 #

flu season

For me and many others around the country we were hit with a rough flu season, kicking in and draining us mentally for a good month or so. I wanted to work, I was getting excited about some projects and wanted to get them going, but I was constantly wiped and had no capacity. I had no ability to think on complex tasks, I was coughing all the time, just trying to survive, but yet wanted to do something. I started doing some small clean up, some work on side projects.

January was when I wrote Stop Using Boomer Ai. The chat-copy-paste era ended for me. If you were still doing that, you were doing it wrong.

Somewhere in the fever haze, I started figuring out the harness. Not just prompting—planning. My First Agentic Workflow documents the /init, AGENTS.md, the whole ritual: issue → plan → execute → review.

At this point I was still afraid of really letting agents cut loose on something meaningful, something that users depend on. But the framework was taking shape. “There is no free lunch. Software engineering is still very much needed, but the work is switching.”

February 2026 #

what just happened?

It got fast. Too fast.

Agent Management Is Exhausting

Claude could implement features faster than I could research and raise issues. It’s like trying to speedrun a Minecraft seed when you just figured out how to craft a pickaxe. “Depending on the day, agents move so damn fast. I can barely research, find, and raise issues as fast as Claude can implement features and fixes.”

The exhaustion was real—managing these things stretches a different part of your brain than you’re used to using. “I had a session yesterday where the context got poisoned with a wrong assumption. The agent spent 20 minutes building on that false premise before I caught it. That’s 20 minutes of perfectly executed code solving the wrong problem entirely.”

This is when I realized babysitting was the wrong frame. Theo’s quote haunted me: developers average 10 well-tested lines of code per day, Opus 3.5 made him 10k LOC in a day. “Stop babysitting your agents, treat them like a real team and they will reward you. You need a tool for planning and tracking, otherwise you are playing babysitter rather than Product Manager.”

yes or –dangerously-accept #

Somewhere between February’s chaos and March’s clarity, the workflow solidified. The yes-or-die moment. The –dangerously-accept flag.

From How To Run 5 Agents In Parallel Feb 2026 Edition: “Planning is the core of what it takes to keep agents running… Agents need something to do, telling them to turn the circle green, then blue, then to a rectangle, is not it.” With a good plan, well-scoped and documented, they’ll keep working. The question becomes: are you reviewing every line, or are you managing plans?

This is when you decide. This is when the frame shifts from “will it break” to “is it solving the right problem.”

March 2026 #

the productivity paradox

I built one of the biggest PRs I’ve ever done professionally. Fifty commands refactored, stdout/stderr contracts established, Unix-pipe friendly—all patterns from clig.dev, implemented consistently.

The agents cranked out more code than I could have typed in months. “There’s a lot that’s getting done that there’s no way I could do alone, it would take a full team with heavy coordination.”

But here’s the thing: [[thinking-about-ai-productivity-again|it’s low value work]]. “This is all good work. It will make the product consistent, repeatable, expected, and most of all boring. But its low value work. We wouldn’t have likely put humans on this work wholesale and fixed critical paths as they came up.”

Am I more productive? “I’m definitely doing more, there are more lines of code… but its hard to sus out the real productivity from the noise.”

Around this time, I realized: I don't want someone else running my agents. “I don’t want to review the mass of changes clobbered across the codebase… If someone is going to be stirring the slop in my product, I want it to be me.”

April 2026 #

here now

The agents are here. They’re not what we thought they’d be in 2024, not what I thought they’d be in August 2025.

They’re exhausting. They’re fast. They require a different kind of management. [[the-ai-wars-are-so-much-worse-than-the-framework-wars|The pace is worse than the framework wars ever were.]]

But they’re here. And the work keeps changing. Expectations are changing, the way work is completed is changing, and we are all here trying to figure out what this looks like moving forward.

Closing the editor #

The editor is closed more often now. I still peek in, still review when it matters, still own my words on this site. But the work happens differently.

If you want to see how this evolved, start with I'm In-ish On Agents, check out My First Agentic Workflow for the mechanics, or read Agent Management Is Exhausting when you’re tired.

Closing the Editor - Gemini