I hate how he called out terminal user interfaces as shit… then proved web interfaces to be superior. Damn him. I love working from my terminal, but having ai prove itself through html reports including video, image, metrics, charts, and text is goated. Rethinking yourself has the bottleneck not the orchestrator feels real. Validating the work is hard, theres a shift right now and everyone is trying to figure it out. Lucas’s technique is a little bit of be lazy and tell it to prove itself to you, so as you juggle your 15 agents you have a nice report to read.
Posts tagged: agents
All posts with the tag "agents"
This is a really good guide, with quite a few good nuggets. I need to try deleting my AGENTS.md and rebuilding it from scratch more often. I liked how he talked about having agents prove their work and tell them up front how they will be judged. What I didn’t care for so much was the feeling that a lot of the rules go in markdown, thats not a rule, thats a suggestion. Rules should be deterministic. They should be tests and linters that ensure they are followed. Suggestions are good, but dont trust the agents to always follow them. And don’t trust that they wont change your rules, keep them honest.
Agents Are Here
Late last year I started writing I'm Out On Agents. Agents sucked, the models were good, but there was still something missing between the harnesses and the models. They could write good code, they could do some debugging and exploring, but they were too good a fucking up the whole project to be useful. They could crank out Green Field POC’s like nobody’s business, but they created so much mess in brown field projects that it was easier to chat and edit yourself.
The 7 min read
Thinking about ai productivity again
Such a good interview @lexfridman is such a talented interview. It’s so cool to see the other side of this. For weeks we’ve heard about the story of the name change, we’ve seen everyone shitting on the security model, buying up all the mac minis in existance, fear mongering not to install this thing. @steipete has such a cool story from the beginning talking about making this thing fun and exciting. Giving it a personality that is not “You are absolutely right”. The story of changing the name twice, and getting pwnd on every step the first time and nailing it the second time is incredible. Dude is having fun trying to make the thing he wants in the world exist.
Pm Not Babysitter
Stop babysitting your agents, treat them like a real team and they will reward you.
Back in December I saw theo make a comment that code is now cheap, its the run rate of models, He quoted a study, not sure that he fully even believed it, but it claimed that the average developer after all meetings, training, emails, planning and extra shit in their day averages out 10 well tested lines of code per day. Opus 3.5 made him 10k loc (lines of code) that day.
We have all agreed for decades that lines of code is not a proxy to productivity or quality. Often more code means more risk, more review, more infrastructure. This has become MUCH different. Lines of code are still far from any sort of good metric. That aside, your agents are not doing 10k lines with you babysitting them, and in fact its very likely that the product quality is MUCH worse as you babysit them.
...
Agent Management Is Exhausting
The state of development in early 2026 is all wrapped around learning how to manage many agents running in parallel. Everyone’s trying to figure out the workflow.
The secret I’ve discovered is a good, well-defined plan. This could be a markdown file or a GitHub issue. Agents are actually great at writing these for you. They’ll include reproduction steps, outline changes needed, and structure the work.
This is your opportunity to step in. Read the plan. Look for hallucinations. Spot where it’s going off track. Edit the plan before the agent starts coding.
...