I hate how he called out terminal user interfaces as shit… then proved web interfaces to be superior. Damn him. I love working from my terminal, but having ai prove itself through html reports including video, image, metrics, charts, and text is goated. Rethinking yourself has the bottleneck not the orchestrator feels real. Validating the work is hard, theres a shift right now and everyone is trying to figure it out. Lucas’s technique is a little bit of be lazy and tell it to prove itself to you, so as you juggle your 15 agents you have a nice report to read.
Posts tagged: thought
All posts with the tag "thought"
This is a really good guide, with quite a few good nuggets. I need to try deleting my AGENTS.md and rebuilding it from scratch more often. I liked how he talked about having agents prove their work and tell them up front how they will be judged. What I didn’t care for so much was the feeling that a lot of the rules go in markdown, thats not a rule, thats a suggestion. Rules should be deterministic. They should be tests and linters that ensure they are followed. Suggestions are good, but dont trust the agents to always follow them. And don’t trust that they wont change your rules, keep them honest.
Feeling this today, feels like everything continues to get worse. Trying to be more positive, and its hard.
Interesting take by Kenneth Reitz. Not quite sure how I feel about it anymore. It kinda hurts, but I’m not sure if code aesthetics matter as much as the product anymore. I cared when I was the one editing, but at this point I’m not doing a lot of edits by hand. Do these aesthetics affect the final products that users use, Not sure. AI makes me sad.
If agents make prime a bit faster, what does that mean for the rest of us mortals?
I’ve gotta agree with bob on this one, the first thing I did to my biggest brownfield project I wanted to use agents on BEFORE they did work was a hardened pre-commit.yaml, ci, hardened type checking and linting. SECOND get rid of bad inconsistent patterns, let them replicate consistency, force them to pass checks. Agents will follow all of your markdown suggestions most of the time, enough for you to become complacent if you let it. They are goal seeking, if you put them to a task you thought was possible that is not given your constraints, they will try to find a way given enough tokens. I dont see this ever changing, its one thing that makes them great, it just needs to be kept in check.
behind, yet positioned to completely dominate this race by hitting it with some sense. Making trends in what looks like longevity in the race that is not subsidising to simply get users, but to get by until they figure out how to 100x reduce the cost to a reasonable level. They feel like the guy sitting in the back with nothing big or flashy to say that is going to drop the hammer on their competition that overstretched itself taking on too much debt because it was necessary to change the game. There might be something to having a mix of hipsters, boomers, and luddites all trying to balance each other out.
5 star video, if you are going to watch one video to understand how harnesses and agents work, this is it. This really had my gears spinning on what tools do for agents and how big of a difference they make in their ability to manage context efficiently and accurately create changes. It’s crazy how good bash works, and that gives the agents the ability to do just about everything, but it could be better.
One of the biggest scientific achievement of our lifetime happened this week. I will forever remember sitting in a Culvers in between theater builds looking through these photos as they came live, looking at them in awe.
One of the most famous images from the shoot “Setting Earth”
What an amazing set of photos created by the Artemis II crew accompanying a fantastic breakdown by Hank Green.
I like this one, as its probably one of the ones not shred a ton
Whole gallery is worth looking at https://www.nasa.gov/gallery/lunar-flyby/
A really interesting long form interview with @simonwillison. If you follow him closely most of it is probably not new, but I found some interesting nuggets.
Simon is writing most of his code from his phone these days using anthropic hosted platform. He mentioned that a lot of security risks go away when you don’t put secrets on the platform and you let them take the risk of running ai written code with ai chosen supply chain.
He talked about the Pelican Riding a Bike benchmark for quite awhile. He was surprised at how well of a proxy it is for how capable a model is at just about everything. He also said that when he runs the benchmark he also runs half a dozen others that he’s never talked about so that He could see if they were to train a model specific to his benchmark he could catch them, but it seems they had caught on and if they were they seem that they would already...
...
THIS is the future of homelab, excited to see someone who knows so much more about hardware than I do get excited about this.
Is Glasswing the next inflection point
Absolutely sick texture app from cnc kitchen. Like him I’ve spent a bunch of time attempting and failing to learn blender, I’m so glad someone else vibe coded out such a good app that can just add texture to stls with basic masks and is the very basics of what you would want to add to 3d prints to make them interesting, I’m excited to use this for some real projects.
Bush on tiny desk. Iconic band on an iconic platform. Will be re-listening to this several times.
smassh is the coolest monkeytype tui clone, its impressively accurate. Easy to install and run, all the same themes appear to be there and everything. maybe a good way to get a few reps in while agents are running these days.
I need to go back and brush up on my skills I’m down a good 20wpm from what I should be doing.
I’ve been thinking about this for awhile and Daniel makes some great arguments here. Interestingly keeping inference cheap removes the incentives to make our tools better, help us choose the right model, lean on local models, open weight models. The frontier models are so affordable through subsidized subscription models why would you deal with anything less intelligent at this point. The tooling we use is not optimized for it, and why should it be.
emacs config so bad he launch obsidian, YIKES! grantid I’m using obsidian currently on my phone, not for this post, but for journal entries while I’m away from my desk. Use this as a reminder that you can swim through murky waters with your dotfiles for awhile, but occasionally its good to do a clean up, pin it, put em in a docker image, have a good fallback to go to if shit really hits the fan. Iv’e been using https://github.com/waylonwalker/nvim-manager as part of my strategy for awhile now.
uv adds dependency cooldowns via #16814. Well needed feature in todays world, far from a guarantee, but its something.
2026, finding the balance between fixed bugs and zero days. There is very unlikely ever a reason you need to be running bleeding edge packages in prod most package managers now support cool downs.