You are holding it wrong

There’s a growing genre of developer blog post: the AI slop rant. You’ve read them, maybe written one. Pull requests that compile and pass tests and do nothing the ticket asked for. Documentation that the submitter clearly hasn’t read. Open source maintainers burning out under a tide of drive-by contributions from people who can’t answer basic questions about the code they just submitted. Emojis in comments. Invented APIs. Four thousand lines where forty would do.

The rants are not wrong.

A recent qualitative study out of Heidelberg, the University of Melbourne, and Singapore Management University analyzed over a thousand Hacker News and Reddit posts tagged “AI slop” and found a consistent theme: developers describe the phenomenon as a tragedy of the commons, where one person’s velocity gain becomes five reviewers’ cleanup bill. Rémi Verschelde, who maintains the Godot game engine, has publicly described the influx of AI-generated contributions as draining and demoralizing. Mitchell Hashimoto, the HashiCorp founder, has built a vouching system — currently being piloted on his Ghostty project — specifically because AI tools have made it trivial to generate plausible-looking but hollow contributions. The Gentoo Linux distribution is migrating off GitHub to Codeberg. This is not a vibes problem. It is a real, measurable externality being absorbed by the people at the end of the PR queue.

But I want to offer an unfashionable observation. The problem isn’t the tool. It has never been the tool. Look at who’s actually producing the slop — and, more tellingly, look at who isn’t.

The quiet other side

Simon Willison, co-creator of the Django web framework, has been publishing his AI-assisted development workflow in public for close to three years. He recently wrote a detailed walkthrough of building a custom colophon page for his tools site: conception to deployed feature in just over seventeen minutes, total Anthropic API cost, sixty-one cents. The code was reviewed. The tests were run. He understood every line, which is why he could step in and finish the last bit by hand when the model got stuck on a GitHub Actions quirk. Nobody in the Datasette community is writing angry blog posts about how Simon’s PRs are destroying their review process.

Kent Beck — the Kent Beck, co-author of the Agile Manifesto, inventor of Extreme Programming, pioneer of test-driven development — spent a chunk of 2025 building a B+ Tree library called BPlusTree3 in Rust and Python using what he calls augmented coding. The result: production-competitive performance, with the Rust implementation matching standard library benchmarks and outperforming them on range scans. He describes the process not as letting the machine run wild but as intervening constantly — watching for warning signs, stopping the agent the moment it starts generating functionality he didn’t ask for, treating unexpected test deletions as red flags. He’s also been explicit that juniors working this way — augmented, not vibe-coding — ramp onto codebases dramatically faster than before, because the AI collapses the search space for “which library should I even use” down from hours to minutes, freeing time for actual learning.

Armin Ronacher, creator of Flask and previously VP of Platform at Sentry, now runs a startup called Earendil with a small team plus what he openly refers to as AI interns. His thirty-seven-minute talk on agentic coding walks through a workflow in which he ships features with Claude Code running with broad permissions inside a Docker container. He’s delegating real work, not supervising every token. His stated philosophy: keep the context system simple, keep the feedback loops observable, avoid tool sprawl, assume the agent will be lazy about whatever friction you introduce. He’s not complaining about slop. He’s also not producing it.

Tobias Lütke, the Shopify CEO, has a publicly visible GitHub contribution graph that spiked last autumn when coding agents crossed a real capability threshold. He’s shipping code again, for the first time in years, because an agent lets him fit real programming work into the cracks of being a CEO — including a recent autoresearch plugin built alongside a collaborator, with the agent maintaining state in a structured JSONL file across sessions.

There’s a developer named Lalit who had been procrastinating on a SQLite parser and linter project for years — four hundred grammar rules of tedious work that every would-be contributor bounces off — and finally shipped the prototype with Claude Code’s help. The reception in the community has been enthusiasm, not complaint.

You could extend this list. Andrej Karpathy. The long tail of indie developers shipping side projects that, five years ago, simply wouldn’t have existed because the cost of getting started was too high. None of these people are villains in the AI slop discourse. None of them are slowing down the people around them. Their names come up in the positive examples, not the complaints.

The diagnosis

So what’s different? It’s not the model. Simon, Kent, Armin, and the people writing angry blog posts are using roughly the same tools — Claude Code, Cursor, Codex, some mix. It’s not even the prompting technique. Most of the patterns are public; Simon has been documenting them for years.

The difference is that the people who ship quality AI-assisted work treat the output as their output. They read it. They test it. They know why every function exists and what happens when it breaks. Simon Willison has made a useful distinction: having an LLM generate every line of your code is not the same thing as vibe coding — provided you actually review, test, and understand what came out. One is using the model as a very fast typist. The other is abdicating. The word he’s landed on for the responsible version, vibe engineering, is ugly on purpose — it refuses the cleanness of pretending there’s a category of serious AI use that doesn’t involve serious human judgment.

The developer who drops four thousand lines of generated code into a PR with a ticket number and a shrug is not losing a fight with their tool. They’re losing a fight with the expectation that engineers understand what they ship. That expectation predates LLMs by about fifty years. It’s not an AI problem. It’s an accountability problem wearing an AI costume.

Why the backlash is correct anyway

None of this is an argument that the slop complaints are wrong to be loud. They are exactly as loud as they need to be. When the cost of producing plausible-looking work collapses, the ratio of serious work to performative work gets harder to read from the outside. Reviewers, maintainers, and teammates end up absorbing the evaluation that the submitter should have done. That’s a real cost, and venting about it is how a community establishes new norms in real time.

But the framing — this tool is ruining our craft — is diagnostically off. The tool is exposing something about the craft that was already there. The person who now submits four thousand lines of generated code is, in most cases, the same person who would have submitted four hundred lines of Stack Overflow copy-paste a decade ago with slightly more friction. What’s changed is the spread. The gap between the best and worst practitioners on any given task has widened, because the tool amplifies whatever judgment the user brings. If you have taste, you ship in a morning what used to take a week. If you don’t, you now generate in a morning what it takes a reviewer a week to unpick.

The quiet pattern in all the positive examples above is the same pattern visible in any good senior engineer’s workflow for the last thirty years: clear intent going in, a short feedback loop, honest reading of the output, willingness to throw it away when it’s wrong. The people who already had those habits got a force multiplier. The people who didn’t, got exposed.

The boring conclusion

There’s a comfortable version of this debate where you pick a side — AI good, AI bad — and call anyone on the other one a shill or a Luddite. The actual situation is more annoying. The tool is real. The slop is real. The productivity gains are also real. The people producing high-quality AI-assisted work are not a rhetorical fiction invented by Anthropic’s marketing team; they have names and public output you can go read.

So the next time someone forwards you a rant about how AI is destroying code review, the right response isn’t to defend the tool. It’s to ask who wrote the PR.

Don’t blame the hammer. The hammer does exactly what the hand tells it to do. That’s the entire point of a hammer.