There once was a junior who spent a week on one unit test. Thirty-seven commits. Each one broken. His senior approved the thirty-seventh because the sprint ended Friday. The test went to production. It passed, but only because it didn't actually test anything.
Six months later, a real bug shipped. Same function. The checkout flow broke for twelve hours. The team spent two days in war rooms. A feature launch got pushed. The business felt it.
That wasn't AI. That was a human approving code he didn't understand to make a deadline.
I know that story intimately ... not because I was the junior, but because I've been the senior who clicked approve.
The project delivery pressure is constant when you're shepherding code to production. The back-and-forth on pull requests feels like a tax you can't afford when the sprint is ending and the demo is Monday. So you start making calculations. Is this worth the discussion? Will anyone notice if I just push this through?
I made that calculation once. Clicked approve on a PR I hadn't fully reviewed because I was already three meetings behind and the deploy window was closing. One line of code. One missing equal sign. = instead of ===. JavaScript did what JavaScript does ... assigned instead of compared ... and suddenly a critical path was returning truthy for every user.
The release went out the door. And instantly, we found the issue.
We had to roll back the entire release. Pull it back. Fix the line. Redeploy. The feature that was supposed to ship stayed in staging while we cleaned up the mess I had approved.
When we traced it back, there it was: my approval stamp on a line I hadn't read carefully enough. The dashboard said PRs merged. It didn't say comprehension verified.
The Blame Reflex
Nothing about the root cause changed.
This is the part that gets lost in the current conversation about "AI slop." Search Twitter or LinkedIn and you'll find endless threads about how AI is flooding codebases with garbage ... vibe coders shipping hallucinated dependencies, juniors deploying agents they don't understand, repositories becoming "junk drawers with a CI/CD pipeline."
The AI didn't create that mess. Humans rushing did. The AI just made the rush faster.
The frustration is real. The diagnosis is wrong.
AI isn't creating technical debt. It's amplifying technical debt that already existed. It's exposing the gaps in your process that you could paper over when humans were the bottleneck. When the speed limit was human typing speed, your broken review process had natural constraints. Now that limit is gone, and the cracks are showing.
What the Research Actually Shows
A 2023 study from GitClear analyzed 153 million lines of changed code and found that code churn ... the percentage of lines that are reverted or updated within two weeks of being committed ... has increased significantly in AI-assisted environments. The code is moving faster, but it's also moving more. The suggestion isn't that AI writes worse code. It's that humans are reviewing it less carefully.
Which brings us back to that junior with thirty-seven broken commits.
The problem wasn't the thirty-seventh commit. It was the thirty-six approvals that came before. The system that let broken code reach production wasn't a tooling problem. It was a permission problem. Someone had permission to ship code they didn't understand, and someone else had permission to approve it without understanding it either.
AI didn't change that permission structure. It just accelerated it.
The Three Accountability Shifts
If you want to stop blaming AI for your technical debt, you need to make three shifts in how you think about accountability.
Shift 1: From Tool Blame to System Ownership
When that production incident hit from my missing equal sign, my first instinct was defensive. The diff was long. The deadline was tight. Anyone could have missed it.
But anyone didn't approve it. I did. And the system I was operating in ... where PR count mattered more than review depth, where velocity was celebrated and rigor was invisible ... made that mistake inevitable.
AI puts the same pressure on every engineer now. The junior with agent mode is facing the same deadline squeeze I faced. The same dashboard that tracked my PR throughput is tracking theirs. The question isn't whether AI will generate slop. It's whether your system has permission structures that catch slop before it ships.
Your AI policy isn't a documentation problem. It's a values problem. What you measure is what you get. What you don't measure is what drifts.
Shift 2: From Process Theater to Process Reality
I've reviewed pull requests at four companies. The checkbox is always clicked. The approval is given. The JIRA ticket moves to Done. But the actual work of comprehension ... sitting with the code, tracing the edge cases, understanding the system being changed ... that work got traded for velocity long before AI arrived.
I've sat in retrospectives where we discussed "improving our review process" for thirty minutes and then assigned the action item to "be more careful." As if care was a resource we could budget. As if attention wasn't already competing with Slack notifications, calendar invites, and the next sprint's planning.
AI makes this worse only because it makes the theater more convincing. The PR looks professional. The code is syntactically clean. The agent followed patterns. It's easy to convince yourself that a quick skim is sufficient when the surface looks polished.
But technical debt doesn't live on the surface. It lives in the assumptions. The edge cases. The interactions between systems that the author didn't fully model. The thing the junior didn't understand when they asked the AI to generate the fix.
You can't review what you don't understand. And you can't understand what you haven't taken the time to sit with.
Shift 3: From Foundation Excuses to Foundation Reality
This is the hardest shift because it requires honesty about where your team actually is, not where your onboarding docs say you are.
I worked with a team once that had extensive coding standards documents. Pages of them. Every new hire got the link. Every PR template referenced them. But when I started reading the actual PRs, I found the same three anti-patterns in every other commit. The standards existed. The enforcement didn't.
The junior with thirty-seven broken commits didn't fail the system. The system failed him. He was operating exactly within the bounds of what was permitted. Commits were cheap. Approvals were easy. Tests were optional if they "passed" on CI. The definition of "good enough" had drifted so far from "correct" that "compiles" became the bar.
AI will operate within those same bounds. It will generate code that matches your patterns ... your good patterns and your bad ones. It will reproduce the shortcuts you take because those shortcuts are embedded in your codebase, your examples, your implicit standards.
If your foundation is slop, AI is a slop multiplier. If your foundation is rigor, AI is a rigor multiplier. The tool doesn't care which one you have. It just makes more of it, faster.
The Foundation Question
So here's the question I'd ask any leader who's tempted to blame AI for their technical debt:
What were you allowing before the AI arrived?
Not what was in your documentation. Not what was in your vision deck. What was actually permitted by your dashboard, your incentives, your daily trade-offs?
If PRs merged without comprehension, AI will merge PRs without comprehension faster. If edge cases were handled by "we'll fix it in production," AI will generate code with that same assumption baked in. If your seniors were too busy to review carefully, your juniors with AI agents will be too busy to verify carefully.
The junior with thirty-seven broken commits and the senior who approved them are still in your organization. They might be using Cursor now. They might be prompting Claude. But they're playing the same game with the same rules. The only difference is the speed.
I still think about that rollback. The equal sign I didn't catch. The release we had to pull back. The feature that stayed in staging because I wanted to save five minutes on a Friday afternoon.
The AI didn't create that scenario. I did. The AI just would have created it faster.
That's the thing about technical debt. It doesn't care about your tools. It cares about your discipline. And discipline is what erodes when nobody's watching the thing that actually matters ... not how fast the code moves, but how well the humans understand it before it goes.
Your AI isn't generating slop. It's reflecting the slop you were already permitting. The mirror is just clearer now.
One email a week from The Builder's Leader. The frameworks, the blind spots, and the conversations most leaders avoid. Subscribe for free.
Top comments (11)
The missing equal sign story hits because everyone who's approved code has done this. Not the same bug maybe. But the same math. Speed over scrutiny. Deadline over durability. And then the rollback.
You're right that AI didn't create this. It just removed the last excuse. Before, you could blame slow typing or unclear requirements. Now the code appears instantly. The bottleneck isn't generation anymore. It's judgment. And judgment was already the weak link.
The thirty-seven commits example is brutal. Not because the junior kept failing. Because the senior kept approving. The system didn't just allow broken code. It rewarded it. Sprint ended. PR merged. Dashboard green. Nobody asked if the test actually tested anything.
That's not a tool problem. That's a permission problem. And AI just made permission cheaper to grant.
The foundation question is the real one. What were you allowing before? Not what you said you allowed. What actually shipped. The slop in your codebase didn't appear last month. It accumulated over years of "good enough for now." AI just makes more of it, faster. And cleaner. Which makes it harder to spot.
The mirror is clearer now. That's uncomfortable. But blaming the mirror won't fix the reflection.
You got it. The thirty-seven commits is the part that still haunts me. Not the junior. The senior.
We had permission structures that rewarded green dashboards over actual understanding. That existed before AI. AI just made the gap between "approved" and "comprehended" wider and harder to see.
The mirror line at the end ... that is the real point. The reflection was always there. AI just made it impossible to ignore.
If you've got AI code review, I'm pretty sure you wouldn't ship a single equal sign when you meant ===. I still think it's about the harness that you wrap around your development process. An AI can be equally a reviewer as it can be an author. I think making sure the human in the loop is dealing at the right level is the important thing today.
The harness is everything. We run AI review on my team ... and we still found a bug last month that made it through three human reviewers and the AI. Single character error. The AI flagged it as "low confidence" but the human merged anyway.
The issue is not whether AI catches things. It is whether the human in the loop knows when to stop and look closer. AI gives you speed. Speed without comprehension is just faster debt accumulation.
This is why I started my project from scratch instead of
forking an existing codebase. The temptation to fork
Substrate or Tendermint was real but you inherit years of
decisions that don't match your use case. Starting clean
is slower but every line exists for a reason you
understand.
Starting from scratch is a luxury. Most of us inherit.
The real skill is not choosing between clean slate and legacy. It is learning to see which decisions in the legacy code were load-bearing and which were accidents of timing. That takes time sitting with the system ... the thing we trade for speed.
Fork or fresh, the debt follows the same rule. It accumulates where comprehension was traded for velocity.
That's fair. Starting from scratch was only possible
because there was nothing to inherit yet. If I was
joining an existing chain I'd be doing exactly what
you described, figuring out which decisions were
intentional vs accidental. The debt rule applies
either way.
This resonates. The pattern I'm seeing in AI/LLM projects specifically is that people skip input validation entirely because "the model handles it." Same energy as skipping SQL parameterization because "the ORM handles it" fifteen years ago. The debt isn't new, it's just accumulating faster because the iteration speed is higher and the failure modes are less visible — a prompt injection doesn't throw a stack trace, it just silently changes behavior.
This is exactly the pattern. "The model handles it" is the new "the framework handles it."
We learned this the hard way with our first agent rollout. Assumed the LLM would validate inputs because ... well, it seemed smart. Found out in staging that it would happily accept malformed data if the prompt was phrased confidently.
Now we validate at boundaries. Same as any other system. The model is not a boundary. It is a component.
Exactly. "The model is not a boundary, it is a component" — that's the line I wish more teams internalized before their first agent rollout. Glad you caught it in staging and not production.
sprint deadline as cover for bad reviews - that story is painfully familiar. been in the "we'll clean it next sprint" conversation more times than I can count.