Graham Trott

Posted on May 11

When AI writes the code, what should humans actually read?

#code #ai #review

There is an open secret in the world of vibe coding. The people commissioning the work — the ones with the product idea, the domain expertise, the actual customer in mind — usually cannot read the output. They prompt, the model produces, and the result is a tower of TypeScript or Python they accept on faith because they have no way to verify it. The validation step gets quietly skipped. "It runs" becomes "it's correct."

This is not a moral failing. It's a tooling problem. And I think the way out of it is hiding in plain sight.

The problem with normal code in an AI-first workflow

If your premise is that AI is going to do most of the routine writing of code, then the human's job shifts. We move from authors to reviewers. From "did I express this correctly?" to "did the machine express what I meant?"

Reviewing is a different job from writing, and it has different tooling needs. When you're writing, you want a fast feedback loop — autocomplete, jump-to-definition, a fast test runner. When you're reviewing, you want comprehension support — context next to the code, an explanation of why this section exists, and confidence that what you're reading is actually what's running.

Most editors are still optimised for the writer. The reviewer has to piece things together: read the code, hunt for a docstring above it, hope the docstring still matches, then mentally verify against intent. For an experienced developer writing their own code, this can be fast. For a vibe coder reviewing AI output, it's almost impossible.

Two things to fix

I've been working on the editor side of this for AllSpeak, a multilingual scripting language. AllSpeak allows the same programs to be written in French, German, Italian, or any other language we add. The combination of natural-language source with a review-first editor is starting to look like a real answer to the validation gap.

The first screenshot below shows the editor in normal ("raw") mode, showing a documentation block followed by some code. Because of the color-coding, the eye skips over the documentation quite easily; it's not meant to be read here.

The second screenshot shows the editor in Blocks mode, displaying the same piece of code but with its documentation in the right-hand pane, making code review far simpler. On the left is a list of all the blocks, for navigation, or you can use the up and down arrows in the toolbar. This is just a starting point; the editor could have a long way to go.

Maintaining such a structure would be a daunting task without the help of AI. This is an almost free gift we should take full advantage of.

There are two specific changes I'm making.

First, sections of code get a documentation block above them, in a structured comment format. Nothing radical there — literate programming has done variations on this for decades. The new bit is that each block contains two SHA hashes: one for the documentation, one for the code section it describes. If either changes without the other being deliberately re-paired, the editor flags drift.

This is cheap, mechanical, and solves a problem that has plagued every codebase I've ever worked on. Documentation rots silently. Cryptographic pairing makes the rot clearly audible.

Second, the editor gains a side-by-side mode that shows one section at a time, code on one side, its documentation on the other. The reviewer sees a small, focused unit and can ask the only question that matters: does the code do what the prose says it does?

That's a comparison task, not a comprehension task. Comparison is much easier than comprehension for non-experts — and that's the entire point.

Of course, all of this only becomes possible when AI is doing the coding, as is increasingly the case. Human coders, however professional, don't like to maintain comprehensive documentation for their code. Documentation gets in the way of coding and is usually regarded as an imposition, so the bare minimum is all that gets written. An agent, on the other hand, has endless patience and is more than willing to take on such a task.

Why this matters more for AllSpeak than for Python

Here's where the language choice does real work. If the code is dense Python with framework conventions a non-developer can't parse, asking "does the code match the prose?" is still a comprehension task in disguise. The reviewer has to understand the code first, then compare. The validation gap stays roughly where it was.

If the code is AllSpeak — close enough to English that a careful reader can follow it line by line — the gap narrows considerably. The reviewer reads two pieces of natural-ish text and checks whether they agree. They don't need to know what a decorator is, or how async resolves, or which way the data flows through a hook. They just need to read.

That's the leverage point. AllSpeak by itself simplifies syntax; the review tooling by itself simplifies workflow; together they change who can credibly validate generated code.

Files become packages, not text

A side effect of all this: a source file is no longer just a sequence of statements. It's a structured package containing code sections, documentation sections, and the cryptographic links between them. The raw form might look a bit ugly opened in vim — comment blocks dominate — but it's not really meant to be read raw any more than a minified JavaScript bundle is.

I want to be careful with this claim, though. There's a temptation to push it further than it deserves. "Humans don't need to read raw code any more" is not quite right. Sometimes the editor is unavailable. Sometimes you're debugging at 2am with grep and a terminal. Sometimes a future tool needs to interoperate with your files and the only sane interface is plain text. The defensible version of the claim is softer: humans should rarely need to read the raw form, but the format should remain legible in extremis. AllSpeak's plain-English nature preserves that floor even with the scaffolding around it.

What I'd encourage other tool-makers to think about

If the future of coding is mostly machine-written, the tooling we should be investing in is the tooling that helps humans check what the machines produced. That's underbuilt right now. The current generation of AI coding tools — the Lovables and v0s and Bolts of the world — focus almost entirely on generation. They produce React, Next.js, the standard opaque stack, and they assume the user will accept whatever comes out. For users who can't read the output, that assumption is shaky at best.

A few things I think are worth borrowing or stealing from what I'm building:

Treat documentation as a first-class artefact paired with code, not a comment that floats nearby
Use cryptographic pairing or some equivalent to make drift visible
Build review modes that show one unit at a time with context attached
Pick a source language whose readability matches the average reviewer's skill level

The last one is the hardest sell to the developer audience because it sounds like a step backwards. But if you accept the premise that AI is going to write most of the code and humans are going to review most of it, then optimising the source language for human reading — even at some cost to expressive density — starts to look like exactly the right trade.

A small invitation

I'm writing this as the AllSpeak editor work progresses. If you're building tools in this space, or if you're a vibe coder who's quietly worried about whether you can really vouch for what you're shipping, I'd be very interested in hearing from you.

The future where AI writes everything and humans rubber-stamp it is the bad version. The future where AI writes everything and humans actually read and approve it is the one worth building toward. The difference is mostly about tooling.

Postscript

The editor described here is written in the JS implementation of AllSpeak to run in a browser. It is served from http://localhost:8080 by a smaller AllSpeak module written in the Python implementation, which has access to all files and system resources.

At the time of writing, the editor (asedit.as) comprises 944 lines of AllSpeak code, 173 lines of comment and 44 blank lines. The block view addition was added in one day by Claude Code, using continuous prompt/review.

This document was proposed and argued by me, written by Claude and edited by me. I take full responsibility for the content.

Photo by Volodymyr Dobrovolskyy on Unsplash

Top comments (1)

Jill Mercer • May 11

vibe coding changed how i look at my apps—i stopped reading every line of boilerplate and started focusing on the intent. after getting burned by platform pivots, i realized owning the presentation layer is the only thing that actually matters. i’m still figuring it out in cursor, but i’ve moved toward using open specs like blueprint protocol to keep the logic clear for the next agent. vibe first, polish later—just make sure you can still steer the ship.