Why I Prefer Chicago-Style TDD

#programming #testing #tdd

There are two big schools of TDD, and most devs end up in one without ever really picking sides. Quick recap: London-style (mockist) drives design by mocking collaborators and asserting on interactions. Chicago-style (classicist) builds from the inside out using real objects and asserting on values and state.

I'm firmly in the Chicago camp. Here's why, and how it pairs really nicely with hexagonal architecture.

Mocks belong at boundaries. That's it.

I'm not anti-mock. Mocks are great when you're crossing a boundary you don't control or don't want to hit in a test: the database, an HTTP API, a message queue, the clock, the filesystem. Anything where the alternative is slow, flaky, or has side effects you can't take back.

Inside the boundary? Use real objects. If two domain classes collaborate, let them collaborate. Build them up, call the method, assert on what came out. That's the test.

The moment you start mocking your own internal types, you've stopped testing your system and started testing your assumptions about how your system should talk to itself. Those are very different things.

Mocking hides real problems

When you mock a collaborator, you're hand-writing what you expect it to return. That mock will happily return whatever you told it to, forever, even if the real thing's contract changed three refactors ago. Your test stays green. Production breaks.

Real objects don't let you get away with that. If the collaborator's behavior changes, the test that uses it will notice, because it's actually running through it. You get an early, honest signal. Mocks give you a comfortable, dishonest one.

Good tests don't care about implementation. Mocks force you to care.

This is the part that bugs me most. The whole point of a test, in my view, is "given this input, I expect this output (or this state change)." That's it. I shouldn't have to know, or care, how the code gets there. That freedom is what makes refactoring safe.

Mock-heavy tests destroy that. Now your test is asserting things like "the service called repository.findById exactly once with this argument, then called mapper.toDto, then…" You've baked the implementation into the test. The minute you reorganize the internals, even if behavior is identical, your tests light up red. That's not a useful signal. That's friction.

And here's the kicker: with strict mocking, you don't even need to implement the thing correctly. As long as the calls match the expectations, the test passes. You can satisfy a contract without honoring it. I find that genuinely unsettling. The test isn't proving the code works; it's proving the code makes the right phone calls.

Use real objects. Fake the rest.

My default is: real objects everywhere I can get away with it. When I can't (boundaries, again), I reach for fakes before mocks.

A fake is a real, working implementation that's just simpler. An in-memory repository that stores things in a map instead of Postgres. An email sender that appends to a list instead of calling SendGrid. The fake has actual behavior (you can put things in and get things out) so your test exercises a real interaction, not a scripted one.

For things I need to observe (was this notification sent? did we publish an event?), I use spies. A spy records what happened so I can assert on it after the fact, without dictating the shape of every internal call up front.

Then I test on values. Did the function return what it should? Is the system in the state I expect? Did the right event end up in the spy's recorded list? That's it. No "verify was called with." No call-order assertions. Just inputs and outputs and observable state.

Hexagonal architecture makes this almost free

If you've used hexagonal architecture (a.k.a. ports and adapters), most of the work for Chicago-style TDD is already done.

Quick refresher: your domain lives in the middle. It defines ports — interfaces that describe what it needs from the outside world (a UserRepository, a PaymentGateway, a Clock). Adapters are the concrete implementations that plug into those ports: a Postgres adapter, a Stripe adapter, a system clock.

The domain doesn't know or care which adapter it's running against. It just talks to the port.

In production, you wire up the real adapters. In tests, you wire up fakes: an in-memory UserRepository, a FakePaymentGateway that records charges, a FixedClock you control. Same port, different adapter. The domain has no idea anything changed, which is exactly what you want.

What you get:

Real domain logic actually runs in your tests. No mock-puppeteering. The classes you ship are the classes under test.
Fast tests. No DB, no network, no sleeps. In-memory fakes are essentially free.
Parallel ready. Since there is no contention for the database or network, the tests can be parallelized much easier.
Tests that survive refactoring. Move methods around, rename internals, split a class in two — as long as the port contracts hold and the outputs match, your tests stay green.
Mocks stay at the edges, where they belong. And often you don't even need them, because a well-written fake adapter does the job better.
The fakes become a design tool. If a fake adapter is painful to write, that's usually the port telling you it's badly shaped. Listen to it.

The architecture and the testing style reinforce each other. Hexagonal pushes side effects to the edges; Chicago-style TDD wants the middle to be real and the edges to be swappable. Same idea from two angles.

Wrapping up

Test the behavior of your system through its real objects. Push side effects to the boundary. Swap those boundaries for fakes when you test. Assert on values and state, not on call patterns.

You end up with tests that tell you when something is actually broken, stay quiet when you refactor, and let you change your mind about implementation without paying a tax. That's the whole job.

Mocks are a tool. A useful one, at the edges. But if they're showing up all through your test suite, your tests have stopped describing what your software does and started describing how you currently happen to have written it. Those two things should never be the same.