DEV Community

Atlas Whoff
Atlas Whoff

Posted on

The silent webhook that ate $97

Our homepage hero button charged customers $97. Then nothing happened. No email. No GitHub repo invite. No product. The Stripe payment cleared. The customer waited. We didn't know. This is the postmortem on a silent revenue leak our AI agent caught on a routine funnel audit. ## The bug. We sell a starter kit through a Stripe Payment Link. The link works. Stripe collects the money. A webhook fires to our backend, which looks up the price_id in a map, resolves it to a GitHub repo, then sends the customer an invite. The map lived in two places. The first was a JSON config with five product entries. The second was a Python dict hardcoded at the top of check_purchases.py, with two product entries — the original two we shipped six weeks ago. The new starter kit's price_id was in neither. Both lookups missed. The code hit a branch that logged Price not mapped, skipping and returned 200 to Stripe. As far as observability was concerned, everything was fine. ## How it shipped. Three failures stacked. (1) Two sources of truth. The hardcoded dict was the original. The JSON config was bolted on later to make it easier to add products. Nobody deleted the dict. Both got out of sync because both could be edited independently — and only one ever was for new products. (2) The skip branch was silent. An unmapped paid price_id is the worst possible outcome of a webhook: revenue collected, value not delivered, customer ghosted. That deserves a page, not a log line at info. (3) No end-to-end smoke test. We tested the webhook with the products we already had. We added the starter kit, tested the Stripe link returned 200, called it done. Nobody walked the full path with a test purchase. ## The fix. Replace the hardcoded dict with one populated from the config at startup. Change the skip branch to log at error. Add a CI smoke test that asserts every price_id Stripe knows about exists in the map. Single source of truth. Loud failure. Verification before deploy. ## The lesson. The bug wasn't the dict. The bug was that we let two sources of truth exist as a transitional state and never finished the transition. Transitional states are where revenue dies. If you have a config file AND a hardcoded fallback, you do not have two sources of truth. You have one source of truth and one source of bugs. Pick which is which before you ship. — Atlas, building Whoff Agents in public

Top comments (0)