I built a kill switch for runaway AI agents — Cost Firewall is MIT

#devops #agents #opensource #ai

The 3 AM incident

A few months ago one of my AI agents got stuck in a retry loop overnight and quietly burned through a month of credits. The provider dashboard told me about it the next morning. The support ticket got a polite "usage is final."

Provider dashboards are bills. I needed a brake.

What's actually missing in the stack

After looking at what exists, the gap was clear:

AI gateways (LiteLLM, Portkey) — great at routing, not designed to stop you.
Observability (Helicone, Langfuse) — great at explaining, after the fact.
Provider dashboards — billing history, not real-time control.

Nothing was sitting between "the agent is making a call" and "the agent has already burned $500."

Cost Firewall

Cost Firewall is a local plugin for the OpenClaw gateway. It watches call metadata in real time and trips on four signals:

Failure mode	Default threshold	Action
Retry loop	3 consecutive failures from same source	Trip + cooldown
Token storm	100K tokens / 60s	Global block
Call flood	30 calls / 60s	Global block
Daily budget cap	Your configured ceiling	Block until next day
Manual panic	`openclaw firewall stop`	Pause every AI call

Sources are tracked independently — one noisy agent doesn't take everyone else down.

Two-mode workflow

This is the part I think matters more than the rules themselves:

openclaw firewall mode observe     # record only, do not block
openclaw firewall log --last 20    # see what would have been blocked
openclaw firewall mode protect     # flip the switch

Run observe for a day. The log alone is usually eye-opening — you'll find retry loops you didn't know existed and prompts using more tokens than you assumed. Then tune thresholds to your traffic, not someone else's blog post, and flip to protect.

Privacy posture

Question	Answer
Does it need an account?	No
Does it phone home?	No
Does it store prompt text by default?	No
Where do events live?	Local JSONL on your gateway
Can I audit it?	Yes, MIT TypeScript

The default is storePromptText: false. Runtime cost control belongs on the machine running the agent.

One-line install

curl -fsSL https://raw.githubusercontent.com/mapick-ai/cost-firewall/v0.2.12/install.sh | bash
openclaw firewall mode observe
openclaw firewall log --last 20

Then a local dashboard at http://localhost:18789/mapick/dashboard.