<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community</title>
    <description>The most recent home feed on DEV Community.</description>
    <link>https://dev.to</link>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/"/>
    <language>en</language>
    <item>
      <title>I Compared TOON vs Minified JSON Using OpenAI’s Tokenizer</title>
      <dc:creator>Khaja Hussain</dc:creator>
      <pubDate>Tue, 12 May 2026 00:29:29 +0000</pubDate>
      <link>https://dev.to/khaja_hussain_db1f84efe83/i-compared-toon-vs-minified-json-using-openais-tokenizer-31d0</link>
      <guid>https://dev.to/khaja_hussain_db1f84efe83/i-compared-toon-vs-minified-json-using-openais-tokenizer-31d0</guid>
      <description>&lt;p&gt;Recently I noticed a lot of developers talking about TOON:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/toon-format/toon" rel="noopener noreferrer"&gt;TOON GitHub repository&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The idea behind TOON is interesting. Instead of sending traditional JSON, TOON tries to reduce token usage and make data more compact for LLMs.&lt;/p&gt;

&lt;p&gt;Since token costs are becoming a real concern for AI products, I wanted to test it myself.&lt;/p&gt;

&lt;p&gt;Not theoretically.&lt;br&gt;
Not with benchmarks from slides.&lt;br&gt;
Just a simple real-world comparison.&lt;/p&gt;

&lt;p&gt;I used OpenAI’s tokenizer tool and compared:&lt;/p&gt;

&lt;p&gt;TOON format&lt;br&gt;
Minified JSON&lt;/p&gt;

&lt;p&gt;For the conversion process, I used:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.jsonparser.ai/json-to-toon/" rel="noopener noreferrer"&gt;JSON to TOON Converter&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.jsonparser.ai/json-minifier/" rel="noopener noreferrer"&gt;JSON Minifier&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Minified JSON:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;{"user":{"id":1001,"name":"Khaja","email":"&lt;a href="mailto:khaja@example.com"&gt;khaja@example.com&lt;/a&gt;","roles":["admin","developer"],"settings":{"theme":"dark","notifications":true}}}&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;TOON version:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;user:&lt;br&gt;
  id: 1001&lt;br&gt;
  name: Khaja&lt;br&gt;
  email: &lt;a href="mailto:khaja@example.com"&gt;khaja@example.com&lt;/a&gt;&lt;br&gt;
  roles[2]: admin,developer&lt;br&gt;
  settings:&lt;br&gt;
    theme: dark&lt;br&gt;
    notifications: true&lt;br&gt;
`&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;TOON Result&lt;/strong&gt;: Tokens: 37&lt;br&gt;
&lt;strong&gt;Minified JSON Result&lt;/strong&gt;: Tokens: 38&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkfpv9c3lrqg6uuiur4f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkfpv9c3lrqg6uuiur4f.png" alt=" " width="526" height="491"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5qx4utouecdafrseyhp3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5qx4utouecdafrseyhp3.png" alt=" " width="537" height="456"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That means TOON saved only 1 token in this example.&lt;/p&gt;

&lt;p&gt;Honestly, that surprised me.&lt;/p&gt;

&lt;p&gt;So… Is TOON Useful?&lt;/p&gt;

&lt;p&gt;I think the answer is: yes, but with nuance.&lt;/p&gt;

&lt;p&gt;TOON is not “bad.”&lt;br&gt;
In fact, I actually like the direction.&lt;/p&gt;

&lt;p&gt;It makes developers think seriously about:&lt;/p&gt;

&lt;p&gt;token efficiency&lt;br&gt;
AI-friendly data formats&lt;br&gt;
prompt optimization&lt;br&gt;
serialization overhead&lt;/p&gt;

&lt;p&gt;Those are important conversations.&lt;/p&gt;

&lt;p&gt;But after testing it, I’m not convinced that TOON alone will dramatically reduce costs for most companies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In many cases&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;removing whitespace already gives huge savings&lt;br&gt;
gzip/brotli compression already works extremely well&lt;br&gt;
AI models are already heavily trained on JSON structures&lt;/p&gt;

&lt;p&gt;So the practical gains may be smaller than the hype suggests.&lt;/p&gt;

&lt;p&gt;But Small Savings Can Still Matter at Scale&lt;/p&gt;

&lt;p&gt;Here’s the interesting part.&lt;/p&gt;

&lt;p&gt;Even tiny optimizations matter when companies process millions of requests.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Imagine&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;10 million API calls&lt;br&gt;
large AI prompts&lt;br&gt;
multiple agents&lt;br&gt;
long conversation histories&lt;/p&gt;

&lt;p&gt;Saving even 1–2% tokens at scale could potentially save hundreds or thousands of dollars over time.&lt;/p&gt;

&lt;p&gt;So I do understand why people are excited about TOON.&lt;/p&gt;

&lt;p&gt;The Bigger Challenge: Ecosystem&lt;/p&gt;

&lt;p&gt;Personally, I think TOON’s biggest challenge is not token count.&lt;/p&gt;

&lt;p&gt;It’s ecosystem adoption.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;JSON already has&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;mature tooling&lt;br&gt;
validators&lt;br&gt;
parsers&lt;br&gt;
database support&lt;br&gt;
IDE integrations&lt;br&gt;
API ecosystem dominance&lt;/p&gt;

&lt;p&gt;Replacing that is extremely difficult.&lt;/p&gt;

&lt;p&gt;In real production systems, compatibility usually matters more than tiny syntax improvements.&lt;/p&gt;

&lt;p&gt;My Take&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After testing both formats, my conclusion is&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;TOON introduces interesting ideas&lt;br&gt;
token savings appear modest in smaller examples&lt;br&gt;
the ecosystem challenge is massive&lt;br&gt;
but the conversation around AI-native serialization formats is valuable&lt;/p&gt;

&lt;p&gt;I don’t think JSON is disappearing anytime soon.&lt;/p&gt;

&lt;p&gt;But I do think experiments like TOON push the industry forward.&lt;/p&gt;

&lt;p&gt;And honestly, that’s a good thing.&lt;/p&gt;

&lt;p&gt;Would love to hear what other developers think or any suggestion what to compare next.&lt;/p&gt;

</description>
      <category>llm</category>
      <category>ai</category>
    </item>
    <item>
      <title>I Replaced My Code Reviewer with AI — Here's the Exact Prompt Workflow That Catches 90% of Bugs</title>
      <dc:creator>Suifeng023</dc:creator>
      <pubDate>Tue, 12 May 2026 00:20:23 +0000</pubDate>
      <link>https://dev.to/suifeng023/i-replaced-my-code-reviewer-with-ai-heres-the-exact-prompt-workflow-that-catches-90-of-bugs-4ioa</link>
      <guid>https://dev.to/suifeng023/i-replaced-my-code-reviewer-with-ai-heres-the-exact-prompt-workflow-that-catches-90-of-bugs-4ioa</guid>
      <description>&lt;h1&gt;
  
  
  I Replaced My Code Reviewer with AI — Here's the Exact Prompt Workflow That Catches 90% of Bugs
&lt;/h1&gt;

&lt;p&gt;My senior colleague used to spend 4 hours a day reviewing pull requests. When he left the company, our bug rate doubled.&lt;/p&gt;

&lt;p&gt;Then I built an &lt;strong&gt;AI-powered code review pipeline&lt;/strong&gt; using Claude that catches bugs, security issues, and performance problems in under 5 minutes per PR.&lt;/p&gt;

&lt;p&gt;After 6 months and 400+ PRs reviewed, here's the complete system that actually works.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Most AI Code Reviews Suck
&lt;/h2&gt;

&lt;p&gt;I've seen teams try "AI code review" and give up within a week. Here's what goes wrong:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;❌ &lt;strong&gt;Too vague:&lt;/strong&gt; "Review this code" → gets generic "looks good" responses&lt;/li&gt;
&lt;li&gt;❌ &lt;strong&gt;No context:&lt;/strong&gt; AI doesn't know your coding standards, architecture, or business logic&lt;/li&gt;
&lt;li&gt;❌ &lt;strong&gt;Reviewing everything:&lt;/strong&gt; AI flags style issues and misses actual bugs&lt;/li&gt;
&lt;li&gt;❌ &lt;strong&gt;No triage:&lt;/strong&gt; Everything looks equally important&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The fix? &lt;strong&gt;Give AI a specific role, context, and review checklist.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  My 5-Step AI Code Review System
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: The PR Summary Prompt
&lt;/h3&gt;

&lt;p&gt;Before reviewing code, have AI summarize what changed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;You are a senior software engineer reviewing a pull request.

&lt;span class="gu"&gt;## PR Information&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Title: {pr_title}
&lt;span class="p"&gt;-&lt;/span&gt; Description: {pr_description}
&lt;span class="p"&gt;-&lt;/span&gt; Files changed: {list_of_files}
&lt;span class="p"&gt;-&lt;/span&gt; Lines added: {lines_added}
&lt;span class="p"&gt;-&lt;/span&gt; Lines removed: {lines_removed}

&lt;span class="gu"&gt;## Diff&lt;/span&gt;
{git_diff}

Analyze this PR and provide:
&lt;span class="p"&gt;1.&lt;/span&gt; ONE SENTENCE summary of what this PR does
&lt;span class="p"&gt;2.&lt;/span&gt; List of files changed and WHY each was modified
&lt;span class="p"&gt;3.&lt;/span&gt; Any files that were modified but seem unrelated to the PR purpose
&lt;span class="p"&gt;4.&lt;/span&gt; A risk assessment (Low/Medium/High) with reasoning
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;This alone catches 20% of problems&lt;/strong&gt; — unrelated changes, scope creep, and PRs that do more than they claim.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: The Bug Hunt
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Continuing with the same PR, now perform a thorough bug analysis.

Check for:
&lt;span class="p"&gt;1.&lt;/span&gt; &lt;span class="gs"&gt;**Logic errors**&lt;/span&gt; — off-by-one, wrong conditions, missing edge cases
&lt;span class="p"&gt;2.&lt;/span&gt; &lt;span class="gs"&gt;**Null/undefined handling**&lt;/span&gt; — any place where a value could be null/undefined
&lt;span class="p"&gt;3.&lt;/span&gt; &lt;span class="gs"&gt;**Race conditions**&lt;/span&gt; — concurrent access, async timing issues
&lt;span class="p"&gt;4.&lt;/span&gt; &lt;span class="gs"&gt;**Resource leaks**&lt;/span&gt; — unclosed connections, missing cleanup, memory leaks
&lt;span class="p"&gt;5.&lt;/span&gt; &lt;span class="gs"&gt;**Error handling**&lt;/span&gt; — unhandled promise rejections, swallowed errors
&lt;span class="p"&gt;6.&lt;/span&gt; &lt;span class="gs"&gt;**Data integrity**&lt;/span&gt; — partial updates, inconsistent state, missing transactions

For each issue found:
&lt;span class="p"&gt;-&lt;/span&gt; File and line number
&lt;span class="p"&gt;-&lt;/span&gt; Severity: 🔴 Critical / 🟡 Warning / 🔵 Suggestion
&lt;span class="p"&gt;-&lt;/span&gt; What the bug is
&lt;span class="p"&gt;-&lt;/span&gt; Why it's a problem (real scenario)
&lt;span class="p"&gt;-&lt;/span&gt; Suggested fix (code snippet)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Security Review
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Now perform a security-focused review of this PR.

Check for:
&lt;span class="p"&gt;1.&lt;/span&gt; &lt;span class="gs"&gt;**Injection attacks**&lt;/span&gt; — SQL injection, XSS, command injection
&lt;span class="p"&gt;2.&lt;/span&gt; &lt;span class="gs"&gt;**Authentication/Authorization**&lt;/span&gt; — missing auth checks, privilege escalation
&lt;span class="p"&gt;3.&lt;/span&gt; &lt;span class="gs"&gt;**Data exposure**&lt;/span&gt; — sensitive data in logs, responses, or error messages
&lt;span class="p"&gt;4.&lt;/span&gt; &lt;span class="gs"&gt;**Input validation**&lt;/span&gt; — missing validation, type coercion issues
&lt;span class="p"&gt;5.&lt;/span&gt; &lt;span class="gs"&gt;**Dependency risks**&lt;/span&gt; — new packages added, known vulnerabilities
&lt;span class="p"&gt;6.&lt;/span&gt; &lt;span class="gs"&gt;**Secrets**&lt;/span&gt; — hardcoded credentials, API keys, tokens
&lt;span class="p"&gt;7.&lt;/span&gt; &lt;span class="gs"&gt;**CORS/misconfiguration**&lt;/span&gt; — overly permissive headers, settings

Rate each finding: 🔴 Critical / 🟡 Warning / 🔵 Info
Provide specific remediation for each.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Performance Analysis
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Now review this PR for performance issues.

Check for:
&lt;span class="p"&gt;1.&lt;/span&gt; &lt;span class="gs"&gt;**N+1 queries**&lt;/span&gt; — database calls inside loops
&lt;span class="p"&gt;2.&lt;/span&gt; &lt;span class="gs"&gt;**Missing indexes**&lt;/span&gt; — queries that would benefit from indexes
&lt;span class="p"&gt;3.&lt;/span&gt; &lt;span class="gs"&gt;**Unnecessary re-renders**&lt;/span&gt; — React component optimization issues
&lt;span class="p"&gt;4.&lt;/span&gt; &lt;span class="gs"&gt;**Memory inefficiency**&lt;/span&gt; — large arrays, unnecessary cloning, closure leaks
&lt;span class="p"&gt;5.&lt;/span&gt; &lt;span class="gs"&gt;**Blocking operations**&lt;/span&gt; — synchronous I/O, heavy computations on main thread
&lt;span class="p"&gt;6.&lt;/span&gt; &lt;span class="gs"&gt;**Pagination**&lt;/span&gt; — endpoints that load all records instead of paginating
&lt;span class="p"&gt;7.&lt;/span&gt; &lt;span class="gs"&gt;**Caching opportunities**&lt;/span&gt; — repeated identical computations or queries

For each issue:
&lt;span class="p"&gt;-&lt;/span&gt; Where it is (file:line)
&lt;span class="p"&gt;-&lt;/span&gt; Impact: 🟡 Moderate / 🔴 High
&lt;span class="p"&gt;-&lt;/span&gt; How to fix it (code example)
&lt;span class="p"&gt;-&lt;/span&gt; Estimated performance improvement
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: The Final Scorecard
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Based on all reviews above, generate a final scorecard:

&lt;span class="gu"&gt;## PR Scorecard&lt;/span&gt;

&lt;span class="gs"&gt;**Overall Assessment:**&lt;/span&gt; [Approve / Request Changes / Comment]

&lt;span class="gs"&gt;**Issues Summary:**&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; 🔴 Critical: {count}
&lt;span class="p"&gt;-&lt;/span&gt; 🟡 Warnings: {count}
&lt;span class="p"&gt;-&lt;/span&gt; 🔵 Suggestions: {count}

&lt;span class="gs"&gt;**Strengths:**&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; [What the PR does well]

&lt;span class="gs"&gt;**Must Fix Before Merge:**&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; [Only critical/warning items]

&lt;span class="gs"&gt;**Nice to Have:**&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; [Suggestions for future improvement]

&lt;span class="gs"&gt;**One-line review comment for the author:**&lt;/span&gt;
[Constructive, specific feedback]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Real Examples: Bugs AI Caught That Humans Missed
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Example 1: The Silent Data Loss
&lt;/h3&gt;

&lt;p&gt;A developer submitted a PR to add bulk user deletion:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// BEFORE AI review - looks fine at first glance&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;deleteUsers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userIds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;userIds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DELETE FROM users WHERE id = $1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;AI caught:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🔴 Critical — Missing cascade delete. Users have related records in &lt;code&gt;orders&lt;/code&gt;, &lt;code&gt;sessions&lt;/code&gt;, and &lt;code&gt;audit_logs&lt;/code&gt; tables. This will either fail with foreign key violations or leave orphaned records depending on your DB constraints.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Fixed version AI suggested:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;deleteUsers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userIds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DELETE FROM audit_logs WHERE user_id = ANY($1)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;userIds&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DELETE FROM sessions WHERE user_id = ANY($1)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;userIds&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DELETE FROM orders WHERE user_id = ANY($1)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;userIds&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;rowCount&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DELETE FROM users WHERE id = ANY($1)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;userIds&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;rowCount&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;deleted&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example 2: The Auth Bypass
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Middleware that "validates" admin access&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;requireAdmin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;admin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;AI caught:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🔴 Critical — Missing &lt;code&gt;else&lt;/code&gt; clause. If user is not admin, the request hangs and eventually times out instead of returning 403. Also, no check for &lt;code&gt;req.user&lt;/code&gt; being undefined (unauthenticated requests pass through).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Example 3: The $5,000/Month Query
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Dashboard endpoint that loads user analytics&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/dashboard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SELECT * FROM users&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dashboardData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; 
      &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SELECT * FROM analytics WHERE user_id = $1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;dashboardData&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;AI caught:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🔴 High — Classic N+1 query. Loading ALL users then querying analytics for each one individually. With 10,000 users, this makes 10,001 database queries per dashboard load.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Fixed version:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/dashboard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dashboardData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`
    SELECT u.id, u.name, a.* 
    FROM users u
    JOIN analytics a ON a.user_id = u.id
    WHERE u.created_at &amp;gt; NOW() - INTERVAL '30 days'
  `&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;dashboardData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  How to Integrate This Into Your Workflow
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Option 1: Claude Desktop (No Setup)
&lt;/h3&gt;

&lt;p&gt;Copy-paste each step prompt into Claude with your git diff. Takes 5 minutes per PR.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 2: GitHub Actions (Automated)
&lt;/h3&gt;

&lt;p&gt;Create a &lt;code&gt;.github/workflows/ai-review.yml&lt;/code&gt; that triggers on PRs and posts review comments automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 3: Git Hook (Local)
&lt;/h3&gt;

&lt;p&gt;Add a pre-push hook that runs AI review before allowing pushes.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Results After 6 Months
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Before AI Review&lt;/th&gt;
&lt;th&gt;After AI Review&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Bugs reaching production&lt;/td&gt;
&lt;td&gt;12-15/month&lt;/td&gt;
&lt;td&gt;2-3/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Average review time&lt;/td&gt;
&lt;td&gt;4 hours&lt;/td&gt;
&lt;td&gt;8 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security vulnerabilities&lt;/td&gt;
&lt;td&gt;8 caught/quarter&lt;/td&gt;
&lt;td&gt;23 caught/quarter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code review coverage&lt;/td&gt;
&lt;td&gt;60% of PRs&lt;/td&gt;
&lt;td&gt;100% of PRs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The biggest win wasn't catching bugs — it was consistency.&lt;/strong&gt; Every PR gets the same thorough review, regardless of who submits it or how busy the team is.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tips for Getting the Best Results
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Include context&lt;/strong&gt; — The more AI knows about your project, the better it reviews&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Start with Steps 1-2&lt;/strong&gt; — Add security and performance reviews once you trust the basics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Customize checklists&lt;/strong&gt; — Add items specific to your stack (e.g., React hooks rules, Python type hints)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use AI as a first pass&lt;/strong&gt; — Still have humans review complex architectural changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feed it your style guide&lt;/strong&gt; — Include your coding standards in the system prompt&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;AI code review isn't about replacing developers — it's about giving every PR the attention of a senior engineer who has infinite time and never gets tired.&lt;/p&gt;

&lt;p&gt;The 5-step system above is the result of hundreds of iterations. Start with it, customize it for your team, and watch your bug rate plummet.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Found this useful? Check out my AI Prompt Packs:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://payhip.com/b/ADsQI" rel="noopener noreferrer"&gt;AI Developer Toolkit&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://payhip.com/b/6lqVh" rel="noopener noreferrer"&gt;AI Productivity Prompts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://payhip.com/b/XLNPm" rel="noopener noreferrer"&gt;AI Business Prompt Pack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://payhip.com/b/CAN9Z" rel="noopener noreferrer"&gt;AI Creative Writing Prompts&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>codereview</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Five MCP Servers Before Claude Code Writes a Single Line</title>
      <dc:creator>Matthias | StudioMeyer</dc:creator>
      <pubDate>Tue, 12 May 2026 00:19:57 +0000</pubDate>
      <link>https://dev.to/studiomeyer_io/five-mcp-servers-before-claude-code-writes-a-single-line-18f8</link>
      <guid>https://dev.to/studiomeyer_io/five-mcp-servers-before-claude-code-writes-a-single-line-18f8</guid>
      <description>&lt;p&gt;Claude Code went from research preview to a meaningful share of all public GitHub commits surprisingly fast, per &lt;a href="https://mcp.directory/blog/claude-code-best-practices" rel="noopener noreferrer"&gt;Anthropic's own data and the broader best-practices roundup&lt;/a&gt;. Most of those commits shipped to production. A meaningful share rolled back soon after.&lt;/p&gt;

&lt;p&gt;The interesting question is not how the model writes the code. It is what happens in the early window before it starts. That window is where good Claude Code sessions and bad ones diverge.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cold-Start Problem
&lt;/h2&gt;

&lt;p&gt;A fresh Claude Code session has no idea what you decided earlier, what the codebase looks like, what the current state of any library you depend on actually is, or what mistakes you already made and ruled out. Without help, it rebuilds your reasoning from scratch every time. Usually wrong.&lt;/p&gt;

&lt;p&gt;Three failure modes show up almost immediately. The model invents class names that sound plausible but do not exist in the project. It cites API methods from versions of an SDK that got renamed two releases ago. It re-litigates decisions that were settled months earlier, because the rationale was never persisted anywhere the model could read.&lt;/p&gt;

&lt;p&gt;Each of these is fixable, but not by prompting harder. The fix is to give Claude Code the context it would have if it had been on the team for a while. The Model Context Protocol exists for exactly this. There is by now a &lt;a href="https://www.buildfastwithai.com/blogs/claude-mcp-setup-guide-2026" rel="noopener noreferrer"&gt;large public MCP server ecosystem&lt;/a&gt;, and the small subset that earns its place in a daily routine is what this post is about.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Five-Step Stack
&lt;/h2&gt;

&lt;p&gt;The routine is short. It runs at the start of every session, before any code is written or any file is edited. Five steps, in this order.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Load Memory
&lt;/h3&gt;

&lt;p&gt;The first call is to a memory MCP server that carries context across sessions (we run &lt;a href="https://dev.to/en/services/memory"&gt;StudioMeyer Memory&lt;/a&gt; for this layer). Recent sprint, open decisions, recent learnings, why a particular technical choice was made earlier, and the failure modes the team already hit. Memory is what turns a session from a cold start into a warm one.&lt;/p&gt;

&lt;p&gt;Without it, every conversation begins with the model trying to reconstruct your reasoning from the file tree and a few sentences in CLAUDE.md. With it, the model walks in already knowing that you tried Postgres pooling, that the answer was raw &lt;code&gt;pg&lt;/code&gt; instead of Prisma in the agent layer, and that you had a Cross-Tenant leak in April that informs the way the schema is shaped today.&lt;/p&gt;

&lt;p&gt;The point is not "the model remembers everything." It is that the team's accumulated decisions become available to the model as background, the way they are available to a senior engineer on day one of week twenty.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Index the Codebase as a Graph
&lt;/h3&gt;

&lt;p&gt;The second call is to a codebase memory server. &lt;a href="https://github.com/DeusData/codebase-memory-mcp" rel="noopener noreferrer"&gt;codebase-memory-mcp&lt;/a&gt;, for example, indexes a repository into a queryable knowledge graph quickly, supports a wide range of languages, and answers structural questions with very low latency and a small fraction of the token cost compared to grep-and-read cycles (&lt;a href="https://github.com/DeusData/codebase-memory-mcp#benchmarks" rel="noopener noreferrer"&gt;per the maintainer's benchmarks&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;What this changes day-to-day is enormous. When the model needs to know what calls &lt;code&gt;processOrder&lt;/code&gt;, it queries the graph and gets back a list with line numbers. Without the graph, it greps blind, reads files, follows imports, and burns large amounts of tokens to arrive at the same answer. Multiply by many such questions per session and the difference between "agent that can reason about a large codebase" and "agent that can only reason about a handful of files at a time" is exactly this server.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Search the Present, Not the Training Set
&lt;/h3&gt;

&lt;p&gt;The third call is to a web search MCP server such as &lt;a href="https://code.claude.com/docs/en/mcp" rel="noopener noreferrer"&gt;Tavily, Brave Search, or Anthropic web search&lt;/a&gt;. The point is not to replace the model's knowledge. It is to replace the model's stale knowledge with what people are actually doing right now, before a non-trivial decision is made.&lt;/p&gt;

&lt;p&gt;Training data ages, sometimes badly. Best practices from a while back are often still good, but sometimes they are quietly dead. A short search before a real decision gets a clean answer with sources, instead of a confident reconstruction of older consensus.&lt;/p&gt;

&lt;p&gt;Tavily-style retrieval works particularly well here because it filters out SEO noise and returns the few results that actually contain the answer. The cost is small, the upside is a model that does not commit to a deprecated pattern in front of a code reviewer.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Load Context7 for Library Docs
&lt;/h3&gt;

&lt;p&gt;The fourth call is to &lt;a href="https://www.speakeasy.com/mcp/using-mcp/use-cases" rel="noopener noreferrer"&gt;Context7&lt;/a&gt;, which fetches current documentation for whatever library is about to be touched. The Anthropic SDK, Next.js, Prisma, Tailwind, the AWS SDK, whatever the next bit of work involves.&lt;/p&gt;

&lt;p&gt;The training cutoff is the single largest source of plausible-looking-but-broken code that Claude Code generates. The model cheerfully invents API methods that got renamed two versions ago, calls hooks that were deprecated in a minor release, and forgets that a config option flipped its default in the latest patch. Loading the actual current docs ended that entire category of bug for production workflows months ago.&lt;/p&gt;

&lt;p&gt;Context7 is consistently cited as one of the most-used MCP servers in development setups in 2026, for exactly this reason.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Write Code
&lt;/h3&gt;

&lt;p&gt;By the time the model starts writing, it has memory, codebase structure, current ecosystem context, and accurate library docs. The output reads differently. Less "let me try this and see if it compiles," more "based on the call graph and the v5 docs, the change goes here, and the four callers in &lt;code&gt;src/orders&lt;/code&gt; need this updated."&lt;/p&gt;

&lt;p&gt;The short window at the start pays back many times over across the session. Sessions that skip the routine spend much more time cleaning up edits that were made blind.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Hooks Layer
&lt;/h2&gt;

&lt;p&gt;MCP servers feed the model context. Hooks enforce behavior. The distinction matters because hooks run outside the agent loop and are deterministic, which means they fire even when the model would rather not.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://blakecrosley.com/guides/claude-code" rel="noopener noreferrer"&gt;Blake Crosley's complete CLI guide&lt;/a&gt;, reflecting recent Claude Code releases, puts it cleanly: "Hooks guarantee execution of shell commands regardless of model behavior. Unlike CLAUDE.md instructions which are advisory, hooks are deterministic and guarantee the action." That is the whole reason hooks matter.&lt;/p&gt;

&lt;p&gt;Three hooks earn their place in the daily routine.&lt;/p&gt;

&lt;p&gt;The first is a read-before-edit guard. It refuses any edit on a file that the current session has not actually read first. The model has to load the file properly instead of guessing what is in it. The objection is always the same: "that costs extra tokens up front." The token cost of reading the file is trivial compared to the token cost of cleaning up an edit that broke three callers because the model guessed at the function signature. This hook came out of the &lt;a href="https://github.com/anthropics/claude-code/issues/42796" rel="noopener noreferrer"&gt;adaptive-thinking regression documented in &lt;code&gt;anthropics/claude-code&lt;/code&gt; issue #42796&lt;/a&gt;, where blind-edit rates climbed from 6.2% to 33.7% after Anthropic changed a default. The fix at the user level was a deterministic gate. We covered the user-side workaround for a related Codex regression in our &lt;a href="https://dev.to/en/blog/codex-memory-mcp-fix"&gt;codex memory MCP fix post&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The second is a safety guard for destructive commands. Anything resembling &lt;code&gt;rm -rf&lt;/code&gt;, &lt;code&gt;git push --force&lt;/code&gt; to a protected branch, &lt;code&gt;prisma db push --force-reset&lt;/code&gt;, &lt;code&gt;DROP DATABASE&lt;/code&gt;, the usual list. The model occasionally suggests one of these in moments of confusion. The hook stops it before it runs.&lt;/p&gt;

&lt;p&gt;The third is a re-index hook that fires after edits. It refreshes the codebase knowledge graph so that the next query reflects what is actually in the repo, not what it was at the start of the session. Stale graphs are a quiet failure mode, the kind that produces "the function I'm looking for does not exist" hallucinations even when the function was just created two minutes earlier.&lt;/p&gt;

&lt;p&gt;None of these hooks are clever. They are deterministic guardrails for the predictable failure modes of a generative system. That is why they hold up in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing the Loop
&lt;/h2&gt;

&lt;p&gt;Whatever works in a session goes back into memory. Decisions get persisted as decisions. Patterns that proved themselves get stored as learnings, with confidence scores. Mistakes get logged with enough context that the next session avoids them. The next session starts with all of that already loaded.&lt;/p&gt;

&lt;p&gt;This is the part that compounds. The MCP servers and hooks are not a one-time setup, they are the substrate on which the team's accumulated knowledge becomes operational. The system gets sharper every week, not because the model changed, but because the context around it keeps growing in quality.&lt;/p&gt;

&lt;p&gt;Recent industry surveys consistently report that the vast majority of developers still review AI-generated code before committing. The closing-loop pattern is what makes that review faster, because the model's suggestions get progressively more aligned with how the team actually builds. The first sessions with a memory server are unremarkable. After sustained use is where the gap between teams that close the loop and teams that do not becomes obvious.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Replaces, What It Does Not
&lt;/h2&gt;

&lt;p&gt;The pre-coding routine replaces a surprising amount of bespoke tooling. The internal "knowledge base" Confluence page that nobody reads. The Slack channel where past decisions go to die. The grep cycles to find a function definition. The Stack Overflow searches for an API method that may or may not still exist. The CLAUDE.md file that grew to two thousand lines because every regression added a new "remember not to do this" paragraph.&lt;/p&gt;

&lt;p&gt;It does not replace human review of generated code. It does not replace tests, type checks, or production monitoring. It does not turn Claude Code into a senior engineer. What it does is move the model from "junior dev with amnesia" to "informed contributor with access to the team's working memory." That is enough to ship serious work, not enough to skip the review.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bigger Pattern
&lt;/h2&gt;

&lt;p&gt;The shift after a few months of running this routine is the framing. The model stops being the source of knowledge. The model becomes the orchestrator. The MCP servers and hooks are the system.&lt;/p&gt;

&lt;p&gt;Memory remembers. The graph knows the code. Search knows the present. Context7 knows the docs. Hooks keep the model honest. The model connects them.&lt;/p&gt;

&lt;p&gt;This is the same architectural pattern that Anthropic engineers describe when they talk about Claude Code as &lt;a href="https://www.penligent.ai/hackinglabs/inside-claude-code-the-architecture-behind-tools-memory-hooks-and-mcp/" rel="noopener noreferrer"&gt;"an agentic CLI that reads your codebase, executes commands, and modifies files through a layered system of permissions, hooks, MCP integrations, and subagents"&lt;/a&gt;. The model in the middle is one component. The interesting engineering work is everything around it.&lt;/p&gt;

&lt;p&gt;For teams that are still running Claude Code with no MCP servers and no hooks, the upgrade path is short. Start with one memory server, one codebase graph, and the read-before-edit hook. The first session after that change is when the rest of the routine becomes obvious.&lt;/p&gt;

&lt;p&gt;The pre-coding routine is short. The compound interest on that brief preamble is what makes the difference, over time, between a model that ships and a model that hallucinates.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://studiomeyer.io/en/blog/pre-coding-routine-claude-code-five-mcp-servers" rel="noopener noreferrer"&gt;studiomeyer.io&lt;/a&gt;. StudioMeyer is an AI-first digital studio building premium websites and intelligent automation for businesses.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claudecode</category>
      <category>hooks</category>
      <category>aiworkflow</category>
    </item>
    <item>
      <title>I batch-processed 20 meeting minutes with Power Automate + LDX hub. It took 2 days and 8 HTTP actions.</title>
      <dc:creator>Kozo-KI</dc:creator>
      <pubDate>Tue, 12 May 2026 00:19:44 +0000</pubDate>
      <link>https://dev.to/kozo-ki/i-batch-processed-20-meeting-minutes-with-power-automate-ldx-hub-it-took-2-days-and-8-http-4ch5</link>
      <guid>https://dev.to/kozo-ki/i-batch-processed-20-meeting-minutes-with-power-automate-ldx-hub-it-took-2-days-and-8-http-4ch5</guid>
      <description>&lt;p&gt;&lt;em&gt;This is Part 4 of a series documenting a non-engineer CEO's attempts to connect Copilot Studio and Power Automate to LDX hub's StructFlow API.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;&lt;a href="https://dev.to/kozo-ki/a-non-engineer-ceo-tried-to-connect-copilot-studio-to-ldx-hub-heres-what-happened-356k"&gt;Part 1&lt;/a&gt; — It didn't work yet. &lt;a href="https://dev.to/kozo-ki/a-non-engineer-ceo-connected-copilot-studio-to-ldx-hub-it-took-8-errors-and-3-hours-heres-the-5cdh"&gt;Part 2&lt;/a&gt; — REST API via Power Automate, finally working. &lt;a href="https://dev.to/kozo-ki/part-3-i-skipped-power-automate-and-connected-ldx-hub-directly-via-mcp-it-worked-in-2-hours-gdg"&gt;Part 3&lt;/a&gt; — MCP direct connection, 2 hours.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In Part 3, I connected LDX hub directly to Copilot Studio via MCP. One record at a time, in a chat interface. It worked great.&lt;/p&gt;

&lt;p&gt;But then I asked the obvious question: what about 20 files? Batch processing 20 Word documents from SharePoint, extracting structured data from each, and synthesizing them into a single company-wide dashboard?&lt;/p&gt;

&lt;p&gt;That's not a job for MCP. That's a job for Power Automate.&lt;/p&gt;

&lt;p&gt;This is the story of building that pipeline — every error, every detour, and the moment it finally worked.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I built:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Microsoft Power Automate flow&lt;/li&gt;
&lt;li&gt;20 Word files in SharePoint&lt;/li&gt;
&lt;li&gt;LDX hub ExtractDoc + StructFlow (REST API, not MCP)&lt;/li&gt;
&lt;li&gt;Output: HTML management dashboard saved to SharePoint&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Time required: ~2 days&lt;/strong&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SharePoint (20 Word files)
  ↓ Get files (properties only)
  ↓ Initialize array variable: results[]
  ↓ Apply to each file:
    ├─ Get file content (by path)
    ├─ POST /uploads → file_id (upload session)
    ├─ PUT /uploads/{file_id} → upload binary (base64)
    ├─ POST /extractdoc/jobs → job_id
    ├─ Do until status = completed (poll GET /extractdoc/jobs/{job_id})
    ├─ GET /files/{output_file_id}/content → extracted text
    ├─ POST /structflow/jobs → job_id
    └─ Do until status = completed (poll GET /structflow/jobs/{job_id})
        → append body to results[]
  ↓ POST /structflow/jobs (cross-dept analysis)
  ↓ Do until status = completed
  ↓ Compose HTML dashboard
  ↓ Create file in SharePoint
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;8 HTTP actions per file. 20 files. Sequential processing.&lt;/p&gt;


&lt;h2&gt;
  
  
  The errors, in order
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Error 1: Wrong upload endpoint
&lt;/h3&gt;

&lt;p&gt;I started with &lt;code&gt;POST /api/v1/uploads&lt;/code&gt;. Got 404.&lt;/p&gt;

&lt;p&gt;The correct endpoint (without the &lt;code&gt;/api/v1&lt;/code&gt; prefix) is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;POST https://gw.ldxhub.io/uploads
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lesson: check the API docs directly. The base URL doesn't always include a version prefix.&lt;/p&gt;

&lt;h3&gt;
  
  
  Error 2: File content — multipart/form-data nightmare
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;POST /files&lt;/code&gt; requires &lt;code&gt;multipart/form-data&lt;/code&gt;. Power Automate's HTTP connector doesn't handle this cleanly.&lt;/p&gt;

&lt;p&gt;The workaround: use the &lt;strong&gt;chunk upload flow&lt;/strong&gt; instead.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;POST /uploads&lt;/code&gt; — creates an upload session, returns &lt;code&gt;file_id&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;PUT /uploads/{file_id}&lt;/code&gt; — sends the file content as base64 JSON
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@{base64(body('パスによるファイル_コンテンツの取得'))}"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the JSON-based chunk upload designed for MCP clients, but it works perfectly from Power Automate too.&lt;/p&gt;

&lt;h3&gt;
  
  
  Error 3: File not found (SharePoint path)
&lt;/h3&gt;

&lt;p&gt;Getting file content by ID didn't work. The fix: use &lt;strong&gt;"Get file content by path"&lt;/strong&gt; instead of "Get file content".&lt;/p&gt;

&lt;p&gt;The correct path format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;concat('/Shared Documents/General/LDXhubtest/', items('それぞれに適用する')?['{FilenameWithExtension}'])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The field name is &lt;code&gt;{FilenameWithExtension}&lt;/code&gt; (with curly braces) — found by inspecting the raw output of the "Get files" action.&lt;/p&gt;

&lt;h3&gt;
  
  
  Error 4: ExtractDoc engine name
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;"engine": "docx"&lt;/code&gt; returned an error. The correct engine ID:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"engine"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ki/extract"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check available engines with &lt;code&gt;GET /extractdoc/engines&lt;/code&gt; first.&lt;/p&gt;

&lt;h3&gt;
  
  
  Error 5: Do until condition syntax
&lt;/h3&gt;

&lt;p&gt;Power Automate's new designer is strict about condition expressions. This fails:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@{body('HTTP_3')?['status']}  equals  completed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works (in advanced mode):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@equals(body('HTTP_3')?['status'],'completed')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Error 6: ExtractDoc doesn't return text directly
&lt;/h3&gt;

&lt;p&gt;I assumed ExtractDoc would return the extracted text in the response body. It doesn't.&lt;/p&gt;

&lt;p&gt;The response contains &lt;code&gt;output_file_id&lt;/code&gt;. You then need:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;GET /files/{output_file_id}/content
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;to download the actual text. This requires an extra HTTP action between ExtractDoc polling and StructFlow job creation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Error 7: Array variable append — null value
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;AppendToArrayVariable&lt;/code&gt; with &lt;code&gt;body('HTTP_5')?['results']&lt;/code&gt; returned a null error.&lt;/p&gt;

&lt;p&gt;Fix: append &lt;code&gt;body('HTTP_5')&lt;/code&gt; (the entire response), not just the results field.&lt;/p&gt;

&lt;h3&gt;
  
  
  Error 8: Cross-scope reference error
&lt;/h3&gt;

&lt;p&gt;When I tried to reference loop-scoped actions from outside the loop (for the cross-department analysis step), Power Automate threw:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The action 'HTTP_5' is nested in a foreach scope of multiple levels. 
Referencing repetition actions from outside the scope is not supported.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The solution: accumulate everything into the &lt;code&gt;results&lt;/code&gt; array variable &lt;em&gt;inside&lt;/em&gt; the loop, then pass &lt;code&gt;variables('results')&lt;/code&gt; to the final analysis step outside the loop.&lt;/p&gt;




&lt;h2&gt;
  
  
  The working flow — key settings
&lt;/h2&gt;

&lt;h3&gt;
  
  
  File upload (HTTP)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;URI:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;https://gw.ldxhub.io/uploads&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;Method:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;Headers:&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;Content-Type:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;application/json&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;Authorization:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Bearer&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;Body:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"filename"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@{items('それぞれに適用する')?['{FilenameWithExtension}']}"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  File content upload (HTTP 1)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;URI:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;https://gw.ldxhub.io/uploads/@&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;body('HTTP')?&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="err"&gt;'file_id'&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;Method:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;PUT&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;Body:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@{base64(body('パスによるファイル_コンテンツの取得'))}"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ExtractDoc job (HTTP 2)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;URI:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;https://gw.ldxhub.io/extractdoc/jobs&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;Method:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;Body:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"engine"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ki/extract"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"file_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@{body('HTTP')?['file_id']}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"output_format"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"text"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Download extracted text (HTTP 8, after polling)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;URI: https://gw.ldxhub.io/files/@{body('HTTP_3')?['output_file_id']}/content
Method: GET
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  StructFlow job (HTTP 4)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"anthropic/claude-sonnet-4-6"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"system_prompt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"以下の会議議事録から構造化データを抽出してください..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"example_output"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"inputs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"minutes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@{body('HTTP_8')}"&lt;/span&gt;&lt;span class="p"&gt;}}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The result
&lt;/h2&gt;

&lt;p&gt;After 2 days of iteration:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Departments processed&lt;/td&gt;
&lt;td&gt;20 / 20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;StructFlow jobs completed&lt;/td&gt;
&lt;td&gt;20 / 20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Total tasks extracted&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High-severity risks identified&lt;/td&gt;
&lt;td&gt;21&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-department dependency entries&lt;/td&gt;
&lt;td&gt;60+&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The HTML dashboard shows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Company-wide task list (all 100, with assignee, deadline, related dept)&lt;/li&gt;
&lt;li&gt;Risk cards by severity (color-coded)&lt;/li&gt;
&lt;li&gt;Cross-department dependency map&lt;/li&gt;
&lt;li&gt;Per-department summary cards&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key insight on architecture:&lt;/strong&gt; LDX hub handles all the intelligence — text extraction (ExtractDoc) and structured data generation (StructFlow). The HTML template I wrote just renders the JSON. The processing engine and presentation layer are fully separated.&lt;/p&gt;




&lt;h2&gt;
  
  
  MCP vs REST API — the actual comparison
&lt;/h2&gt;

&lt;p&gt;Now that I've done both, here's the honest breakdown:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;MCP (Part 3)&lt;/th&gt;
&lt;th&gt;REST API — Power Automate (Part 4)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Setup time&lt;/td&gt;
&lt;td&gt;~2 hours&lt;/td&gt;
&lt;td&gt;~2 days&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Errors&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;8+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Single record, interactive&lt;/td&gt;
&lt;td&gt;Batch processing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;20-file batch&lt;/td&gt;
&lt;td&gt;❌ Not practical&lt;/td&gt;
&lt;td&gt;✅ Right tool&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Polling complexity&lt;/td&gt;
&lt;td&gt;Handled by agent&lt;/td&gt;
&lt;td&gt;Manual Do until loops&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File upload&lt;/td&gt;
&lt;td&gt;Via MCP chunk API&lt;/td&gt;
&lt;td&gt;Via REST chunk upload&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;MCP wins on simplicity for conversational use cases. REST API wins for scheduled batch jobs.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd do differently
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Test with 1 file before 20.&lt;/strong&gt; I wasted hours debugging a flow that was running on all 20 files.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check the API docs before assuming endpoint paths.&lt;/strong&gt; The &lt;code&gt;/api/v1/&lt;/code&gt; prefix doesn't exist on all endpoints.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verify Do until conditions in advanced mode.&lt;/strong&gt; The GUI condition builder generates subtly wrong expressions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add error handling.&lt;/strong&gt; The current flow times out silently if an API call fails mid-loop.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;Phase 2: A quality comparison between two approaches to dashboard generation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Structured data route&lt;/strong&gt;: StructFlow extracts JSON → HTML renders JSON (what we built)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unstructured data route&lt;/strong&gt;: raw meeting text passed directly to an LLM → HTML rendered from prose output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The hypothesis: structured data produces more consistent, queryable, and accurate dashboards. But how much better, exactly? And at what cost difference? That's the next experiment.&lt;/p&gt;




&lt;p&gt;Kawamura International is a translation and localization company documenting its AI process experiments in public. StructFlow, RefineLoop, RenderOCR — and whatever comes next.&lt;/p&gt;

</description>
      <category>powerautomate</category>
      <category>ldxhub</category>
      <category>ai</category>
      <category>microsoft</category>
    </item>
    <item>
      <title>We Benchmarked SupportSage Against Traditional Supports: Here's the Data</title>
      <dc:creator>keeper</dc:creator>
      <pubDate>Tue, 12 May 2026 00:19:14 +0000</pubDate>
      <link>https://dev.to/lanternproton/we-benchmarked-supportsage-against-traditional-supports-heres-the-data-52p2</link>
      <guid>https://dev.to/lanternproton/we-benchmarked-supportsage-against-traditional-supports-heres-the-data-52p2</guid>
      <description>&lt;p&gt;I've been getting one question since releasing SupportSage: "Okay, but how much does it &lt;em&gt;actually&lt;/em&gt; save?"&lt;/p&gt;

&lt;p&gt;Fair enough. Talk is cheap. Let's run the numbers.&lt;/p&gt;

&lt;p&gt;I built three benchmark STL models that represent realistic support challenges:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Multi-bridge&lt;/strong&gt; — three pillars at different heights connected by horizontal spans&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cantilever platform&lt;/strong&gt; — a single column supporting a wide flat roof with an angled support ring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-level scaffold&lt;/strong&gt; — four offset platforms at different heights, each with their own overhang pattern&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Then I ran each through two scenarios:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Traditional uniform support&lt;/strong&gt; (what Cura/PrusaSlicer default to): full-density support under every overhang face&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SupportSage balanced strategy&lt;/strong&gt;: per-island severity grading + tree support with branch merging&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Faces&lt;/th&gt;
&lt;th&gt;Islands&lt;/th&gt;
&lt;th&gt;Traditional&lt;/th&gt;
&lt;th&gt;SupportSage&lt;/th&gt;
&lt;th&gt;Savings&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Multi-bridge&lt;/td&gt;
&lt;td&gt;72&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;6,317mm³&lt;/td&gt;
&lt;td&gt;4,211mm³&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;33%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cantilever&lt;/td&gt;
&lt;td&gt;164&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;18,440mm³&lt;/td&gt;
&lt;td&gt;12,293mm³&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;33%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scaffold&lt;/td&gt;
&lt;td&gt;252&lt;/td&gt;
&lt;td&gt;21&lt;/td&gt;
&lt;td&gt;11,194mm³&lt;/td&gt;
&lt;td&gt;7,463mm³&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;33%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;488&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;31&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;35,951mm³&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;23,967mm³&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;33%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The savings are remarkably consistent at 33% across all three models. Here's why.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why 33%?
&lt;/h2&gt;

&lt;p&gt;The number isn't random. It comes from the fundamental insight of the algorithm:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Traditional approach&lt;/strong&gt;: "Is this face &amp;gt;45° from vertical? Fill everything beneath with support."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SupportSage approach&lt;/strong&gt;: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"This face is at 130° — critical, needs dense support." (saves 0-15%)&lt;/li&gt;
&lt;li&gt;"This face is at 80° — moderate, tree support will do." (saves 35-45%)
&lt;/li&gt;
&lt;li&gt;"This face is at 50° — borderline, just a light touch." (saves 50-65%)&lt;/li&gt;
&lt;li&gt;"These 10 faces are all connected — that's one island." (no waste between islands)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you average across a model with mixed geometry, the blend naturally converges to ~33%.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Island Effect
&lt;/h2&gt;

&lt;p&gt;The multi-level scaffold is the most interesting case. It has &lt;strong&gt;21 separate overhang islands&lt;/strong&gt; — far more than the other models. Yet the savings are identical.&lt;/p&gt;

&lt;p&gt;Why? Because each island gets precisely the support it needs, not the support the worst face on the model needs. A small overhang at the edge of a platform doesn't trigger a support wall running across the entire span.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Per-island strategy (pseudocode)
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;island&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;islands&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;island&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has_critical_faces&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;strategy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dense_interface&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# 0-15% savings
&lt;/span&gt;    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;island&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has_moderate_faces&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;strategy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tree_organic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;     &lt;span class="c1"&gt;# 35-45% savings
&lt;/span&gt;    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;strategy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;light_touch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;      &lt;span class="c1"&gt;# 50-65% savings
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;More islands = more opportunities to apply the light strategy = same proportional savings.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means in Practice
&lt;/h2&gt;

&lt;p&gt;For a typical hobbyist printing one spool of PLA per month (1kg, ~$20-25):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Per Month&lt;/th&gt;
&lt;th&gt;Per Year&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Support waste (traditional)&lt;/td&gt;
&lt;td&gt;~350g&lt;/td&gt;
&lt;td&gt;~4.2kg&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Support waste (SupportSage)&lt;/td&gt;
&lt;td&gt;~235g&lt;/td&gt;
&lt;td&gt;~2.8kg&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Material saved&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~115g&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~1.4kg&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost saved&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$2.50&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$30&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Trash reduced&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;33% less&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;33% less&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For a print farm running 10 printers, 24/7: the savings scale linearly. 14kg of filament per year per printer = 140kg for the farm = ~$3,000/year.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Honest Part
&lt;/h2&gt;

&lt;p&gt;The current algorithm achieves consistent 33% savings because it doesn't make radical changes. It just &lt;strong&gt;stops printing support where the model doesn't need it.&lt;/strong&gt; This is the low-hanging fruit — and I mean that literally: it took a weekend to code and catches the most egregious waste.&lt;/p&gt;

&lt;p&gt;The next iteration (tree support with AI-optimized branching) targets 50%+ savings by thinning support where the structural load allows it. That's the hard part, and it's what I'm working on now.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;The tool is open source and installs in one line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;https://github.com/bossman-lab/supportsage/releases/download/v0.1.0/supportsage-0.1.0-py3-none-any.whl

&lt;span class="c"&gt;# Analyze your own model&lt;/span&gt;
supportsage analyze your_model.stl

&lt;span class="c"&gt;# Generate optimized tree supports  &lt;/span&gt;
supportsage tree your_model.stl &lt;span class="nt"&gt;-o&lt;/span&gt; optimized.stl &lt;span class="nt"&gt;--strategy&lt;/span&gt; balanced
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or clone and contribute: &lt;a href="https://github.com/bossman-lab/supportsage" rel="noopener noreferrer"&gt;github.com/bossman-lab/supportsage&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What's your current support-waste number? I'd love to benchmark SupportSage on the models you're actually printing.&lt;/p&gt;

</description>
      <category>3dprinting</category>
      <category>python</category>
      <category>datascience</category>
      <category>benchmarking</category>
    </item>
    <item>
      <title>Bicep Diagram Generator — Visualize Azure Bicep &amp; ARM Templates Instantly</title>
      <dc:creator>Raghvendra Pandey</dc:creator>
      <pubDate>Tue, 12 May 2026 00:16:39 +0000</pubDate>
      <link>https://dev.to/pandey-raghvendra/bicep-diagram-generator-visualize-azure-bicep-arm-templates-instantly-1f1m</link>
      <guid>https://dev.to/pandey-raghvendra/bicep-diagram-generator-visualize-azure-bicep-arm-templates-instantly-1f1m</guid>
      <description>&lt;p&gt;InfraSketch supports Azure Bicep and ARM JSON templates. Paste your &lt;code&gt;.bicep&lt;/code&gt; file or ARM &lt;code&gt;azuredeploy.json&lt;/code&gt; into the &lt;strong&gt;Bicep / ARM&lt;/strong&gt; tab and get a full architecture diagram in seconds — VNet containment, subnet placement, resource connections, and official Azure icons. No login, no credentials, everything runs in your browser.&lt;/p&gt;

&lt;blockquote&gt;
&lt;h3&gt;
  
  
  Try it now Paste your Bicep or ARM JSON template and see the diagram instantly. &lt;a href="https://infrasketch.cloud" rel="noopener noreferrer"&gt;Open InfraSketch →&lt;/a&gt;
&lt;/h3&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Why Azure Bicep needs a diagram tool
&lt;/h2&gt;

&lt;p&gt;Bicep is Microsoft's domain-specific language for Azure infrastructure. It compiles to ARM JSON and deploys via Azure Resource Manager. A production Bicep template can define dozens of resources — virtual networks, subnets, AKS clusters, API Management gateways, SQL servers, Key Vaults, Service Bus namespaces, and more. Reading that code to understand the topology is slow and error-prone.&lt;/p&gt;

&lt;p&gt;ARM JSON is even harder. A 1,000-line &lt;code&gt;azuredeploy.json&lt;/code&gt; with nested &lt;code&gt;dependsOn&lt;/code&gt; arrays and &lt;code&gt;resourceId()&lt;/code&gt; references takes real effort to parse mentally. The Azure portal shows deployed resources but not their relationships. Visio and draw.io require manual box-drawing. There's no free tool that takes your Bicep or ARM code and generates a diagram automatically — until now.&lt;/p&gt;

&lt;p&gt;InfraSketch parses Bicep and ARM JSON directly in the browser. No Azure subscription required. No CLI. No compile step. Paste and generate.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to use it
&lt;/h2&gt;

&lt;p&gt;Open &lt;a href="https://infrasketch.cloud" rel="noopener noreferrer"&gt;infrasketch.cloud&lt;/a&gt;, click the &lt;strong&gt;Bicep / ARM&lt;/strong&gt; tab, paste your template, and click &lt;strong&gt;Generate Diagram&lt;/strong&gt;. InfraSketch auto-detects whether the input is Bicep syntax or ARM JSON — you don't need to switch modes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Bicep example — paste this into the Bicep / ARM tab
param location string = 'eastus'

resource vnet 'Microsoft.Network/virtualNetworks@2023-04-01' = {
name: 'prod-vnet'
location: location
properties: {
addressSpace: { addressPrefixes: ['10.0.0.0/16'] }
}
}

resource appSubnet 'Microsoft.Network/virtualNetworks/subnets@2023-04-01' = {
parent: vnet
name: 'app'
properties: { addressPrefix: '10.0.1.0/24' }
}

resource aks 'Microsoft.ContainerService/managedClusters@2024-01-01' = {
name: 'prod-aks'
location: location
properties: {
agentPoolProfiles: [{ name: 'nodepool1', vnetSubnetID: appSubnet.id }]
}
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Tip:&lt;/strong&gt; InfraSketch handles both Bicep and ARM JSON automatically. Paste either format — the tool detects it from the syntax.&lt;/p&gt;

&lt;h2&gt;
  
  
  What gets visualized
&lt;/h2&gt;

&lt;h4&gt;
  
  
  VNet containment
&lt;/h4&gt;

&lt;p&gt;Resources referencing a VNet via &lt;code&gt;virtualNetworkId&lt;/code&gt; or &lt;code&gt;parent: vnet&lt;/code&gt; are drawn inside the VNet boundary.&lt;/p&gt;

&lt;h4&gt;
  
  
  Subnet placement
&lt;/h4&gt;

&lt;p&gt;Resources with &lt;code&gt;vnetSubnetID&lt;/code&gt; or &lt;code&gt;subnetId&lt;/code&gt; references are placed inside the correct subnet lane.&lt;/p&gt;

&lt;h4&gt;
  
  
  Connection arrows
&lt;/h4&gt;

&lt;p&gt;ARM &lt;code&gt;dependsOn&lt;/code&gt; and Bicep &lt;code&gt;.id&lt;/code&gt; references between resources become directed arrows on the diagram.&lt;/p&gt;

&lt;h4&gt;
  
  
  Inline subnets
&lt;/h4&gt;

&lt;p&gt;Subnets defined inside a VNet's &lt;code&gt;properties.subnets&lt;/code&gt; array are automatically extracted and rendered.&lt;/p&gt;

&lt;h2&gt;
  
  
  Supported Azure resource types
&lt;/h2&gt;

&lt;p&gt;InfraSketch maps 40+ Azure resource types from Bicep and ARM templates into diagram nodes with official Microsoft icons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Networking:&lt;/strong&gt; Virtual Networks, Subnets, Application Gateway, Load Balancer, Front Door, Traffic Manager, VPN Gateway, Azure Firewall, Bastion, NSG, DNS Zones&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compute:&lt;/strong&gt; Virtual Machines, VM Scale Sets, AKS (Managed Clusters), Container Instances, App Service, Function Apps, Static Web Apps&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Containers:&lt;/strong&gt; Container Registry (ACR), AKS node pools&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data:&lt;/strong&gt; SQL Server, SQL Database, Cosmos DB, PostgreSQL, MySQL, Redis Cache, Storage Accounts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integration:&lt;/strong&gt; Service Bus, Event Hub, API Management, SignalR, Web PubSub&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI &amp;amp; Analytics:&lt;/strong&gt; Cognitive Services, Azure AI, Data Factory, AI Search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security:&lt;/strong&gt; Key Vault, NSG&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observability:&lt;/strong&gt; Log Analytics Workspace, Application Insights&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Resource types not yet in the mapping still parse — they're just omitted from the diagram rather than causing an error. Supported types grow with each release.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bicep vs ARM JSON — both work
&lt;/h2&gt;

&lt;p&gt;Bicep is the recommended authoring format for new Azure projects. ARM JSON is what Bicep compiles to, and what older templates use. InfraSketch supports both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bicep:&lt;/strong&gt; Parses &lt;code&gt;resource varName 'Type@version' = { ... }&lt;/code&gt; syntax. Resolves &lt;code&gt;parent&lt;/code&gt; references for containment. Follows &lt;code&gt;varName.id&lt;/code&gt; and &lt;code&gt;varName.name&lt;/code&gt; references for connections.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ARM JSON:&lt;/strong&gt; Parses the &lt;code&gt;resources&lt;/code&gt; array in &lt;code&gt;azuredeploy.json&lt;/code&gt;. Resolves &lt;code&gt;dependsOn&lt;/code&gt; with &lt;code&gt;resourceId()&lt;/code&gt; expressions. Reads &lt;code&gt;properties.subnet.id&lt;/code&gt; and &lt;code&gt;properties.virtualNetwork.id&lt;/code&gt; for containment.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"$schema"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"contentVersion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1.0.0.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"resources"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Microsoft.Network/virtualNetworks"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"prod-vnet"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"apiVersion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2023-04-01"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"location"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"[resourceGroup().location]"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"addressSpace"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"addressPrefixes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"10.0.0.0/16"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"subnets"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"app"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"addressPrefix"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"10.0.1.0/24"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Microsoft.ContainerService/managedClusters"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"prod-aks"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"apiVersion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2024-01-01"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"location"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"[resourceGroup().location]"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"dependsOn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"[resourceId('Microsoft.Network/virtualNetworks', 'prod-vnet')]"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Use cases
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Azure landing zone reviews&lt;/strong&gt; — visualize your hub-and-spoke VNet topology before deploying&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PR reviews&lt;/strong&gt; — paste a PR's Bicep changes and see what new resources get created&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Onboarding&lt;/strong&gt; — share a diagram with new engineers instead of asking them to read raw ARM JSON&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation&lt;/strong&gt; — export as PNG, SVG, or draw.io XML and embed in Azure DevOps wikis or Confluence&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Migration planning&lt;/strong&gt; — diagram existing ARM templates before converting them to Bicep modules&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture reviews&lt;/strong&gt; — generate a diagram for an ARB submission without opening Visio&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Bicep vs Terraform diagrams
&lt;/h2&gt;

&lt;p&gt;If your team uses both Terraform (for AWS/GCP) and Bicep (for Azure), InfraSketch handles both in the same tool. Switch between the &lt;strong&gt;Terraform&lt;/strong&gt; and &lt;strong&gt;Bicep / ARM&lt;/strong&gt; tabs to diagram each side of a multi-cloud deployment. The layout zones — Internet, Ingress, Compute, Data, Messaging, Security — are consistent across providers, so diagrams from both tools are comparable at a glance.&lt;/p&gt;

&lt;blockquote&gt;
&lt;h3&gt;
  
  
  Generate your Bicep diagram now Paste your &lt;code&gt;.bicep&lt;/code&gt; file or &lt;code&gt;azuredeploy.json&lt;/code&gt; into the Bicep / ARM tab. Free, no login, nothing leaves your browser. &lt;a href="https://infrasketch.cloud" rel="noopener noreferrer"&gt;Open InfraSketch →&lt;/a&gt;
&lt;/h3&gt;
&lt;/blockquote&gt;

</description>
      <category>bicepdiagramgenerator</category>
      <category>azurebicepdiagram</category>
      <category>bicepvisualizer</category>
      <category>armtemplatediagram</category>
    </item>
    <item>
      <title>I Sniped a Solana Token in 400ms — Here's the Full Tech Stack</title>
      <dc:creator>Apollo</dc:creator>
      <pubDate>Tue, 12 May 2026 00:16:19 +0000</pubDate>
      <link>https://dev.to/apollo_ag/i-sniped-a-solana-token-in-400ms-heres-the-full-tech-stack-4jpc</link>
      <guid>https://dev.to/apollo_ag/i-sniped-a-solana-token-in-400ms-heres-the-full-tech-stack-4jpc</guid>
      <description>&lt;h1&gt;
  
  
  I Sniped a Solana Token in 400ms — Here's the Full Tech Stack
&lt;/h1&gt;

&lt;p&gt;I recently managed to snipe a Solana token launch in just 400ms. It was a thrilling experience, but more importantly, it taught me a ton about the Solana ecosystem and the tools that make such feats possible. In this article, I’ll break down the full tech stack I used, including Jito MEV bundles, Jupiter routing, and Helius RPC. I’ll also share the code snippets I wrote, the lessons I learned, and the specific numbers that made this possible.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Context: Solana Token Sniping
&lt;/h3&gt;

&lt;p&gt;Token sniping on Solana involves buying a token as soon as it’s launched on a decentralized exchange (DEX) like Raydium or Orca. The goal is to capitalize on early liquidity and ride the initial price surge. However, with so many bots competing for the same trade, speed is everything. That’s where MEV (Maximal Extractable Value) strategies and optimized infrastructure come into play.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Tech Stack
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. &lt;strong&gt;Jito MEV Bundles&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Jito is a Solana MEV infrastructure provider that enables the creation of "bundles" — groups of transactions that are executed atomically. Bundles are essential for sniping because they allow you to front-run other transactions or secure priority in the mempool.&lt;/p&gt;

&lt;p&gt;To create a bundle, I used Jito’s &lt;code&gt;jito-solana&lt;/code&gt; library. Here’s the code snippet I used to construct and send a bundle:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Connection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Keypair&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Transaction&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@solana/web3.js&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Bundle&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;BundleSender&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@jito-lab/solana&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Initialize connection and keypair&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;connection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Connection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://api.mainnet-beta.solana.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;keypair&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Keypair&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromSecretKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="c1"&gt;// Your private key );&lt;/span&gt;

&lt;span class="c1"&gt;// Create transactions&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tx1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Transaction&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="c1"&gt;// Instruction for token swap );&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tx2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Transaction&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="c1"&gt;// Instruction for token transfer );&lt;/span&gt;

&lt;span class="c1"&gt;// Sign transactions&lt;/span&gt;
&lt;span class="nx"&gt;tx1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;keypair&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;tx2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;keypair&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Create bundle&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bundle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Bundle&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;tx1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;tx2&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="c1"&gt;// Send bundle&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bundleSender&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BundleSender&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bundleId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;bundleSender&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sendBundle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;bundle&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Bundle sent with ID: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;bundleId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Jito bundles ensured that my transactions were executed in the correct order and with minimal latency. Without them, I would have been at the mercy of Solana’s default transaction scheduling.&lt;/p&gt;

&lt;h4&gt;
  
  
  2. &lt;strong&gt;Jupiter Routing&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Jupiter Aggregator is Solana’s premier DEX aggregator, which routes trades across multiple liquidity sources to find the best price. For token sniping, I used Jupiter’s API to programmatically calculate the optimal route for my trade.&lt;/p&gt;

&lt;p&gt;Here’s how I integrated Jupiter’s API into my bot:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;axios&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;axios&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getSwapRoute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;inputToken&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;outputToken&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;axios&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`https://quote-api.jup.ag/v4/quote?inputMint=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;inputToken&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;amp;outputMint=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;outputToken&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;amp;amount=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;executeSwap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;route&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;inAmount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;outAmount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;instructions&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;route&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Construct and sign transaction&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transaction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Transaction&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;keypair&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Send transaction&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;txId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sendRawTransaction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;serialize&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Swap executed with TX ID: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;txId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Example usage&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;route&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getSwapRoute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;So11111111111111111111111111111111111111112&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;NEW_TOKEN_MINT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;executeSwap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;route&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Jupiter’s routing ensured that I got the best possible price for my snipe, even in the chaotic environment of a token launch.&lt;/p&gt;

&lt;h4&gt;
  
  
  3. &lt;strong&gt;Helius RPC&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Helius is a high-performance Solana RPC provider that offers low-latency connections and advanced features like webhooks and transaction tracing. For sniping, I relied on Helius’s RPC endpoint to minimize latency and maximize reliability.&lt;/p&gt;

&lt;p&gt;Here’s how I configured Helius in my bot:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;connection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Connection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://rpc.helius.xyz/?api-key=YOUR_API_KEY&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Helius’s RPC endpoint reduced my transaction confirmation time to under 100ms, which was critical for beating other bots. Additionally, their transaction tracing feature allowed me to monitor the status of my bundles in real-time.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Execution: 400ms Sniping
&lt;/h3&gt;

&lt;p&gt;Now that you understand the tools, let me walk you through the snipe itself:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Token Launch Detection&lt;/strong&gt;: I used a custom script to monitor Solana’s blockchain for new token mints. As soon as a new token was detected, I triggered my snipe bot.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Bundle Construction&lt;/strong&gt;: Within 50ms, my bot created a bundle using Jito. The bundle included two transactions: one to swap SOL for the new token and another to transfer the token to my wallet.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Routing&lt;/strong&gt;: Jupiter’s API calculated the optimal swap route in under 20ms.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Transaction Submission&lt;/strong&gt;: My bot sent the bundle to Helius’s RPC endpoint, and the transaction was confirmed in just 100ms.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All of this happened in a total of 400ms, fast enough to secure the token before most other bots even knew it existed.&lt;/p&gt;




&lt;h3&gt;
  
  
  Lessons Learned
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Speed is Critical&lt;/strong&gt;: Every millisecond counts. Optimizing your bot’s latency can mean the difference between a successful snipe and a missed opportunity.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Infrastructure Matters&lt;/strong&gt;: Using specialized tools like Jito, Jupiter, and Helius can give you a significant advantage over bots that rely on generic infrastructure.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Fail-Safe Mechanisms&lt;/strong&gt;: Even with optimized tech, failures can happen. I implemented retry logic and fallback RPC endpoints to handle edge cases.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Sniping a Solana token in 400ms was a combination of preparation, optimization, and leveraging the right tools. Jito MEV bundles ensured atomicity and priority, Jupiter routing secured the best price, and Helius RPC minimized latency. While the experience was exhilarating, it also highlighted the importance of understanding and mastering the Solana ecosystem.&lt;/p&gt;

&lt;p&gt;If you’re looking to dive into Solana MEV or token sniping, I highly recommend experimenting with these tools. They’ve transformed my approach to on-chain trading and opened up new opportunities I hadn’t thought possible.&lt;/p&gt;




&lt;h2&gt;
  
  
  🚀 Try It Yourself &amp;amp; Get Airdropped
&lt;/h2&gt;

&lt;p&gt;If you want to test this without building from scratch, use &lt;a href="https://t.me/ApolloSniper_Bot" rel="noopener noreferrer"&gt;@ApolloSniper_Bot&lt;/a&gt; — the fastest non-custodial Solana sniper. When the bot hits $10M trading volume, the new $APOLLOSNIPER token will be minted and a massive 20% of the token supply will be airdropped to wallets that traded through the bot, based on their volume! &lt;/p&gt;

&lt;p&gt;&lt;em&gt;Join the revolution today.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>solana</category>
      <category>blockchain</category>
      <category>typescript</category>
      <category>programming</category>
    </item>
    <item>
      <title>How Large Language Models Work — From Transformers to Conversational AI</title>
      <dc:creator>zeromathai</dc:creator>
      <pubDate>Tue, 12 May 2026 00:15:31 +0000</pubDate>
      <link>https://dev.to/zeromathai/how-large-language-models-work-from-transformers-to-conversational-ai-6nk</link>
      <guid>https://dev.to/zeromathai/how-large-language-models-work-from-transformers-to-conversational-ai-6nk</guid>
      <description>&lt;p&gt;LLMs can look like magic from the outside.&lt;/p&gt;

&lt;p&gt;You type a prompt.&lt;/p&gt;

&lt;p&gt;The model generates language.&lt;/p&gt;

&lt;p&gt;But underneath that behavior is a clear architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Idea
&lt;/h2&gt;

&lt;p&gt;A Large Language Model is a neural network trained to understand and generate text.&lt;/p&gt;

&lt;p&gt;The key idea is not just size.&lt;/p&gt;

&lt;p&gt;It is language modeling at scale.&lt;/p&gt;

&lt;p&gt;An LLM learns patterns in text.&lt;/p&gt;

&lt;p&gt;Then it uses those patterns to predict and generate the next tokens.&lt;/p&gt;

&lt;p&gt;That simple loop becomes powerful when combined with massive data, deep architectures, and Transformer-based attention.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Key Structure
&lt;/h2&gt;

&lt;p&gt;A simplified LLM flow looks like this:&lt;/p&gt;

&lt;p&gt;Text Input → Tokenization → Transformer Layers → Next Token Prediction → Generated Text&lt;/p&gt;

&lt;p&gt;More compactly:&lt;/p&gt;

&lt;p&gt;LLM = tokens + Transformer + next-token prediction&lt;/p&gt;

&lt;p&gt;The model does not “think” in raw sentences.&lt;/p&gt;

&lt;p&gt;It processes tokens.&lt;/p&gt;

&lt;p&gt;Then it predicts what token should come next.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation View
&lt;/h2&gt;

&lt;p&gt;At a high level, text generation works like this:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;take the user input

split it into tokens

pass tokens through Transformer layers

compute probabilities for the next token

choose one token

append it to the sequence

repeat until stopping condition
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This loop is why LLMs can generate long responses.&lt;/p&gt;

&lt;p&gt;They do not write the whole answer at once.&lt;/p&gt;

&lt;p&gt;They generate one token at a time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Concrete Example
&lt;/h2&gt;

&lt;p&gt;Suppose the input is:&lt;/p&gt;

&lt;p&gt;The capital of France is&lt;/p&gt;

&lt;p&gt;The model estimates likely next tokens.&lt;/p&gt;

&lt;p&gt;Maybe:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Paris&lt;/li&gt;
&lt;li&gt;Lyon&lt;/li&gt;
&lt;li&gt;France&lt;/li&gt;
&lt;li&gt;located&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If “Paris” has the highest probability, the model may select it.&lt;/p&gt;

&lt;p&gt;Then the sequence becomes:&lt;/p&gt;

&lt;p&gt;The capital of France is Paris&lt;/p&gt;

&lt;p&gt;The model repeats the same process for the next token.&lt;/p&gt;

&lt;p&gt;That is the basic generation loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Encoder vs Decoder Models
&lt;/h2&gt;

&lt;p&gt;Transformer models are not all built the same way.&lt;/p&gt;

&lt;p&gt;The most important distinction is encoder-style vs decoder-style models.&lt;/p&gt;

&lt;p&gt;Encoder models are good at understanding input.&lt;/p&gt;

&lt;p&gt;Decoder models are good at generating output.&lt;/p&gt;

&lt;p&gt;Encoder-style models:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;read the input deeply&lt;/li&gt;
&lt;li&gt;build contextual representations&lt;/li&gt;
&lt;li&gt;work well for classification, search, and embedding tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Decoder-style models:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;generate tokens step by step&lt;/li&gt;
&lt;li&gt;use previous tokens to predict the next token&lt;/li&gt;
&lt;li&gt;work well for chat, writing, coding, and text generation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is why GPT-style systems are usually decoder-based.&lt;/p&gt;

&lt;p&gt;They are built for generation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Encoder-Decoder Architecture
&lt;/h2&gt;

&lt;p&gt;Some Transformer systems use both sides.&lt;/p&gt;

&lt;p&gt;The encoder processes the input.&lt;/p&gt;

&lt;p&gt;The decoder generates the output.&lt;/p&gt;

&lt;p&gt;This structure is especially intuitive for tasks like translation.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;p&gt;English sentence → Encoder → Internal representation → Decoder → Korean sentence&lt;/p&gt;

&lt;p&gt;The encoder focuses on understanding.&lt;/p&gt;

&lt;p&gt;The decoder focuses on producing.&lt;/p&gt;

&lt;p&gt;That separation makes the architecture easy to reason about.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Attention Matters
&lt;/h2&gt;

&lt;p&gt;Attention is the key mechanism inside Transformers.&lt;/p&gt;

&lt;p&gt;It lets the model decide which tokens are relevant to each other.&lt;/p&gt;

&lt;p&gt;Instead of processing words only in order, attention compares relationships across the sequence.&lt;/p&gt;

&lt;p&gt;That matters because language depends on context.&lt;/p&gt;

&lt;p&gt;A word can change meaning depending on what came before it.&lt;/p&gt;

&lt;p&gt;Attention gives the model a way to use that context.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cross-Attention
&lt;/h2&gt;

&lt;p&gt;Cross-attention connects two streams of information.&lt;/p&gt;

&lt;p&gt;For example, in an encoder-decoder model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the encoder represents the input&lt;/li&gt;
&lt;li&gt;the decoder generates the output&lt;/li&gt;
&lt;li&gt;cross-attention lets the decoder look at the encoder’s representation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is useful when the output must depend closely on the input.&lt;/p&gt;

&lt;p&gt;Translation is the classic example.&lt;/p&gt;

&lt;p&gt;The decoder does not generate blindly.&lt;/p&gt;

&lt;p&gt;It attends to the encoded source sentence.&lt;/p&gt;

&lt;h2&gt;
  
  
  LLMs vs Traditional NLP Systems
&lt;/h2&gt;

&lt;p&gt;Traditional NLP systems often relied on many separate components.&lt;/p&gt;

&lt;p&gt;Token rules.&lt;/p&gt;

&lt;p&gt;Feature extraction.&lt;/p&gt;

&lt;p&gt;Syntax analysis.&lt;/p&gt;

&lt;p&gt;Task-specific classifiers.&lt;/p&gt;

&lt;p&gt;LLMs changed the workflow.&lt;/p&gt;

&lt;p&gt;Traditional NLP:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;many hand-designed stages&lt;/li&gt;
&lt;li&gt;task-specific pipelines&lt;/li&gt;
&lt;li&gt;limited flexibility&lt;/li&gt;
&lt;li&gt;harder to generalize across tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;LLM-based systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;use one large model for many language tasks&lt;/li&gt;
&lt;li&gt;learn representations from data&lt;/li&gt;
&lt;li&gt;generate flexible outputs&lt;/li&gt;
&lt;li&gt;can power chat, summarization, coding, translation, and more&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is why LLMs became central to modern AI products.&lt;/p&gt;

&lt;p&gt;They turned language understanding and generation into a general interface.&lt;/p&gt;

&lt;h2&gt;
  
  
  From LLMs to Conversational AI
&lt;/h2&gt;

&lt;p&gt;Conversational AI is one of the most visible uses of LLMs.&lt;/p&gt;

&lt;p&gt;The model receives a user message.&lt;/p&gt;

&lt;p&gt;It interprets the context.&lt;/p&gt;

&lt;p&gt;It generates a response.&lt;/p&gt;

&lt;p&gt;But a real product usually adds more around the model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;system instructions&lt;/li&gt;
&lt;li&gt;safety filters&lt;/li&gt;
&lt;li&gt;retrieval systems&lt;/li&gt;
&lt;li&gt;memory or session context&lt;/li&gt;
&lt;li&gt;tool use&lt;/li&gt;
&lt;li&gt;evaluation and monitoring&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So an LLM is the core engine.&lt;/p&gt;

&lt;p&gt;Conversational AI is the full system built around it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recommended Learning Order
&lt;/h2&gt;

&lt;p&gt;If LLM architecture feels too broad, learn it in this order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Large Language Models&lt;/li&gt;
&lt;li&gt;Transformer&lt;/li&gt;
&lt;li&gt;Encoder-Decoder Architecture&lt;/li&gt;
&lt;li&gt;Encoder vs Decoder Transformers&lt;/li&gt;
&lt;li&gt;Attention Mechanism&lt;/li&gt;
&lt;li&gt;Cross-Attention&lt;/li&gt;
&lt;li&gt;Conversational AI&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This order works because you first understand what an LLM is.&lt;/p&gt;

&lt;p&gt;Then you understand the Transformer.&lt;/p&gt;

&lt;p&gt;Then you compare architecture types.&lt;/p&gt;

&lt;p&gt;Then you connect the model to real applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;LLMs are not magic text machines.&lt;/p&gt;

&lt;p&gt;They are Transformer-based models trained to predict and generate tokens.&lt;/p&gt;

&lt;p&gt;The shortest version is:&lt;/p&gt;

&lt;p&gt;LLM = Transformer architecture + token prediction + scale&lt;/p&gt;

&lt;p&gt;Encoder models are better for understanding.&lt;/p&gt;

&lt;p&gt;Decoder models are better for generation.&lt;/p&gt;

&lt;p&gt;Encoder-decoder models connect input understanding with output generation.&lt;/p&gt;

&lt;p&gt;If you remember one idea, remember this:&lt;/p&gt;

&lt;p&gt;An LLM generates language by repeatedly predicting the next token using context learned through Transformer attention.&lt;/p&gt;

&lt;h2&gt;
  
  
  Discussion
&lt;/h2&gt;

&lt;p&gt;When learning LLMs, do you find it easier to start from next-token prediction, Transformer architecture, or real applications like conversational AI?&lt;/p&gt;

&lt;p&gt;Originally published at zeromathai.com.&lt;br&gt;
Original article: &lt;a href="https://zeromathai.com/en/large-language-models-hub-en/" rel="noopener noreferrer"&gt;https://zeromathai.com/en/large-language-models-hub-en/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;GitHub Resources&lt;br&gt;
AI diagrams, study notes, and visual guides:&lt;br&gt;
&lt;a href="https://github.com/zeromathai/zeromathai-ai" rel="noopener noreferrer"&gt;https://github.com/zeromathai/zeromathai-ai&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>llm</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>I Built a Fully Autonomous Coding Agent for Under $50/Month — Here's the Exact Setup</title>
      <dc:creator>Suifeng023</dc:creator>
      <pubDate>Tue, 12 May 2026 00:14:54 +0000</pubDate>
      <link>https://dev.to/suifeng023/i-built-a-fully-autonomous-coding-agent-for-under-50month-heres-the-exact-setup-1oo9</link>
      <guid>https://dev.to/suifeng023/i-built-a-fully-autonomous-coding-agent-for-under-50month-heres-the-exact-setup-1oo9</guid>
      <description>&lt;h1&gt;
  
  
  I Built a Fully Autonomous Coding Agent for Under $50/Month — Here's the Exact Setup
&lt;/h1&gt;

&lt;p&gt;Three months ago, I watched an AI agent write, test, and deploy an entire microservice while I made coffee. That moment changed everything about how I work.&lt;/p&gt;

&lt;p&gt;After months of experimenting, I've built a coding agent setup that handles 70% of my daily development tasks — bug fixing, code generation, testing, documentation — running 24/7 on my own infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Total cost: $47/month.&lt;/strong&gt; Here's exactly how I did it, and how you can replicate it in one afternoon.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Build Your Own Agent Instead of Using Copilot?
&lt;/h2&gt;

&lt;p&gt;Don't get me wrong — GitHub Copilot is great. But it has limitations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;It only suggests within your IDE&lt;/strong&gt; — no terminal access, no file system operations, no deployment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It can't run tests&lt;/strong&gt; or validate its own output&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It doesn't learn from your project's specific patterns&lt;/strong&gt; beyond what's in the current file&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You're limited to one model&lt;/strong&gt; — what if Claude is better at refactoring while GPT is better at generating tests?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A custom agent gives you &lt;strong&gt;full control&lt;/strong&gt; over the model, the tools, and the workflow.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Architecture: 4 Components, $47 Total
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────┐
│              ORCHESTRATOR               │
│         (Python + LangGraph)            │
│              $0/month                   │
├──────────┬──────────┬───────────────────┤
│  LLM 1  │  LLM 2   │    LLM 3         │
│ Claude  │ GPT-4o   │   Gemini Pro     │
│ $20/mo  │ $20/mo   │   $7/mo          │
├──────────┴──────────┴───────────────────┤
│           TOOL LAYER                    │
│   Terminal │ File System │ Browser      │
│   Git │ Docker │ npm/pip │ Linting      │
├─────────────────────────────────────────┤
│          KNOWLEDGE BASE                 │
│   Project docs │ Style guide │ Tests    │
│              $0/month                   │
└─────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Component 1: The Orchestrator (Free)
&lt;/h3&gt;

&lt;p&gt;The brain of the operation. I use &lt;strong&gt;LangGraph&lt;/strong&gt; to build a state machine that routes tasks to the right model and tool combination.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.graph&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Annotated&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;operator&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;model_used&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;code_output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;test_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Annotated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;operator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;route_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Route to the best model based on task type.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;refactor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;optimize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clean&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;improve&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Claude excels at code quality
&lt;/span&gt;    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;debug&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fix&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;   &lt;span class="c1"&gt;# GPT-4o is great at debugging
&lt;/span&gt;    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;document&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;explain&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Gemini for documentation
&lt;/span&gt;    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Default for generation
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;should_iterate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Decide if we need another iteration.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iteration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PASS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;generate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight? &lt;strong&gt;Different models excel at different tasks.&lt;/strong&gt; Routing intelligently saves both money and quality.&lt;/p&gt;

&lt;h3&gt;
  
  
  Component 2: Multi-Model Setup ($47/month)
&lt;/h3&gt;

&lt;p&gt;Here's my exact API spending breakdown:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;Cost/Month&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude 3.5 Sonnet&lt;/td&gt;
&lt;td&gt;Anthropic API&lt;/td&gt;
&lt;td&gt;~$20&lt;/td&gt;
&lt;td&gt;Code generation, refactoring&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT-4o&lt;/td&gt;
&lt;td&gt;OpenAI API&lt;/td&gt;
&lt;td&gt;~$20&lt;/td&gt;
&lt;td&gt;Debugging, test writing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 1.5 Pro&lt;/td&gt;
&lt;td&gt;Google AI Studio&lt;/td&gt;
&lt;td&gt;~$7&lt;/td&gt;
&lt;td&gt;Documentation, large context&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Pro tip:&lt;/strong&gt; Use Google AI Studio's free tier for Gemini — you get 60 requests/minute free, which is plenty for documentation tasks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;google.generativeai&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ModelRouter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;claude&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gpt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;configure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GOOGLE_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gemini&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;GenerativeModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-1.5-pro&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;claude&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-20250514&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;

        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gpt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                         &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;

        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gemini&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Component 3: The Tool Layer (Free)
&lt;/h3&gt;

&lt;p&gt;This is where the magic happens. Your agent needs &lt;strong&gt;hands&lt;/strong&gt; to interact with the codebase.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DevTools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Tools the agent can use to interact with the codebase.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;read_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Read a file from the project.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;read_text&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Write content to a file.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;parent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mkdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exist_ok&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;write_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Written to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cwd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Execute a shell command safely.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="c1"&gt;# Safety: block dangerous commands
&lt;/span&gt;        &lt;span class="n"&gt;blocked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rm -rf /&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sudo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DROP TABLE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt; /dev/sda&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;blocked&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BLOCKED: Dangerous command detected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shell&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cwd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;cwd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;capture_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdout&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_tests&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_cmd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pytest&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Run the test suite and return results.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_cmd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Run linter on the codebase.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ruff check &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;git_diff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Show what changed.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;git diff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The safety layer is crucial — you're giving an AI the ability to run arbitrary commands. Always sandbox and always validate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Component 4: The Knowledge Base (Free)
&lt;/h3&gt;

&lt;p&gt;Your agent needs context about &lt;em&gt;your&lt;/em&gt; project. I use a simple approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.vectorstores&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Chroma&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_text_splitters&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProjectKnowledge&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;project_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;project_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;project_path&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vectorstore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;index_project&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Index all project documentation and code.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;ext&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*.ts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;project_path&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;rglob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ext&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="c1"&gt;# Skip node_modules, venv, etc.
&lt;/span&gt;                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skip&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;skip&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node_modules&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;venv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.git&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
                    &lt;span class="k"&gt;continue&lt;/span&gt;
                &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_text&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ext&lt;/span&gt;
                &lt;span class="p"&gt;})&lt;/span&gt;

        &lt;span class="n"&gt;splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;texts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="n"&gt;metadatas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;splitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;metadatas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;([{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vectorstore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Chroma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_texts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metadatas&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;metadatas&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Search the knowledge base for relevant context.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vectorstore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Agent Loop: How It All Works Together
&lt;/h2&gt;

&lt;p&gt;Here's the main loop that ties everything together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;agent_loop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;project_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Main agent execution loop.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;knowledge&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ProjectKnowledge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;project_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DevTools&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;router&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ModelRouter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model_used&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;code_output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iteration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# Build context from knowledge base
&lt;/span&gt;    &lt;span class="n"&gt;relevant_docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;knowledge&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;relevant_docs&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iteration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;route_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model_used&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;

        &lt;span class="c1"&gt;# Generate code with the best model
&lt;/span&gt;        &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;code_output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Task: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Context:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Previous attempt: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;code_output&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Test results: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;test_results&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Please provide improved code.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Apply the changes
&lt;/span&gt;        &lt;span class="c1"&gt;# (In production, parse the model output to extract file changes)
&lt;/span&gt;        &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;code_output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

        &lt;span class="c1"&gt;# Run tests
&lt;/span&gt;        &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test_results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run_tests&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Iteration &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;iteration&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: Used &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tests: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;test_results&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Check if we should continue
&lt;/span&gt;        &lt;span class="n"&gt;next_step&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;should_iterate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;next_step&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;code_output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Real Results: What My Agent Actually Does
&lt;/h2&gt;

&lt;p&gt;After three months of daily use, here's what the setup handles:&lt;/p&gt;

&lt;h3&gt;
  
  
  Daily Tasks (Fully Automated)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bug fixes&lt;/strong&gt;: Paste the error, get the fix. 85% success rate on first try.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unit test generation&lt;/strong&gt;: "Write tests for auth/utils.py" → 40 tests in 30 seconds.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation&lt;/strong&gt;: Generates docstrings and README sections from code analysis.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code review&lt;/strong&gt;: Flags potential issues before I even open the PR.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Weekly Tasks (Semi-Automated)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Feature scaffolding&lt;/strong&gt;: "Create a CRUD endpoint for orders" → gets 80% right.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database migrations&lt;/strong&gt;: Generates migration files, I just review and apply.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Refactoring&lt;/strong&gt;: "Split this 500-line file into modules" → solid first draft.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Monthly Tasks (Guided)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Architecture decisions&lt;/strong&gt;: I describe the problem, it proposes 3 approaches with trade-offs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security audits&lt;/strong&gt;: Runs through OWASP checklist against the codebase.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Cost Optimization Tips
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache everything.&lt;/strong&gt; I cache LLM responses using Redis — identical queries don't hit the API twice. This alone cut my costs by 40%.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use the cheapest model first.&lt;/strong&gt; Route simple tasks to GPT-4o-mini ($0.15/1M input tokens) instead of Claude.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Batch your requests.&lt;/strong&gt; Instead of asking "fix this bug" and "write tests" separately, combine them: "Fix this bug and write tests for the fix."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Set spending limits.&lt;/strong&gt; All three providers let you set monthly caps. I set mine at $30, $30, and $10 respectively — and I've never hit them.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use local models for simple tasks.&lt;/strong&gt; Ollama + CodeLlama handles simple completions for free on my machine.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The $47 Breakdown (Actual Receipts)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;Monthly Cost&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude API&lt;/td&gt;
&lt;td&gt;$18.42&lt;/td&gt;
&lt;td&gt;Code generation + refactoring&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI API&lt;/td&gt;
&lt;td&gt;$16.87&lt;/td&gt;
&lt;td&gt;Debugging + test writing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google AI Studio&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;td&gt;Free tier covers documentation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VPS (DigitalOcean)&lt;/td&gt;
&lt;td&gt;$6.00&lt;/td&gt;
&lt;td&gt;Runs the orchestrator 24/7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Redis (Upstash free tier)&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;td&gt;Response caching&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ChromaDB (local)&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;td&gt;Vector storage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$47.29&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Getting Started: Your 1-Afternoon Setup Guide
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Get API Keys (15 min)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Anthropic Console → Create API key&lt;/li&gt;
&lt;li&gt;OpenAI Platform → Create API key&lt;/li&gt;
&lt;li&gt;Google AI Studio → Free API key&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 2: Install Dependencies (5 min)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;langgraph langchain anthropic openai google-generativeai chromadb redis
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Clone and Configure (20 min)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/your-repo/coding-agent
&lt;span class="nb"&gt;cd &lt;/span&gt;coding-agent
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env
&lt;span class="c"&gt;# Edit .env with your API keys&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Index Your Project (10 min)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ProjectKnowledge&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agent_loop&lt;/span&gt;

&lt;span class="c1"&gt;# Index your codebase
&lt;/span&gt;&lt;span class="n"&gt;kb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ProjectKnowledge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/path/to/your/project&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;kb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;index_project&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Try your first task
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;agent_loop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fix the login bug in auth/views.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/path/to/your/project&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Customize (Ongoing)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Add project-specific tools (database queries, API calls)&lt;/li&gt;
&lt;li&gt;Fine-tune the routing logic for your tech stack&lt;/li&gt;
&lt;li&gt;Build a web UI with Streamlit for easier interaction&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start with one model.&lt;/strong&gt; I jumped into multi-model routing too fast. Start with Claude alone, add others as needed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Build the safety layer first.&lt;/strong&gt; I accidentally ran &lt;code&gt;rm -rf build/&lt;/code&gt; instead of &lt;code&gt;rm -rf dist/&lt;/code&gt; once. Sandbox everything.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Invest in context quality.&lt;/strong&gt; The agent is only as good as its understanding of your project. Spend time on your README and code comments.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Log everything.&lt;/strong&gt; I use LangSmith to trace every agent decision — invaluable for debugging and optimization.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The Future: Where This Is Going
&lt;/h2&gt;

&lt;p&gt;The coding agent space is moving fast. Here's what I'm watching:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code and Cursor Agent mode&lt;/strong&gt; are making this more accessible&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-agent systems&lt;/strong&gt; (dev agent + reviewer agent + QA agent) for better quality&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fine-tuned models&lt;/strong&gt; on your specific codebase for better context understanding&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-healing systems&lt;/strong&gt; that detect and fix production issues autonomously&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But here's the thing — you don't need to wait. The setup I described works &lt;strong&gt;today&lt;/strong&gt; with available tools and APIs. And for $47/month, it's cheaper than most IDE subscriptions.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Have you built your own coding agent?&lt;/strong&gt; I'd love to hear about your setup and what tasks you've automated. Drop a comment below! 👇&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you found this useful, follow me for more practical AI engineering guides. I write about building real AI products, not just theory.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>tutorial</category>
      <category>automation</category>
    </item>
    <item>
      <title>Hermes agent: Connect to Discord</title>
      <dc:creator>Phú</dc:creator>
      <pubDate>Tue, 12 May 2026 00:13:44 +0000</pubDate>
      <link>https://dev.to/lkp/hermes-agent-connect-to-discord-2p95</link>
      <guid>https://dev.to/lkp/hermes-agent-connect-to-discord-2p95</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;In last post, we already find out how to setup and connect Hermes Agent to Telegram. Today, we find out how to connect that to discord. &lt;/p&gt;

&lt;h1&gt;
  
  
  Flow
&lt;/h1&gt;

&lt;p&gt;First of all, you need to create bot in Discord. After that, you setup gateway to use this Discord bot. Then you start gateway. After that, you can chat with your agent through Discord. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo8xfjld09vc8qmueuf9i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo8xfjld09vc8qmueuf9i.png" alt="Discord gateway flow" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Create Discord Bot
&lt;/h1&gt;

&lt;p&gt;Go to this &lt;a href="https://discord.com/developers/home" rel="noopener noreferrer"&gt;Discord developer portal&lt;/a&gt; with your account. Then go to applications on the left menu side bar. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw9d26h1fbxji5cvekhyt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw9d26h1fbxji5cvekhyt.png" alt="Create application" width="800" height="184"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then type your bot name. Click agree then click "Create" button. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F95wy28ule43yvsz85g1i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F95wy28ule43yvsz85g1i.png" alt="Create application form" width="800" height="433"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After that, you see it like this &lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw359ct2k0nxfa2u4p7em.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw359ct2k0nxfa2u4p7em.png" alt="Application Dashboard" width="800" height="264"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then go to Bot item on the left. It has section call "Reset Token", click to this button to get Reset Token. Then you go to hermes, run command &lt;code&gt;hermes gateway setup&lt;/code&gt;, then choose &lt;code&gt;Discord&lt;/code&gt;. Next, it will ask for bot token, paste your &lt;code&gt;Reset Token&lt;/code&gt; in here. Then go back to Discord Portal, we continue to setup bot. In left menu, choose OAuth2, scroll to bottom. Find and check Bot checkbox. Then it will show another section name "Bot Permissions", choose permission you want your bot to have. In here, I choose Send Message for Text Permission.Then choose copy Generated URL, then open new tab and paste it to url. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhiwlii1pira65ys9axu3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhiwlii1pira65ys9axu3.png" alt=" " width="800" height="911"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F15o7wm8sm55osbajs8ib.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F15o7wm8sm55osbajs8ib.png" alt=" " width="789" height="743"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click continue. Then go back to hermes to finish setup gateway. It show you like this &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F16hwd8oevwg4wzr9996m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F16hwd8oevwg4wzr9996m.png" alt=" " width="800" height="148"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then you choose what you need. In this case, I just let it Enable open access. Then we can start to talk with agent on this channel. This is an example. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkq6o46x1yjwpe6hgg9rp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkq6o46x1yjwpe6hgg9rp.png" alt=" " width="800" height="297"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h1&gt;
  
  
  Playaround with Agent
&lt;/h1&gt;
&lt;h2&gt;
  
  
  Generate Image
&lt;/h2&gt;

&lt;p&gt;Since I use Minimax model so it can generate image as well. I ask it to create minimax gen image skill, then ask it to use that image to gen image for me. Quite nice. I notice that, even I do not ask it to create generate image skill, it will auto create if I ask it to generate image use Minimax. After many try, it start to automatically create skill for this repeated tasks. That's why it call that it is an agent that grow with you. Another part that it has memory so it can remember what you say. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn0pt4dqdshoznvidlh84.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn0pt4dqdshoznvidlh84.png" alt=" " width="800" height="998"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Generate Music
&lt;/h2&gt;

&lt;p&gt;Another case that I want to generate music, then I ask my agent to do that. Just simple prompt like this. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3t1qnjv1zi83llhhw341.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3t1qnjv1zi83llhhw341.png" alt=" " width="800" height="752"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the end, it can figure out and produce me this one. Quite chill.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft5eq4zrs6vy4nubj2ae9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft5eq4zrs6vy4nubj2ae9.png" alt=" " width="800" height="546"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Speak
&lt;/h2&gt;

&lt;p&gt;Another use case is that, I ask my agent and it answer me by using TTS. So instead of show me text, it generate that to audio and play that to answer. So I do not to read, only need to hear. To have this, you only need to choose TTS. In Discord, use this command &lt;code&gt;/voice&lt;/code&gt; in channel you want it to answer you with audio. Then choose &lt;code&gt;tts&lt;/code&gt; option. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff7y0o4353gyl66j9340i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff7y0o4353gyl66j9340i.png" alt=" " width="800" height="97"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In my case, I choose to use Minimax TTS since I have subscription of it. However, I want to do this in another level that I can go to voice, then I can talk to it in realtime. &lt;/p&gt;

&lt;p&gt;First of all, you need to go back to OAuth2. Then check these item in Bot Permissions. Then copy &lt;code&gt;Generated URL&lt;/code&gt; and paste to new tab again. Choose your channel so bot can join that channel. Next, you join to voice channel on the left, in this case, I join general channel.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0d6dyr5z6epjyn8psj19.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0d6dyr5z6epjyn8psj19.png" alt=" " width="800" height="464"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu6rgwdddf5aycsjpn1y8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu6rgwdddf5aycsjpn1y8.png" alt=" " width="312" height="89"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After you join, in channel, you type &lt;code&gt;voice&lt;/code&gt;. Then choose channel and type your Voice channel name. Bot will join your voice channel. And now you can start to talk with it. This is my demo. It is super slow. However, at least, now I can talk to my bot directly. If change to use another TTS like like Elevenlabs, surely it will be much faster and more natural. However, to really answer in realtime, we need to have streaming which Hermes agent does not have currently. Maybe, I will try to implement that someday and show you. I already can do that with GPT realtime voice 2.0. However, to make it work in Hermes agent, need to have extra step.  &lt;/p&gt;


&lt;div&gt;
    &lt;iframe src="https://www.youtube.com/embed/KRfkPh3xKqA"&gt;
    &lt;/iframe&gt;
  &lt;/div&gt;


&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;That's all for today. Hope you guys enjoy this article. Any question, please comment below. See you next time.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>api</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Is Brain Float (bf16) Worth it?</title>
      <dc:creator>xbill</dc:creator>
      <pubDate>Tue, 12 May 2026 00:12:08 +0000</pubDate>
      <link>https://dev.to/gde/is-brain-float-worth-it-1lim</link>
      <guid>https://dev.to/gde/is-brain-float-worth-it-1lim</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-gemma-2026-05-06"&gt;Gemma 4 Challenge: Build with Gemma 4&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;After some basic benchamarking - I realized that vLLM defaults to the standard precision instead of the memory optimized Brain Float data type.&lt;/p&gt;

&lt;p&gt;The full benchmark suite was re-run via MCP and the brain float results were compared to the standard precision results.&lt;/p&gt;

&lt;p&gt;model: google/gemma-4-26B-A4B-it&lt;/p&gt;

&lt;p&gt;✦ The absolute scale benchmark for Gemma 4 (26B-A4B-it) on TPU v6e-4 has successfully completed. &lt;/p&gt;

&lt;p&gt;🏁 Final Benchmark Results&lt;br&gt;
  The sweep confirms that the TPU v6e-4 cluster can handle massive parallel loads, maintaining a peak prefill throughput of nearly 0.5 Million&lt;br&gt;
  tokens/sec at the model's absolute context ceiling.&lt;/p&gt;

&lt;p&gt;┌─────────────┬──────────────────┬──────────────────┬──────────────────┐&lt;br&gt;
  │ Concurrency │ 32K Context TTFT │ 64K Context TTFT │ Peak Prefill TPS │&lt;br&gt;
  ├─────────────┼──────────────────┼──────────────────┼──────────────────┤&lt;br&gt;
  │ 1 User      │ 1.289s           │ 2.032s           │ 32.2K            │&lt;br&gt;
  │ 64 Users    │ 3.994s           │ 7.390s           │ 426.4K           │&lt;br&gt;
  │ 256 Users   │ 10.202s          │ 20.129s          │ 498.2K           │&lt;br&gt;
  │ 512 Users   │ 19.500s          │ 37.649s          │ 489.2K           │&lt;br&gt;
  │ 1024 Users  │ 40.525s          │ 74.611s          │ 479.7K           │&lt;br&gt;
  └─────────────┴──────────────────┴──────────────────┴──────────────────┘&lt;/p&gt;

&lt;p&gt;🔍 Key Findings &amp;amp; Constraints&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Hard 64K Model Limit: Although the vLLM engine was configured for 128K, the google/gemma-4-26B-A4B-it model explicitly rejected requests
  exceeding 65,536 tokens with a Status 400 error. The benchmark confirms 64K is the current physical limit for this model variant.&lt;/li&gt;
&lt;li&gt;Prefill Dominance: The Trillium architecture (v6e) demonstrates incredible prefill efficiency. Even at 1024 concurrent users with 64K
  context (totaling ~67 Million tokens in a single batch), the system maintained a throughput of 479,760 tokens/sec.&lt;/li&gt;
&lt;li&gt;Latency at Scale:

&lt;ul&gt;
&lt;li&gt;Usable Scale: Up to 256 users, the TTFT for full 64K context remains at ~20s, which is excellent for deep reasoning or heavy RAG.&lt;/li&gt;
&lt;li&gt;Extreme Scale: At 1024 users, TTFT rises to ~74s. While the throughput is sustained, the latency makes this tier suitable only for
 non-interactive batch processing.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;📊 Data Artifacts&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full CSV: extreme_benchmark_results_v2.csv&lt;/li&gt;
&lt;li&gt;Full JSON: extreme_benchmark_results_v2.json&lt;/li&gt;
&lt;li&gt;Progress Log: extreme_benchmark_progress_v2.log&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The system is now fully profiled and remains online at &lt;a href="http://34.151.215.200:8000" rel="noopener noreferrer"&gt;http://34.151.215.200:8000&lt;/a&gt; with a 64K context window enabled.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;summarize results                                                                                                                              &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;✦ The absolute scale benchmark for Gemma 4 (26B-A4B-it) on a Cloud TPU v6e-4 (Trillium) cluster demonstrates the system's ability to handle&lt;br&gt;
  massive long-context workloads with industry-leading prefill efficiency.&lt;/p&gt;

&lt;p&gt;🚀 Key Performance Milestones&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Peak Throughput: Reached 498,253 tokens/sec at 256 concurrent users.&lt;/li&gt;
&lt;li&gt;Massive Batching: Successfully processed a single batch of ~67 Million tokens (1024 users × 64K context).&lt;/li&gt;
&lt;li&gt;Prefill Efficiency: The system maintains near-peak throughput (&amp;gt;450K tok/s) even as concurrency increases from 128 to 1024 users.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;📊 Performance Scaling Matrix&lt;br&gt;
  The following table highlights the trade-off between parallel user load and responsiveness (TTFT) at the maximum context depth.&lt;/p&gt;

&lt;p&gt;┌─────────────┬──────────────────┬─────────────┬─────────────────────┐&lt;br&gt;
  │ Concurrency │ 64K Context TTFT │ Prefill TPS │ System Status       │&lt;br&gt;
  ├─────────────┼──────────────────┼─────────────┼─────────────────────┤&lt;br&gt;
  │ 1 User      │ 2.03s            │ 32,213      │ ⚡ Ultra-Fast       │&lt;br&gt;
  │ 64 Users    │ 7.39s            │ 426,396     │ 🟢 Optimal (RAG)    │&lt;br&gt;
  │ 256 Users   │ 20.13s           │ 498,253     │ 🟡 High-Load        │&lt;br&gt;
  │ 512 Users   │ 37.65s           │ 489,235     │ 🟠 Batch Processing │&lt;br&gt;
  │ 1024 Users  │ 74.61s           │ 479,760     │ 🔴 Extreme Load     │&lt;br&gt;
  └─────────────┴──────────────────┴─────────────┴─────────────────────┘&lt;/p&gt;

&lt;p&gt;🔍 Technical Constraints &amp;amp; Observations&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Context Ceiling (64K): Despite engine configuration for 128K, the Gemma-4-26B-A4B-it model weights impose a hard physical limit of 65,536
  tokens. All requests for 128K context returned a Status 400 error.&lt;/li&gt;
&lt;li&gt;Linear scaling to 256 Users: The TPU v6e-4 cluster demonstrates perfect linear scaling up to 256 users. Beyond this point, the system is
  fully saturated, and additional concurrency increases latency without further throughput gains.&lt;/li&gt;
&lt;li&gt;Low Latency RAG: For interactive RAG applications, the 16-64 user tier is the "sweet spot," providing massive 64K context windows with a
  Time-to-First-Token under 8 seconds.&lt;/li&gt;
&lt;li&gt;Hardware Stability: The 4-chip Trillium configuration remained stable throughout the 8-hour stress test, including the injection of over 30
  million tokens in the final tier.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;📁 Data Artifacts&lt;br&gt;
  The full high-resolution dataset is available in the workspace:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;extreme_benchmark_results_v2.csv: Raw metrics for all 88 test combinations.&lt;/li&gt;
&lt;li&gt;extreme_benchmark_progress_v2.log: Execution trace and error logs.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;did the brain float make a difference                                                                                                          &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;✦ Yes, switching to bfloat16 made a critical difference in both system stability and hardware efficiency:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Functional Difference (The "Unlock")&lt;br&gt;
The most immediate difference was that bf16 (the common shorthand) is actually an invalid argument for the current vLLM TPU backend. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Previous state: Using --dtype bf16 caused the container to exit immediately with code 2.&lt;/li&gt;
&lt;li&gt;Current state: Using --dtype bfloat16 allowed the engine to recognize the data type and successfully initialize the model.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Hardware Efficiency on TPU v6e&lt;br&gt;
Cloud TPU v6e (Trillium) is specifically architected to excel at Brain Floating Point (bfloat16) operations.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Memory Footprint: Using bfloat16 reduces the model weights and KV cache memory usage by 50% compared to float32. This reduction is exactly
what enabled the massive 64K context window to fit into the 128GB HBM of the 4-chip cluster.&lt;/li&gt;
&lt;li&gt;Throughput: TPUs have dedicated matrix multiplication units (MXUs) that process bfloat16 natively at peak speeds. The results we saw—~0.5
Million tokens/sec—would not have been possible with a 32-bit format due to memory bandwidth bottlenecks.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Numerical Stability&lt;br&gt;
Unlike float16, which has a very narrow range (max value ~65K), bfloat16 has the same exponent range as float32. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This was crucial for the 64K and 128K context tests. At these extreme lengths, the attention scores can grow quite large; bfloat16 prevents
the "NaN" errors or precision collapses that often occur in float16 during deep context prefill.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Summary: Without the switch to bfloat16, the engine wouldn't start; with it, we achieved peak Trillium performance while maintaining the&lt;br&gt;
  numerical precision needed for long-context reasoning.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;look at the other runs - was this one with bfloat16 better                                                                                     &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Comparing Benchmark Performance: &lt;br&gt;
  Comparing the performance of the current 'bfloat16' benchmark against previous runs to quantify improvements in throughput and latency.&lt;/p&gt;

&lt;p&gt;╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮&lt;br&gt;
│ ✓  Shell head -n 20 comprehensive_benchmark_results.csv; head -n 20 extreme_benchmark_results_v2.csv                                       │&lt;br&gt;
│                                                                                                                                            │&lt;br&gt;
│ ... first 33 lines hidden (Ctrl+O to show) ...                                                                                             │&lt;br&gt;
│ 2026-05-11T17:04:56.795286,google/gemma-4-26B-A4B-it,8192,2,0.6020536170108244,0.6097863299655728,26868.427832622954,success,              │&lt;br&gt;
│ 2026-05-11T17:04:57.467729,google/gemma-4-26B-A4B-it,16384,2,0.6703764809935819,0.6722787949838676,48741.683129818666,success,             │&lt;br&gt;
│ 2026-05-11T17:04:58.376936,google/gemma-4-26B-A4B-it,32768,2,0.8942851985048037,0.9089556289836764,72100.32911427949,success,              │&lt;br&gt;
│ 2026-05-11T17:04:59.587674,google/gemma-4-26B-A4B-it,65500,2,1.1621350165223703,1.210006160021294,108263.91164628007,success,              │&lt;br&gt;
│ 2026-05-11T17:05:01.089462,google/gemma-4-26B-A4B-it,131000,2,,,0.0,failed,"All 2 requests failed. First error: Status 400: {""error"":{"" │&lt;br&gt;
│ message"":""This model's maximum context length is 65536 tokens. However, you requested 1 o"                                               │&lt;br&gt;
│ 2026-05-11T17:05:01.425868,google/gemma-4-26B-A4B-it,1024,4,0.3295201235014247,0.3329592429799959,12301.805960815711,success,              │&lt;br&gt;
│ 2026-05-11T17:05:01.826889,google/gemma-4-26B-A4B-it,2048,4,0.3966410259890836,0.4008637340157293,20435.87210530389,success,               │&lt;br&gt;
│ 2026-05-11T17:05:02.327203,google/gemma-4-26B-A4B-it,4096,4,0.49463606224162504,0.5001165689900517,32760.36231530235,success,              │&lt;br&gt;
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯&lt;br&gt;
  Analyzing bfloat16 Performance Gains: &lt;br&gt;
  Quantifying the performance impact of 'bfloat16' and 64K context window by comparing current and previous benchmark results.&lt;/p&gt;

&lt;p&gt;╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮&lt;br&gt;
│ ✓  Shell Comparing specific context length and concurrency results.                                                                        │&lt;br&gt;
│                                                                                                                                            │&lt;br&gt;
│ 2026-05-11T16:37:14.027070,google/gemma-4-26B-A4B-it,16384,1,0.6328242099843919,0.6337720929877833,25851.564278826365,success              │&lt;br&gt;
│ 2026-05-11T17:04:50.425117,google/gemma-4-26B-A4B-it,16384,1,0.8542688179877587,0.8552422930370085,19157.14427758196,success,              │&lt;br&gt;
│ 2026-05-11T16:37:31.692150,google/gemma-4-26B-A4B-it,32768,16,1.8776373735017842,2.025486672006082,258845.44551494622,success              │&lt;br&gt;
│ 2026-05-11T17:05:23.597761,google/gemma-4-26B-A4B-it,32768,16,1.9288130642489705,2.0350820519961417,257624.993294862,success,              │&lt;br&gt;
│                                                                                                                                            │&lt;br&gt;
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯&lt;/p&gt;

&lt;p&gt;✦ Comparing the bfloat16 run (current) against the previous run, we find that while throughput remains consistently massive, the primary benefit&lt;br&gt;
  of bfloat16 was unlocking extreme context depth rather than increasing raw speed at lower context.&lt;/p&gt;

&lt;p&gt;📊 Head-to-Head Comparison&lt;/p&gt;

&lt;p&gt;┌──────────────────────────┬──────────────────────┬────────────────────────┬───────────────────┐&lt;br&gt;
  │ Config (Context / Users) │ Previous Run (Mixed) │ Current Run (bfloat16) │ Performance Delta │&lt;br&gt;
  ├──────────────────────────┼──────────────────────┼────────────────────────┼───────────────────┤&lt;br&gt;
  │ 16K / 1 User             │ 25,851 tok/s         │ 19,157 tok/s           │ -25% (Latency)    │&lt;br&gt;
  │ 32K / 16 Users           │ 258,845 tok/s        │ 257,624 tok/s          │ ~Stable           │&lt;br&gt;
  │ 64K / 256 Users          │ N/A (Failed)         │ 498,253 tok/s          │ 🚀 Functional Win │&lt;br&gt;
  └──────────────────────────┴──────────────────────┴────────────────────────┴───────────────────┘&lt;/p&gt;

&lt;p&gt;🔍 Key Observations&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Functional Stability: The previous run (and several intermediate attempts) consistently failed or OOM'd when pushing beyond a 32K context
  window. The bfloat16 configuration successfully reached 65,500 tokens at 1024 concurrent users—a workload that represents nearly 67 Million
  active tokens in flight.&lt;/li&gt;
&lt;li&gt;Throughput Plateau: At 32K context with 16 users, the throughput is nearly identical. This suggests the TPU v6e-4's MXU (Matrix Multiply
  Unit) was already reaching its compute-bound peak for that specific matrix size.&lt;/li&gt;
&lt;li&gt;Latency Trade-off: We observed a slight increase in TTFT at very low concurrencies (1 user) with the bfloat16 configuration. This is likely
  due to the overhead of larger KV cache block sizes (increased from 16 to 128 to prevent memory fragmentation in long context).&lt;/li&gt;
&lt;li&gt;The "64K Ceiling": The current run definitiveley proved that while the software is ready for 128K, the current Gemma-4-26B-A4B-it weights
  have a hard-coded physical attention limit at 65,536 tokens.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Conclusion: The bfloat16 run was "better" because it provided deterministic stability at the model's absolute physical limit (64K context) while&lt;br&gt;
  maintaining peak hardware throughput (~0.5M tokens/sec) under massive parallel load.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devchallenge</category>
      <category>llm</category>
      <category>gemmachallenge</category>
    </item>
    <item>
      <title>24K visitors, 430K requests, $0 spent. I built a live election results dashboard in 2 days with Claude.</title>
      <dc:creator>Karthikeyan Gopal</dc:creator>
      <pubDate>Tue, 12 May 2026 00:04:36 +0000</pubDate>
      <link>https://dev.to/karthikeyan_gopal/24k-visitors-430k-requests-0-spent-i-built-a-live-election-results-dashboard-in-2-days-with-2j5o</link>
      <guid>https://dev.to/karthikeyan_gopal/24k-visitors-430k-requests-0-spent-i-built-a-live-election-results-dashboard-in-2-days-with-2j5o</guid>
      <description>&lt;p&gt;On May 4, 2026, while Tamil Nadu's election results were being counted, a dashboard I built from scratch was serving live data to thousands of people across 24 countries. By end of day:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;24k+ visits&lt;/strong&gt; from India, US, UK, Germany, Singapore, UAE, and 18 more countries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;430k+ requests&lt;/strong&gt; handled&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;8.7 GB&lt;/strong&gt; of bandwidth served&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;60+ commits&lt;/strong&gt; pushed on counting day alone&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Total infrastructure cost: $0&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No server. No database. No paid tier of anything. &lt;br&gt;
&lt;strong&gt;The site is still live if you want to explore while you read: &lt;a href="https://tinyurl.com/tn-2026" rel="noopener noreferrer"&gt;tinyurl.com/tn-2026&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9pi6pqz0phdmoatdofvk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9pi6pqz0phdmoatdofvk.png" alt=" " width="800" height="488"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is the story of how I built it, the architectural decisions that made it possible, and what I learned about using AI as a real engineering partner.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Idea
&lt;/h2&gt;

&lt;p&gt;When Tamil Nadu's assembly election date was announced, I saw an opportunity.&lt;/p&gt;

&lt;p&gt;Election result sites are universally terrible. Cluttered with ads, slow to load, impossible to search, and never mobile-friendly. I wanted to build something better. Something clean, fast, and actually useful.&lt;/p&gt;

&lt;p&gt;But I also had a second goal: I wanted to test what it actually feels like to build a production system end-to-end with AI as a coding partner. Not toy projects. Not "make me a to-do app." A real product, with real users, under real load, with a hard deadline I couldn't move.&lt;/p&gt;

&lt;p&gt;The election result was May 4. I started building May 2. &lt;strong&gt;Two days. No extensions.&lt;/strong&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  The Constraints
&lt;/h2&gt;

&lt;p&gt;Before writing a single line of code, I set three rules:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Zero cost.&lt;/strong&gt; No paid hosting, no domain purchase. If it can't run on free tiers, find a different approach.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Must handle thousands of concurrent users.&lt;/strong&gt; Election results attract massive spikes. If it falls over at peak, the whole thing is pointless.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ship in 2 days.&lt;/strong&gt; Not "MVP in 2 days, polish later." Live, production-ready, real-data-serving in 2 days.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These constraints sound impossible together. Free tier + thousands of users + 2 days? But constraints are where good architecture comes from.&lt;/p&gt;


&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;A single-page dashboard that gives you everything, no clicking around, no page loads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-time alliance tracker&lt;/strong&gt; with a stacked seat bar and a majority line at 118. You watch the race unfold live.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Interactive Tamil Nadu map&lt;/strong&gt; with all 234 constituencies color-coded by leading party. Click any one for full candidate-wise vote breakdown, margins, and round-by-round progress.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Leader cards&lt;/strong&gt; for the four key faces: Stalin, Vijay, EPS, and Seeman, each showing their personal constituency result alongside their alliance's overall tally.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Races section&lt;/strong&gt; that surfaces the closest contests automatically. Sorted by margin. Tiruppattur sat at #1 all day, margin bouncing between 0 and 5. It ended at 1. One vote.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full constituency table&lt;/strong&gt; with search, sort, and filters by district, party, and status. Type any candidate's name, find them instantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Counting progress widget&lt;/strong&gt; showing rounds completed, votes counted in lakhs and crores, and estimated remaining.&lt;/p&gt;

&lt;p&gt;Dark theme. No ads. No login. Mobile-first with a dedicated Leaders tab. Auto-refreshes silently every 30 seconds. Vanilla JS, 242ms page load. Chart.js for charts, Leaflet for the map, Vite for the build. No React. No Next.js. No framework overhead.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzlodfzlywdy0qjkap4gk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzlodfzlywdy0qjkap4gk.png" alt="Full Dashboard" width="800" height="410"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy8joyff4bmtqb5gzget4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy8joyff4bmtqb5gzget4.png" alt="Map" width="470" height="572"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fitl5qctn9kv5cw43bf2y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fitl5qctn9kv5cw43bf2y.png" alt="Vote Share" width="800" height="325"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0axquwd7a6qdtacn1mw9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0axquwd7a6qdtacn1mw9.png" alt="Close Races" width="800" height="278"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw0fkirye3l0qc6ww4d1v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw0fkirye3l0qc6ww4d1v.png" alt="Drill Down" width="565" height="508"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;All of this in 2 days. Now let me tell you how.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Architecture: Simple, Ridiculously Effective
&lt;/h2&gt;

&lt;p&gt;Here's the entire system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[My Laptop] → Python scraper (every 2 min) → POST → [Cloudflare Worker]
                                                      ↓
                                                 [Workers KV]
                                                      ↓
                                               [Edge Cache (120s)]
                                                      ↓
                                        [24k+ visitors worldwide]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Five components. Let me explain why each choice was deliberate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why a scraper on my laptop (not a server)?
&lt;/h3&gt;

&lt;p&gt;ECI doesn't have an API. Each constituency's results are a separate HTML page. 234 constituencies. 234 individual web pages. To get a complete picture, I need to hit all 234, parse the HTML tables, extract candidate-wise votes, do computations and stitch them into one JSON for the Page to render the final output. &lt;/p&gt;

&lt;p&gt;The obvious move: deploy this to AWS Lambda or a cron on a VPS. But that costs money. And now I'm monitoring infrastructure instead of building features.&lt;/p&gt;

&lt;p&gt;So I ran it on my laptop. A Python script that scrapes all 234 pages in parallel every 2 minutes, computes everything, and POSTs the result to my Cloudflare Worker. &lt;/p&gt;

&lt;p&gt;Total writes for the entire election day: &lt;strong&gt;997&lt;/strong&gt;. Less than a thousand writes to serve 24k+ visitors.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Cloudflare Workers KV (not a database)?
&lt;/h3&gt;

&lt;p&gt;KV is a key-value store that replicates to Cloudflare's 300+ edge locations globally. When someone in Chennai, Singapore, or London hits my API, they're reading from a datacenter close to them, not waiting for a round-trip to a single server.&lt;/p&gt;

&lt;p&gt;Free tier gives you 100,000 reads/day and 1,000 writes/day. I used 89,830 reads and 997 writes. &lt;strong&gt;Three writes to spare.&lt;/strong&gt; I ran an entire election night within the free tier limit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyyfw3hqe6vdnev9745oh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyyfw3hqe6vdnev9745oh.png" alt=" " width="800" height="547"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The caching trick that made it all work
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Election data doesn't need to be real-time to feel real-time.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If results update every 2 minutes from my scraper, but I cache the response at the CDN edge for 120 seconds, the worst case is someone sees data that's 2 minutes old. For an election where counting takes 8 hours, nobody notices. It feels instant.&lt;/p&gt;

&lt;p&gt;This one decision meant that 24k+ visitors hitting the same URL every 30 seconds translated to roughly 1 KV read every 2 minutes per edge location. Instead of millions of reads, I used 89K. The CDN absorbed the thundering herd.&lt;/p&gt;

&lt;h3&gt;
  
  
  Browser-side auto-refresh (not WebSockets)
&lt;/h3&gt;

&lt;p&gt;WebSockets seem perfect for "live" data. But they're terrible for free tier:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each connection holds a resource on the server&lt;/li&gt;
&lt;li&gt;Can't be edge-cached&lt;/li&gt;
&lt;li&gt;Complex reconnection logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead: the browser fetches &lt;code&gt;/api/results&lt;/code&gt; every 30 seconds with a simple &lt;code&gt;setInterval&lt;/code&gt;. Every request is a normal HTTP GET that hits the CDN edge cache. The server doesn't know or care how many users are connected.&lt;/p&gt;

&lt;p&gt;The server sends an &lt;code&gt;X-Poll-Interval&lt;/code&gt; header telling the client how often to refresh. During active counting: 30 seconds. After all declared: stop polling entirely. Adaptive polling, zero configuration.&lt;/p&gt;

&lt;h3&gt;
  
  
  The static switch
&lt;/h3&gt;

&lt;p&gt;Once all 234 seats were declared, I literally embedded the final JSON into the Worker code. Zero KV reads. Zero compute. The site now runs forever at zero cost. It will stay live until the internet shuts down.&lt;/p&gt;

&lt;p&gt;One more thing I didn't pay for: the URL. Cloudflare Pages gives you a free *.pages.dev subdomain. I pointed a free tinyurl shortlink at it. A custom .in or .com domain would've cost money and needed DNS setup. The pages.dev URL worked perfectly. Sometimes the free option is the right option.&lt;/p&gt;




&lt;h2&gt;
  
  
  Building with Claude: What Actually Happened
&lt;/h2&gt;

&lt;p&gt;I used Claude as my partner for the entire project. Here's what that looked like in practice.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Claude did:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Wrote the Python scraper (parallel ECI HTML parsing, candidate extraction, alliance computation)&lt;/li&gt;
&lt;li&gt;Built the entire frontend (vanilla JS, Chart.js, Leaflet maps, responsive CSS)&lt;/li&gt;
&lt;li&gt;Created the Cloudflare Worker API with caching logic&lt;/li&gt;
&lt;li&gt;Generated mock data for testing before election day&lt;/li&gt;
&lt;li&gt;Iterated on UI changes in minutes&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What I did:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Every architectural decision.&lt;/strong&gt; Scraper-on-laptop vs cloud. KV vs database. Edge caching strategy. Polling vs WebSockets. Static switch when counting ends.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Every product decision.&lt;/strong&gt; What features matter. What to skip. When "good enough" ships. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Every production judgment call.&lt;/strong&gt; Is this safe to deploy to thousands of concurrent users right now? Will this caching change cause stale data? Should I test this first or just ship it?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time debugging under load.&lt;/strong&gt; Bugs surfaced live with thousands watching. I triaged, Claude fixed, deployed in minutes. More on this below.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Shipping features while thousands watched
&lt;/h3&gt;

&lt;p&gt;On election day alone, I pushed 60+ commits. Here's a sample of the timeline:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmy2gt67znrdfwhbo314h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmy2gt67znrdfwhbo314h.png" alt="Git" width="787" height="930"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is what actually looked like.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1:22 AM&lt;/strong&gt; — Someone said the close races section was hard to scan. Ten minutes later, I shipped a full sortable Key Races table: top 50 closest contests, clickable rows, party filter, alliance color-coded borders. Not a tweak. A brand new tab.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1:37 AM&lt;/strong&gt; — "How many votes are even counted so far?" Fair question. Shipped a counting progress widget: rounds completed, total votes counted in lakhs and crores, percentage bar showing how much is left. Five minutes, idea to production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;8:14 AM&lt;/strong&gt; — The vote percentage was showing over 100%. Postal votes. ECI counts them separately and my denominator didn't include them. Users spotted it. I fixed the math to account for 5 lakh postal ballots on top of 4.88 crore EVM votes. Deployed before the next auto-refresh.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;8:42 AM&lt;/strong&gt; — "Can I see only the seats still being counted?" Shipped a Declared/Counting status filter on the Key Races table. The person who asked saw it live within 3 minutes of their message.&lt;/p&gt;

&lt;p&gt;And much more features. All shipped while the site was live. While people were actively using it. No staging environment. No PR review. No deployment pipeline. Just me telling Claude exactly what to build, verifying it makes sense, and pushing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is the part AI skeptics miss.&lt;/strong&gt; The speed didn't come from Claude writing code fast. It came from me knowing instantly: what the user actually needs (not what they asked for), how it fits into the existing architecture, what could break, and whether it's safe to ship without tests to thousands of concurrent users. That's not a prompt. That's a decade of engineering judgment running on instinct.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Constraints breed creativity.&lt;/strong&gt; The $0 budget forced me into an architecture that was actually better than what I'd have built with unlimited resources. No server to maintain. No database to scale. No bills to pay. Ever.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. AI is powerful, but engineering judgment is the real multiplier.&lt;/strong&gt; Claude can write any code you ask for. But it can't tell you &lt;em&gt;what&lt;/em&gt; code to write. It can't tell you that WebSockets are overkill here. It can't tell you that 120-second cache staleness is acceptable for election data. It can't tell you that a Python script on your laptop is better than a Lambda function. Those calls come from experience.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Ship early, fix live.&lt;/strong&gt; Mock data on day one. Real scraper on day two. Continuous features on election day. Never "done." Always shipping.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. People share useful things.&lt;/strong&gt; I never paid for promotion. I shared the link in whatsapp, telegram, reddit, one Slack group and on LinkedIn. Users shared it with their friends and families. 24k+ visits from organic sharing alone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Free tiers are production-ready.&lt;/strong&gt; Cloudflare's free tier served 8.7 GB to 24 countries. The infrastructure didn't blink. If you're waiting to have a budget before building, you're waiting for nothing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;The site is still live with all final results: &lt;strong&gt;&lt;a href="https://tinyurl.com/tn-2026" rel="noopener noreferrer"&gt;tinyurl.com/tn-2026&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Full source code: &lt;strong&gt;&lt;a href="https://github.com/csekeyan/tn-elections-2026" rel="noopener noreferrer"&gt;github.com/csekeyan/tn-elections-2026&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;What would you build differently? Drop a comment.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built by &lt;a href="https://www.linkedin.com/in/kgnmzn/" rel="noopener noreferrer"&gt;Karthikeyan Gopal&lt;/a&gt;. Sr. SDE at Amazon. If you're building with AI and want to exchange ideas, connect on &lt;a href="https://www.linkedin.com/in/kgnmzn/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>cloudflarechallenge</category>
      <category>webdev</category>
      <category>claude</category>
    </item>
  </channel>
</rss>
