This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
Codebase Dungeon: paste any GitHub repo URL and Gemma 4 reads your actual source code, finds real security vulnerabilities and bugs, then turns them into a playable text adventure dungeon.
- Files become rooms
- Real bugs become monsters (with creative names like "The Hardcoded Sentinel" or "The CSV Injection Imp")
- You fix the bugs to clear rooms: wrong answers cost HP, correct fixes earn XP
- Gemma 4's multimodal vision analyzes your app's screenshots and creates UX-themed rooms
- At the end, you get a downloadable code review report: a genuinely useful security audit disguised as a game
It's not just a game. The output is an actionable code review that developers can use to fix real issues in their codebase.
Demo
Try the pre-loaded codebases for instant gameplay, or paste any public GitHub repo URL.
Code
🔗 github.com/aimadetools/codebase-dungeon
Key Implementation: Multimodal + 128K Context + Structured Output in One Call
// Send code + screenshot to Gemma 4: all three capabilities at once
const parts = [
{ text: prompt }, // Contains full source files (128K context)
{ inlineData: { mimeType: 'image/png', data: screenshotBase64 } } // Multimodal
];
const res = await fetch(GEMMA_API_URL, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
contents: [{ role: 'user', parts }],
generationConfig: {
responseMimeType: 'application/json', // Force JSON
responseJsonSchema: FIRST_ROOM_SCHEMA, // Structured output
maxOutputTokens: 800,
temperature: 0.6
}
})
});
// Result: clean JSON with room name, bug description, correct fix,
// victory narrative: all informed by both code AND screenshot
The Schema That Solves Gemma 4's Thinking Problem
const FIRST_ROOM_SCHEMA = {
type: 'object',
properties: {
dungeonName: { type: 'string' },
id: { type: 'string' }, // Exact file path
name: { type: 'string' }, // Creative room name
monsterName: { type: 'string' }, // Bug as a monster
bugDescription: { type: 'string' },// Real bug found in code
correctFix: { type: 'string' }, // The answer (for deterministic scoring)
victoryNarrative: { type: 'string' },
colorTheme: { type: 'string' }, // Extracted from screenshot
narrative: { type: 'string' }, // References actual UI elements
choices: { type: 'string' } // 5 options, randomized
},
required: [/* all fields */]
};
// With this schema: 99%+ parse rate, zero thinking tokens, perfect JSON
// Without it: ~50% failure rate, 140+ wasted tokens per call
Zero-Cost Gameplay: All Logic Pre-Computed
// During gameplay: NO API calls, instant responses
app.get('/api/action', (req, res) => {
const room = session.dungeon.rooms.find(r => r.id === session.currentRoom);
const isCorrectFix = action.toLowerCase().trim() === room.correctFix.toLowerCase().trim();
if (isCorrectFix) {
// Instant victory: narrative was pre-generated
session.xp += 20;
narrative = room.victoryNarrative;
} else if (isMove) {
// Instant room transition: narrative was pre-generated
narrative = targetRoom.roomNarrative;
} else {
// Instant wrong answer: no AI needed
session.hp -= 10;
narrative = `The ${room.monster.name} shrugs off your attack. -10 HP.`;
}
// Total API calls during gameplay: 0
});
How I Used Gemma 4
I chose Gemma 4 31B Dense because this project requires three capabilities that only this model provides among open models:
1. 128K Context Window: Entire Codebase Analysis
Gemma 4's 128K context window means we can feed entire repositories into a single prompt: full file contents, not just filenames or snippets. The model reads complete source files and reasons about interactions between them, finding cross-file vulnerabilities like "this function in auth.js is called without validation in routes.js."
The live demo limits file count for cost efficiency (it runs 24/7 for free), but the architecture supports loading full repos with dozens of files in a single Gemma call. No other open model has the context window to hold an entire codebase and reason about it holistically.
2. Native Multimodal: Design Comprehension, Not Just Color Detection
When a repo contains UI screenshots, Gemma 4 looks at them and demonstrates genuine design comprehension: understanding what the app does, identifying specific UI elements, and finding real accessibility issues.
Here's what Gemma 4 generated after seeing a SchemaLens Chrome Store screenshot:
"You step into a dim, cavernous room where two massive stone tablets-Schema A and Schema B-loom before you. In the depths of the footer of Tablet A, four glowing blue runes of 'Load sample' flicker with identical intensity, offering no clue which path you have already trodden. Across the gap, in the footer of Tablet B, a lone rune 'Copy from A & modify' pulses with a pale, spectral lilac hue, clashing with the bold violet of the 'Compare Schemas' altar above."
From a single screenshot, Gemma identified:
- The two schema editor panels by name ("Schema A" and "Schema B")
- The "Load sample" links in the footer and their identical styling
- The "Copy from A & modify" link with its inconsistent color
- The "Compare Schemas" button's purple gradient
- A real UX issue: inconsistent visual hierarchy between primary and secondary actions
This isn't color detection: it's a genuine UX audit from a screenshot. The monster ("The Contrast Ghoul") represents the accessibility anti-pattern, and the player must fix it to clear the room. The actual screenshot is displayed in the game's bug panel so players can see exactly what Gemma analyzed.
3. Structured JSON Output: Solving Gemma 4's Thinking Problem
Gemma 4's "thinking mode" is notoriously hard to disable: developer forums are full of people struggling with it. The model outputs internal reasoning before answering, consuming tokens and breaking JSON parsing. thinkingLevel: "MINIMAL" reduces it but doesn't guarantee structured output.
The real solution: responseJsonSchema in the Gemini API's generation config. It not only forces clean JSON output but also effectively bypasses the thinking behavior entirely: no thinking tokens, no wasted output, just structured data.
generationConfig: {
responseMimeType: 'application/json',
responseJsonSchema: { /* your schema */ }
}
This is documented for Gemini models, but the official Gemma 4 capabilities page doesn't list it as a supported feature. We discovered it works perfectly with Gemma 4 31B through the same API: taking our parse reliability from ~50% to 99%+.
Zero API Calls During Gameplay
Here's the key architectural insight: Gemma does all the work upfront, then gameplay is instant.
The generation flow:
- First room: Gemma analyzes code + screenshot, generates room with narrative, choices, and correct answer (~10s)
- Game starts: player can immediately play the first room
- Background batches: remaining rooms generate in parallel while the player is already playing (~15s)
- Cached forever: once generated, the dungeon is saved. Return visits are instant.
During actual gameplay (choosing answers, navigating rooms), there are zero API calls:
- Wrong answers: instant feedback (0ms, pre-computed)
- Correct answers: instant pre-generated victory narrative (0ms)
- Room navigation: instant pre-generated room descriptions (0ms)
This means cached repos (the presets in the demo) provide a completely free, instant gaming experience. Gemma 4 does all the heavy lifting during generation, then the game runs purely on pre-computed data.
The Downloadable Code Review Report
When you clear the dungeon (or die trying), you get a downloadable markdown report listing every bug found:
- File location
- Bug description
- Vulnerable code snippet
- How to fix it
- The correct action
This isn't a gimmick: it's an actionable security audit that developers can use to fix real issues. The game makes code review engaging; the report makes it useful.
Why Gemma 4 and Not Another Model?
| Capability | Gemma 4 31B | GPT-4o | Other Open Models |
|---|---|---|---|
| 128K context (entire repos) | ✅ | ✅ | ❌ (8K-32K) |
| Native multimodal (screenshots) | ✅ | ✅ | ❌ |
| Structured JSON schema | ✅ | ✅ | ❌ (unreliable) |
| Cost per game | $0.005 | $0.09 | Varies |
| Open model | ✅ | ❌ | ✅ |
Gemma 4 delivers the same multimodal + long-context capability as GPT-4o at 18x lower cost: while being fully open. For a game that needs to run 24/7 for free, this makes all the difference.
Real Bugs Found
Here are actual bugs Gemma 4 found in real codebases:
-
Hardcoded admin password in plain text (
const ADMIN_PASSWORD = 'schemalens-admin-2026') - CSV injection vulnerability: unescaped fields that could execute formulas in Excel
- Missing request body validation: server crashes on empty POST requests
- Exposed environment variables in health check endpoints
- Base64 tokens without HMAC: anyone can forge authentication tokens
- Memory leak in rate limiter: Map grows unbounded without TTL eviction
These aren't hallucinated: they're real issues in real code, found by Gemma 4 reading the actual source files.







Top comments (0)