Haji Rufai

Posted on May 11 • Originally published at hajirufai.github.io

I Built an AI Interview Coach with Gemma 4 — Zero Backend, 100% Free

#devchallenge #gemmachallenge #gemma

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4.

What I Built

Interview Coach — a free, open-source AI interview practice tool powered by Google Gemma 4 (31B Dense). It runs entirely in the browser — no backend, no server, no accounts. Just Gemma 4's reasoning and your ambition.

It conducts realistic mock interviews across 6 modes, evaluates your answers in real-time with detailed feedback, gives mid-session scorecards, and generates comprehensive performance reports with personalized study plans.

The world's first free and open-source AI interview coach.

Demo

hajirufai.github.io

🔗 Live Demo: hajirufai.github.io/gemma4-interview-coach (bring your own free API key from Google AI Studio)

💻 GitHub Repo: github.com/hajirufai/gemma4-interview-coach

📄 License: MIT — fork it, improve it, ship it.

The Problem

91% of candidates who fail online assessments never practiced under realistic conditions. Interview prep tools exist, but they're:

Expensive — $30–50/month for premium platforms
Generic — same questions regardless of your role or level
Passive — read sample answers instead of actually practicing

What if you had a personal coach that adapts to YOUR experience level, gives feedback on YOUR specific answers, and costs $0 forever?

That's what Gemma 4 makes possible.

6 Practice Modes

Mode	What It Does
🗣️ Behavioral	STAR-method questions on leadership, conflict, teamwork
💻 Technical	Coding problems, algorithms, data structures
🏗️ System Design	"Design Twitter" style architecture challenges
📝 Assessment	Simulated OA with aptitude + coding + logic
🏆 Certification	Exam-style questions (AWS, Azure, GCP, etc.)
📊 Case Study	Business cases with structured frameworks

Each mode has a carefully crafted system prompt that shapes Gemma 4's behavior — asking follow-ups, evaluating with mode-specific criteria, and calibrating difficulty across entry/mid/senior/lead levels.

Key Features

🎙️ Voice Input & Output

Speak your answers naturally using Web Speech API — just like a real interview. Gemma 4's responses are read aloud with text-to-speech for a fully conversational experience.

📊 Real-Time Scoring

Hit "Score Me" at any point for a mid-session evaluation across 5 dimensions. The end-of-session report includes a detailed performance breakdown and a personalized 7-day study plan.

🖼️ Image Upload

Upload screenshots of coding challenges, whiteboard diagrams, or error messages — Gemma 4's multimodal capabilities analyze them in the context of your interview session.

🔄 Multi-Provider Support

Works with Google AI Studio, OpenRouter, NVIDIA NIM, and HuggingFace — all using Gemma 4. If one provider is overloaded, switch seamlessly.

✅ API Key Validation

One-click key test before starting — instant feedback on whether your setup is working.

📋 Session History

LocalStorage-backed session tracking so you can see your progress over time — modes practiced, questions completed, and token usage.

🌙 Dark Mode

Full dark/light theme toggle — easy on the eyes during late-night interview prep.

How I Used Gemma 4

Model Choice: 31B Dense — And Why It Matters

I deliberately chose the 31B Dense model over the lighter MoE variants. Here's why:

Interview coaching demands the highest reasoning quality available. When Gemma 4 evaluates a candidate's answer, it needs to simultaneously:

Parse whether the answer follows frameworks like STAR or MECE
Identify specific gaps ("You mentioned leading the migration, but what was the quantifiable impact?")
Calibrate difficulty for the next question based on performance so far
Generate feedback that's encouraging but honest

The 31B Dense model activates all 31 billion parameters for every token, producing noticeably more nuanced and accurate evaluations than the smaller variants. For a coaching tool where feedback quality IS the product, this was non-negotiable.

128K Context Window → True Multi-Turn Coaching

This is where Gemma 4 really shines. Interview practice isn't a one-shot Q&A — it's a 15–20 turn conversation where the coach needs to:

Remember your answer to Q1 when evaluating Q8
Notice patterns ("You keep avoiding specifics — let me push harder")
Generate a final report that references the entire session

System prompt:     ~800 tokens
Per Q&A round:     ~500 tokens (question + answer + feedback)
15 rounds:         ~7,500 tokens
Final report:      ~2,000 tokens
Total:             ~10,300 tokens ← comfortably within 128K

No chunking, no summarization, no lost context. Every answer is remembered and referenced.

Native Chain-of-Thought → Superior Evaluations

Gemma 4's built-in thinking tokens (thought: true) are game-changing for evaluation tasks. Before responding, the model reasons internally:

{
  "text": "The user mentioned leading a team of 5...\n- STAR compliance? Situation ✓, Task ✓, Action partial, Result missing\n- Specificity? Medium — needs quantified metrics\n- Overall assessment: push for more concrete outcomes",
  "thought": true
},
{
  "text": "Great foundation! You clearly described the situation and your role. To make this a knockout answer, add the specific outcome..."
}

This internal reasoning produces dramatically better feedback than models that generate evaluations in a single pass.

Open & Free → Accessible to Everyone

This was the whole point. Gemma 4 runs on:

Google AI Studio free tier — no credit card required
OpenRouter — free tier available
Locally via Ollama — on a decent laptop
NVIDIA NIM / HuggingFace — for developers who prefer those platforms

Zero cost, zero gatekeeping, zero excuses not to practice.

Architecture: Why Zero Backend?

Browser ──(HTTPS)──> Gemma 4 API
   │                    │
   │              31B Dense Model
   │                    │
   └────────────────────┘
     No server. No proxy. No database.

The entire app is a single HTML file. No React build, no Node server, no database. Here's why:

Privacy — Your API key and interview responses never touch a third-party server
Cost — $0 hosting via GitHub Pages
Speed — No proxy round-trip. Browser → Gemma 4 → Browser
Simplicity — git clone && open index.html is the full setup

The tradeoff: Users need their own API key. I chose this intentionally — it keeps the tool free forever and teaches users about AI APIs in the process.

Tech Stack

Component	Technology
Frontend	Vanilla HTML + Tailwind CSS (CDN)
AI Model	Google Gemma 4 31B Dense
Voice	Web Speech API (STT + TTS)
Providers	Google AI Studio, OpenRouter, NVIDIA NIM, HuggingFace
Markdown	Custom lightweight renderer
State	In-memory + LocalStorage
Hosting	GitHub Pages (static)

What I Learned

Gemma 4's thinking tokens are game-changing for evaluation tasks. The model genuinely considers multiple aspects before responding — producing feedback that feels like a real interviewer's assessment, not a generic LLM response.
128K context is overkill for most apps — but perfect for coaching. The ability to reference earlier answers creates a coherent experience that shorter-context models can't match. The final session report can cite specific moments from Q1 through Q15.
Zero-backend AI apps are viable and powerful. Browser → API → Browser eliminates 90% of infrastructure complexity. The main cost is that users bring their own key — but for free tools, that's a feature, not a bug.
Multi-provider resilience matters. Google AI Studio's free tier occasionally returns 500/503 errors under load. Having OpenRouter, NVIDIA NIM, and HuggingFace as fallbacks — all serving Gemma 4 — keeps the tool reliable.

Try It

🔗 Live Demo: hajirufai.github.io/gemma4-interview-coach
💻 Source Code: github.com/hajirufai/gemma4-interview-coach
⭐ Star the repo if you find it useful!

Built by Haji Rufai — creator of Interview Buddy, a free AI-powered interview preparation platform.

DEV Community