Fiewor John

Posted on May 11

How Gemma 4 Helped Us Cut the Last Wire Keeping AI Out of African Classrooms

#devchallenge #gemmachallenge #gemma

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

The Problem Started Long Before This Hackathon

In 2021, as a postgraduate student, I built something called AI-Grader. The idea was simple: teachers spend an unreasonable amount of their lives marking scripts. AI could change that. The first version used Microsoft Azure's Cognitive Services — OCR to read handwriting, key phrase extraction to compare answers. It worked, but it was brittle. It compared keywords, not understanding. It could not tell you why it gave a grade.

Then ChatGPT dropped, and the gap between what I had built and what was now possible became impossible to ignore.

In early 2024, I teamed up with two people I had only just met, and together we rebuilt everything from scratch — this time with Google Gemini — during the Google BuildWithAI Hackathon organised by GDG Lagos and renamed the project GradrAI. We placed third nationally, and walked away with something more valuable than the prize: confirmation that this problem was real and that Gemini was the right foundation to build on.

GradrAI's relationship with Google had officially begun.

What GradrAI Actually Does

GradrAI is an AI-powered assessment platform built for educators in Africa. Two core workflows:

Paper-Based Test (PBT) grading: Teachers upload scanned student scripts, a question paper, and a marking guide. Gemini reads the handwriting, evaluates each answer against the rubric, and returns per-question scores, explanations, and personalised student feedback — in minutes, not days.

Computer-Based Test (CBT) generation and grading: Teachers upload lecture notes or past papers. The platform extracts topics, generates a structured quiz (MCQ, essay, or hybrid), produces a marking guide, and grades student submissions automatically.

The cloud version of this works well. Schools loved the demos.

Then came the part nobody wants to hear at a demo: "We don't have reliable internet."

The Conversation That Started Everything

On the 20th of April, I shared a NetworkChuck YouTube Short about offline AI models with my team on WhatsApp. And then a day before this hackathon announcement, FreeCodeCamp dropped a tutorial on open model coding essentials that specifically highlighted Gemma 4 as "very smart with low memory usage."

The response from Greatness, a member of the GradrAI design team mentioned based on his experience: "But you need a lot of RAM power."
Adams, our backend lead: "Yes, definitely."

They were right to be cautious. But the timing of those two things landing in the same week felt like something. And there was context behind it that made it feel even more urgent.

Three schools we had specifically demoed GradrAI to had given us near-identical feedback: a private secondary school in Lagos told us they lacked the infrastructure to adopt a cloud-dependent solution. A school in Ogun State said internet costs were not something they were prepared to cover for teachers. A third Lagos school said something similar.

These were not fringe cases. They were our target market telling us, clearly, that the cloud dependency was the barrier. Not the price. Not the product. The wire running from the classroom to the internet.

That day, we started building.

Phase 1: An Offline App That Still Needed the Cloud

The first version of the GradrAI desktop app — an Electron application — solved part of the problem. Teachers could import an exam package from the cloud onto their laptop. Students would connect to the teacher's local network and take the exam offline on their own devices. The local SQLite database collected every attempt.

MCQs were graded immediately. The correct answer was embedded in the exam package at import, so no AI was needed.

Essay and theory questions were a different story. They sat in a queue. Ungraded. Waiting for the teacher's machine to get a stable internet connection so the answers could be synced to the cloud and processed by Gemini.

The learning loop — the thing that makes assessment valuable — was broken at exactly the moment it mattered most. A student submits an essay. They want to know what they got wrong. The system says: come back when there's WiFi.

That is not a product. That is a waiting room.

The Sign

On the 6th of May, the DEV Community announced the Gemma 4 Challenge. $3,000 in prizes. Running through May 24. The mandate: build something real with Gemma 4.

I had been sitting on a half-finished idea for weeks. This was the deadline that forced it into existence.

The first task was straightforward: move theory grading offline. Replace the cloud sync queue with a local Gemma 4 model running via Ollama as a sidecar process inside the Electron app. When a student submits an essay, the answer goes directly to Gemma 4 on the teacher's machine. Score, explanation, and feedback come back in seconds. No internet required.

This worked. But then I stopped and looked at the full picture.

The grading was offline. The exam creation was still entirely online. Teachers still needed cloud access to extract topics from their materials, generate quiz questions, and produce marking guides. We had moved the last mile offline while leaving the first mile completely dependent on the internet.

That made no sense.

Phase 2: Taking the Entire Pipeline Offline

The cloud CBT pipeline in GradrAI has four AI operations:

Topic extraction — read a PDF, identify key pedagogical topics and their relative weights
Exam generation — take those topics and produce a structured quiz
Marking guide generation — for essay questions, produce the evaluation rubric
Theory grading — evaluate student answers against the marking guide

Operations 3 and 4 are text-in, text-out. Swapping them to a local Gemma 4 call is straightforward.

Operations 1 and 2 are the hard part. The cloud implementation sends Google Cloud Storage URIs directly to Vertex AI — a mechanism specific to Google's infrastructure. Ollama cannot resolve a GCS URI. To run these operations locally, we needed to rasterise the PDF pages to images, base64-encode them, and pass them as inline multimodal parts to Gemma 4.

This is exactly where Gemma 4's native multimodal capability becomes the architectural enabler. Without it, local topic extraction from a PDF is either impossible or requires a separate OCR pipeline bolted on the side. With Gemma 4, the same model that extracts topics also generates questions and grades essays. One model. One Ollama endpoint. The entire pipeline.

The implementation:

// offlineProcessor.js — PDF pages to base64 image parts, entirely in memory
const { pdfToPng } = require("pdf-to-png-converter");

async function rasterisePdfToInlineParts(localFilePath) {
  const pages = await pdfToPng(localFilePath, {
    viewportScale: 1.5,
    outputFileMask: "page",
  });

  const capped = pages.slice(0, 20); // respect E4B context window

  return capped.map((page) => ({
    type: "image",
    data: page.content.toString("base64"),
    mimeType: "image/png",
  }));
}

// gemma.provider.js — unified Ollama interface for all four operations
async function generateContent({ promptText, imageParts = [] }) {
  const message = {
    role: "user",
    content: promptText,
    ...(imageParts.length > 0 && {
      images: imageParts.map((p) => p.data), // Ollama expects base64 array here
    }),
  };

  const response = await axios.post(
    "http://127.0.0.1:11434/api/chat",
    {
      model: process.env.OFFLINE_MODEL || "gemma4",
      messages: [message],
      stream: false,
    },
    { timeout: 120000 }
  );

  return response.data.message.content;
}

The provider abstraction means the rest of the application — topic extraction, exam generation, marking guide generation, theory grading — never needs to know whether it is talking to Gemini on Vertex AI or Gemma 4 on Ollama. The service layer calls generateContent. The provider handles the rest.

Why Gemma 4 Specifically

This is not a case of "we picked a model and ran with it." The choice of Gemma 4 over other open models was deliberate and is embedded in the architecture.

The E4B model runs on consumer hardware. Most schools in our target market do not have server-grade machines. The teacher's laptop is the server. Gemma 4 E4B runs on consumer-grade laptops and high-end Android phones. It runs on the hardware that actually exists in Nigerian secondary schools — a mid-range Windows laptop, the kind a teacher might already carry to class.

Native multimodal input eliminates a pipeline dependency. Every other approach to offline PDF processing requires a separate OCR layer. Gemma 4's vision capability means a single model handles both document understanding and structured output generation. The pipeline is simpler, the failure surface is smaller, and the resource footprint on the local machine is lower.

The 128K context window makes the full syllabus available. For topic extraction, we can pass an entire semester's lecture notes as context in a single call. For grading complex essays, the marking guide, question paper, and student answer fit comfortably within the window without chunking. This is the difference between a model that skims a document and one that reads it.

Structured JSON output is reliable enough for production use. Our grading pipeline requires strict schema adherence — every response must contain questionId, score, maxScore, explanation, and feedback fields, or the result cannot be persisted. The existing score-clamping and schema validation in our grading service guards against drift, but Gemma 4's instruction following at this task is consistent enough that the guard rarely needs to activate.

What the Offline CBT Pipeline Looks Like Now

A teacher in a school with no internet connection can now do the following, entirely on their laptop:

Open the GradrAI desktop app. Ollama starts automatically in the background. Gemma 4 loads.
Upload a PDF — lecture notes, a past paper, a textbook chapter.
The app rasterises the PDF in memory and sends the pages to Gemma 4. Topics are extracted and returned to the teacher in a visual interface where they can adjust weights and priorities.
The teacher configures the exam: MCQ count, essay count, difficulty, total marks.
Gemma 4 generates the full question set and marking guide simultaneously.
The teacher reviews, approves, and publishes the exam to the local network.
Students connect their devices to the teacher's hotspot and take the exam.
MCQs are graded deterministically and instantly.
Essay answers go directly to Gemma 4. Scores and feedback are returned to students within seconds.
All results are stored in the local SQLite database and synced to the cloud the next time internet is available.

No cloud token was consumed. No data left the building. The student got their feedback before the class ended.

The Broader Context: Why This Matters for Africa

The shift toward Computer-Based Testing is accelerating across Africa. Nigeria's Joint Admissions and Matriculation Board (JAMB) now administers its UTME entirely via CBT to over 1.8 million candidates annually. Ghana's WAEC has been piloting CBT for WASSCE. Kenya's KNEC has published a national CBT roadmap. The continent is moving toward digital assessment at institutional scale.

The infrastructure reality has not kept pace. Nigeria's internet penetration sits below 45%. Electricity supply in many states is unreliable, meaning internet-dependent systems face compounding failure modes — no power, no router, no exam. The tools being built for this transition cannot assume the connectivity that developed-market EdTech takes for granted.

GradrAI's offline-first architecture is not a feature for edge cases. It is the baseline requirement for the market we are actually building for.

Gemma 4 is what made it technically feasible at the hardware level that this market actually has.

What Comes Next

The offline CBT pipeline is complete. The next stage is the SmartPrep feature — GradrAI's student-facing JAMB and WASSCE practice mode. Currently, SmartPrep surfaces past questions from an external API and provides AI-generated feedback online. The logical extension is an offline practice mode: Gemma 4 generates practice questions from curriculum content, grades student responses, identifies knowledge gaps across a session, and produces a diagnostic summary — all on-device, all free of cloud cost.

A student preparing for JAMB in a rural school, with no data, no tutor, and no money for a prep centre, practising with an AI that tells them exactly what they got wrong and why. That is the version of this product we are building toward.

Links

GitHub Repository: (to be added before submission)
Demo Video: (to be added before submission)
GradrAI Platform: gradrai.com

What we built during this hackathon is not a demo. It is a production feature shipping to real schools. The students who take exams on GradrAI's desktop app this term will get their theory grades back before they leave the classroom — because Gemma 4 is running on the teacher's laptop while they wait.