MOHIT BHAT

Posted on Feb 15

Cloudgen Cloud Platform: Deploy Your App by Talking to It, Built with GitHub Copilot CLI

#devchallenge #githubchallenge #cli #githubcopilot

GitHub Copilot CLI Challenge Submission

This is a submission for the GitHub Copilot CLI Challenge.

What I Built

Cloudgen is a product that turns cloud deployment into a conversation. You describe what you need in plain English, your app, how many users you expect, whether you need a database or cache, and an AI figures out the rest. No config files, no dashboards, no wrestling with infrastructure. You get a clear plan, a cost estimate, and you’re one approval away from your app being live.

Behind that simple experience is a lot of complexity. Understanding your intent, analyzing your repo, sizing resources, generating a safe execution plan, and then actually provisioning everything and giving you live URLs and connection strings. We built Cloudgen so you never have to touch that complexity yourself.

Demo

Live project: https://cloudgenapp.vercel.app/

Video walkthrough: https://www.loom.com/share/0b5f73a623bb438ebd1ec41053427c21

GitHub Repository: https://github.com/mbcse/cloudgen

📌 The Problem

Deploying even a simple application to the cloud today requires:

Writing Dockerfiles, YAML manifests, and IaC templates
Navigating complex dashboards across AWS / GCP / Azure
Manually provisioning databases, caches, compute, and networking
Understanding container orchestration, port mappings, reverse proxies, and health checks

For a developer who just wants to ship their app, this is too much friction.

Small teams and indie developers often spend more time wrangling infrastructure than building product. The gap between "I have a repo" and "it's live on the internet" shouldn't require DevOps expertise.

💡 Our Solution

Cloudgen is a chat-first cloud control plane where users describe their infrastructure needs in plain English, and an agentic AI pipeline analyzes their repository, generates a deployment plan, and provisions real infrastructure — all with human-in-the-loop approval.

"Deploy my Next.js app from GitHub with a Postgres database for 500 users"

→ Cloudgen analyzes the repo, generates a resource plan with cost estimates, and after approval, provisions containers, databases, and networking automatically.

Why it matters

Getting from “I have a repo” to “it’s live on the internet” usually means learning orchestration, networking, databases, and billing. Small teams and indie devs end up spending more time on infra than on product. Cloudgen flips that. You say what you want, review a plan the AI proposes, approve it, and we handle the rest. Deployment becomes something you do in a chat, not a checklist of manual steps.

✨ Product Features

Feature	Description
🗣️ Chat-to-Deploy	Natural language interface, describe what you need, get a deployment plan
🔍 Smart Repo Analysis	Auto-detects runtime (Node.js/Python), framework, build commands, and Dockerfile presence
📋 Plan Review & Approval	Every deployment requires explicit human approval, see resources, rationale, cost estimates, and YAML steps before anything runs
🐳 Multi-Resource Provisioning	Deploy App Services, Compute Instances, PostgreSQL, and Redis from a single conversation
💰 Cost Estimation	AI-powered resource sizing with tiered pricing (`ac.starter`, `ac.pro`, `ac.business`)
📡 Live Deployment Logs	Real-time streaming logs as containers build and start
🔗 Endpoint Discovery	Automatically returns live URLs and connection strings after deployment
🧠 RAG-Powered Context	Internal docs and repo READMEs are indexed for smarter, context-aware responses
⚡ Graceful Degradation	Works without an LLM API key using deterministic heuristic fallbacks

🏗️ Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                        Next.js Frontend                         │
│   Dashboard  ·  Chat UI  ·  Plan Review  ·  Deployment Logs    │
└──────────────────────────────┬──────────────────────────────────┘
                               │ REST API
┌──────────────────────────────▼──────────────────────────────────┐
│                     Fastify API Server                          │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │               LangGraph Chat Orchestrator                │   │
│  │                                                          │   │
│  │   Parse Intent ──► RAG Retrieval ──► Plan Generation     │   │
│  │        │                                    │            │   │
│  │        ▼                                    ▼            │   │
│  │   ┌─────────┐   ┌──────────────┐   ┌────────────┐       │   │
│  │   │ Intent  │   │    Repo      │   │  Capacity  │       │   │
│  │   │ Agent   │   │  Inspector   │   │   Agent    │       │   │
│  │   │         │   │    Agent     │   │            │       │   │
│  │   └─────────┘   └──────────────┘   └────────────┘       │   │
│  │        Gemini 2.0 Flash  /  Deterministic Fallback       │   │
│  └──────────────────────────────────────────────────────────┘   │
│                                                                 │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────┐   │
│  │   Deployer   │  │  RAG Engine  │  │   Prisma + Postgres  │   │
│  │  (SSH + Docker)│  │  (pgvector)  │  │   (Control Plane DB) │   │
│  └──────┬───────┘  └──────────────┘  └──────────────────────┘   │
└─────────┼───────────────────────────────────────────────────────┘
          │ SSH
┌─────────▼───────────────────────────────────────────────────────┐
│                    EC2 Runtime Host                              │
│                                                                 │
│   ┌─────────┐  ┌──────────┐  ┌─────────┐  ┌────────────────┐   │
│   │ App     │  │ Postgres │  │  Redis  │  │   Compute      │   │
│   │Container│  │Container │  │Container│  │   Instance      │   │
│   └─────────┘  └──────────┘  └─────────┘  └────────────────┘   │
│                        Nginx (reverse proxy)                    │
└─────────────────────────────────────────────────────────────────┘

🧠 How It Works — Technical Deep Dive

1. Chat Orchestration (LangGraph)

The core of Cloudgen is a stateful LangGraph pipeline (agent.ts) that processes every user message through a directed graph of nodes:

START ──► parse_intent ──► retrieve_context ──► route_intent
                                                    │
                                    ┌───────────────┼────────────┐
                                    ▼               ▼            ▼
                              generate_plan    answer_question   ...
                                    │
                                    ▼
                                save_plan ──► respond ──► END

parse_intent — Regex + heuristic parser extracts repo URLs, expected user counts, database requirements, and whether the user wants to plan, deploy, or ask a question
retrieve_context — Queries the RAG index (pgvector cosine similarity) for relevant internal docs and repo READMEs
generate_plan — Invokes the multi-agent planning pipeline when a provisioning intent is detected
respond — Streams a contextual reply back using Gemini, or falls back to a templated response

2. Multi-Agent Planning Pipeline

When a deployment intent is detected, three specialized agents collaborate (planning.ts):

Agent	Role	Output
Intent Agent	Extracts high-level goals — expected users, latency sensitivity, which resources are needed (app/postgres/redis/instance)	`IntentAgentOutput`
Repo Inspector Agent	Fetches the GitHub repo structure, `package.json`, `Dockerfile`, and `requirements.txt` to infer runtime, framework, build commands, and app port	`RepoInspectorOutput`
Capacity Agent	Takes the outputs of the previous two agents and determines resource sizing (CPU, memory, replicas), tier selection, and cost estimation	`CapacityAgentOutput`

Each agent calls Gemini 2.0 Flash via a structured JSON prompt (llm.ts). If no API key is available, each agent has a deterministic fallback so the system remains fully functional without any LLM.

The pipeline produces:

A DeploymentPlan stored in Postgres
A YAML step file (deployment-plans/<plan-id>.yaml) describing the exact Docker commands to execute

3. RAG Engine (pgvector)

Cloudgen uses Retrieval-Augmented Generation to ground agent responses in real documentation (rag.ts):

Internal docs (architecture, runbooks) are chunked and embedded into pgvector
Repository READMEs from connected GitHub repos are fetched and indexed
At query time, a deterministic 64-dim embedding is computed and a cosine similarity search retrieves the most relevant chunks
Retrieved chunks are injected as context into the LLM prompt and returned as citations in the chat response

4. Deployment Executor (SSH + Docker)

After plan approval, the Deployer (deployer.ts) triggers the Executor (executor.ts):

SSH tunnel to the runtime EC2 host using key-based authentication
Clone the GitHub repo on the remote host
Auto-generate Dockerfile if missing (for Node.js repos)
Build the Docker image
Provision each resource as a Docker container with CPU/memory limits:
- docker run with --cpus, --memory, port mappings
- Postgres containers use postgres:16-alpine
- Redis containers use redis:7-alpine
- Compute instances use ubuntu:22.04 with long-running entry points
Health check — polls the container endpoint until it responds
Configure Nginx reverse proxy route (optional)
Return endpoints — live app URL + database/cache connection strings

5. Data Model (Prisma + PostgreSQL)

The control plane persists all state via Prisma ORM (schema.prisma):

Model	Purpose
`Project`	A linked GitHub repo with name, slug, branch
`ChatSession`	Conversation thread tied to a project
`ChatMessage`	Individual messages with role and citations
`DeploymentPlan`	AI-generated plan with inputs, decision, cost, rationale
`Deployment`	Execution record with status, logs, and live URLs
`RagDocument` / `RagChunk` / `RagEmbedding`	RAG corpus with pgvector embeddings

🛠️ Tech Stack

Layer	Technology
Frontend	Next.js 15, React, Tailwind CSS, NextAuth.js
Backend	Fastify 5, TypeScript, Zod validation
Agent Framework	LangGraph (stateful graph orchestration)
LLM	Gemini 2.0 Flash (streaming + structured JSON output)
Database	PostgreSQL + pgvector (control plane + RAG)
ORM	Prisma with raw SQL for vector operations
Deployment Runtime	Docker containers on EC2 via SSH
Monorepo	npm workspaces

🚀 Getting Started

Prerequisites

Node.js 20+
PostgreSQL 15+ with pgvector extension
(Optional) Gemini API key for AI-powered planning
(Optional) EC2 host with Docker for live deployments

Setup

# Install dependencies
npm install

# Generate Prisma client
npm run db:generate

# Run database migrations
npm run db:migrate -- --name init

# Seed initial data
npm run db:seed

# Start development servers
npm run dev

Environment Variables

Create apps/api/.env:

DATABASE_URL="postgresql://postgres:postgres@localhost:5432/cloudgen"
PORT=4000

# LLM (optional — system works without it via fallbacks)
GEMINI_API_KEY=""
GEMINI_MODEL="gemini-2.0-flash"

# Runtime deployment host (leave empty for local preview mode)
RUNTIME_SSH_HOST=""
RUNTIME_SSH_USER="ubuntu"
RUNTIME_SSH_PORT="22"
RUNTIME_SSH_KEY_PATH="/path/to/key.pem"
RUNTIME_BASE_DIR="/tmp/cloudgen-apps"
RUNTIME_PUBLIC_BASE_URL="http://your-ec2-ip"

# Resource limits
ACTIVE_APP_CAP="8"

Services

Service	URL
Web Dashboard	http://localhost:3000
API Server	http://localhost:4000
Health Check	http://localhost:4000/health

📡 API Reference

Method	Endpoint	Description
`POST`	`/api/projects`	Create a new project (link a GitHub repo)
`GET`	`/api/projects/:id`	Get project details
`GET`	`/api/projects/:id/resources`	Get provisioned resources for a project
`POST`	`/api/chat`	Send a chat message (triggers planning if needed)
`GET`	`/api/sessions/:id`	Get full chat session history
`POST`	`/api/plans/:id/approve`	Approve a deployment plan
`GET`	`/api/plans/:id/steps`	Get YAML deployment steps for a plan
`POST`	`/api/deploy`	Trigger deployment of an approved plan
`GET`	`/api/deployments/:id`	Get deployment status and logs

🔄 End-to-End Flow

User: "Deploy https://github.com/user/app with Postgres for 500 users"
  │
  ├──► Intent Parser: repo URL, 500 users, postgres needed
  ├──► RAG Retrieval: fetch relevant architecture docs
  ├──► Intent Agent: high-level goals + resource list
  ├──► Repo Inspector: Node.js, has Dockerfile, port 3000
  ├──► Capacity Agent: ac.pro tier, 1 CPU, 1GB RAM, ~$34/mo
  │
  ▼
Plan Generated:
  • App container (Node.js, port 3000)
  • Postgres container (postgres:16-alpine)
  • Estimated cost: $34/mo
  • YAML steps file written
  │
  ├──► User reviews plan + rationale
  ├──► User approves
  │
  ▼
Deployment:
  • SSH into EC2 host
  • Clone repo, build Docker image
  • Start app + postgres containers
  • Health check passes
  • Nginx route configured
  │
  ▼
Result: "Your app is live at http://host:21042 🎉"

🧪 Smoke Test

Run the full end-to-end flow against a running instance:

npm run smoke

Screenshots:

My Experience with GitHub Copilot CLI

Cloudgen is a complex product. Orchestrated AI agents, a full API, real provisioning, and a dashboard. We used GitHub Copilot CLI from the terminal throughout the build. It didn’t just speed up typing, it helped us design and implement systems we’d have hesitated to tackle alone. Below is how we used it, the prompts that worked, and how you can use Copilot CLI more effectively on your own projects.

How we used Copilot CLI: interactive vs one-off

Copilot CLI has two modes we relied on every day.

Interactive mode (default). We’d run copilot in the project root and stay in a session. That’s where we did most of the work: multi-file changes, refactors, and “add a new node to the graph” style tasks. The back-and-forth let us refine in small steps (“now add error handling for when the repo URL is invalid”) without rewriting long prompts. We’d confirm we trusted the folder when asked, then work in that directory and its subdirectories.

Programmatic mode. For quick, single-shot tasks we used -p or --prompt. For example:

copilot -p "Add a Zod schema for POST /api/chat: sessionId optional UUID, projectId optional UUID, message required string min 1"

That gave us a schema and a route stub we could paste into the Fastify app. We used this for validation schemas, small utilities, and one-off scripts (e.g. “generate a bash script to run the API and web app with one command”). For anything that would modify or run files, Copilot asked for approval first, so we stayed in control.

Giving Copilot context: @file and scope

Copilot works better when it sees the exact code you care about. We used @file a lot.

Examples we actually used:

Explain @apps/api/src/agent.ts and list all graph nodes and what they write to state

So we could onboard quickly and later ask for new nodes without breaking the graph.
In @apps/api/src/intent.ts add detection for "no database" and set databaseRequired to false. Keep the same ParsedIntent shape.

One file, one behavior change, clear outcome.
@apps/api/src/planning.ts the Capacity agent output needs a new field 'reasoning: string[]'. Add it to the interface and to the fallback object.

Copilot had the interfaces and the fallback logic in context, so the change was consistent.
Fix the bug in @apps/api/src/rag.ts: chunkContent is sometimes undefined when we build the citation. Add a filter.

We pointed at the file and described the symptom; Copilot proposed a safe fix.

We also kept sessions focused. When we switched from “agent graph” work to “frontend” work, we ran /clear and started a new mental context. That cut down on Copilot suggesting changes to the wrong layer. When we needed to touch both API and web app, we stayed in one session and mentioned both: “Update the API to return estimatedCostUsd in the plan payload and add a cost row in the plan review card in the project page.”

Plan before you code: /plan for big features

For larger chunks of work we used plan mode. Instead of “implement this whole thing,” we asked Copilot to design the steps first.

Example prompts:

/plan Add a deployment plan approval flow: API endpoint POST /api/plans/:id/approve, update plan status in DB, return updated plan. Frontend: Approve button that calls the endpoint and then shows a “Deployment started” state.
/plan Implement streaming chat: the /api/chat response should stream chunks. Frontend should consume the stream and append to the message content. Keep existing non-streaming fallback.

What we got: a structured plan (often saved to something like plan.md in the session), with checkboxes and ordered steps. We could review it, ask for edits (“add a step to handle approval rejection”), and only then say “implement this plan.” That reduced wrong turns and kept the codebase consistent. For quick bug fixes or single-file edits we didn’t use /plan, only for multi-file or multi-layer features.

Prompt examples that worked for Cloudgen

Here are real prompt patterns we used, so you can adapt them.

Orchestration and state (LangGraph)

“In agent.ts we have a state graph. Add a new node retrieve_context that runs after parse_intent. It should call retrieveContext from rag.js with the message, put the result in state.chunks, and pass through to the next node. Use the same Annotation pattern as the other nodes.”
“The respond node should include citations from state.chunks in the streamed reply. Each citation needs source title and chunk text. Update the streamChatReply call to pass citations.”
“Our graph has a branch: if intent.wantsPlan we go to generate_plan, else if intent.asksQuestion we go to answer_question. Add a default branch that sets reply to a short ‘I didn’t understand’ message and goes to END.”

Planning pipeline and types

“In planning.ts add a fallback for IntentAgentOutput when the LLM is unavailable: expectedUsers 100, databaseRequired true, requestedResources ['app'], confidence 0.5. Match the interface exactly.”
“We’re adding estimatedCostUsd to the plan. Update: 1) CapacityAgentOutput and the fallback in planning.ts, 2) the place we save the plan to the DB in agent.ts, 3) the DeploymentPlan type in the Prisma schema if needed.”
“Generate the TypeScript interface for DeployStep: id string, type enum (prepare_workspace, clone_repo, build_image, run_app, run_postgres, health_check, configure_route), target string, description string.”

API and validation

“Add a Fastify route POST /api/plans/:id/approve. Body: { approved: true, approvedBy?: string }. Validate with Zod. On success call approvePlan(planId) and return { plan }.”
“We need GET /api/deployments/:id that returns status, logsJson, appUrl, containerName. Use getDeployment from deployer.js. Return 404 if not found.”

Database and RAG

“Our Prisma schema has Project, ChatSession, ChatMessage, DeploymentPlan, Deployment. Add a RagDocument model: id, sourceType enum (INTERNAL, RUNBOOK, REPO_README), sourceRef unique, title, content, createdAt, updatedAt. Add RagChunk with documentId and content.”
“In rag.ts the retrieveContext function should take a query string, compute an embedding (use the existing deterministic embedding if no API key), query RagChunk by cosine similarity on the embedding column, and return the top 5 chunks with title and sourceRef.”

Frontend

“In the project page, add a section that shows the latest deployment plan: rationale, resources list, estimated cost. If there’s no plan, show ‘No plan yet.’ Use the existing API types.”
“Wire the Approve button to POST /api/plans/:id/approve and then POST /api/deploy with projectId and planId. Disable the button while loading and show a toast or inline message on error.”
“The deployment logs should update in real time. Add a polling loop that calls GET /api/deployments/:id every 2 seconds while status is QUEUED or BUILDING, and append new log entries to the list.”

Provisioning and reliability

“The executor runs a sequence of steps. If any step fails, we should push the error message to logs, set status to FAILED, and stop. Don’t run the next step. Add a try/catch around each step and update the deployment record.”
“Add a health check step after the app is running: GET the app URL with a 30s timeout, retry up to 5 times with 5s delay. If it never responds, mark deployment as FAILED and add ‘Health check failed’ to logs.”

We were specific about inputs and outputs (e.g. “return the top 5 chunks with title and sourceRef”) and broke big features into small prompts (one node, one endpoint, one UI block). That gave us code that matched our architecture instead of generic snippets.

Slash commands we used every day

We leaned on a few slash commands to work faster and keep context clean.

Command	How we used it
`/clear`	Between unrelated tasks (e.g. after finishing the agent graph and before starting the dashboard). Clears conversation history so Copilot doesn’t drag in old files or decisions.
`/plan`	Before implementing a new feature or flow. Gets a step-by-step plan we can edit, then “implement this plan.”
`/cwd`	When we needed to scope Copilot to `apps/api` or `apps/web` only. We’d `/cwd apps/api` then ask for API-only changes.
`/model`	We switched to a more capable model for the orchestration and planning code (complex state and types), and kept a faster model for simple CRUD and UI.
`/help`	To discover other commands (e.g. `/context`, `/session`, `/delegate`) when we needed them.
`/review`	Before committing. “Review the changes in my current branch against main for potential bugs and security issues.”

Pro tip: If you only remember three, use /clear, /cwd, and /plan. They give you control over context, scope, and how much “thinking” Copilot does before writing code.

Custom instructions so Copilot matched our stack

We added a .github/copilot-instructions.md (or repo-level instructions) so Copilot didn’t guess our conventions.

We wrote things like:

Build commands: npm run dev (root, runs api + web), npm run db:migrate (from root, runs API migrations), npm run typecheck (root).
Code style: TypeScript strict, ESM (import/export), prefer async/await, use Zod for request validation.
Structure: Backend in apps/api/src, frontend in apps/web/app and components, shared types in apps/web/lib/types.ts and API types.ts.
Workflow: After adding an API endpoint, export it from the right module and add the route in index.ts; after changing Prisma schema, run db:generate and db:migrate.

That way, when we said “add an endpoint to list deployments for a project,” Copilot put it in the right place and used Zod and our existing patterns. Short, actionable instructions worked better than long essays.

How to use Copilot CLI more effectively (what we learned)

Break down complex tasks. “Implement the full chat flow” is too big. We got better results with: “Add the parse_intent node,” then “Add the retrieve_context node and wire it after parse_intent,” then “Add the branch that routes to generate_plan or answer_question.” Same for the frontend: one component or one API integration per prompt.

Be specific about inputs and outputs. “Add a function that fetches the plan” is vague. “Add a function getPlan(planId: string) that calls GET /api/plans/:id and returns the plan JSON or null if 404” is something Copilot can implement accurately.

Use @file for precision. When the change is in one file or you need to avoid touching others, put the path in the prompt. It reduces wrong-file edits and keeps context small.

Plan mode for multi-step work. For anything that touches API + DB + frontend, or several modules, we used /plan first. We reviewed the plan, adjusted it, then said “implement this plan.” Fewer rollbacks and cleaner diffs.

Validate Copilot’s output. We always ran npm run typecheck and the relevant tests after accepting changes. We especially reviewed anything that touched deployment or user data. Copilot is a powerful draft, not a substitute for review.

Clear context when switching tasks. /clear between “backend” and “frontend” or between features kept suggestions relevant. We also closed files we weren’t working on so Copilot didn’t pull them in unnecessarily.

What actually changed for us

We could think in product terms. Describe what should happen next, get code that matched. Less jumping between docs and editor.
Complex systems felt buildable. The orchestration and provisioning layers are the kind of thing that usually take weeks to get right. Copilot CLI gave us a strong first pass so we could refine behavior and edge cases instead of starting from zero.
Consistency across the stack. By describing our conventions in natural language (and in copilot-instructions), we kept naming, error handling, and structure consistent across backend, agents, and frontend.

We still review and test everything, especially where real provisioning and user data are involved, but Copilot CLI let us build a product that sells the idea of “deploy by talking,” instead of getting stuck in the plumbing.

Summary

Cloudgen is a product that lets you deploy your app by describing what you need in chat. We handle the complexity, planning, sizing, provisioning, and giving you live URLs and connection strings, so you don’t have to. We built it with GitHub Copilot CLI, and it’s what made it realistic to ship something this involved: multi-agent orchestration, a full API, real provisioning, and a polished dashboard, without drowning in boilerplate. We’re excited to keep improving Cloudgen, and we will soon launch it to the public for use. A lot of tools are also coming that help you deploy better. Thanks to copilot we can ship fast!!

Thanks to the GitHub Copilot team for running this challenge.

DEV Community