This is a submission for the GitHub Copilot CLI Challenge.
What I Built
Cloudgen is a product that turns cloud deployment into a conversation. You describe what you need in plain English, your app, how many users you expect, whether you need a database or cache, and an AI figures out the rest. No config files, no dashboards, no wrestling with infrastructure. You get a clear plan, a cost estimate, and you’re one approval away from your app being live.
Behind that simple experience is a lot of complexity. Understanding your intent, analyzing your repo, sizing resources, generating a safe execution plan, and then actually provisioning everything and giving you live URLs and connection strings. We built Cloudgen so you never have to touch that complexity yourself.
Demo
Live project: https://cloudgenapp.vercel.app/
Video walkthrough: https://www.loom.com/share/0b5f73a623bb438ebd1ec41053427c21
GitHub Repository: https://github.com/mbcse/cloudgen
📌 The Problem
Deploying even a simple application to the cloud today requires:
- Writing Dockerfiles, YAML manifests, and IaC templates
- Navigating complex dashboards across AWS / GCP / Azure
- Manually provisioning databases, caches, compute, and networking
- Understanding container orchestration, port mappings, reverse proxies, and health checks
For a developer who just wants to ship their app, this is too much friction.
Small teams and indie developers often spend more time wrangling infrastructure than building product. The gap between "I have a repo" and "it's live on the internet" shouldn't require DevOps expertise.
💡 Our Solution
Cloudgen is a chat-first cloud control plane where users describe their infrastructure needs in plain English, and an agentic AI pipeline analyzes their repository, generates a deployment plan, and provisions real infrastructure — all with human-in-the-loop approval.
"Deploy my Next.js app from GitHub with a Postgres database for 500 users"
→ Cloudgen analyzes the repo, generates a resource plan with cost estimates, and after approval, provisions containers, databases, and networking automatically.
Why it matters
Getting from “I have a repo” to “it’s live on the internet” usually means learning orchestration, networking, databases, and billing. Small teams and indie devs end up spending more time on infra than on product. Cloudgen flips that. You say what you want, review a plan the AI proposes, approve it, and we handle the rest. Deployment becomes something you do in a chat, not a checklist of manual steps.
✨ Product Features
| Feature | Description |
|---|---|
| 🗣️ Chat-to-Deploy | Natural language interface, describe what you need, get a deployment plan |
| 🔍 Smart Repo Analysis | Auto-detects runtime (Node.js/Python), framework, build commands, and Dockerfile presence |
| 📋 Plan Review & Approval | Every deployment requires explicit human approval, see resources, rationale, cost estimates, and YAML steps before anything runs |
| 🐳 Multi-Resource Provisioning | Deploy App Services, Compute Instances, PostgreSQL, and Redis from a single conversation |
| 💰 Cost Estimation | AI-powered resource sizing with tiered pricing (ac.starter, ac.pro, ac.business) |
| 📡 Live Deployment Logs | Real-time streaming logs as containers build and start |
| 🔗 Endpoint Discovery | Automatically returns live URLs and connection strings after deployment |
| 🧠 RAG-Powered Context | Internal docs and repo READMEs are indexed for smarter, context-aware responses |
| ⚡ Graceful Degradation | Works without an LLM API key using deterministic heuristic fallbacks |
🏗️ Architecture Overview
┌─────────────────────────────────────────────────────────────────┐
│ Next.js Frontend │
│ Dashboard · Chat UI · Plan Review · Deployment Logs │
└──────────────────────────────┬──────────────────────────────────┘
│ REST API
┌──────────────────────────────▼──────────────────────────────────┐
│ Fastify API Server │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ LangGraph Chat Orchestrator │ │
│ │ │ │
│ │ Parse Intent ──► RAG Retrieval ──► Plan Generation │ │
│ │ │ │ │ │
│ │ ▼ ▼ │ │
│ │ ┌─────────┐ ┌──────────────┐ ┌────────────┐ │ │
│ │ │ Intent │ │ Repo │ │ Capacity │ │ │
│ │ │ Agent │ │ Inspector │ │ Agent │ │ │
│ │ │ │ │ Agent │ │ │ │ │
│ │ └─────────┘ └──────────────┘ └────────────┘ │ │
│ │ Gemini 2.0 Flash / Deterministic Fallback │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │ Deployer │ │ RAG Engine │ │ Prisma + Postgres │ │
│ │ (SSH + Docker)│ │ (pgvector) │ │ (Control Plane DB) │ │
│ └──────┬───────┘ └──────────────┘ └──────────────────────┘ │
└─────────┼───────────────────────────────────────────────────────┘
│ SSH
┌─────────▼───────────────────────────────────────────────────────┐
│ EC2 Runtime Host │
│ │
│ ┌─────────┐ ┌──────────┐ ┌─────────┐ ┌────────────────┐ │
│ │ App │ │ Postgres │ │ Redis │ │ Compute │ │
│ │Container│ │Container │ │Container│ │ Instance │ │
│ └─────────┘ └──────────┘ └─────────┘ └────────────────┘ │
│ Nginx (reverse proxy) │
└─────────────────────────────────────────────────────────────────┘
🧠 How It Works — Technical Deep Dive
1. Chat Orchestration (LangGraph)
The core of Cloudgen is a stateful LangGraph pipeline (agent.ts) that processes every user message through a directed graph of nodes:
START ──► parse_intent ──► retrieve_context ──► route_intent
│
┌───────────────┼────────────┐
▼ ▼ ▼
generate_plan answer_question ...
│
▼
save_plan ──► respond ──► END
-
parse_intent— Regex + heuristic parser extracts repo URLs, expected user counts, database requirements, and whether the user wants to plan, deploy, or ask a question -
retrieve_context— Queries the RAG index (pgvector cosine similarity) for relevant internal docs and repo READMEs -
generate_plan— Invokes the multi-agent planning pipeline when a provisioning intent is detected -
respond— Streams a contextual reply back using Gemini, or falls back to a templated response
2. Multi-Agent Planning Pipeline
When a deployment intent is detected, three specialized agents collaborate (planning.ts):
| Agent | Role | Output |
|---|---|---|
| Intent Agent | Extracts high-level goals — expected users, latency sensitivity, which resources are needed (app/postgres/redis/instance) | IntentAgentOutput |
| Repo Inspector Agent | Fetches the GitHub repo structure, package.json, Dockerfile, and requirements.txt to infer runtime, framework, build commands, and app port |
RepoInspectorOutput |
| Capacity Agent | Takes the outputs of the previous two agents and determines resource sizing (CPU, memory, replicas), tier selection, and cost estimation | CapacityAgentOutput |
Each agent calls Gemini 2.0 Flash via a structured JSON prompt (llm.ts). If no API key is available, each agent has a deterministic fallback so the system remains fully functional without any LLM.
The pipeline produces:
- A
DeploymentPlanstored in Postgres - A YAML step file (
deployment-plans/<plan-id>.yaml) describing the exact Docker commands to execute
3. RAG Engine (pgvector)
Cloudgen uses Retrieval-Augmented Generation to ground agent responses in real documentation (rag.ts):
- Internal docs (architecture, runbooks) are chunked and embedded into pgvector
- Repository READMEs from connected GitHub repos are fetched and indexed
- At query time, a deterministic 64-dim embedding is computed and a cosine similarity search retrieves the most relevant chunks
- Retrieved chunks are injected as context into the LLM prompt and returned as citations in the chat response
4. Deployment Executor (SSH + Docker)
After plan approval, the Deployer (deployer.ts) triggers the Executor (executor.ts):
- SSH tunnel to the runtime EC2 host using key-based authentication
- Clone the GitHub repo on the remote host
- Auto-generate Dockerfile if missing (for Node.js repos)
- Build the Docker image
-
Provision each resource as a Docker container with CPU/memory limits:
-
docker runwith--cpus,--memory, port mappings - Postgres containers use
postgres:16-alpine - Redis containers use
redis:7-alpine - Compute instances use
ubuntu:22.04with long-running entry points
-
- Health check — polls the container endpoint until it responds
- Configure Nginx reverse proxy route (optional)
- Return endpoints — live app URL + database/cache connection strings
5. Data Model (Prisma + PostgreSQL)
The control plane persists all state via Prisma ORM (schema.prisma):
| Model | Purpose |
|---|---|
Project |
A linked GitHub repo with name, slug, branch |
ChatSession |
Conversation thread tied to a project |
ChatMessage |
Individual messages with role and citations |
DeploymentPlan |
AI-generated plan with inputs, decision, cost, rationale |
Deployment |
Execution record with status, logs, and live URLs |
RagDocument / RagChunk / RagEmbedding
|
RAG corpus with pgvector embeddings |
🛠️ Tech Stack
| Layer | Technology |
|---|---|
| Frontend | Next.js 15, React, Tailwind CSS, NextAuth.js |
| Backend | Fastify 5, TypeScript, Zod validation |
| Agent Framework | LangGraph (stateful graph orchestration) |
| LLM | Gemini 2.0 Flash (streaming + structured JSON output) |
| Database | PostgreSQL + pgvector (control plane + RAG) |
| ORM | Prisma with raw SQL for vector operations |
| Deployment Runtime | Docker containers on EC2 via SSH |
| Monorepo | npm workspaces |
🚀 Getting Started
Prerequisites
- Node.js 20+
- PostgreSQL 15+ with pgvector extension
- (Optional) Gemini API key for AI-powered planning
- (Optional) EC2 host with Docker for live deployments
Setup
# Install dependencies
npm install
# Generate Prisma client
npm run db:generate
# Run database migrations
npm run db:migrate -- --name init
# Seed initial data
npm run db:seed
# Start development servers
npm run dev
Environment Variables
Create apps/api/.env:
DATABASE_URL="postgresql://postgres:postgres@localhost:5432/cloudgen"
PORT=4000
# LLM (optional — system works without it via fallbacks)
GEMINI_API_KEY=""
GEMINI_MODEL="gemini-2.0-flash"
# Runtime deployment host (leave empty for local preview mode)
RUNTIME_SSH_HOST=""
RUNTIME_SSH_USER="ubuntu"
RUNTIME_SSH_PORT="22"
RUNTIME_SSH_KEY_PATH="/path/to/key.pem"
RUNTIME_BASE_DIR="/tmp/cloudgen-apps"
RUNTIME_PUBLIC_BASE_URL="http://your-ec2-ip"
# Resource limits
ACTIVE_APP_CAP="8"
Services
| Service | URL |
|---|---|
| Web Dashboard | http://localhost:3000 |
| API Server | http://localhost:4000 |
| Health Check | http://localhost:4000/health |
📡 API Reference
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/projects |
Create a new project (link a GitHub repo) |
GET |
/api/projects/:id |
Get project details |
GET |
/api/projects/:id/resources |
Get provisioned resources for a project |
POST |
/api/chat |
Send a chat message (triggers planning if needed) |
GET |
/api/sessions/:id |
Get full chat session history |
POST |
/api/plans/:id/approve |
Approve a deployment plan |
GET |
/api/plans/:id/steps |
Get YAML deployment steps for a plan |
POST |
/api/deploy |
Trigger deployment of an approved plan |
GET |
/api/deployments/:id |
Get deployment status and logs |
🔄 End-to-End Flow
User: "Deploy https://github.com/user/app with Postgres for 500 users"
│
├──► Intent Parser: repo URL, 500 users, postgres needed
├──► RAG Retrieval: fetch relevant architecture docs
├──► Intent Agent: high-level goals + resource list
├──► Repo Inspector: Node.js, has Dockerfile, port 3000
├──► Capacity Agent: ac.pro tier, 1 CPU, 1GB RAM, ~$34/mo
│
▼
Plan Generated:
• App container (Node.js, port 3000)
• Postgres container (postgres:16-alpine)
• Estimated cost: $34/mo
• YAML steps file written
│
├──► User reviews plan + rationale
├──► User approves
│
▼
Deployment:
• SSH into EC2 host
• Clone repo, build Docker image
• Start app + postgres containers
• Health check passes
• Nginx route configured
│
▼
Result: "Your app is live at http://host:21042 🎉"
🧪 Smoke Test
Run the full end-to-end flow against a running instance:
npm run smoke
Screenshots:
My Experience with GitHub Copilot CLI
Cloudgen is a complex product. Orchestrated AI agents, a full API, real provisioning, and a dashboard. We used GitHub Copilot CLI from the terminal throughout the build. It didn’t just speed up typing, it helped us design and implement systems we’d have hesitated to tackle alone. Below is how we used it, the prompts that worked, and how you can use Copilot CLI more effectively on your own projects.
How we used Copilot CLI: interactive vs one-off
Copilot CLI has two modes we relied on every day.
Interactive mode (default). We’d run copilot in the project root and stay in a session. That’s where we did most of the work: multi-file changes, refactors, and “add a new node to the graph” style tasks. The back-and-forth let us refine in small steps (“now add error handling for when the repo URL is invalid”) without rewriting long prompts. We’d confirm we trusted the folder when asked, then work in that directory and its subdirectories.
Programmatic mode. For quick, single-shot tasks we used -p or --prompt. For example:
copilot -p "Add a Zod schema for POST /api/chat: sessionId optional UUID, projectId optional UUID, message required string min 1"
That gave us a schema and a route stub we could paste into the Fastify app. We used this for validation schemas, small utilities, and one-off scripts (e.g. “generate a bash script to run the API and web app with one command”). For anything that would modify or run files, Copilot asked for approval first, so we stayed in control.
Giving Copilot context: @file and scope
Copilot works better when it sees the exact code you care about. We used @file a lot.
Examples we actually used:
Explain @apps/api/src/agent.ts and list all graph nodes and what they write to state
So we could onboard quickly and later ask for new nodes without breaking the graph.In @apps/api/src/intent.ts add detection for "no database" and set databaseRequired to false. Keep the same ParsedIntent shape.
One file, one behavior change, clear outcome.@apps/api/src/planning.ts the Capacity agent output needs a new field 'reasoning: string[]'. Add it to the interface and to the fallback object.
Copilot had the interfaces and the fallback logic in context, so the change was consistent.Fix the bug in @apps/api/src/rag.ts: chunkContent is sometimes undefined when we build the citation. Add a filter.
We pointed at the file and described the symptom; Copilot proposed a safe fix.
We also kept sessions focused. When we switched from “agent graph” work to “frontend” work, we ran /clear and started a new mental context. That cut down on Copilot suggesting changes to the wrong layer. When we needed to touch both API and web app, we stayed in one session and mentioned both: “Update the API to return estimatedCostUsd in the plan payload and add a cost row in the plan review card in the project page.”
Plan before you code: /plan for big features
For larger chunks of work we used plan mode. Instead of “implement this whole thing,” we asked Copilot to design the steps first.
Example prompts:
/plan Add a deployment plan approval flow: API endpoint POST /api/plans/:id/approve, update plan status in DB, return updated plan. Frontend: Approve button that calls the endpoint and then shows a “Deployment started” state./plan Implement streaming chat: the /api/chat response should stream chunks. Frontend should consume the stream and append to the message content. Keep existing non-streaming fallback.
What we got: a structured plan (often saved to something like plan.md in the session), with checkboxes and ordered steps. We could review it, ask for edits (“add a step to handle approval rejection”), and only then say “implement this plan.” That reduced wrong turns and kept the codebase consistent. For quick bug fixes or single-file edits we didn’t use /plan, only for multi-file or multi-layer features.
Prompt examples that worked for Cloudgen
Here are real prompt patterns we used, so you can adapt them.
Orchestration and state (LangGraph)
- “In agent.ts we have a state graph. Add a new node
retrieve_contextthat runs afterparse_intent. It should call retrieveContext from rag.js with the message, put the result in state.chunks, and pass through to the next node. Use the same Annotation pattern as the other nodes.” - “The respond node should include citations from state.chunks in the streamed reply. Each citation needs source title and chunk text. Update the streamChatReply call to pass citations.”
- “Our graph has a branch: if intent.wantsPlan we go to generate_plan, else if intent.asksQuestion we go to answer_question. Add a default branch that sets reply to a short ‘I didn’t understand’ message and goes to END.”
Planning pipeline and types
- “In planning.ts add a fallback for IntentAgentOutput when the LLM is unavailable: expectedUsers 100, databaseRequired true, requestedResources ['app'], confidence 0.5. Match the interface exactly.”
- “We’re adding estimatedCostUsd to the plan. Update: 1) CapacityAgentOutput and the fallback in planning.ts, 2) the place we save the plan to the DB in agent.ts, 3) the DeploymentPlan type in the Prisma schema if needed.”
- “Generate the TypeScript interface for DeployStep: id string, type enum (prepare_workspace, clone_repo, build_image, run_app, run_postgres, health_check, configure_route), target string, description string.”
API and validation
- “Add a Fastify route POST /api/plans/:id/approve. Body: { approved: true, approvedBy?: string }. Validate with Zod. On success call approvePlan(planId) and return { plan }.”
- “We need GET /api/deployments/:id that returns status, logsJson, appUrl, containerName. Use getDeployment from deployer.js. Return 404 if not found.”
Database and RAG
- “Our Prisma schema has Project, ChatSession, ChatMessage, DeploymentPlan, Deployment. Add a RagDocument model: id, sourceType enum (INTERNAL, RUNBOOK, REPO_README), sourceRef unique, title, content, createdAt, updatedAt. Add RagChunk with documentId and content.”
- “In rag.ts the retrieveContext function should take a query string, compute an embedding (use the existing deterministic embedding if no API key), query RagChunk by cosine similarity on the embedding column, and return the top 5 chunks with title and sourceRef.”
Frontend
- “In the project page, add a section that shows the latest deployment plan: rationale, resources list, estimated cost. If there’s no plan, show ‘No plan yet.’ Use the existing API types.”
- “Wire the Approve button to POST /api/plans/:id/approve and then POST /api/deploy with projectId and planId. Disable the button while loading and show a toast or inline message on error.”
- “The deployment logs should update in real time. Add a polling loop that calls GET /api/deployments/:id every 2 seconds while status is QUEUED or BUILDING, and append new log entries to the list.”
Provisioning and reliability
- “The executor runs a sequence of steps. If any step fails, we should push the error message to logs, set status to FAILED, and stop. Don’t run the next step. Add a try/catch around each step and update the deployment record.”
- “Add a health check step after the app is running: GET the app URL with a 30s timeout, retry up to 5 times with 5s delay. If it never responds, mark deployment as FAILED and add ‘Health check failed’ to logs.”
We were specific about inputs and outputs (e.g. “return the top 5 chunks with title and sourceRef”) and broke big features into small prompts (one node, one endpoint, one UI block). That gave us code that matched our architecture instead of generic snippets.
Slash commands we used every day
We leaned on a few slash commands to work faster and keep context clean.
| Command | How we used it |
|---|---|
/clear |
Between unrelated tasks (e.g. after finishing the agent graph and before starting the dashboard). Clears conversation history so Copilot doesn’t drag in old files or decisions. |
/plan |
Before implementing a new feature or flow. Gets a step-by-step plan we can edit, then “implement this plan.” |
/cwd |
When we needed to scope Copilot to apps/api or apps/web only. We’d /cwd apps/api then ask for API-only changes. |
/model |
We switched to a more capable model for the orchestration and planning code (complex state and types), and kept a faster model for simple CRUD and UI. |
/help |
To discover other commands (e.g. /context, /session, /delegate) when we needed them. |
/review |
Before committing. “Review the changes in my current branch against main for potential bugs and security issues.” |
Pro tip: If you only remember three, use /clear, /cwd, and /plan. They give you control over context, scope, and how much “thinking” Copilot does before writing code.
Custom instructions so Copilot matched our stack
We added a .github/copilot-instructions.md (or repo-level instructions) so Copilot didn’t guess our conventions.
We wrote things like:
-
Build commands:
npm run dev(root, runs api + web),npm run db:migrate(from root, runs API migrations),npm run typecheck(root). -
Code style: TypeScript strict, ESM (import/export), prefer
async/await, use Zod for request validation. -
Structure: Backend in
apps/api/src, frontend inapps/web/appandcomponents, shared types inapps/web/lib/types.tsand APItypes.ts. -
Workflow: After adding an API endpoint, export it from the right module and add the route in
index.ts; after changing Prisma schema, rundb:generateanddb:migrate.
That way, when we said “add an endpoint to list deployments for a project,” Copilot put it in the right place and used Zod and our existing patterns. Short, actionable instructions worked better than long essays.
How to use Copilot CLI more effectively (what we learned)
Break down complex tasks. “Implement the full chat flow” is too big. We got better results with: “Add the parse_intent node,” then “Add the retrieve_context node and wire it after parse_intent,” then “Add the branch that routes to generate_plan or answer_question.” Same for the frontend: one component or one API integration per prompt.
Be specific about inputs and outputs. “Add a function that fetches the plan” is vague. “Add a function getPlan(planId: string) that calls GET /api/plans/:id and returns the plan JSON or null if 404” is something Copilot can implement accurately.
Use @file for precision. When the change is in one file or you need to avoid touching others, put the path in the prompt. It reduces wrong-file edits and keeps context small.
Plan mode for multi-step work. For anything that touches API + DB + frontend, or several modules, we used /plan first. We reviewed the plan, adjusted it, then said “implement this plan.” Fewer rollbacks and cleaner diffs.
Validate Copilot’s output. We always ran npm run typecheck and the relevant tests after accepting changes. We especially reviewed anything that touched deployment or user data. Copilot is a powerful draft, not a substitute for review.
Clear context when switching tasks. /clear between “backend” and “frontend” or between features kept suggestions relevant. We also closed files we weren’t working on so Copilot didn’t pull them in unnecessarily.
What actually changed for us
- We could think in product terms. Describe what should happen next, get code that matched. Less jumping between docs and editor.
- Complex systems felt buildable. The orchestration and provisioning layers are the kind of thing that usually take weeks to get right. Copilot CLI gave us a strong first pass so we could refine behavior and edge cases instead of starting from zero.
- Consistency across the stack. By describing our conventions in natural language (and in copilot-instructions), we kept naming, error handling, and structure consistent across backend, agents, and frontend.
We still review and test everything, especially where real provisioning and user data are involved, but Copilot CLI let us build a product that sells the idea of “deploy by talking,” instead of getting stuck in the plumbing.
Summary
Cloudgen is a product that lets you deploy your app by describing what you need in chat. We handle the complexity, planning, sizing, provisioning, and giving you live URLs and connection strings, so you don’t have to. We built it with GitHub Copilot CLI, and it’s what made it realistic to ship something this involved: multi-agent orchestration, a full API, real provisioning, and a polished dashboard, without drowning in boilerplate. We’re excited to keep improving Cloudgen, and we will soon launch it to the public for use. A lot of tools are also coming that help you deploy better. Thanks to copilot we can ship fast!!
Thanks to the GitHub Copilot team for running this challenge.


Top comments (0)