This is a submission for the GitHub Copilot CLI Challenge.
What I Built
Cloudgen is a product that turns cloud deployment into a conversation. You describe what you need in plain English, your app, how many users you expect, whether you need a database or cache, and an AI figures out the rest. No config files, no dashboards, no wrestling with infrastructure. You get a clear plan, a cost estimate, and youβre one approval away from your app being live.
Behind that simple experience is a lot of complexity. Understanding your intent, analyzing your repo, sizing resources, generating a safe execution plan, and then actually provisioning everything and giving you live URLs and connection strings. We built Cloudgen so you never have to touch that complexity yourself.
Demo
Live project: https://cloudgenapp.vercel.app/
Video walkthrough: https://www.loom.com/share/0b5f73a623bb438ebd1ec41053427c21
GitHub Repository: https://github.com/mbcse/cloudgen
π The Problem
Deploying even a simple application to the cloud today requires:
- Writing Dockerfiles, YAML manifests, and IaC templates
- Navigating complex dashboards across AWS / GCP / Azure
- Manually provisioning databases, caches, compute, and networking
- Understanding container orchestration, port mappings, reverse proxies, and health checks
For a developer who just wants to ship their app, this is too much friction.
Small teams and indie developers often spend more time wrangling infrastructure than building product. The gap between "I have a repo" and "it's live on the internet" shouldn't require DevOps expertise.
π‘ Our Solution
Cloudgen is a chat-first cloud control plane where users describe their infrastructure needs in plain English, and an agentic AI pipeline analyzes their repository, generates a deployment plan, and provisions real infrastructure β all with human-in-the-loop approval.
"Deploy my Next.js app from GitHub with a Postgres database for 500 users"
β Cloudgen analyzes the repo, generates a resource plan with cost estimates, and after approval, provisions containers, databases, and networking automatically.
Why it matters
Getting from βI have a repoβ to βitβs live on the internetβ usually means learning orchestration, networking, databases, and billing. Small teams and indie devs end up spending more time on infra than on product. Cloudgen flips that. You say what you want, review a plan the AI proposes, approve it, and we handle the rest. Deployment becomes something you do in a chat, not a checklist of manual steps.
β¨ Product Features
| Feature | Description |
|---|---|
| π£οΈ Chat-to-Deploy | Natural language interface, describe what you need, get a deployment plan |
| π Smart Repo Analysis | Auto-detects runtime (Node.js/Python), framework, build commands, and Dockerfile presence |
| π Plan Review & Approval | Every deployment requires explicit human approval, see resources, rationale, cost estimates, and YAML steps before anything runs |
| π³ Multi-Resource Provisioning | Deploy App Services, Compute Instances, PostgreSQL, and Redis from a single conversation |
| π° Cost Estimation | AI-powered resource sizing with tiered pricing (ac.starter, ac.pro, ac.business) |
| π‘ Live Deployment Logs | Real-time streaming logs as containers build and start |
| π Endpoint Discovery | Automatically returns live URLs and connection strings after deployment |
| π§ RAG-Powered Context | Internal docs and repo READMEs are indexed for smarter, context-aware responses |
| β‘ Graceful Degradation | Works without an LLM API key using deterministic heuristic fallbacks |
ποΈ Architecture Overview
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Next.js Frontend β
β Dashboard Β· Chat UI Β· Plan Review Β· Deployment Logs β
ββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ
β REST API
ββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββ
β Fastify API Server β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β LangGraph Chat Orchestrator β β
β β β β
β β Parse Intent βββΊ RAG Retrieval βββΊ Plan Generation β β
β β β β β β
β β βΌ βΌ β β
β β βββββββββββ ββββββββββββββββ ββββββββββββββ β β
β β β Intent β β Repo β β Capacity β β β
β β β Agent β β Inspector β β Agent β β β
β β β β β Agent β β β β β
β β βββββββββββ ββββββββββββββββ ββββββββββββββ β β
β β Gemini 2.0 Flash / Deterministic Fallback β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββ β
β β Deployer β β RAG Engine β β Prisma + Postgres β β
β β (SSH + Docker)β β (pgvector) β β (Control Plane DB) β β
β ββββββββ¬ββββββββ ββββββββββββββββ ββββββββββββββββββββββββ β
βββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SSH
βββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β EC2 Runtime Host β
β β
β βββββββββββ ββββββββββββ βββββββββββ ββββββββββββββββββ β
β β App β β Postgres β β Redis β β Compute β β
β βContainerβ βContainer β βContainerβ β Instance β β
β βββββββββββ ββββββββββββ βββββββββββ ββββββββββββββββββ β
β Nginx (reverse proxy) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π§ How It Works β Technical Deep Dive
1. Chat Orchestration (LangGraph)
The core of Cloudgen is a stateful LangGraph pipeline (agent.ts) that processes every user message through a directed graph of nodes:
START βββΊ parse_intent βββΊ retrieve_context βββΊ route_intent
β
βββββββββββββββββΌβββββββββββββ
βΌ βΌ βΌ
generate_plan answer_question ...
β
βΌ
save_plan βββΊ respond βββΊ END
-
parse_intentβ Regex + heuristic parser extracts repo URLs, expected user counts, database requirements, and whether the user wants to plan, deploy, or ask a question -
retrieve_contextβ Queries the RAG index (pgvector cosine similarity) for relevant internal docs and repo READMEs -
generate_planβ Invokes the multi-agent planning pipeline when a provisioning intent is detected -
respondβ Streams a contextual reply back using Gemini, or falls back to a templated response
2. Multi-Agent Planning Pipeline
When a deployment intent is detected, three specialized agents collaborate (planning.ts):
| Agent | Role | Output |
|---|---|---|
| Intent Agent | Extracts high-level goals β expected users, latency sensitivity, which resources are needed (app/postgres/redis/instance) | IntentAgentOutput |
| Repo Inspector Agent | Fetches the GitHub repo structure, package.json, Dockerfile, and requirements.txt to infer runtime, framework, build commands, and app port |
RepoInspectorOutput |
| Capacity Agent | Takes the outputs of the previous two agents and determines resource sizing (CPU, memory, replicas), tier selection, and cost estimation | CapacityAgentOutput |
Each agent calls Gemini 2.0 Flash via a structured JSON prompt (llm.ts). If no API key is available, each agent has a deterministic fallback so the system remains fully functional without any LLM.
The pipeline produces:
- A
DeploymentPlanstored in Postgres - A YAML step file (
deployment-plans/<plan-id>.yaml) describing the exact Docker commands to execute
3. RAG Engine (pgvector)
Cloudgen uses Retrieval-Augmented Generation to ground agent responses in real documentation (rag.ts):
- Internal docs (architecture, runbooks) are chunked and embedded into pgvector
- Repository READMEs from connected GitHub repos are fetched and indexed
- At query time, a deterministic 64-dim embedding is computed and a cosine similarity search retrieves the most relevant chunks
- Retrieved chunks are injected as context into the LLM prompt and returned as citations in the chat response
4. Deployment Executor (SSH + Docker)
After plan approval, the Deployer (deployer.ts) triggers the Executor (executor.ts):
- SSH tunnel to the runtime EC2 host using key-based authentication
- Clone the GitHub repo on the remote host
- Auto-generate Dockerfile if missing (for Node.js repos)
- Build the Docker image
-
Provision each resource as a Docker container with CPU/memory limits:
-
docker runwith--cpus,--memory, port mappings - Postgres containers use
postgres:16-alpine - Redis containers use
redis:7-alpine - Compute instances use
ubuntu:22.04with long-running entry points
-
- Health check β polls the container endpoint until it responds
- Configure Nginx reverse proxy route (optional)
- Return endpoints β live app URL + database/cache connection strings
5. Data Model (Prisma + PostgreSQL)
The control plane persists all state via Prisma ORM (schema.prisma):
| Model | Purpose |
|---|---|
Project |
A linked GitHub repo with name, slug, branch |
ChatSession |
Conversation thread tied to a project |
ChatMessage |
Individual messages with role and citations |
DeploymentPlan |
AI-generated plan with inputs, decision, cost, rationale |
Deployment |
Execution record with status, logs, and live URLs |
RagDocument / RagChunk / RagEmbedding
|
RAG corpus with pgvector embeddings |
π οΈ Tech Stack
| Layer | Technology |
|---|---|
| Frontend | Next.js 15, React, Tailwind CSS, NextAuth.js |
| Backend | Fastify 5, TypeScript, Zod validation |
| Agent Framework | LangGraph (stateful graph orchestration) |
| LLM | Gemini 2.0 Flash (streaming + structured JSON output) |
| Database | PostgreSQL + pgvector (control plane + RAG) |
| ORM | Prisma with raw SQL for vector operations |
| Deployment Runtime | Docker containers on EC2 via SSH |
| Monorepo | npm workspaces |
π Getting Started
Prerequisites
- Node.js 20+
- PostgreSQL 15+ with pgvector extension
- (Optional) Gemini API key for AI-powered planning
- (Optional) EC2 host with Docker for live deployments
Setup
# Install dependencies
npm install
# Generate Prisma client
npm run db:generate
# Run database migrations
npm run db:migrate -- --name init
# Seed initial data
npm run db:seed
# Start development servers
npm run dev
Environment Variables
Create apps/api/.env:
DATABASE_URL="postgresql://postgres:postgres@localhost:5432/cloudgen"
PORT=4000
# LLM (optional β system works without it via fallbacks)
GEMINI_API_KEY=""
GEMINI_MODEL="gemini-2.0-flash"
# Runtime deployment host (leave empty for local preview mode)
RUNTIME_SSH_HOST=""
RUNTIME_SSH_USER="ubuntu"
RUNTIME_SSH_PORT="22"
RUNTIME_SSH_KEY_PATH="/path/to/key.pem"
RUNTIME_BASE_DIR="/tmp/cloudgen-apps"
RUNTIME_PUBLIC_BASE_URL="http://your-ec2-ip"
# Resource limits
ACTIVE_APP_CAP="8"
Services
| Service | URL |
|---|---|
| Web Dashboard | http://localhost:3000 |
| API Server | http://localhost:4000 |
| Health Check | http://localhost:4000/health |
π‘ API Reference
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/projects |
Create a new project (link a GitHub repo) |
GET |
/api/projects/:id |
Get project details |
GET |
/api/projects/:id/resources |
Get provisioned resources for a project |
POST |
/api/chat |
Send a chat message (triggers planning if needed) |
GET |
/api/sessions/:id |
Get full chat session history |
POST |
/api/plans/:id/approve |
Approve a deployment plan |
GET |
/api/plans/:id/steps |
Get YAML deployment steps for a plan |
POST |
/api/deploy |
Trigger deployment of an approved plan |
GET |
/api/deployments/:id |
Get deployment status and logs |
π End-to-End Flow
User: "Deploy https://github.com/user/app with Postgres for 500 users"
β
ββββΊ Intent Parser: repo URL, 500 users, postgres needed
ββββΊ RAG Retrieval: fetch relevant architecture docs
ββββΊ Intent Agent: high-level goals + resource list
ββββΊ Repo Inspector: Node.js, has Dockerfile, port 3000
ββββΊ Capacity Agent: ac.pro tier, 1 CPU, 1GB RAM, ~$34/mo
β
βΌ
Plan Generated:
β’ App container (Node.js, port 3000)
β’ Postgres container (postgres:16-alpine)
β’ Estimated cost: $34/mo
β’ YAML steps file written
β
ββββΊ User reviews plan + rationale
ββββΊ User approves
β
βΌ
Deployment:
β’ SSH into EC2 host
β’ Clone repo, build Docker image
β’ Start app + postgres containers
β’ Health check passes
β’ Nginx route configured
β
βΌ
Result: "Your app is live at http://host:21042 π"
π§ͺ Smoke Test
Run the full end-to-end flow against a running instance:
npm run smoke
Screenshots:
My Experience with GitHub Copilot CLI
Cloudgen is a complex product. Orchestrated AI agents, a full API, real provisioning, and a dashboard. We used GitHub Copilot CLI from the terminal throughout the build. It didnβt just speed up typing, it helped us design and implement systems weβd have hesitated to tackle alone. Below is how we used it, the prompts that worked, and how you can use Copilot CLI more effectively on your own projects.
How we used Copilot CLI: interactive vs one-off
Copilot CLI has two modes we relied on every day.
Interactive mode (default). Weβd run copilot in the project root and stay in a session. Thatβs where we did most of the work: multi-file changes, refactors, and βadd a new node to the graphβ style tasks. The back-and-forth let us refine in small steps (βnow add error handling for when the repo URL is invalidβ) without rewriting long prompts. Weβd confirm we trusted the folder when asked, then work in that directory and its subdirectories.
Programmatic mode. For quick, single-shot tasks we used -p or --prompt. For example:
copilot -p "Add a Zod schema for POST /api/chat: sessionId optional UUID, projectId optional UUID, message required string min 1"
That gave us a schema and a route stub we could paste into the Fastify app. We used this for validation schemas, small utilities, and one-off scripts (e.g. βgenerate a bash script to run the API and web app with one commandβ). For anything that would modify or run files, Copilot asked for approval first, so we stayed in control.
Giving Copilot context: @file and scope
Copilot works better when it sees the exact code you care about. We used @file a lot.
Examples we actually used:
Explain @apps/api/src/agent.ts and list all graph nodes and what they write to state
So we could onboard quickly and later ask for new nodes without breaking the graph.In @apps/api/src/intent.ts add detection for "no database" and set databaseRequired to false. Keep the same ParsedIntent shape.
One file, one behavior change, clear outcome.@apps/api/src/planning.ts the Capacity agent output needs a new field 'reasoning: string[]'. Add it to the interface and to the fallback object.
Copilot had the interfaces and the fallback logic in context, so the change was consistent.Fix the bug in @apps/api/src/rag.ts: chunkContent is sometimes undefined when we build the citation. Add a filter.
We pointed at the file and described the symptom; Copilot proposed a safe fix.
We also kept sessions focused. When we switched from βagent graphβ work to βfrontendβ work, we ran /clear and started a new mental context. That cut down on Copilot suggesting changes to the wrong layer. When we needed to touch both API and web app, we stayed in one session and mentioned both: βUpdate the API to return estimatedCostUsd in the plan payload and add a cost row in the plan review card in the project page.β
Plan before you code: /plan for big features
For larger chunks of work we used plan mode. Instead of βimplement this whole thing,β we asked Copilot to design the steps first.
Example prompts:
/plan Add a deployment plan approval flow: API endpoint POST /api/plans/:id/approve, update plan status in DB, return updated plan. Frontend: Approve button that calls the endpoint and then shows a βDeployment startedβ state./plan Implement streaming chat: the /api/chat response should stream chunks. Frontend should consume the stream and append to the message content. Keep existing non-streaming fallback.
What we got: a structured plan (often saved to something like plan.md in the session), with checkboxes and ordered steps. We could review it, ask for edits (βadd a step to handle approval rejectionβ), and only then say βimplement this plan.β That reduced wrong turns and kept the codebase consistent. For quick bug fixes or single-file edits we didnβt use /plan, only for multi-file or multi-layer features.
Prompt examples that worked for Cloudgen
Here are real prompt patterns we used, so you can adapt them.
Orchestration and state (LangGraph)
- βIn agent.ts we have a state graph. Add a new node
retrieve_contextthat runs afterparse_intent. It should call retrieveContext from rag.js with the message, put the result in state.chunks, and pass through to the next node. Use the same Annotation pattern as the other nodes.β - βThe respond node should include citations from state.chunks in the streamed reply. Each citation needs source title and chunk text. Update the streamChatReply call to pass citations.β
- βOur graph has a branch: if intent.wantsPlan we go to generate_plan, else if intent.asksQuestion we go to answer_question. Add a default branch that sets reply to a short βI didnβt understandβ message and goes to END.β
Planning pipeline and types
- βIn planning.ts add a fallback for IntentAgentOutput when the LLM is unavailable: expectedUsers 100, databaseRequired true, requestedResources ['app'], confidence 0.5. Match the interface exactly.β
- βWeβre adding estimatedCostUsd to the plan. Update: 1) CapacityAgentOutput and the fallback in planning.ts, 2) the place we save the plan to the DB in agent.ts, 3) the DeploymentPlan type in the Prisma schema if needed.β
- βGenerate the TypeScript interface for DeployStep: id string, type enum (prepare_workspace, clone_repo, build_image, run_app, run_postgres, health_check, configure_route), target string, description string.β
API and validation
- βAdd a Fastify route POST /api/plans/:id/approve. Body: { approved: true, approvedBy?: string }. Validate with Zod. On success call approvePlan(planId) and return { plan }.β
- βWe need GET /api/deployments/:id that returns status, logsJson, appUrl, containerName. Use getDeployment from deployer.js. Return 404 if not found.β
Database and RAG
- βOur Prisma schema has Project, ChatSession, ChatMessage, DeploymentPlan, Deployment. Add a RagDocument model: id, sourceType enum (INTERNAL, RUNBOOK, REPO_README), sourceRef unique, title, content, createdAt, updatedAt. Add RagChunk with documentId and content.β
- βIn rag.ts the retrieveContext function should take a query string, compute an embedding (use the existing deterministic embedding if no API key), query RagChunk by cosine similarity on the embedding column, and return the top 5 chunks with title and sourceRef.β
Frontend
- βIn the project page, add a section that shows the latest deployment plan: rationale, resources list, estimated cost. If thereβs no plan, show βNo plan yet.β Use the existing API types.β
- βWire the Approve button to POST /api/plans/:id/approve and then POST /api/deploy with projectId and planId. Disable the button while loading and show a toast or inline message on error.β
- βThe deployment logs should update in real time. Add a polling loop that calls GET /api/deployments/:id every 2 seconds while status is QUEUED or BUILDING, and append new log entries to the list.β
Provisioning and reliability
- βThe executor runs a sequence of steps. If any step fails, we should push the error message to logs, set status to FAILED, and stop. Donβt run the next step. Add a try/catch around each step and update the deployment record.β
- βAdd a health check step after the app is running: GET the app URL with a 30s timeout, retry up to 5 times with 5s delay. If it never responds, mark deployment as FAILED and add βHealth check failedβ to logs.β
We were specific about inputs and outputs (e.g. βreturn the top 5 chunks with title and sourceRefβ) and broke big features into small prompts (one node, one endpoint, one UI block). That gave us code that matched our architecture instead of generic snippets.
Slash commands we used every day
We leaned on a few slash commands to work faster and keep context clean.
| Command | How we used it |
|---|---|
/clear |
Between unrelated tasks (e.g. after finishing the agent graph and before starting the dashboard). Clears conversation history so Copilot doesnβt drag in old files or decisions. |
/plan |
Before implementing a new feature or flow. Gets a step-by-step plan we can edit, then βimplement this plan.β |
/cwd |
When we needed to scope Copilot to apps/api or apps/web only. Weβd /cwd apps/api then ask for API-only changes. |
/model |
We switched to a more capable model for the orchestration and planning code (complex state and types), and kept a faster model for simple CRUD and UI. |
/help |
To discover other commands (e.g. /context, /session, /delegate) when we needed them. |
/review |
Before committing. βReview the changes in my current branch against main for potential bugs and security issues.β |
Pro tip: If you only remember three, use /clear, /cwd, and /plan. They give you control over context, scope, and how much βthinkingβ Copilot does before writing code.
Custom instructions so Copilot matched our stack
We added a .github/copilot-instructions.md (or repo-level instructions) so Copilot didnβt guess our conventions.
We wrote things like:
-
Build commands:
npm run dev(root, runs api + web),npm run db:migrate(from root, runs API migrations),npm run typecheck(root). -
Code style: TypeScript strict, ESM (import/export), prefer
async/await, use Zod for request validation. -
Structure: Backend in
apps/api/src, frontend inapps/web/appandcomponents, shared types inapps/web/lib/types.tsand APItypes.ts. -
Workflow: After adding an API endpoint, export it from the right module and add the route in
index.ts; after changing Prisma schema, rundb:generateanddb:migrate.
That way, when we said βadd an endpoint to list deployments for a project,β Copilot put it in the right place and used Zod and our existing patterns. Short, actionable instructions worked better than long essays.
How to use Copilot CLI more effectively (what we learned)
Break down complex tasks. βImplement the full chat flowβ is too big. We got better results with: βAdd the parse_intent node,β then βAdd the retrieve_context node and wire it after parse_intent,β then βAdd the branch that routes to generate_plan or answer_question.β Same for the frontend: one component or one API integration per prompt.
Be specific about inputs and outputs. βAdd a function that fetches the planβ is vague. βAdd a function getPlan(planId: string) that calls GET /api/plans/:id and returns the plan JSON or null if 404β is something Copilot can implement accurately.
Use @file for precision. When the change is in one file or you need to avoid touching others, put the path in the prompt. It reduces wrong-file edits and keeps context small.
Plan mode for multi-step work. For anything that touches API + DB + frontend, or several modules, we used /plan first. We reviewed the plan, adjusted it, then said βimplement this plan.β Fewer rollbacks and cleaner diffs.
Validate Copilotβs output. We always ran npm run typecheck and the relevant tests after accepting changes. We especially reviewed anything that touched deployment or user data. Copilot is a powerful draft, not a substitute for review.
Clear context when switching tasks. /clear between βbackendβ and βfrontendβ or between features kept suggestions relevant. We also closed files we werenβt working on so Copilot didnβt pull them in unnecessarily.
What actually changed for us
- We could think in product terms. Describe what should happen next, get code that matched. Less jumping between docs and editor.
- Complex systems felt buildable. The orchestration and provisioning layers are the kind of thing that usually take weeks to get right. Copilot CLI gave us a strong first pass so we could refine behavior and edge cases instead of starting from zero.
- Consistency across the stack. By describing our conventions in natural language (and in copilot-instructions), we kept naming, error handling, and structure consistent across backend, agents, and frontend.
We still review and test everything, especially where real provisioning and user data are involved, but Copilot CLI let us build a product that sells the idea of βdeploy by talking,β instead of getting stuck in the plumbing.
Summary
Cloudgen is a product that lets you deploy your app by describing what you need in chat. We handle the complexity, planning, sizing, provisioning, and giving you live URLs and connection strings, so you donβt have to. We built it with GitHub Copilot CLI, and itβs what made it realistic to ship something this involved: multi-agent orchestration, a full API, real provisioning, and a polished dashboard, without drowning in boilerplate. Weβre excited to keep improving Cloudgen, and we will soon launch it to the public for use. A lot of tools are also coming that help you deploy better. Thanks to copilot we can ship fast!!
Thanks to the GitHub Copilot team for running this challenge.


Top comments (0)