DEV Community

Cover image for Cloudgen Cloud Platform: Deploy Your App by Talking to It, Built with GitHub Copilot CLI
MOHIT BHAT
MOHIT BHAT

Posted on

Cloudgen Cloud Platform: Deploy Your App by Talking to It, Built with GitHub Copilot CLI

GitHub Copilot CLI Challenge Submission

This is a submission for the GitHub Copilot CLI Challenge.


What I Built

Cloudgen is a product that turns cloud deployment into a conversation. You describe what you need in plain English, your app, how many users you expect, whether you need a database or cache, and an AI figures out the rest. No config files, no dashboards, no wrestling with infrastructure. You get a clear plan, a cost estimate, and you’re one approval away from your app being live.

Behind that simple experience is a lot of complexity. Understanding your intent, analyzing your repo, sizing resources, generating a safe execution plan, and then actually provisioning everything and giving you live URLs and connection strings. We built Cloudgen so you never have to touch that complexity yourself.

Demo

Live project: https://cloudgenapp.vercel.app/

Video walkthrough: https://www.loom.com/share/0b5f73a623bb438ebd1ec41053427c21

GitHub Repository: https://github.com/mbcse/cloudgen

πŸ“Œ The Problem

Deploying even a simple application to the cloud today requires:

  • Writing Dockerfiles, YAML manifests, and IaC templates
  • Navigating complex dashboards across AWS / GCP / Azure
  • Manually provisioning databases, caches, compute, and networking
  • Understanding container orchestration, port mappings, reverse proxies, and health checks

For a developer who just wants to ship their app, this is too much friction.

Small teams and indie developers often spend more time wrangling infrastructure than building product. The gap between "I have a repo" and "it's live on the internet" shouldn't require DevOps expertise.


πŸ’‘ Our Solution

Cloudgen is a chat-first cloud control plane where users describe their infrastructure needs in plain English, and an agentic AI pipeline analyzes their repository, generates a deployment plan, and provisions real infrastructure β€” all with human-in-the-loop approval.

"Deploy my Next.js app from GitHub with a Postgres database for 500 users"

β†’ Cloudgen analyzes the repo, generates a resource plan with cost estimates, and after approval, provisions containers, databases, and networking automatically.


Why it matters

Getting from β€œI have a repo” to β€œit’s live on the internet” usually means learning orchestration, networking, databases, and billing. Small teams and indie devs end up spending more time on infra than on product. Cloudgen flips that. You say what you want, review a plan the AI proposes, approve it, and we handle the rest. Deployment becomes something you do in a chat, not a checklist of manual steps.

✨ Product Features

Feature Description
πŸ—£οΈ Chat-to-Deploy Natural language interface, describe what you need, get a deployment plan
πŸ” Smart Repo Analysis Auto-detects runtime (Node.js/Python), framework, build commands, and Dockerfile presence
πŸ“‹ Plan Review & Approval Every deployment requires explicit human approval, see resources, rationale, cost estimates, and YAML steps before anything runs
🐳 Multi-Resource Provisioning Deploy App Services, Compute Instances, PostgreSQL, and Redis from a single conversation
πŸ’° Cost Estimation AI-powered resource sizing with tiered pricing (ac.starter, ac.pro, ac.business)
πŸ“‘ Live Deployment Logs Real-time streaming logs as containers build and start
πŸ”— Endpoint Discovery Automatically returns live URLs and connection strings after deployment
🧠 RAG-Powered Context Internal docs and repo READMEs are indexed for smarter, context-aware responses
⚑ Graceful Degradation Works without an LLM API key using deterministic heuristic fallbacks

πŸ—οΈ Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Next.js Frontend                         β”‚
β”‚   Dashboard  Β·  Chat UI  Β·  Plan Review  Β·  Deployment Logs    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                               β”‚ REST API
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Fastify API Server                          β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚               LangGraph Chat Orchestrator                β”‚   β”‚
β”‚  β”‚                                                          β”‚   β”‚
β”‚  β”‚   Parse Intent ──► RAG Retrieval ──► Plan Generation     β”‚   β”‚
β”‚  β”‚        β”‚                                    β”‚            β”‚   β”‚
β”‚  β”‚        β–Ό                                    β–Ό            β”‚   β”‚
β”‚  β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚   β”‚
β”‚  β”‚   β”‚ Intent  β”‚   β”‚    Repo      β”‚   β”‚  Capacity  β”‚       β”‚   β”‚
β”‚  β”‚   β”‚ Agent   β”‚   β”‚  Inspector   β”‚   β”‚   Agent    β”‚       β”‚   β”‚
β”‚  β”‚   β”‚         β”‚   β”‚    Agent     β”‚   β”‚            β”‚       β”‚   β”‚
β”‚  β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚   β”‚
β”‚  β”‚        Gemini 2.0 Flash  /  Deterministic Fallback       β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚   Deployer   β”‚  β”‚  RAG Engine  β”‚  β”‚   Prisma + Postgres  β”‚   β”‚
β”‚  β”‚  (SSH + Docker)β”‚  β”‚  (pgvector)  β”‚  β”‚   (Control Plane DB) β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚ SSH
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    EC2 Runtime Host                              β”‚
β”‚                                                                 β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚   β”‚ App     β”‚  β”‚ Postgres β”‚  β”‚  Redis  β”‚  β”‚   Compute      β”‚   β”‚
β”‚   β”‚Containerβ”‚  β”‚Container β”‚  β”‚Containerβ”‚  β”‚   Instance      β”‚   β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                        Nginx (reverse proxy)                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

🧠 How It Works β€” Technical Deep Dive

1. Chat Orchestration (LangGraph)

The core of Cloudgen is a stateful LangGraph pipeline (agent.ts) that processes every user message through a directed graph of nodes:

START ──► parse_intent ──► retrieve_context ──► route_intent
                                                    β”‚
                                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                    β–Ό               β–Ό            β–Ό
                              generate_plan    answer_question   ...
                                    β”‚
                                    β–Ό
                                save_plan ──► respond ──► END
Enter fullscreen mode Exit fullscreen mode
  • parse_intent β€” Regex + heuristic parser extracts repo URLs, expected user counts, database requirements, and whether the user wants to plan, deploy, or ask a question
  • retrieve_context β€” Queries the RAG index (pgvector cosine similarity) for relevant internal docs and repo READMEs
  • generate_plan β€” Invokes the multi-agent planning pipeline when a provisioning intent is detected
  • respond β€” Streams a contextual reply back using Gemini, or falls back to a templated response

2. Multi-Agent Planning Pipeline

When a deployment intent is detected, three specialized agents collaborate (planning.ts):

Agent Role Output
Intent Agent Extracts high-level goals β€” expected users, latency sensitivity, which resources are needed (app/postgres/redis/instance) IntentAgentOutput
Repo Inspector Agent Fetches the GitHub repo structure, package.json, Dockerfile, and requirements.txt to infer runtime, framework, build commands, and app port RepoInspectorOutput
Capacity Agent Takes the outputs of the previous two agents and determines resource sizing (CPU, memory, replicas), tier selection, and cost estimation CapacityAgentOutput

Each agent calls Gemini 2.0 Flash via a structured JSON prompt (llm.ts). If no API key is available, each agent has a deterministic fallback so the system remains fully functional without any LLM.

The pipeline produces:

  • A DeploymentPlan stored in Postgres
  • A YAML step file (deployment-plans/<plan-id>.yaml) describing the exact Docker commands to execute

3. RAG Engine (pgvector)

Cloudgen uses Retrieval-Augmented Generation to ground agent responses in real documentation (rag.ts):

  • Internal docs (architecture, runbooks) are chunked and embedded into pgvector
  • Repository READMEs from connected GitHub repos are fetched and indexed
  • At query time, a deterministic 64-dim embedding is computed and a cosine similarity search retrieves the most relevant chunks
  • Retrieved chunks are injected as context into the LLM prompt and returned as citations in the chat response

4. Deployment Executor (SSH + Docker)

After plan approval, the Deployer (deployer.ts) triggers the Executor (executor.ts):

  1. SSH tunnel to the runtime EC2 host using key-based authentication
  2. Clone the GitHub repo on the remote host
  3. Auto-generate Dockerfile if missing (for Node.js repos)
  4. Build the Docker image
  5. Provision each resource as a Docker container with CPU/memory limits:
    • docker run with --cpus, --memory, port mappings
    • Postgres containers use postgres:16-alpine
    • Redis containers use redis:7-alpine
    • Compute instances use ubuntu:22.04 with long-running entry points
  6. Health check β€” polls the container endpoint until it responds
  7. Configure Nginx reverse proxy route (optional)
  8. Return endpoints β€” live app URL + database/cache connection strings

5. Data Model (Prisma + PostgreSQL)

The control plane persists all state via Prisma ORM (schema.prisma):

Model Purpose
Project A linked GitHub repo with name, slug, branch
ChatSession Conversation thread tied to a project
ChatMessage Individual messages with role and citations
DeploymentPlan AI-generated plan with inputs, decision, cost, rationale
Deployment Execution record with status, logs, and live URLs
RagDocument / RagChunk / RagEmbedding RAG corpus with pgvector embeddings

πŸ› οΈ Tech Stack

Layer Technology
Frontend Next.js 15, React, Tailwind CSS, NextAuth.js
Backend Fastify 5, TypeScript, Zod validation
Agent Framework LangGraph (stateful graph orchestration)
LLM Gemini 2.0 Flash (streaming + structured JSON output)
Database PostgreSQL + pgvector (control plane + RAG)
ORM Prisma with raw SQL for vector operations
Deployment Runtime Docker containers on EC2 via SSH
Monorepo npm workspaces

πŸš€ Getting Started

Prerequisites

  • Node.js 20+
  • PostgreSQL 15+ with pgvector extension
  • (Optional) Gemini API key for AI-powered planning
  • (Optional) EC2 host with Docker for live deployments

Setup

# Install dependencies
npm install

# Generate Prisma client
npm run db:generate

# Run database migrations
npm run db:migrate -- --name init

# Seed initial data
npm run db:seed

# Start development servers
npm run dev
Enter fullscreen mode Exit fullscreen mode

Environment Variables

Create apps/api/.env:

DATABASE_URL="postgresql://postgres:postgres@localhost:5432/cloudgen"
PORT=4000

# LLM (optional β€” system works without it via fallbacks)
GEMINI_API_KEY=""
GEMINI_MODEL="gemini-2.0-flash"

# Runtime deployment host (leave empty for local preview mode)
RUNTIME_SSH_HOST=""
RUNTIME_SSH_USER="ubuntu"
RUNTIME_SSH_PORT="22"
RUNTIME_SSH_KEY_PATH="/path/to/key.pem"
RUNTIME_BASE_DIR="/tmp/cloudgen-apps"
RUNTIME_PUBLIC_BASE_URL="http://your-ec2-ip"

# Resource limits
ACTIVE_APP_CAP="8"
Enter fullscreen mode Exit fullscreen mode

Services

Service URL
Web Dashboard http://localhost:3000
API Server http://localhost:4000
Health Check http://localhost:4000/health

πŸ“‘ API Reference

Method Endpoint Description
POST /api/projects Create a new project (link a GitHub repo)
GET /api/projects/:id Get project details
GET /api/projects/:id/resources Get provisioned resources for a project
POST /api/chat Send a chat message (triggers planning if needed)
GET /api/sessions/:id Get full chat session history
POST /api/plans/:id/approve Approve a deployment plan
GET /api/plans/:id/steps Get YAML deployment steps for a plan
POST /api/deploy Trigger deployment of an approved plan
GET /api/deployments/:id Get deployment status and logs

πŸ”„ End-to-End Flow

User: "Deploy https://github.com/user/app with Postgres for 500 users"
  β”‚
  β”œβ”€β”€β–Ί Intent Parser: repo URL, 500 users, postgres needed
  β”œβ”€β”€β–Ί RAG Retrieval: fetch relevant architecture docs
  β”œβ”€β”€β–Ί Intent Agent: high-level goals + resource list
  β”œβ”€β”€β–Ί Repo Inspector: Node.js, has Dockerfile, port 3000
  β”œβ”€β”€β–Ί Capacity Agent: ac.pro tier, 1 CPU, 1GB RAM, ~$34/mo
  β”‚
  β–Ό
Plan Generated:
  β€’ App container (Node.js, port 3000)
  β€’ Postgres container (postgres:16-alpine)
  β€’ Estimated cost: $34/mo
  β€’ YAML steps file written
  β”‚
  β”œβ”€β”€β–Ί User reviews plan + rationale
  β”œβ”€β”€β–Ί User approves
  β”‚
  β–Ό
Deployment:
  β€’ SSH into EC2 host
  β€’ Clone repo, build Docker image
  β€’ Start app + postgres containers
  β€’ Health check passes
  β€’ Nginx route configured
  β”‚
  β–Ό
Result: "Your app is live at http://host:21042 πŸŽ‰"
Enter fullscreen mode Exit fullscreen mode

πŸ§ͺ Smoke Test

Run the full end-to-end flow against a running instance:

npm run smoke
Enter fullscreen mode Exit fullscreen mode

Screenshots:


My Experience with GitHub Copilot CLI

Cloudgen is a complex product. Orchestrated AI agents, a full API, real provisioning, and a dashboard. We used GitHub Copilot CLI from the terminal throughout the build. It didn’t just speed up typing, it helped us design and implement systems we’d have hesitated to tackle alone. Below is how we used it, the prompts that worked, and how you can use Copilot CLI more effectively on your own projects.


How we used Copilot CLI: interactive vs one-off

Copilot CLI has two modes we relied on every day.

Interactive mode (default). We’d run copilot in the project root and stay in a session. That’s where we did most of the work: multi-file changes, refactors, and β€œadd a new node to the graph” style tasks. The back-and-forth let us refine in small steps (β€œnow add error handling for when the repo URL is invalid”) without rewriting long prompts. We’d confirm we trusted the folder when asked, then work in that directory and its subdirectories.

Programmatic mode. For quick, single-shot tasks we used -p or --prompt. For example:

copilot -p "Add a Zod schema for POST /api/chat: sessionId optional UUID, projectId optional UUID, message required string min 1"
Enter fullscreen mode Exit fullscreen mode

That gave us a schema and a route stub we could paste into the Fastify app. We used this for validation schemas, small utilities, and one-off scripts (e.g. β€œgenerate a bash script to run the API and web app with one command”). For anything that would modify or run files, Copilot asked for approval first, so we stayed in control.


Giving Copilot context: @file and scope

Copilot works better when it sees the exact code you care about. We used @file a lot.

Examples we actually used:

  • Explain @apps/api/src/agent.ts and list all graph nodes and what they write to state

    So we could onboard quickly and later ask for new nodes without breaking the graph.

  • In @apps/api/src/intent.ts add detection for "no database" and set databaseRequired to false. Keep the same ParsedIntent shape.

    One file, one behavior change, clear outcome.

  • @apps/api/src/planning.ts the Capacity agent output needs a new field 'reasoning: string[]'. Add it to the interface and to the fallback object.

    Copilot had the interfaces and the fallback logic in context, so the change was consistent.

  • Fix the bug in @apps/api/src/rag.ts: chunkContent is sometimes undefined when we build the citation. Add a filter.

    We pointed at the file and described the symptom; Copilot proposed a safe fix.

We also kept sessions focused. When we switched from β€œagent graph” work to β€œfrontend” work, we ran /clear and started a new mental context. That cut down on Copilot suggesting changes to the wrong layer. When we needed to touch both API and web app, we stayed in one session and mentioned both: β€œUpdate the API to return estimatedCostUsd in the plan payload and add a cost row in the plan review card in the project page.”


Plan before you code: /plan for big features

For larger chunks of work we used plan mode. Instead of β€œimplement this whole thing,” we asked Copilot to design the steps first.

Example prompts:

  • /plan Add a deployment plan approval flow: API endpoint POST /api/plans/:id/approve, update plan status in DB, return updated plan. Frontend: Approve button that calls the endpoint and then shows a β€œDeployment started” state.

  • /plan Implement streaming chat: the /api/chat response should stream chunks. Frontend should consume the stream and append to the message content. Keep existing non-streaming fallback.

What we got: a structured plan (often saved to something like plan.md in the session), with checkboxes and ordered steps. We could review it, ask for edits (β€œadd a step to handle approval rejection”), and only then say β€œimplement this plan.” That reduced wrong turns and kept the codebase consistent. For quick bug fixes or single-file edits we didn’t use /plan, only for multi-file or multi-layer features.


Prompt examples that worked for Cloudgen

Here are real prompt patterns we used, so you can adapt them.

Orchestration and state (LangGraph)

  • β€œIn agent.ts we have a state graph. Add a new node retrieve_context that runs after parse_intent. It should call retrieveContext from rag.js with the message, put the result in state.chunks, and pass through to the next node. Use the same Annotation pattern as the other nodes.”
  • β€œThe respond node should include citations from state.chunks in the streamed reply. Each citation needs source title and chunk text. Update the streamChatReply call to pass citations.”
  • β€œOur graph has a branch: if intent.wantsPlan we go to generate_plan, else if intent.asksQuestion we go to answer_question. Add a default branch that sets reply to a short β€˜I didn’t understand’ message and goes to END.”

Planning pipeline and types

  • β€œIn planning.ts add a fallback for IntentAgentOutput when the LLM is unavailable: expectedUsers 100, databaseRequired true, requestedResources ['app'], confidence 0.5. Match the interface exactly.”
  • β€œWe’re adding estimatedCostUsd to the plan. Update: 1) CapacityAgentOutput and the fallback in planning.ts, 2) the place we save the plan to the DB in agent.ts, 3) the DeploymentPlan type in the Prisma schema if needed.”
  • β€œGenerate the TypeScript interface for DeployStep: id string, type enum (prepare_workspace, clone_repo, build_image, run_app, run_postgres, health_check, configure_route), target string, description string.”

API and validation

  • β€œAdd a Fastify route POST /api/plans/:id/approve. Body: { approved: true, approvedBy?: string }. Validate with Zod. On success call approvePlan(planId) and return { plan }.”
  • β€œWe need GET /api/deployments/:id that returns status, logsJson, appUrl, containerName. Use getDeployment from deployer.js. Return 404 if not found.”

Database and RAG

  • β€œOur Prisma schema has Project, ChatSession, ChatMessage, DeploymentPlan, Deployment. Add a RagDocument model: id, sourceType enum (INTERNAL, RUNBOOK, REPO_README), sourceRef unique, title, content, createdAt, updatedAt. Add RagChunk with documentId and content.”
  • β€œIn rag.ts the retrieveContext function should take a query string, compute an embedding (use the existing deterministic embedding if no API key), query RagChunk by cosine similarity on the embedding column, and return the top 5 chunks with title and sourceRef.”

Frontend

  • β€œIn the project page, add a section that shows the latest deployment plan: rationale, resources list, estimated cost. If there’s no plan, show β€˜No plan yet.’ Use the existing API types.”
  • β€œWire the Approve button to POST /api/plans/:id/approve and then POST /api/deploy with projectId and planId. Disable the button while loading and show a toast or inline message on error.”
  • β€œThe deployment logs should update in real time. Add a polling loop that calls GET /api/deployments/:id every 2 seconds while status is QUEUED or BUILDING, and append new log entries to the list.”

Provisioning and reliability

  • β€œThe executor runs a sequence of steps. If any step fails, we should push the error message to logs, set status to FAILED, and stop. Don’t run the next step. Add a try/catch around each step and update the deployment record.”
  • β€œAdd a health check step after the app is running: GET the app URL with a 30s timeout, retry up to 5 times with 5s delay. If it never responds, mark deployment as FAILED and add β€˜Health check failed’ to logs.”

We were specific about inputs and outputs (e.g. β€œreturn the top 5 chunks with title and sourceRef”) and broke big features into small prompts (one node, one endpoint, one UI block). That gave us code that matched our architecture instead of generic snippets.


Slash commands we used every day

We leaned on a few slash commands to work faster and keep context clean.

Command How we used it
/clear Between unrelated tasks (e.g. after finishing the agent graph and before starting the dashboard). Clears conversation history so Copilot doesn’t drag in old files or decisions.
/plan Before implementing a new feature or flow. Gets a step-by-step plan we can edit, then β€œimplement this plan.”
/cwd When we needed to scope Copilot to apps/api or apps/web only. We’d /cwd apps/api then ask for API-only changes.
/model We switched to a more capable model for the orchestration and planning code (complex state and types), and kept a faster model for simple CRUD and UI.
/help To discover other commands (e.g. /context, /session, /delegate) when we needed them.
/review Before committing. β€œReview the changes in my current branch against main for potential bugs and security issues.”

Pro tip: If you only remember three, use /clear, /cwd, and /plan. They give you control over context, scope, and how much β€œthinking” Copilot does before writing code.


Custom instructions so Copilot matched our stack

We added a .github/copilot-instructions.md (or repo-level instructions) so Copilot didn’t guess our conventions.

We wrote things like:

  • Build commands: npm run dev (root, runs api + web), npm run db:migrate (from root, runs API migrations), npm run typecheck (root).
  • Code style: TypeScript strict, ESM (import/export), prefer async/await, use Zod for request validation.
  • Structure: Backend in apps/api/src, frontend in apps/web/app and components, shared types in apps/web/lib/types.ts and API types.ts.
  • Workflow: After adding an API endpoint, export it from the right module and add the route in index.ts; after changing Prisma schema, run db:generate and db:migrate.

That way, when we said β€œadd an endpoint to list deployments for a project,” Copilot put it in the right place and used Zod and our existing patterns. Short, actionable instructions worked better than long essays.


How to use Copilot CLI more effectively (what we learned)

Break down complex tasks. β€œImplement the full chat flow” is too big. We got better results with: β€œAdd the parse_intent node,” then β€œAdd the retrieve_context node and wire it after parse_intent,” then β€œAdd the branch that routes to generate_plan or answer_question.” Same for the frontend: one component or one API integration per prompt.

Be specific about inputs and outputs. β€œAdd a function that fetches the plan” is vague. β€œAdd a function getPlan(planId: string) that calls GET /api/plans/:id and returns the plan JSON or null if 404” is something Copilot can implement accurately.

Use @file for precision. When the change is in one file or you need to avoid touching others, put the path in the prompt. It reduces wrong-file edits and keeps context small.

Plan mode for multi-step work. For anything that touches API + DB + frontend, or several modules, we used /plan first. We reviewed the plan, adjusted it, then said β€œimplement this plan.” Fewer rollbacks and cleaner diffs.

Validate Copilot’s output. We always ran npm run typecheck and the relevant tests after accepting changes. We especially reviewed anything that touched deployment or user data. Copilot is a powerful draft, not a substitute for review.

Clear context when switching tasks. /clear between β€œbackend” and β€œfrontend” or between features kept suggestions relevant. We also closed files we weren’t working on so Copilot didn’t pull them in unnecessarily.


What actually changed for us

  • We could think in product terms. Describe what should happen next, get code that matched. Less jumping between docs and editor.
  • Complex systems felt buildable. The orchestration and provisioning layers are the kind of thing that usually take weeks to get right. Copilot CLI gave us a strong first pass so we could refine behavior and edge cases instead of starting from zero.
  • Consistency across the stack. By describing our conventions in natural language (and in copilot-instructions), we kept naming, error handling, and structure consistent across backend, agents, and frontend.

We still review and test everything, especially where real provisioning and user data are involved, but Copilot CLI let us build a product that sells the idea of β€œdeploy by talking,” instead of getting stuck in the plumbing.


Summary

Cloudgen is a product that lets you deploy your app by describing what you need in chat. We handle the complexity, planning, sizing, provisioning, and giving you live URLs and connection strings, so you don’t have to. We built it with GitHub Copilot CLI, and it’s what made it realistic to ship something this involved: multi-agent orchestration, a full API, real provisioning, and a polished dashboard, without drowning in boilerplate. We’re excited to keep improving Cloudgen, and we will soon launch it to the public for use. A lot of tools are also coming that help you deploy better. Thanks to copilot we can ship fast!!

Thanks to the GitHub Copilot team for running this challenge.

Top comments (0)