Syntora

Posted on Feb 14

Building an AI Proposal Pipeline: From Call Transcripts to Branded Web Pages with Python and Supabase

#python #ai #supabase #automation

Every client engagement at my agency starts the same way: a 30-minute discovery call, followed by me spending 2-3 hours writing a proposal. Scoping phases, listing deliverables, pricing line items, formatting it into something presentable. It was the most repetitive part of running the business.

So I built a pipeline that turns a call recording into a branded, client-facing proposal page. The whole flow takes about 20 minutes now, and most of that is me reviewing the output and adjusting pricing.

This post walks through the full technical implementation: the Python extraction layer, the Claude API integration, the Supabase schema, and the Next.js rendering. Everything here is production code (simplified for clarity).

Pipeline Overview

The system has six stages:

Discovery call gets recorded and transcribed (I use Otter.ai, but any transcription service works).
A Python script extracts structured data from the raw transcript.
Claude generates a full proposal JSON with scope, phases, and deliverables.
I review the output in a simple terminal UI and adjust pricing.
One click publishes the proposal to Supabase.
The client receives a link to a branded proposal page rendered by Next.js.

The core principle: every step produces a well-defined data structure that the next step consumes. No magic, no ambiguity between stages.

Step 1: Transcript Ingestion

Nothing fancy here. I export the transcript as a .txt file and drop it into a watched directory. The script picks it up and reads it in.

from pathlib import Path

def load_transcript(file_path: str) -> str:
    path = Path(file_path)
    if not path.exists():
        raise FileNotFoundError(f"Transcript not found: {file_path}")

    text = path.read_text(encoding="utf-8")
    return text.strip()

If you are using an API-based transcription service (Deepgram, AssemblyAI), you can skip the file step and pipe the transcript string directly into the next stage.

Step 2: Extracting Key Info with Claude

This is where the pipeline gets interesting. I send the raw transcript to Claude with a structured extraction prompt. The goal is to pull out everything I need to scope a proposal: what the client wants, what their current setup looks like, technical constraints, timeline expectations, and budget signals.

import anthropic
import json

client = anthropic.Anthropic()

EXTRACTION_PROMPT = """You are an expert business analyst. Extract the following
structured information from this discovery call transcript.

Return valid JSON with these fields:
- company_name: string
- contact_name: string
- contact_email: string (if mentioned, else null)
- current_stack: list of strings (technologies they currently use)
- pain_points: list of strings (specific problems they described)
- desired_outcomes: list of strings (what success looks like to them)
- timeline: string (any timeline mentioned)
- budget_signals: string (any budget context)
- project_type: one of [automation, web_app, data_pipeline, integration, other]
- raw_requirements: list of strings (specific features or deliverables they asked for)
- red_flags: list of strings (scope creep risks, unrealistic expectations, etc.)

Only include information that was explicitly stated or strongly implied.
Do not fabricate details."""


def extract_call_data(transcript: str) -> dict:
    message = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=2000,
        messages=[
            {
                "role": "user",
                "content": EXTRACTION_PROMPT + "\n\nTranscript:\n" + transcript
            }
        ]
    )

    response_text = message.content[0].text
    return json.loads(response_text.strip())

A few things worth noting:

I use claude-sonnet-4-20250514 for extraction. It is fast, cheap, and accurate for structured extraction tasks. You do not need Opus here.
The red_flags field has saved me from bad engagements more than once. Claude is surprisingly good at catching scope creep signals in natural conversation.

Step 3: Generating the Proposal JSON

With the extracted data in hand, a second Claude call generates the full proposal structure. I keep extraction and generation as separate calls because they are fundamentally different tasks, and separating them makes each prompt simpler and more reliable.

PROPOSAL_PROMPT = """You are a senior technical consultant writing a project proposal.
Given the following extracted call data, generate a complete proposal JSON.

Rules:
- Break the project into logical phases (2-4 phases typical)
- Each phase has deliverables with descriptions
- Include estimated hours per deliverable (be conservative)
- Do NOT include pricing. I will add that manually.
- Write deliverable descriptions in plain English, not jargon

Return valid JSON with: title, summary, phases (each with name,
description, deliverables), assumptions, out_of_scope, timeline_estimate."""


def generate_proposal(call_data: dict) -> dict:
    message = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=4000,
        messages=[
            {
                "role": "user",
                "content": PROPOSAL_PROMPT + "\n\nCall data:\n" + json.dumps(call_data, indent=2)
            }
        ]
    )

    response_text = message.content[0].text
    proposal = json.loads(response_text.strip())

    # Inject metadata from the extraction step
    proposal["company_name"] = call_data["company_name"]
    proposal["contact_name"] = call_data["contact_name"]
    proposal["project_type"] = call_data["project_type"]

    return proposal

Here is what a real proposal JSON looks like after generation (abbreviated):

{
  "title": "CRM Integration and Automated Reporting Pipeline",
  "company_name": "Acme Corp",
  "contact_name": "Jane Smith",
  "project_type": "integration",
  "summary": "Build an automated pipeline connecting HubSpot CRM to the internal reporting dashboard, eliminating manual CSV exports and reducing reporting lag from 3 days to real-time.",
  "phases": [
    {
      "name": "Discovery and Architecture",
      "description": "Audit the current HubSpot configuration and design the integration architecture.",
      "deliverables": [
        {
          "name": "Data Flow Diagram",
          "description": "Visual map of all data moving between HubSpot and downstream dashboards.",
          "estimated_hours": 4
        },
        {
          "name": "Technical Specification",
          "description": "Detailed spec covering API endpoints, data transformations, and error handling.",
          "estimated_hours": 6
        }
      ]
    },
    {
      "name": "Core Integration Build",
      "description": "Implement the data pipeline with scheduling, error handling, and monitoring.",
      "deliverables": [
        {
          "name": "HubSpot API Connector",
          "description": "Python service that pulls contact, deal, and activity data on a configurable schedule.",
          "estimated_hours": 12
        }
      ]
    }
  ],
  "assumptions": [
    "Acme provides API credentials and a staging HubSpot environment",
    "Data volume is under 100k records per sync"
  ],
  "out_of_scope": [
    "Modifications to the existing reporting dashboard UI",
    "Historical data backfill beyond 12 months"
  ],
  "timeline_estimate": "4-6 weeks"
}

At this point I open the JSON in my terminal, review the phases, and manually add pricing. This is the one step I refuse to automate. Pricing requires judgment that depends on client relationship, strategic value, and context that no LLM has.

Step 4: The Supabase Schema

The database schema is straightforward. A proposals table holds the top-level metadata, and a proposal_phases table holds the phase/deliverable breakdown. I use Supabase's built-in jsonb support for deliverables since they are always read as a unit with their parent phase.

create table public.proposals (
  id uuid default gen_random_uuid() primary key,
  slug text unique not null,
  company_name text not null,
  contact_name text not null,
  project_type text not null,
  title text not null,
  summary text not null,
  assumptions jsonb default '[]'::jsonb,
  out_of_scope jsonb default '[]'::jsonb,
  timeline_estimate text,
  total_price numeric(10, 2),
  status text default 'draft'
    check (status in ('draft', 'published', 'accepted', 'expired')),
  published_at timestamptz,
  expires_at timestamptz,
  created_at timestamptz default now(),
  updated_at timestamptz default now()
);

create table public.proposal_phases (
  id uuid default gen_random_uuid() primary key,
  proposal_id uuid references public.proposals(id) on delete cascade,
  phase_order integer not null,
  name text not null,
  description text not null,
  deliverables jsonb not null default '[]'::jsonb,
  phase_price numeric(10, 2),
  created_at timestamptz default now()
);

create index idx_proposals_slug on public.proposals(slug);

alter table public.proposals enable row level security;

create policy "Published proposals are publicly readable"
  on public.proposals for select
  using (status = 'published');

create policy "Authenticated users can manage proposals"
  on public.proposals for all
  using (auth.role() = 'authenticated');

Two design decisions worth explaining:

Slug-based lookups. The client-facing URL uses a slug (e.g., /proposals/acme-crm-integration) rather than a UUID. Easier to share, easier to remember.
RLS for access control. Published proposals are world-readable. Drafts require authentication. This means I do not need a separate API layer for access control.

Step 5: Publishing to the Database

Once I have reviewed and priced the proposal, a single function pushes it to Supabase.

from supabase import create_client
import os
import re

supabase = create_client(
    os.environ["SUPABASE_URL"],
    os.environ["SUPABASE_SERVICE_ROLE_KEY"]
)


def slugify(text: str) -> str:
    text = text.lower().strip()
    text = re.sub(r"[^\w\s-]", "", text)
    return re.sub(r"[-\s]+", "-", text)


def publish_proposal(proposal: dict, prices: dict) -> str:
    slug = slugify(proposal["company_name"] + "-" + proposal["title"])

    result = supabase.table("proposals").insert({
        "slug": slug,
        "company_name": proposal["company_name"],
        "contact_name": proposal["contact_name"],
        "project_type": proposal["project_type"],
        "title": proposal["title"],
        "summary": proposal["summary"],
        "assumptions": proposal["assumptions"],
        "out_of_scope": proposal["out_of_scope"],
        "timeline_estimate": proposal["timeline_estimate"],
        "total_price": prices["total"],
        "status": "published",
        "published_at": "now()",
    }).execute()

    proposal_id = result.data[0]["id"]

    for i, phase in enumerate(proposal["phases"]):
        supabase.table("proposal_phases").insert({
            "proposal_id": proposal_id,
            "phase_order": i + 1,
            "name": phase["name"],
            "description": phase["description"],
            "deliverables": phase["deliverables"],
            "phase_price": prices["phases"][i],
        }).execute()

    return slug

Step 6: Rendering with Next.js

The frontend is a Next.js app with a dynamic route at /proposals/[slug]. It fetches the proposal data server-side and renders a clean, branded page.

The server component calls Supabase, fetches the proposal by slug, joins the phases table, sorts phases by order, and renders everything in a responsive layout. Since the proposal row has RLS allowing public reads for published proposals, the server component can use the anon key. No auth required on the client side for viewing.

The page has three sections:

Header with the proposal title, company name, and summary.
Phases listed sequentially, each with its deliverables in a bordered card layout and the phase price aligned right.
Footer with assumptions and out-of-scope items in a two-column grid, and the total price centered at the bottom.

The rendering code is about 80 lines of standard Next.js with TypeScript interfaces for type safety. Nothing exotic. The data structure from Supabase maps directly to the component hierarchy, which is the whole point of designing the schema around the rendering needs.

Results and What I Would Change

The pipeline has processed around 40 proposals since I built it. Actual numbers:

Time per proposal: dropped from 2-3 hours to roughly 20 minutes. Most of that 20 minutes is me reviewing scope and setting prices.
Consistency: every proposal follows the same structure. Clients get a predictable, professional format every time.
Accuracy: Claude's extraction catches details I used to miss when writing proposals manually, especially around assumptions and scope boundaries.

If I were rebuilding this from scratch, I would change two things. First, I would add a webhook so Otter.ai pushes transcripts directly into the pipeline instead of me dropping files manually. Second, I would version proposals in the database so clients can see revision history when scope changes during negotiation.

The broader point: proposals are one of dozens of business processes that follow the pattern of "take unstructured input, produce structured output, render it somewhere." Once you have the extraction-to-rendering pipeline working for one use case, adapting it to others (SOWs, invoices, project briefs) is mostly just changing the prompts and the schema.

I'm Parker Gawne, founder of Syntora. We build custom Python infrastructure for small and mid-size businesses. syntora.io

Top comments (1)

Syntora • Feb 14

Proposal sent ---> Proposal Signed ---> Onboarding Automation --> Client Portal gets spun up automatically --> Backlog created from discovery calls. Time savings are massive.