<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Theo Oliveira</title>
    <description>The latest articles on DEV Community by Theo Oliveira (@theo_oliveira_40b15cfaf73).</description>
    <link>https://dev.to/theo_oliveira_40b15cfaf73</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1555824%2F1fafaa7d-2f2f-4742-a026-fc702e85c336.jpg</url>
      <title>DEV Community: Theo Oliveira</title>
      <link>https://dev.to/theo_oliveira_40b15cfaf73</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/theo_oliveira_40b15cfaf73"/>
    <language>en</language>
    <item>
      <title>How We Built an AI SaaS on the Edge for Nearly $0 in Infrastructure Costs</title>
      <dc:creator>Theo Oliveira</dc:creator>
      <pubDate>Thu, 18 Jun 2026 04:26:22 +0000</pubDate>
      <link>https://dev.to/theo_oliveira_40b15cfaf73/how-we-built-an-ai-saas-on-the-edge-for-nearly-0-in-infrastructure-costs-24jm</link>
      <guid>https://dev.to/theo_oliveira_40b15cfaf73/how-we-built-an-ai-saas-on-the-edge-for-nearly-0-in-infrastructure-costs-24jm</guid>
      <description>&lt;h2&gt;
  
  
  1. Introduction
&lt;/h2&gt;

&lt;p&gt;A few months ago, we started building &lt;a href="https://propoza.com.br" rel="noopener noreferrer"&gt;Propoza&lt;/a&gt;, a tool that generates business proposals with AI for Brazilian freelancers and small business owners.&lt;/p&gt;

&lt;p&gt;The problem was practical: freelancers spend hours crafting proposals in Canva or Word, delivering generic documents that don't protect the project scope. The tool needed to be simple — the user describes the service, the AI structures a complete proposal with scope, timeline, and payment terms.&lt;/p&gt;

&lt;p&gt;The technical challenge: how do you run a freemium SaaS for Brazilian freelancers without blowing your budget before you have revenue?&lt;/p&gt;

&lt;p&gt;A lean SaaS or micro SaaS typically costs between $5 and $20 per month before your first paying customer — just for server, managed database, and CDN. For a validation-phase product, that's burned money.&lt;/p&gt;

&lt;p&gt;We decided to try a different route: build everything on the edge, with serverless infrastructure and zero servers to manage. Here's what worked, the trade-offs, and what we learned.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. The problem with the traditional SaaS stack
&lt;/h2&gt;

&lt;p&gt;Before choosing our stack, we calculated the minimum cost of a Brazilian SaaS running on conventional infrastructure:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Typical Provider&lt;/th&gt;
&lt;th&gt;Estimated Monthly Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Server&lt;/td&gt;
&lt;td&gt;DigitalOcean / AWS EC2&lt;/td&gt;
&lt;td&gt;$8–$30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database&lt;/td&gt;
&lt;td&gt;Managed PostgreSQL (RDS, Supabase)&lt;/td&gt;
&lt;td&gt;$10–$40&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CDN&lt;/td&gt;
&lt;td&gt;Cloudflare (paid plan) or similar&lt;/td&gt;
&lt;td&gt;$5–$20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Domain&lt;/td&gt;
&lt;td&gt;Namecheap / Porkbun&lt;/td&gt;
&lt;td&gt;~$1/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$10–$90/month&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That's not unreasonable for an established SaaS. But for a product still validating its market fit with zero paying customers, it means burning $100–500 before you even know if the model works.&lt;/p&gt;

&lt;p&gt;There's another Brazil-specific problem: &lt;strong&gt;latency&lt;/strong&gt;. Servers concentrated in São Paulo and Rio de Janeiro leave users in the North and Northeast with poor experiences, especially on mobile connections. A CDN helps, but adds cost.&lt;/p&gt;

&lt;p&gt;We needed something that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cost &lt;strong&gt;near $0&lt;/strong&gt; in the first few months&lt;/li&gt;
&lt;li&gt;Offered &lt;strong&gt;low latency&lt;/strong&gt; across all of Brazil&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scaled&lt;/strong&gt; without manual intervention&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's when we looked at the edge serverless model.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. The stack we chose (and why)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  3.1 Cloudflare Workers as the edge runtime
&lt;/h3&gt;

&lt;p&gt;The application's backbone is Cloudflare Workers — a serverless runtime that executes code across 330+ data centers worldwide, including points of presence in Brazil (São Paulo, Rio de Janeiro, Fortaleza).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Code runs in the data center closest to the user. For Brazilian users, latency stays under 50ms.&lt;/li&gt;
&lt;li&gt;Workers keeps a warm runtime, unlike Lambda functions that freeze between executions. The first request is as fast as subsequent ones.&lt;/li&gt;
&lt;li&gt;If the product goes viral and 10,000 people access it simultaneously, Workers distributes automatically.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The trade-off:&lt;/strong&gt; Workers is not Node.js. There's no filesystem access, native WebSocket, or Node standard library. It's an isolated V8 runtime. Most things you need exist as native APIs (fetch, Web Crypto, streams), but libraries depending on &lt;code&gt;fs&lt;/code&gt; or &lt;code&gt;net&lt;/code&gt; won't work.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.2 Hono as the HTTP framework
&lt;/h3&gt;

&lt;p&gt;We needed an HTTP framework that ran inside Workers with zero overhead.&lt;/p&gt;

&lt;p&gt;Most popular Node.js frameworks (Express, Fastify, Koa) were designed for full Node.js environments and have compatibility issues. They depend on the &lt;code&gt;http&lt;/code&gt; module, use synchronous APIs, or carry middleware too heavy for the serverless model.&lt;/p&gt;

&lt;p&gt;Hono sidesteps all that. It's under 14KB, runs natively on Workers, Deno, Bun, and Node.js, and is TypeScript-first with strong type inference. It supports middleware, route params, and Zod validation.&lt;/p&gt;

&lt;p&gt;Here's a proposal generation route:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Hono&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hono&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zod&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;zValidator&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@hono/zod-validator&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Hono&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;proposalSchema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;clientName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;projectDescription&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;deliverables&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;deadline&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="na"&gt;paymentMethod&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/proposals/generate&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;zValidator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;proposalSchema&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;valid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;generateProposal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;proposal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Validation with Zod at the edge of the request, combined with Hono's inferred types, eliminates several runtime bugs without adding processing overhead.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.3 Database on the edge
&lt;/h3&gt;

&lt;p&gt;The database choice was the hardest decision. For an application that needs to persist proposals, users, and settings, the relational model is still the most natural fit.&lt;/p&gt;

&lt;p&gt;We examined two options in the edge ecosystem:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;D1 (Cloudflare):&lt;/strong&gt; Distributed, managed SQLite. Queries go directly to the Worker's storage, no connection pools.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Turso:&lt;/strong&gt; Distributed SQLite with per-region replication.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We went with D1 for its native integration with Workers. You define the schema locally, run migrations with &lt;code&gt;wrangler d1 migrations apply&lt;/code&gt;, and queries run in the same data center as the Worker.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;proposals&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;client_name&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="s1"&gt;'draft'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;UNIQUE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;proposals_count&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Issues we ran into:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;D1 doesn't support all SQLite queries. &lt;code&gt;ALTER TABLE&lt;/code&gt; with complex constraints, &lt;code&gt;RETURNING&lt;/code&gt;, and some window functions aren't available. Always test with &lt;code&gt;wrangler dev&lt;/code&gt; before deploying.&lt;/li&gt;
&lt;li&gt;Write latency is higher than read latency. The free plan prioritizes eventual consistency — fine for business proposals.&lt;/li&gt;
&lt;li&gt;Migrations work, but altering tables with lots of data requires planning.&lt;/li&gt;
&lt;li&gt;Heavy write access sometimes needs distribution via queues or KV-backed caching.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3.4 Cloudflare AI Gateway
&lt;/h3&gt;

&lt;p&gt;The most expensive part of an AI SaaS isn't infrastructure — it's the LLM API. Each call costs fractions of a penny, but hundreds of calls per day add up fast.&lt;/p&gt;

&lt;p&gt;The AI Gateway acts as a proxy between the Worker and the LLM API:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Response caching:&lt;/strong&gt; if two users generate proposals with similar context, the response comes from cache.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-user rate limiting:&lt;/strong&gt; on the free plan, we limit to 5 proposals/month. The gateway applies this with zero extra code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observability:&lt;/strong&gt; latency, tokens, and error rates are logged automatically.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The request flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Request -&amp;gt; Worker -&amp;gt; AI Gateway -&amp;gt; Cache hit? Return cached response
                                -&amp;gt; Cache miss? LLM API -&amp;gt; Cache the response -&amp;gt; Return
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Caching cut about 40% of actual LLM calls in the first few weeks. Many users test with similar inputs.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.5 Frontend: React + Vite + Cloudflare Pages
&lt;/h3&gt;

&lt;p&gt;Frontend with React + Vite, hosted on Cloudflare Pages. The free tier covers 500 builds/month and unlimited bandwidth for static sites.&lt;/p&gt;

&lt;p&gt;Communication with the API is via typed &lt;code&gt;fetch&lt;/code&gt;, leveraging the fact that both Worker and frontend share the same TypeScript types:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// shared types&lt;/span&gt;
&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;GenerateRequest&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;clientName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
  &lt;span class="nx"&gt;projectDescription&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
  &lt;span class="nx"&gt;deliverables&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;
  &lt;span class="nx"&gt;deadline&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
  &lt;span class="nx"&gt;paymentMethod&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// client-side call&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/proposals/generate&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One detail that made a difference: &lt;strong&gt;PDFs are generated client-side&lt;/strong&gt;, using libraries like &lt;code&gt;@react-pdf/renderer&lt;/code&gt; or &lt;code&gt;html2canvas + jspdf&lt;/code&gt;. The Worker never allocates memory or CPU for PDF rendering — the user's browser handles it.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. The real cost
&lt;/h2&gt;

&lt;p&gt;After a few weeks in production with dozens of active users:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;Cost/Month&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;API / Backend&lt;/td&gt;
&lt;td&gt;Cloudflare Workers (Free Tier)&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database&lt;/td&gt;
&lt;td&gt;Cloudflare D1 (Free Tier)&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Frontend / Hosting&lt;/td&gt;
&lt;td&gt;Cloudflare Pages (Free Tier)&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI (LLM)&lt;/td&gt;
&lt;td&gt;External API (cached via AI Gateway)&lt;/td&gt;
&lt;td&gt;~$1–$4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Domain&lt;/td&gt;
&lt;td&gt;Namecheap / Porkbun&lt;/td&gt;
&lt;td&gt;~$1/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;&amp;lt; $5/month&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The only variable cost is the LLM. It scales with actual usage, not with registered users. If someone signs up but never generates a proposal, the cost is zero.&lt;/p&gt;

&lt;p&gt;This isn't forever. When we scale to thousands of users, the Workers free tier will need an upgrade ($5+/month after 100k requests/day), and D1 may need a paid plan. But for PMF validation, it buys you months of experimentation.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. What we learned
&lt;/h2&gt;

&lt;p&gt;Five things we'd take to the next project:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Remote debugging on Workers is harder. Wrangler tail helps.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;There's no SSH into a Worker. &lt;code&gt;wrangler tail&lt;/code&gt; streams live production logs. Pair it with structured logging (JSON with &lt;code&gt;requestId&lt;/code&gt;, &lt;code&gt;userId&lt;/code&gt;, &lt;code&gt;latency&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Secrets, vars, and bindings have nuances.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;API keys go in &lt;code&gt;secrets&lt;/code&gt;. Public config in &lt;code&gt;vars&lt;/code&gt;. Resources like D1 or KV go in &lt;code&gt;bindings&lt;/code&gt; (direct runtime references). Secrets aren't accessible via &lt;code&gt;wrangler dev&lt;/code&gt; without a &lt;code&gt;.dev.vars&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. D1 doesn't accept everything SQLite accepts.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A migration that worked locally broke on D1 because it used &lt;code&gt;ALTER TABLE ... ADD COLUMN&lt;/code&gt; with constraints D1 rejects. Test every migration with &lt;code&gt;wrangler d1 migrations apply --local&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. AI Gateway saved us instrumentation work.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We didn't need to implement token logging, caching, or rate limiting manually. The gateway delivers everything through environment variables. Saved days of work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Hono + Zod is a solid combo for secure edge APIs.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Validation at the request edge with auto-inferred types eliminates runtime errors without adding noticeable latency.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Conclusion
&lt;/h2&gt;

&lt;p&gt;Building on the edge with Cloudflare Workers let us validate the product without the fixed costs of a traditional SaaS. The stack is lean, deployment is &lt;code&gt;wrangler deploy&lt;/code&gt;, and the infra bill doesn't scare you.&lt;/p&gt;

&lt;p&gt;Is this for every SaaS? No. If you need heavy computation, complex queues, or a database with all the features of PostgreSQL, the edge stack has limitations. But for validation in markets where every dollar counts, it works.&lt;/p&gt;

&lt;p&gt;The tool we built is called Propoza — an AI-powered proposal generator for freelancers. It's free to use.&lt;/p&gt;

&lt;p&gt;If you've used this stack or have a different approach to low-cost SaaS infrastructure, I'd love to hear about it in the comments. The goal here is to share experience, not to sell technology.&lt;/p&gt;

</description>
      <category>cloudflare</category>
      <category>saas</category>
      <category>typescript</category>
      <category>serverless</category>
    </item>
  </channel>
</rss>
