<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: fastapier (Freelance Backend)</title>
    <description>The latest articles on DEV Community by fastapier (Freelance Backend) (@fastapier).</description>
    <link>https://dev.to/fastapier</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3782726%2F0884f0b4-b8ab-4239-9022-2ccd7532aba9.jpg</url>
      <title>DEV Community: fastapier (Freelance Backend)</title>
      <link>https://dev.to/fastapier</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/fastapier"/>
    <language>en</language>
    <item>
      <title>Building a Production-Aware AI Backend with FastAPI</title>
      <dc:creator>fastapier (Freelance Backend)</dc:creator>
      <pubDate>Wed, 25 Mar 2026 17:13:45 +0000</pubDate>
      <link>https://dev.to/fastapier/building-a-production-aware-ai-backend-with-fastapi-2f5p</link>
      <guid>https://dev.to/fastapier/building-a-production-aware-ai-backend-with-fastapi-2f5p</guid>
      <description>&lt;p&gt;Most AI backend examples stop at one thing:&lt;/p&gt;

&lt;p&gt;send a prompt, get a response.&lt;/p&gt;

&lt;p&gt;That is fine for demos, but real systems usually need more than that.&lt;/p&gt;

&lt;p&gt;Once you try to use AI inside an actual product, a few practical questions show up immediately:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How much does each request cost?&lt;/li&gt;
&lt;li&gt;How long does each response take?&lt;/li&gt;
&lt;li&gt;Can we stream output instead of waiting for a full response?&lt;/li&gt;
&lt;li&gt;Can we reduce hallucinations by grounding responses in known data?&lt;/li&gt;
&lt;li&gt;Can we log usage for billing and analytics?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I wanted to build something closer to that reality.&lt;/p&gt;

&lt;p&gt;So instead of making another thin OpenAI wrapper, I built a FastAPI-based AI backend with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;synchronous responses&lt;/li&gt;
&lt;li&gt;streaming responses&lt;/li&gt;
&lt;li&gt;usage logging&lt;/li&gt;
&lt;li&gt;token-based cost estimation&lt;/li&gt;
&lt;li&gt;response time monitoring&lt;/li&gt;
&lt;li&gt;lightweight context-based answering&lt;/li&gt;
&lt;li&gt;Docker reproducibility&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result is a backend that feels much closer to something you could actually extend into an internal AI tool or SaaS feature.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why I Built It This Way
&lt;/h2&gt;

&lt;p&gt;A lot of AI tutorials focus on model output.&lt;/p&gt;

&lt;p&gt;I wanted to focus on backend behavior.&lt;/p&gt;

&lt;p&gt;That means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;responses should be explainable&lt;/li&gt;
&lt;li&gt;system behavior should be predictable&lt;/li&gt;
&lt;li&gt;logs should support observability&lt;/li&gt;
&lt;li&gt;the backend should be structured for extension, not just for demo screenshots&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, I was less interested in “Can this call OpenAI?”&lt;br&gt;
and more interested in “Can this behave like a real backend feature?”&lt;/p&gt;




&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Python 3.11&lt;/li&gt;
&lt;li&gt;FastAPI&lt;/li&gt;
&lt;li&gt;SQLAlchemy 2.0&lt;/li&gt;
&lt;li&gt;Alembic&lt;/li&gt;
&lt;li&gt;OpenAI API&lt;/li&gt;
&lt;li&gt;SQLite&lt;/li&gt;
&lt;li&gt;Docker&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What the Backend Does
&lt;/h2&gt;

&lt;p&gt;The project currently includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;POST /ai/test&lt;/code&gt; for standard AI responses&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;POST /ai/stream&lt;/code&gt; for streaming output&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;POST /ai/upload&lt;/code&gt; for adding text-based context data&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;POST /seed&lt;/code&gt; for inserting sample context&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GET /ai/logs&lt;/code&gt; for inspecting stored usage logs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It also stores request-level data such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;prompt&lt;/li&gt;
&lt;li&gt;response&lt;/li&gt;
&lt;li&gt;total tokens&lt;/li&gt;
&lt;li&gt;estimated cost&lt;/li&gt;
&lt;li&gt;response time&lt;/li&gt;
&lt;li&gt;endpoint name&lt;/li&gt;
&lt;li&gt;user id&lt;/li&gt;
&lt;li&gt;timestamp&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That logging layer turned out to be one of the most important parts of the project.&lt;/p&gt;

&lt;p&gt;Because once you can see how AI is being used, the backend starts becoming operational instead of experimental.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lightweight RAG-Style Answering
&lt;/h2&gt;

&lt;p&gt;One of the goals was to reduce hallucinated answers.&lt;/p&gt;

&lt;p&gt;Instead of letting the model answer freely from its general knowledge, I added a lightweight retrieval flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;search relevant records from the database&lt;/li&gt;
&lt;li&gt;inject the retrieved content into the prompt&lt;/li&gt;
&lt;li&gt;instruct the model to answer only from that context&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is not a full vector database setup.&lt;br&gt;
It is intentionally lightweight.&lt;/p&gt;

&lt;p&gt;The retrieval logic uses:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;simple keyword-based matching&lt;/li&gt;
&lt;li&gt;basic query pre-processing for Japanese text&lt;/li&gt;
&lt;li&gt;AND search first&lt;/li&gt;
&lt;li&gt;OR fallback if needed&lt;/li&gt;
&lt;li&gt;safe fallback responses when no context is found&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the project is better described as a lightweight RAG-style backend rather than a full enterprise retrieval system.&lt;/p&gt;

&lt;p&gt;That was deliberate.&lt;/p&gt;

&lt;p&gt;I wanted something small enough to understand, but structured enough to feel useful.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Streaming Matters
&lt;/h2&gt;

&lt;p&gt;Streaming changes the feel of an AI product a lot.&lt;/p&gt;

&lt;p&gt;Without streaming, the user waits for the full answer.&lt;br&gt;
With streaming, the user gets feedback immediately.&lt;/p&gt;

&lt;p&gt;That makes the backend feel much closer to a real assistant feature.&lt;/p&gt;

&lt;p&gt;So I added &lt;code&gt;/ai/stream&lt;/code&gt; and then made sure streaming requests were not treated like second-class citizens.&lt;/p&gt;

&lt;p&gt;I wanted them logged too.&lt;/p&gt;

&lt;p&gt;That meant tracking:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;total tokens&lt;/li&gt;
&lt;li&gt;estimated cost&lt;/li&gt;
&lt;li&gt;response time&lt;/li&gt;
&lt;li&gt;endpoint name&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This was important because a lot of examples show streaming output, but do not show how to observe or measure it properly.&lt;/p&gt;

&lt;p&gt;In practice, that observability layer is what makes the feature maintainable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cost Tracking and Latency Monitoring
&lt;/h2&gt;

&lt;p&gt;For regular &lt;code&gt;/ai/test&lt;/code&gt; responses, token usage was straightforward to capture.&lt;/p&gt;

&lt;p&gt;For streaming, it required a bit more work.&lt;/p&gt;

&lt;p&gt;I refactored the provider layer so the streaming flow could still capture usage data at the end, then calculate an estimated cost and store it together with the final response log.&lt;/p&gt;

&lt;p&gt;That gave me a much more useful log structure.&lt;/p&gt;

&lt;p&gt;Instead of only storing “prompt” and “response,” I could now see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how many tokens were used&lt;/li&gt;
&lt;li&gt;how much the request approximately cost&lt;/li&gt;
&lt;li&gt;how long the request took&lt;/li&gt;
&lt;li&gt;which endpoint generated it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a much stronger foundation for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;cost visibility&lt;/li&gt;
&lt;li&gt;future billing models&lt;/li&gt;
&lt;li&gt;usage analytics&lt;/li&gt;
&lt;li&gt;performance monitoring&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  One Practical Issue I Hit
&lt;/h2&gt;

&lt;p&gt;Alembic autogeneration tried to include unrelated schema changes while I was extending the logging table.&lt;/p&gt;

&lt;p&gt;It detected the new columns I wanted:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;estimated_cost&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;response_time_ms&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;endpoint&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But it also tried to remove an unrelated &lt;code&gt;documents&lt;/code&gt; table.&lt;/p&gt;

&lt;p&gt;That was a good reminder that migration generation is helpful, but not magical.&lt;/p&gt;

&lt;p&gt;I manually cleaned the migration so it only included the actual intended schema change.&lt;/p&gt;

&lt;p&gt;That was one of those small but very real backend moments:&lt;br&gt;
not “how do I make the feature work?”&lt;br&gt;
but “how do I make the change safe?”&lt;/p&gt;




&lt;h2&gt;
  
  
  Current Logging Model
&lt;/h2&gt;

&lt;p&gt;The request log now stores:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;prompt&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;response&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;total_tokens&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;estimated_cost&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;response_time_ms&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;endpoint&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;user_id&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;created_at&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That makes the backend feel much more production-aware than a simple AI demo.&lt;/p&gt;




&lt;h2&gt;
  
  
  What This Project Is Really About
&lt;/h2&gt;

&lt;p&gt;The most important thing I learned is that AI backend work is not just model integration.&lt;/p&gt;

&lt;p&gt;It is also about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;structure&lt;/li&gt;
&lt;li&gt;safety&lt;/li&gt;
&lt;li&gt;logging&lt;/li&gt;
&lt;li&gt;reproducibility&lt;/li&gt;
&lt;li&gt;monitoring&lt;/li&gt;
&lt;li&gt;extension paths&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Calling an API is easy.&lt;/p&gt;

&lt;p&gt;Building something that behaves predictably when it grows is the harder part.&lt;/p&gt;

&lt;p&gt;That is what I wanted this project to reflect.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Would Add Next
&lt;/h2&gt;

&lt;p&gt;The next natural steps would be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;JWT authentication&lt;/li&gt;
&lt;li&gt;token quota control&lt;/li&gt;
&lt;li&gt;admin-facing usage analytics&lt;/li&gt;
&lt;li&gt;Stripe integration&lt;/li&gt;
&lt;li&gt;richer retrieval strategies&lt;/li&gt;
&lt;li&gt;vector-based search when the use case really needs it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But even in its current form, the backend already demonstrates something important:&lt;/p&gt;

&lt;p&gt;AI features become much more valuable when they are treated as backend systems, not just model calls.&lt;/p&gt;




&lt;h2&gt;
  
  
  Repository
&lt;/h2&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/hiro-kuroe/fastapi-ai-core" rel="noopener noreferrer"&gt;fastapi-ai-core&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The repository includes Docker setup, logging examples, and a lightweight context-retrieval flow.&lt;/p&gt;

&lt;p&gt;If you are building AI-enabled backend systems with FastAPI, I think this kind of structure is worth caring about early.&lt;/p&gt;

&lt;p&gt;Because once usage grows, observability stops being a nice-to-have.&lt;br&gt;
It becomes part of the product itself.&lt;/p&gt;

</description>
      <category>fastapi</category>
      <category>python</category>
      <category>ai</category>
      <category>openai</category>
    </item>
    <item>
      <title>Building a Production-Ready AI Backend with FastAPI and OpenAI</title>
      <dc:creator>fastapier (Freelance Backend)</dc:creator>
      <pubDate>Wed, 18 Mar 2026 22:16:54 +0000</pubDate>
      <link>https://dev.to/fastapier/building-a-production-ready-ai-backend-with-fastapi-and-openai-2hna</link>
      <guid>https://dev.to/fastapier/building-a-production-ready-ai-backend-with-fastapi-and-openai-2hna</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Most developers today use ChatGPT.&lt;/p&gt;

&lt;p&gt;But in real-world systems, the real value is not using AI —&lt;br&gt;&lt;br&gt;
it's &lt;strong&gt;integrating AI into a reliable backend system&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Connecting to the OpenAI API is easy.&lt;br&gt;&lt;br&gt;
However, in production, you quickly run into real problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Slow responses causing user drop-off&lt;/li&gt;
&lt;li&gt;Uncontrolled token usage and unpredictable costs&lt;/li&gt;
&lt;li&gt;AI logic becoming a black box&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This project focuses on solving those issues by building a&lt;br&gt;&lt;br&gt;
&lt;strong&gt;manageable, production-oriented AI backend&lt;/strong&gt; using FastAPI.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;A simple but practical AI backend API:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;FastAPI-based endpoint&lt;/li&gt;
&lt;li&gt;OpenAI API integration&lt;/li&gt;
&lt;li&gt;Clean JSON response design&lt;/li&gt;
&lt;li&gt;Dockerized for environment consistency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example endpoint:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST /ai/test
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Request:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "prompt": "Explain FastAPI and AI integration"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Response:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "result": "..."
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;




&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;FastAPI&lt;/li&gt;
&lt;li&gt;OpenAI API&lt;/li&gt;
&lt;li&gt;SQLAlchemy (for logging design)&lt;/li&gt;
&lt;li&gt;Docker&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Key Implementation Points
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Secure API Key Management
&lt;/h3&gt;

&lt;p&gt;The OpenAI API key is handled via environment variables:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OPENAI_API_KEY=your_api_key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;




&lt;h3&gt;
  
  
  2. Fully Asynchronous Design
&lt;/h3&gt;

&lt;p&gt;The backend is built using async/await to prevent blocking during AI response time.&lt;/p&gt;

&lt;p&gt;This ensures the system remains responsive under concurrent requests.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Clean Response Structure
&lt;/h3&gt;

&lt;p&gt;The API returns a simple JSON format, making it easy to integrate with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Frontend applications&lt;/li&gt;
&lt;li&gt;External services&lt;/li&gt;
&lt;li&gt;Automation pipelines&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  4. Dockerized Environment
&lt;/h3&gt;

&lt;p&gt;To eliminate environment inconsistencies:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker build -t fastapi-ai .
docker run -p 8000:8000 fastapi-ai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;




&lt;h2&gt;
  
  
  Challenges
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Handling environment variables inside Docker&lt;/li&gt;
&lt;li&gt;Debugging API key issues&lt;/li&gt;
&lt;li&gt;Differences between local and container execution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are common pitfalls when moving from “it works locally” to production.&lt;/p&gt;




&lt;h2&gt;
  
  
  Design Philosophy
&lt;/h2&gt;

&lt;p&gt;The goal is not just to "use AI", but to:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;build AI as a controllable backend component&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Key principles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API-first design (modular and reusable)&lt;/li&gt;
&lt;li&gt;Async processing (scalable)&lt;/li&gt;
&lt;li&gt;Dockerized deployment (reproducible)&lt;/li&gt;
&lt;li&gt;Logging-ready structure (cost &amp;amp; monitoring)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🛠 Roadmap (Toward SaaS)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Streaming responses (real-time UX)&lt;/li&gt;
&lt;li&gt;Usage tracking (token-level logging per user)&lt;/li&gt;
&lt;li&gt;JWT authentication integration&lt;/li&gt;
&lt;li&gt;RAG-based knowledge integration&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This setup is intentionally simple, but designed with production in mind.&lt;/p&gt;

&lt;p&gt;It demonstrates how to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Treat AI as part of your backend architecture&lt;/li&gt;
&lt;li&gt;Control cost and performance&lt;/li&gt;
&lt;li&gt;Build systems that can scale beyond prototypes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Moving from "using AI" to &lt;strong&gt;"integrating AI into real systems"&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
is a key step for backend engineers today.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔗 Source Code
&lt;/h2&gt;

&lt;p&gt;GitHub Repository:&lt;br&gt;&lt;br&gt;
&lt;a href="https://github.com/hiro-kuroe/fastapi-ai-core" rel="noopener noreferrer"&gt;https://github.com/hiro-kuroe/fastapi-ai-core&lt;/a&gt;  &lt;/p&gt;

&lt;p&gt;This repository is part of a series:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Authentication (JWT)
&lt;/li&gt;
&lt;li&gt;Payment Integration (Stripe)
&lt;/li&gt;
&lt;li&gt;AI Backend (this project)
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Together, they form a production-ready backend foundation.  &lt;/p&gt;




&lt;p&gt;If you're building an AI-powered product and facing issues like:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API rate limits (429 errors)
&lt;/li&gt;
&lt;li&gt;unstable batch processing
&lt;/li&gt;
&lt;li&gt;usage tracking / token logging
&lt;/li&gt;
&lt;li&gt;OpenAI integration problems
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I specialize in fixing and stabilizing FastAPI backends.  &lt;/p&gt;

&lt;p&gt;Feel free to check the repository or contact me directly.  &lt;/p&gt;

&lt;p&gt;📩 &lt;a href="mailto:fastapienne@gmail.com"&gt;fastapienne@gmail.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>fastapi</category>
      <category>openai</category>
      <category>ai</category>
      <category>saas</category>
    </item>
    <item>
      <title>Building a Stripe Subscription Backend with FastAPI</title>
      <dc:creator>fastapier (Freelance Backend)</dc:creator>
      <pubDate>Sun, 08 Mar 2026 13:28:20 +0000</pubDate>
      <link>https://dev.to/fastapier/building-a-stripe-subscription-backend-with-fastapi-3n3</link>
      <guid>https://dev.to/fastapier/building-a-stripe-subscription-backend-with-fastapi-3n3</guid>
      <description>&lt;p&gt;Many Stripe tutorials stop at &lt;strong&gt;Checkout integration&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;But real SaaS products require more than that.&lt;/p&gt;

&lt;p&gt;A production subscription backend must handle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;subscription state management&lt;/li&gt;
&lt;li&gt;webhook processing&lt;/li&gt;
&lt;li&gt;access control&lt;/li&gt;
&lt;li&gt;expiration logic&lt;/li&gt;
&lt;li&gt;duplicate webhook protection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To explore this architecture, I built a small project:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;FastAPI Revenue Core&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Repository&lt;br&gt;
&lt;a href="https://github.com/hiro-kuroe/fastapi-revenue-core" rel="noopener noreferrer"&gt;https://github.com/hiro-kuroe/fastapi-revenue-core&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This project demonstrates a minimal &lt;strong&gt;SaaS-style subscription backend&lt;/strong&gt; using FastAPI and Stripe.&lt;/p&gt;


&lt;h1&gt;
  
  
  What This Project Implements
&lt;/h1&gt;

&lt;p&gt;The backend includes the essential components required for subscription-based services.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;JWT authentication&lt;/li&gt;
&lt;li&gt;Stripe Checkout integration&lt;/li&gt;
&lt;li&gt;Stripe Webhook processing&lt;/li&gt;
&lt;li&gt;Subscription state engine&lt;/li&gt;
&lt;li&gt;Automatic expiration logic&lt;/li&gt;
&lt;li&gt;Docker deployment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal was to build a &lt;strong&gt;reusable revenue backend foundation&lt;/strong&gt; that could power subscription products.&lt;/p&gt;


&lt;h1&gt;
  
  
  Architecture
&lt;/h1&gt;

&lt;p&gt;The system is intentionally simple.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Client
   ↓
FastAPI API
   ↓
Stripe Checkout
   ↓
Stripe Webhook
   ↓
Subscription Status Engine
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Stripe handles payment processing.&lt;/p&gt;

&lt;p&gt;The FastAPI backend manages &lt;strong&gt;user access state&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This separation keeps payment logic clean and allows the API to control feature access.&lt;/p&gt;




&lt;h1&gt;
  
  
  Subscription State Engine
&lt;/h1&gt;

&lt;p&gt;Each user has a subscription status stored in the database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FREE
PRO
EXPIRED
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Stripe webhook events update these states.&lt;/p&gt;

&lt;p&gt;Example transitions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;subscription.created
FREE → PRO

subscription.deleted
PRO → EXPIRED
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This state engine ensures the backend always knows whether a user has access to paid features.&lt;/p&gt;




&lt;h1&gt;
  
  
  Handling Subscription Expiration
&lt;/h1&gt;

&lt;p&gt;Stripe provides the timestamp:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;current_period_end
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example extraction:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;period_end_ts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;current_period_end&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This timestamp is stored in the database and used to determine whether the user's subscription has expired.&lt;/p&gt;

&lt;p&gt;Whenever a protected API endpoint is accessed, the backend checks the expiration timestamp.&lt;/p&gt;

&lt;p&gt;If the subscription is no longer valid:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PRO → EXPIRED
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This guarantees that access control stays correct even if webhook timing changes.&lt;/p&gt;




&lt;h1&gt;
  
  
  Webhook Design
&lt;/h1&gt;

&lt;p&gt;Stripe webhooks are essential for subscription systems.&lt;/p&gt;

&lt;p&gt;The backend processes events such as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;customer.subscription.created
customer.subscription.deleted
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The webhook updates user subscription status and expiration timestamps.&lt;/p&gt;

&lt;p&gt;Because Stripe may resend webhook events, the backend design supports &lt;strong&gt;idempotent event handling&lt;/strong&gt; to avoid duplicate processing.&lt;/p&gt;




&lt;h1&gt;
  
  
  Running the Project
&lt;/h1&gt;

&lt;p&gt;Clone the repository:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/hiro-kuroe/fastapi-revenue-core
cd fastapi-revenue-core
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install -r requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run the server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;uvicorn app.main:app --reload
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Swagger documentation will be available at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://localhost:8000/docs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h1&gt;
  
  
  Running with Docker
&lt;/h1&gt;

&lt;p&gt;The project also supports Docker deployment.&lt;/p&gt;

&lt;p&gt;Build the image:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker build -t fastapi-revenue-core .
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run the container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker run -p 8000:8000 fastapi-revenue-core
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h1&gt;
  
  
  What This Project Demonstrates
&lt;/h1&gt;

&lt;p&gt;This repository demonstrates a minimal backend architecture for subscription-based services.&lt;/p&gt;

&lt;p&gt;It combines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;FastAPI&lt;/li&gt;
&lt;li&gt;Stripe subscription billing&lt;/li&gt;
&lt;li&gt;webhook processing&lt;/li&gt;
&lt;li&gt;access control logic&lt;/li&gt;
&lt;li&gt;Docker deployment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This type of backend is commonly used for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SaaS products&lt;/li&gt;
&lt;li&gt;paid API services&lt;/li&gt;
&lt;li&gt;membership platforms&lt;/li&gt;
&lt;li&gt;subscription-based applications&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  Repository
&lt;/h1&gt;

&lt;p&gt;Full source code:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/hiro-kuroe/fastapi-revenue-core" rel="noopener noreferrer"&gt;https://github.com/hiro-kuroe/fastapi-revenue-core&lt;/a&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  Final Thoughts
&lt;/h1&gt;

&lt;p&gt;Stripe integration tutorials often focus only on payment processing.&lt;/p&gt;

&lt;p&gt;But real subscription systems require:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;backend state management&lt;/li&gt;
&lt;li&gt;expiration handling&lt;/li&gt;
&lt;li&gt;webhook reliability&lt;/li&gt;
&lt;li&gt;API access control&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This project demonstrates how these pieces can be combined into a simple but practical backend architecture.&lt;/p&gt;

&lt;p&gt;If you're building a FastAPI SaaS backend or working with Stripe subscriptions, feel free to check out the repository.　　&lt;/p&gt;




&lt;h1&gt;
  
  
  Incident Intake
&lt;/h1&gt;

&lt;p&gt;If you are experiencing issues with Stripe payments or subscription systems,&lt;br&gt;&lt;br&gt;
you can submit a diagnosis request through the intake form below.&lt;/p&gt;

&lt;p&gt;Typical problems include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stripe Webhook not triggering
&lt;/li&gt;
&lt;li&gt;Subscription status not updating
&lt;/li&gt;
&lt;li&gt;Checkout succeeds but user access does not change
&lt;/li&gt;
&lt;li&gt;Cancelled subscriptions still retaining access
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Submit an incident report here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/hiro-kuroe/fastapi-revenue-core/blob/main/Intake-en.md" rel="noopener noreferrer"&gt;https://github.com/hiro-kuroe/fastapi-revenue-core/blob/main/Intake-en.md&lt;/a&gt;&lt;/p&gt;

</description>
      <category>fastapi</category>
      <category>stripe</category>
      <category>python</category>
      <category>webdev</category>
    </item>
    <item>
      <title>401 Is Not the Bug. It’s the Signal.</title>
      <dc:creator>fastapier (Freelance Backend)</dc:creator>
      <pubDate>Sat, 21 Feb 2026 12:56:21 +0000</pubDate>
      <link>https://dev.to/fastapier/401-is-not-the-bug-its-the-signal-3am0</link>
      <guid>https://dev.to/fastapier/401-is-not-the-bug-its-the-signal-3am0</guid>
      <description>&lt;p&gt;You fixed the endpoint.&lt;br&gt;
You rewrote the dependency.&lt;br&gt;
You regenerated the token.&lt;/p&gt;

&lt;p&gt;Still 401.&lt;/p&gt;

&lt;p&gt;Here’s the uncomfortable truth:&lt;/p&gt;

&lt;p&gt;401 is not the root cause.&lt;br&gt;
It’s the signal that something deeper is inconsistent.&lt;/p&gt;

&lt;p&gt;In FastAPI authentication flows, 401 usually appears when:&lt;/p&gt;

&lt;p&gt;The SECRET_KEY used to sign the token is not the one used to verify it&lt;/p&gt;

&lt;p&gt;Docker injects a different .env than your local environment&lt;/p&gt;

&lt;p&gt;Multiple instances are running with inconsistent configurations&lt;/p&gt;

&lt;p&gt;The token algorithm (HS256 / RS256) does not match&lt;/p&gt;

&lt;p&gt;Clock drift invalidates the token timestamp&lt;/p&gt;

&lt;p&gt;The controller is fine.&lt;br&gt;
The route is fine.&lt;br&gt;
The dependency is fine.&lt;/p&gt;

&lt;p&gt;The layers are not aligned.&lt;/p&gt;

&lt;p&gt;Authentication is not just code.&lt;br&gt;
It’s configuration.&lt;br&gt;
It’s environment.&lt;br&gt;
It’s deployment consistency.&lt;/p&gt;

&lt;p&gt;When /token works but /me returns 401,&lt;br&gt;
your application is telling you:&lt;/p&gt;

&lt;p&gt;“The layers don’t agree.”&lt;/p&gt;

&lt;p&gt;Stop fixing the endpoint.&lt;/p&gt;

&lt;p&gt;Start mapping the layers:&lt;/p&gt;

&lt;p&gt;Environment variables&lt;/p&gt;

&lt;p&gt;Key consistency&lt;/p&gt;

&lt;p&gt;Container configuration&lt;/p&gt;

&lt;p&gt;Token structure&lt;/p&gt;

&lt;p&gt;Deployment topology&lt;/p&gt;

&lt;p&gt;401 is not your enemy.&lt;/p&gt;

&lt;p&gt;It’s the signal that your architecture is out of sync.&lt;/p&gt;

&lt;p&gt;Treat it as a bug, and you’ll chase symptoms.&lt;br&gt;
Treat it as a signal, and you’ll repair the architecture.  &lt;/p&gt;




&lt;p&gt;I built a reproducible playground for this type of incident:&lt;br&gt;
&lt;a href="https://github.com/hiro-kuroe/fastapi-auth-crud-docker" rel="noopener noreferrer"&gt;https://github.com/hiro-kuroe/fastapi-auth-crud-docker&lt;/a&gt;&lt;/p&gt;

</description>
      <category>fastapi</category>
      <category>jwt</category>
      <category>authentication</category>
      <category>python</category>
    </item>
    <item>
      <title>Your API Is Replying… But Authentication Is Already Broken (FastAPI JWT)</title>
      <dc:creator>fastapier (Freelance Backend)</dc:creator>
      <pubDate>Fri, 20 Feb 2026 15:00:41 +0000</pubDate>
      <link>https://dev.to/fastapier/your-api-is-replying-but-authentication-is-already-broken-fastapi-jwt-333e</link>
      <guid>https://dev.to/fastapier/your-api-is-replying-but-authentication-is-already-broken-fastapi-jwt-333e</guid>
      <description>&lt;h2&gt;
  
  
  🔥 Opening
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;/token works.&lt;/p&gt;

&lt;p&gt;You receive a JWT.&lt;/p&gt;

&lt;p&gt;Swagger shows “Authorized”.&lt;/p&gt;

&lt;p&gt;Everything looks correct.&lt;/p&gt;

&lt;p&gt;And then —&lt;/p&gt;

&lt;p&gt;/me returns 401.&lt;/p&gt;

&lt;p&gt;You check the endpoint.&lt;/p&gt;

&lt;p&gt;You rewrite the dependency.&lt;/p&gt;

&lt;p&gt;You add print logs.&lt;/p&gt;

&lt;p&gt;Nothing changes.&lt;/p&gt;

&lt;p&gt;The API is replying.&lt;/p&gt;

&lt;p&gt;But authentication is already broken.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  🧩 Why This Happens
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Most of the time, the endpoint is not the problem.&lt;/p&gt;

&lt;p&gt;The token is valid.&lt;/p&gt;

&lt;p&gt;The route is correct.&lt;/p&gt;

&lt;p&gt;The dependency is wired properly.&lt;/p&gt;

&lt;p&gt;The failure happens one layer deeper.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Common structural causes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Different &lt;code&gt;SECRET_KEY&lt;/code&gt; between environments&lt;/li&gt;
&lt;li&gt;Docker using a different &lt;code&gt;.env&lt;/code&gt; file&lt;/li&gt;
&lt;li&gt;Multiple instances running with inconsistent configs&lt;/li&gt;
&lt;li&gt;Clock drift causing token validation issues&lt;/li&gt;
&lt;li&gt;Algorithm mismatch (HS256 vs RS256)&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Nothing is “wrong” in the controller.&lt;/p&gt;

&lt;p&gt;The structure is inconsistent.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  🧠 The Real Problem
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;When authentication fails, most developers try to fix code.&lt;/p&gt;

&lt;p&gt;They debug the endpoint.&lt;/p&gt;

&lt;p&gt;They rewrite dependencies.&lt;/p&gt;

&lt;p&gt;They patch logic.&lt;/p&gt;

&lt;p&gt;But authentication is not just code.&lt;/p&gt;

&lt;p&gt;It is configuration.&lt;/p&gt;

&lt;p&gt;It is environment.&lt;/p&gt;

&lt;p&gt;It is instance consistency.&lt;/p&gt;

&lt;p&gt;It is key management.&lt;/p&gt;

&lt;p&gt;If those layers are misaligned,&lt;/p&gt;

&lt;p&gt;no endpoint fix will solve it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  🛠 My Approach
&lt;/h2&gt;

&lt;p&gt;I specialize in structural FastAPI/JWT authentication incidents.&lt;/p&gt;

&lt;p&gt;I don’t start by modifying code.&lt;/p&gt;

&lt;p&gt;I map the layers first:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Environment variables&lt;/li&gt;
&lt;li&gt;Key consistency&lt;/li&gt;
&lt;li&gt;Instance configuration&lt;/li&gt;
&lt;li&gt;Token structure&lt;/li&gt;
&lt;li&gt;Deployment differences&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Playground (reproducible setup): 👉 &lt;a href="https://github.com/hiro-kuroe/fastapi-auth-crud-docker" rel="noopener noreferrer"&gt;https://github.com/hiro-kuroe/fastapi-auth-crud-docker&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Only after the structure is clarified, modification makes sense.&lt;/p&gt;

&lt;h2&gt;
  
  
  🏁 Closing
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;If your API keeps replying&lt;/p&gt;

&lt;p&gt;but authentication keeps failing,&lt;/p&gt;

&lt;p&gt;you may be debugging the wrong layer.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I built a reproducible playground for this exact type of incident:&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/hiro-kuroe/fastapi-auth-crud-docker" rel="noopener noreferrer"&gt;https://github.com/hiro-kuroe/fastapi-auth-crud-docker&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Analysis-first. Structure before patching.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is not a patch guide.&lt;br&gt;
It’s a structural diagnosis.&lt;/p&gt;

</description>
      <category>fastapi</category>
      <category>python</category>
      <category>jwt</category>
      <category>docker</category>
    </item>
  </channel>
</rss>
