I spent the last week building DocuFlow, an event-driven data pipeline that automatically ingests PDF invoices and extracts structured financial data (Vendor, Date, Amount) using AI.
The architecture was solid:
Watcher Service to detect new files.
Redis & Celery for asynchronous task queues.
PostgreSQL for storage.
Google Gemini 2.5 Flash as the intelligence layer.
Everything worked perfectly on my local machine. But the moment I containerized it with Docker, everything crashed. 💥
Here is the story of how a "simple" SDK versioning error nearly killed the project, and how I fixed it by ripping out the library and going raw.
The Problem: "404 Model Not Found"
I was using the standard google-generativeai Python library. In my Dockerfile, I was installing the latest dependencies.
When I ran docker compose up, my worker service threw this error immediately:
Plaintext
Error: 404 models/gemini-1.5-flash is not found for API version v1beta, or is not supported for generateContent.
This made no sense. The model definitely exists. It worked on my laptop. Why was Docker failing?
The Root Cause
It turns out that Google is updating their GenAI SDKs so fast that version mismatches are common.
My Docker container was pulling a slightly different version of the SDK than my local environment.
The library was trying to hit a deprecated API endpoint (v1beta) that didn't recognize the newer gemini-2.5-flash model alias.
I tried upgrading to the newer google-genai library, but that introduced its own "library hell" with conflicting dependencies in my slim Docker image.
I was stuck in dependency hell. 📉
The Solution: The "Raw" Approach
Instead of fighting pip and version numbers, I realized I didn't need the SDK. Under the hood, the SDK is just making HTTP requests.
So, I fired the SDK. 🚫📦
I rewrote the extraction engine using Python's standard requests library to hit the Gemini REST API directly. This gave me 100% control over the endpoint and the payload.
The Code
Here is the robust, Docker-proof implementation:
Python
import os
import requests
import json
def parse_invoice_with_rest(text):
api_key = os.getenv("GEMINI_API_KEY")
# Direct URL to the stable endpoint
# Note: Using 'gemini-2.5-flash-latest' to ensure we get the specific model version
url = f"https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-latest:generateContent?key={api_key}"
# Constructing the payload manually
payload = {
"contents": [{
"parts": [{
"text": f"Extract the Vendor, Date, and Total Amount from this OCR text: {text}. Return JSON only."
}]
}]
}
try:
# Standard HTTP POST request
response = requests.post(
url,
headers={"Content-Type": "application/json"},
json=payload
)
if response.status_code != 200:
print(f"Error: {response.text}")
return None
return response.json()
except Exception as e:
print(f"Connection failed: {e}")
return None
Why this is better
Zero Dependency Hell: I don't care if Google updates their Python SDK tomorrow. As long as the REST endpoint exists, my code works.
Lighter Containers: I removed the heavy AI libraries from my requirements.txt, making my Docker image smaller.
Debuggability: When an error happens, I see the raw HTTP response code (400, 404, 500) instead of a cryptic Python stack trace.
The Result: Green Logs 🟢
After deploying the REST client, the pipeline processed the invoice in 1.8 seconds.
![Screenshot of green terminal logs showing successful extraction]

The full pipeline now runs smoothly in Docker Compose, handling OCR (Tesseract), queuing (Redis), and AI extraction without a single library conflict.
The Takeaway
If you are building AI agents in production—especially in containerized environments—don't be afraid to bypass the "official" SDKs. Sometimes, a simple curl or requests.post is the most robust engineering decision you can make.
Repo Link: https://github.com/Shashank0701-byte/docuflow
Let me know if you've faced similar "SDK Hell" with other AI providers! 👇
Top comments (1)
Would love some Constructive Criticisms for experienced peeps