How I replaced multiple AI SaaS subscriptions with one simple self‑hosted gateway.
Why I Built This
Over the past few months I kept experimenting with different AI tools.
Chat apps. Coding assistants. Agent platforms. Most of them were impressive — but almost all of them required separate subscriptions.
After a while it became obvious that many of these tools are simply thin layers on top of the same foundation models.
So instead of paying for multiple platforms, I decided to build a single local AI stack that:
- Uses AWS Bedrock for model access
- Uses LiteLLM as a unified gateway
- Uses Open WebUI as a chat interface
- Connects to VS Code agents for coding
- Runs locally using Docker
The result ended up being cleaner than expected.
Architecture
The full flow looks like this:
Open WebUI / VS Code Agents
│
▼
LiteLLM Gateway
│
▼
AWS Bedrock
(Claude, DeepSeek, Qwen, etc)
The key idea is LiteLLM acting as a gateway.
LiteLLM exposes an OpenAI‑compatible API, which means almost every AI tool can connect to it without needing Bedrock‑specific integrations.
That single layer simplifies the entire ecosystem.
Step 1 — Project Structure
I started with a simple project directory.
AI/
└── litellm/
├── config.yaml
├── docker-compose.yml
└── .env
LiteLLM runs alongside Redis and Postgres.
These are used for:
- caching
- usage tracking
- gateway state
Step 2 — Docker Compose Setup
litellm/docker-compose.yml
services:
postgres:
image: postgres:16
container_name: litellm-postgres
restart: unless-stopped
environment:
POSTGRES_USER: litellm
POSTGRES_PASSWORD: litellm
POSTGRES_DB: litellm
volumes:
- postgres_data:/var/lib/postgresql/data
ports:
- "55432:5432"
redis:
image: redis:7
container_name: litellm-redis
restart: unless-stopped
ports:
- "56379:6379"
litellm:
image: ghcr.io/berriai/litellm:main-latest
container_name: litellm
restart: unless-stopped
ports:
- "4010:4000"
volumes:
- ./config.yaml:/app/config.yaml
env_file:
- .env
environment:
DATABASE_URL: postgres://litellm:litellm@postgres:5432/litellm
REDIS_HOST: redis
REDIS_PORT: 6379
depends_on:
- postgres
- redis
command: --config /app/config.yaml
volumes:
postgres_data:
Step 3 — LiteLLM Configuration
config.yaml
general_settings:
stream_response: true
master_key: sk-admin
The master_key acts as the admin API key for the gateway.
Step 4 — Start LiteLLM
From inside the litellm directory:
docker compose up -d
Once everything starts, the LiteLLM dashboard is available at:
http://localhost:4010
Step 5 — Add AWS Bedrock Credentials
Inside the LiteLLM dashboard navigate to:
Models + Endpoints → LLM Credentials
Then click:
Add Credential
Select provider:
Amazon Bedrock
Fill the required fields:
- AWS Access Key ID
- AWS Secret Access Key
- AWS Region
Example region:
us-east-1
Once saved, LiteLLM can start communicating with Bedrock.
Step 6 — Register Bedrock Models
Next I added the models I wanted LiteLLM to expose.
Navigate to:
Models + Endpoints → Add Model
Provider:
Amazon Bedrock
Example model:
us.deepseek.r1-v1:0
Mapping configuration:
Public Model Name: us.deepseek.r1-v1:0
LiteLLM Model Name: us.deepseek.r1-v1:0
Select your credential and click:
Test Connect
If everything is configured correctly you should see:
Connection successful
Then click:
Add Model
I repeated this for multiple Bedrock models.
Step 7 — Create an API Key
LiteLLM allows generating API keys for client applications.
Navigate to:
Virtual Keys
Create a key such as:
sk-dev
This key will be used by tools like Open WebUI or VS Code agents.
Step 8 — Run Open WebUI
To get a ChatGPT‑style interface I used Open WebUI.
Run it with Docker:
docker run -d \
-p 3000:8080 \
--name openwebui \
ghcr.io/open-webui/open-webui:main
Open the interface:
http://localhost:3000
Create an account on first launch.
Step 9 — Connect Open WebUI to LiteLLM
Inside Open WebUI go to:
Settings → Connections → OpenAI API
Configure:
Base URL
http://localhost:4010
API Key
sk-dev
After saving, Open WebUI automatically loads all models registered in LiteLLM.
Final Result
At this point the stack looks like this:
Open WebUI
↓
LiteLLM Gateway
↓
AWS Bedrock Models
From a single interface I can now:
- switch between models
- test prompts
- track token usage
- monitor costs
Using It with Coding Agents
Because LiteLLM exposes an OpenAI compatible API, it integrates directly with developer tools.
For example in VS Code tools like:
- Continue.dev
- OpenCode
Configuration simply requires:
Base URL: http://localhost:4010
API Key: sk-dev
This lets the same Bedrock models power coding workflows.
Why This Setup Works Well
A few reasons I ended up liking this architecture:
- No vendor lock‑in
- Pay only for inference
- One API for multiple models
- Works with most AI tooling
- Fully self‑hosted gateway
LiteLLM effectively becomes the central router for every AI tool I use.
Closing Thoughts
The AI tooling ecosystem moves extremely fast. Most products are simply wrappers around the same models.
Building a small modular stack turned out to be more flexible than relying on several separate platforms.
Now I have:
- a local chat interface
- coding agents inside my editor
- access to Bedrock models
- a single gateway controlling everything
All running locally with Docker while using AWS only for inference.
If you're experimenting with AI development, agents, or multi‑model workflows, this setup is a solid foundation.
If you build something similar or improve this stack, I'd love to see how others are approaching it.
Top comments (0)