DEV Community

Cover image for Building a Local AI Development Stack with LiteLLM + AWS Bedrock + Open WebUI
Nitesh Kumar
Nitesh Kumar

Posted on

Building a Local AI Development Stack with LiteLLM + AWS Bedrock + Open WebUI

How I replaced multiple AI SaaS subscriptions with one simple self‑hosted gateway.


Why I Built This

Over the past few months I kept experimenting with different AI tools.

Chat apps. Coding assistants. Agent platforms. Most of them were impressive — but almost all of them required separate subscriptions.

After a while it became obvious that many of these tools are simply thin layers on top of the same foundation models.

So instead of paying for multiple platforms, I decided to build a single local AI stack that:

  • Uses AWS Bedrock for model access
  • Uses LiteLLM as a unified gateway
  • Uses Open WebUI as a chat interface
  • Connects to VS Code agents for coding
  • Runs locally using Docker

The result ended up being cleaner than expected.


Architecture

The full flow looks like this:

Open WebUI / VS Code Agents
            │
            ▼
        LiteLLM Gateway
            │
            ▼
        AWS Bedrock
 (Claude, DeepSeek, Qwen, etc)
Enter fullscreen mode Exit fullscreen mode

The key idea is LiteLLM acting as a gateway.

LiteLLM exposes an OpenAI‑compatible API, which means almost every AI tool can connect to it without needing Bedrock‑specific integrations.

That single layer simplifies the entire ecosystem.


Step 1 — Project Structure

I started with a simple project directory.

AI/
 └── litellm/
     ├── config.yaml
     ├── docker-compose.yml
     └── .env
Enter fullscreen mode Exit fullscreen mode

LiteLLM runs alongside Redis and Postgres.

These are used for:

  • caching
  • usage tracking
  • gateway state

Step 2 — Docker Compose Setup

litellm/docker-compose.yml

services:

  postgres:
    image: postgres:16
    container_name: litellm-postgres
    restart: unless-stopped
    environment:
      POSTGRES_USER: litellm
      POSTGRES_PASSWORD: litellm
      POSTGRES_DB: litellm
    volumes:
      - postgres_data:/var/lib/postgresql/data
    ports:
      - "55432:5432"

  redis:
    image: redis:7
    container_name: litellm-redis
    restart: unless-stopped
    ports:
      - "56379:6379"

  litellm:
    image: ghcr.io/berriai/litellm:main-latest
    container_name: litellm
    restart: unless-stopped
    ports:
      - "4010:4000"
    volumes:
      - ./config.yaml:/app/config.yaml
    env_file:
      - .env
    environment:
      DATABASE_URL: postgres://litellm:litellm@postgres:5432/litellm
      REDIS_HOST: redis
      REDIS_PORT: 6379
    depends_on:
      - postgres
      - redis
    command: --config /app/config.yaml

volumes:
  postgres_data:
Enter fullscreen mode Exit fullscreen mode

Step 3 — LiteLLM Configuration

config.yaml

general_settings:
  stream_response: true
  master_key: sk-admin
Enter fullscreen mode Exit fullscreen mode

The master_key acts as the admin API key for the gateway.


Step 4 — Start LiteLLM

From inside the litellm directory:

docker compose up -d
Enter fullscreen mode Exit fullscreen mode

Once everything starts, the LiteLLM dashboard is available at:

http://localhost:4010
Enter fullscreen mode Exit fullscreen mode

Step 5 — Add AWS Bedrock Credentials

Inside the LiteLLM dashboard navigate to:

Models + Endpoints → LLM Credentials
Enter fullscreen mode Exit fullscreen mode

Then click:

Add Credential
Enter fullscreen mode Exit fullscreen mode

Select provider:

Amazon Bedrock
Enter fullscreen mode Exit fullscreen mode

Fill the required fields:

  • AWS Access Key ID
  • AWS Secret Access Key
  • AWS Region

Example region:

us-east-1
Enter fullscreen mode Exit fullscreen mode

Once saved, LiteLLM can start communicating with Bedrock.


Step 6 — Register Bedrock Models

Next I added the models I wanted LiteLLM to expose.

Navigate to:

Models + Endpoints → Add Model
Enter fullscreen mode Exit fullscreen mode

Provider:

Amazon Bedrock
Enter fullscreen mode Exit fullscreen mode

Example model:

us.deepseek.r1-v1:0
Enter fullscreen mode Exit fullscreen mode

Mapping configuration:

Public Model Name: us.deepseek.r1-v1:0
LiteLLM Model Name: us.deepseek.r1-v1:0
Enter fullscreen mode Exit fullscreen mode

Select your credential and click:

Test Connect
Enter fullscreen mode Exit fullscreen mode

If everything is configured correctly you should see:

Connection successful
Enter fullscreen mode Exit fullscreen mode

Then click:

Add Model
Enter fullscreen mode Exit fullscreen mode

I repeated this for multiple Bedrock models.


Step 7 — Create an API Key

LiteLLM allows generating API keys for client applications.

Navigate to:

Virtual Keys
Enter fullscreen mode Exit fullscreen mode

Create a key such as:

sk-dev
Enter fullscreen mode Exit fullscreen mode

This key will be used by tools like Open WebUI or VS Code agents.


Step 8 — Run Open WebUI

To get a ChatGPT‑style interface I used Open WebUI.

Run it with Docker:

docker run -d \
-p 3000:8080 \
--name openwebui \
ghcr.io/open-webui/open-webui:main
Enter fullscreen mode Exit fullscreen mode

Open the interface:

http://localhost:3000
Enter fullscreen mode Exit fullscreen mode

Create an account on first launch.


Step 9 — Connect Open WebUI to LiteLLM

Inside Open WebUI go to:

Settings → Connections → OpenAI API
Enter fullscreen mode Exit fullscreen mode

Configure:

Base URL

http://localhost:4010
Enter fullscreen mode Exit fullscreen mode

API Key

sk-dev
Enter fullscreen mode Exit fullscreen mode

After saving, Open WebUI automatically loads all models registered in LiteLLM.


Final Result

At this point the stack looks like this:

Open WebUI
     ↓
LiteLLM Gateway
     ↓
AWS Bedrock Models
Enter fullscreen mode Exit fullscreen mode

From a single interface I can now:

  • switch between models
  • test prompts
  • track token usage
  • monitor costs

Using It with Coding Agents

Because LiteLLM exposes an OpenAI compatible API, it integrates directly with developer tools.

For example in VS Code tools like:

  • Continue.dev
  • OpenCode

Configuration simply requires:

Base URL: http://localhost:4010
API Key: sk-dev
Enter fullscreen mode Exit fullscreen mode

This lets the same Bedrock models power coding workflows.


Why This Setup Works Well

A few reasons I ended up liking this architecture:

  • No vendor lock‑in
  • Pay only for inference
  • One API for multiple models
  • Works with most AI tooling
  • Fully self‑hosted gateway

LiteLLM effectively becomes the central router for every AI tool I use.


Closing Thoughts

The AI tooling ecosystem moves extremely fast. Most products are simply wrappers around the same models.

Building a small modular stack turned out to be more flexible than relying on several separate platforms.

Now I have:

  • a local chat interface
  • coding agents inside my editor
  • access to Bedrock models
  • a single gateway controlling everything

All running locally with Docker while using AWS only for inference.

If you're experimenting with AI development, agents, or multi‑model workflows, this setup is a solid foundation.


If you build something similar or improve this stack, I'd love to see how others are approaching it.

Top comments (0)