DEV Community

jeann
jeann

Posted on

Setting up LiteLLM (SDK + Proxy Gateway)

I recently spent time setting up LiteLLM, trying to unify multiple LLM providers (OpenAI, Anthropic, Vertex, etc.) under a single interface.

The main idea was simple:

Reduce provider coupling and move toward a model-agnostic LLM abstraction layer.


SDK setup (straightforward part)

The Python SDK installation was simple:

uv add litellm
Enter fullscreen mode Exit fullscreen mode

Basic usage:

from litellm import completion

completion(
  model="openai/gpt-4o",
  messages=[{"role": "user", "content": "Hello"}]
)
Enter fullscreen mode Exit fullscreen mode

What stood out here:

  • Same API across providers
  • Minimal setup
  • No SDK fragmentation

This part worked immediately without friction.


The interesting part: LiteLLM Proxy

The real value started when I explored the proxy (LLM Gateway layer).

litellm --model gpt-4o
Enter fullscreen mode Exit fullscreen mode

This exposes a local OpenAI-compatible endpoint:

http://0.0.0.0:4000
Enter fullscreen mode Exit fullscreen mode

At this stage, LiteLLM stops feeling like a library and starts behaving like infrastructure.


Core abstraction: YAML configuration

The routing layer becomes explicit only when using configuration:

model_list:
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
Enter fullscreen mode Exit fullscreen mode

This is where the mental model shifts:

LiteLLM is not just a client β€” it becomes a model routing system.


Production setup (Docker)

Running the proxy in Docker is straightforward but sensitive to configuration and environment resolution:

docker run \
  -v $(pwd)/litellm_config.yaml:/app/config.yaml \
  -e OPENAI_API_KEY=your-key \
  -p 4000:4000 \
  docker.litellm.ai/berriai/litellm:main-latest \
  --config /app/config.yaml
Enter fullscreen mode Exit fullscreen mode

Why this matters

Once running, any OpenAI-compatible client can interact with the gateway:

  • Model abstraction becomes centralized
  • Routing becomes configurable
  • Provider switching becomes transparent
  • Infrastructure concerns move out of application code

Key takeaway

What initially looks like a simple SDK quickly becomes a lightweight LLM infrastructure layer.

The key mental shift:

from calling models directly β†’ to managing model routing as infrastructure


Final thoughts

The most interesting part of LiteLLM is not the SDK itself, but the proxy layer that enables:

  • multi-provider routing
  • centralized control
  • deployment flexibility

It’s a practical step toward treating LLMs as infrastructure components rather than isolated APIs.


Top comments (0)