DEV Community

Santiago Morales
Santiago Morales

Posted on • Originally published at sandmor.dev

Bifrost and Bloated Headers: My Journey Setting Up an AI Gateway

Recently, I was trying to set up an AI gateway. I wasn't very familiar with Bifrost before but it caught my eye that it was written in go, a language that I have come to love and see as part of the future of server systems.

In any case, after choosing Bifrost, I encountered my first hurdle. There were two ways to configure it, either by configuration file or by UI. In cases like this I'll usually go for the configuration file, I find pretty straightforward to have a YAML file carrying a exact definition of the configuration that I can then set in another environment if needed.

However, in this case I went with UI, why? Because I wasn't very familiar with the tool, it is not as well known either so documentation may be scarce, and most importantly because I wanted to familiarize myself with its features, something that I can do more easily if I can explore a dashboard. Before taking this decision though, I made sure to inform myself if the UI was just basic and a configuration file was recommended for "advanced features." Thankfully, that didn't seem to be the case, according to the Bifrost documentation they offered first class UI support, though I was given to understand that some very niche features still may only be accessible through YAML.

Choice made, I wrote a little docker compose file.

services:
  bifrost:
    image: maximhq/bifrost:latest
    container_name: bifrost
    restart: unless-stopped
    ports:
      - "127.0.0.1:8080:8080"
    volumes:
      - ./bifrost-data:/app/data
Enter fullscreen mode Exit fullscreen mode

And deployed it in a VPS. As you can see I am mapping to 127.0.0.1:8080, which makes sure that not other devices can access the insecure HTTP connection.

With docker fired up, the next step was creating a configuration file in NGINX to expose this through a secure connection.

NGINX configuration is one of those things that I have done many times and yet I always find myself looking up the exact syntax. The basics are simple enough, but getting location blocks and proxy headers exactly right comes with its own set of pitfalls that are easy to forget if you don't deal with them regularly.

The first thing I needed was a domain name pointing to my VPS. I already had a subdomain ready for this, so I created an A record pointing to the server's IP address and waited for DNS propagation. That done, I created an NGINX server block for the subdomain, proxying through to Bifrost on port 8080.

Importantly, I turned off proxy buffering (proxy_buffering off;). When you're streaming model responses chunk by chunk, NGINX's default buffering can introduce lag that breaks the real-time feel.

My initial NGINX configuration looked like this:

server {
    listen 80;
    server_name bifrost.mydomain.com;

    location / {
        proxy_pass http://127.0.0.1:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        proxy_buffering off;
        proxy_read_timeout 600;
    }
}
Enter fullscreen mode Exit fullscreen mode

With the server block configured and buffering disabled, I was ready to add TLS. I already had certbot installed, so granting my new subdomain a SSL certificate and configuring it in NGINX could be done easily with a single command:

sudo certbot --nginx -d bifrost.mydomain.com
Enter fullscreen mode Exit fullscreen mode

This command both issues the certificate and updates NGINX configuration file to use it (changing port 80 to 443 and adding the SSL paths). With TLS in place, I could finally access the Bifrost dashboard through a secure connection. Opening the browser and navigating to my subdomain, I was greeted with a clean, minimal interface.

The first that I noticed was that I wasn't prompted to create a first user or authentication at all, that is not gonna fly, was the first thing that I thought. As I was quite excited to test my gateway and didn't want to sift through the large interface, I quickly asked an AI assistant who didn't seem very familiar with the application as its first response was setting the dashboard behind HTTP Basic Auth.

I have to admit that I toyed with the idea for a minute, until I just decided to check if there was an option built-in in Bifrost dashboard. Which was! Buried in Settings -> Security there as a way to set a password to the dashboard, though by the time of writing this article it says to be in beta.

Once I set a username and password, and confirmed the change, the page reloaded and I found myself confronted by a login form. Filling up the credentials prove easy enough and now I have my dashboard secure.

Next step was setting up a provider, and I thought this was going to be easy, but of course, nothing is ever that straightforward when you're trying a new tool.

After configuring my first provider, a custom one. I noticed quickly that Bifrost showed a error indicator, most likely of failing to try and list the available models.

But as it didn't show more information, I decided to try and call the API endpoint to see what would Bifrost do with this faulty provider.

Digging a bit in how it should work by default, the convention was to call the api with a model in the format provider/model, so I did that using curl and immediately getting a concerning answer.

{"is_bifrost_error":false,"error":{"error":"error when reading response headers: small read buffer. Increase ReadBufferSize. Buffer size=4096, contents: \"HTTP/1.1 200 OK\\r\\nAccess-Control-Allow-Headers: Content-Type, Authorization, x-api-key, x-request-id, x-client, x-app, x-billing-mode, x-test-billing-mode, x-payment, x-x402, x-x402-payment-id, x-pr\"...\" application/json\\r\\nDate: Fri, 30 May 2026 21:15:53 GMT\\r\\nReferrer-Policy: strict-origin-when-cross-origin\\r\\nServer: Vercel\\r\\nStrict-Transport-Security: max-age=63072000\\r\\nVary: rsc, next-router-state-tree\"","message":"failed to execute HTTP request to provider API"},"extra_fields":{"provider":"test","original_model_requested":"moonshotai/kimi-k2.6","resolved_model_used":"moonshotai/kimi-k2.6","request_type":"chat_completion"}}
Enter fullscreen mode Exit fullscreen mode

I was just trying to configure my first provider and I was already getting a error! It seems like whatever Go library that Bifrost is using to handle HTTP has a buffer size of 4096 bytes for requests, and my provider was sending a bit more than that.

Given the fields I figured out that my most likely suspect was Access-Control-Allow-Headers, its value must be massive but actually not needed for our purposes. Therefore stripping it should fix the problem, right?

Given that I'm already using NGINX, I could simply add a proxy layer to strip the CORS headers.

server {
    listen 8081;
    server_name host.docker.internal _;
    resolver 8.8.8.8 1.1.1.1 valid=300s;
    resolver_timeout 5s;

    location / {
        set $upstream https://myprovider.com;

        proxy_pass $upstream$request_uri;
        proxy_ssl_server_name on;
        proxy_ssl_name myprovider.com;
        proxy_set_header Host myprovider.com;

        # Strip theCORS headers that are crashing Bifrost
        proxy_hide_header Access-Control-Allow-Headers;
        proxy_hide_header Access-Control-Allow-Methods;
        proxy_hide_header Access-Control-Allow-Origin;
        proxy_hide_header Access-Control-Expose-Headers;

        proxy_pass_request_headers on;

        proxy_buffering off;
        proxy_read_timeout 600;
    }
}
Enter fullscreen mode Exit fullscreen mode

Secondary proxy added, I needed to update and restart my docker compose file for Bifrost to be able to reach it:

    extra_hosts:
      - "host.docker.internal:host-gateway"
Enter fullscreen mode Exit fullscreen mode

With this I expected it to work, but sadly, I met another error, as registered in NGINX logs.

2026/05/31 17:54:24 [error] 202599#202599: *25120 upstream sent too big header while reading response header from upstream, client: 172.19.0.2, server: host.docker.internal, request: "POST /v1/chat/completions HTTP/1.1", upstream: "https://215.151.1.1:443/v1/chat/completions", host: "host.docker.internal:8081"
Enter fullscreen mode Exit fullscreen mode

Apparently NGINX was also suffering from the same problem than Bifrost, the response headers from the provider were too large for its default buffer sizes. The fix was straightforward though, NGINX lets us tune these limits with a couple of directives. I added proxy_buffer_size 32k; to my server block, giving it enough room to handle those large headers without choking.

With the buffer sizes increased, I restarted NGINX and tried my curl request again. And it failed again! Checking the NGINX logs I found this new message.

2026/05/31 17:57:21 [error] 256520#256520: *25172 upstream sent too big header while reading response header from upstream, client: 13.212.191.221, server: bifrost.mydomain.com, request: "POST /v1/chat/completions HTTP/1.1", upstream: "http://127.0.0.1:8080/v1/chat/completions", host: "bifrost.mydomain.com"
Enter fullscreen mode Exit fullscreen mode

This error is very similar to the previous one, however I immediately notice how it is pointing to my client and the domain that I gave the proxy. Not the provider, strange but this just means that now is Bifrost who is sending a large response to the client. I'm not sure if that means that it is passing through other headers from the providers that may be engorging the response, but I found easily enough to also bump up the buffer sizes of the main Bifrost server block as well. Same directive, proxy_buffer_size 32k;, this time applied to the server block handling the connection between the outside world and Bifrost itself.

server {
    server_name bifrost.mydomain.com;

    location / {
        proxy_pass http://127.0.0.1:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # Increase buffer size strictly for reading large upstream headers
        proxy_buffer_size 32k;

        proxy_buffering off;
        proxy_read_timeout 600;
    }

    listen 443 ssl;
    # SSL certificates managed by Certbot...
}
Enter fullscreen mode Exit fullscreen mode

After reloading NGINX one more time, I sent my curl request and finally got a proper response back. The model answered without errors, and Bifrost successfully proxied the request through to my custom provider. A satisfactory victory after all the unexpected trouble that I went through, even if it stings me a bit that I had to patch things up to make it work.

With the provider working I now have to secure the LLM endpoint so it couldn't be prompted from Bifrost gateway without a key. Setting a key in Bifrost was easy enough, though I have to dig a bit to find the option to disable keyless inference.

Once I found that toggle and switched it off, I tested the endpoint again without passing an API key and was met with a proper authentication error instead of a successful response. That was reassuring.

Adding a second provider turned out to be smoother than the first, probably because I had already dealt with the NGINX buffering issues and knew what to expect. I set up OpenAI as my second provider, which being a first-class citizen in Bifrost meant no custom proxy shenanigans were needed. The dashboard had a dedicated section for it, I plugged in my API key, and Bifrost immediately recognized it without any errors. A refreshing contrast to my earlier experience.

Satisfied with the setup, I took a step back and reviewed what I had built. A Bifrost instance running in Docker, behind NGINX with TLS, two providers configured, the dashboard secured with authentication, and the inference endpoint protected by API keys. There were still things I could improve, experimenting with dynamic routing, configuring rate limits, maybe exploring some of those niche features that only the YAML configuration exposes, but the core functionality was there and working.

For now, that was enough. The gateway was up, it was secure, and it was routing requests the way I wanted.

Top comments (0)