DEV Community

IronSoftware
IronSoftware

Posted on

Gotenberg Docker Setup: Understanding the Hidden Complexity (Fixed)

Gotenberg has become a popular choice for developers seeking HTML to PDF conversion in containerized environments. The project markets itself as a "developer-friendly API" that bundles Chromium and LibreOffice into a Docker image. This sounds convenient on paper, but the architecture introduces operational complexity that many teams only discover after deployment.

This article examines the real-world challenges of running Gotenberg in production: the DevOps overhead of managing a separate service, network latency implications, container orchestration complexity, and the ongoing maintenance burden. It also presents an alternative architectural approach that eliminates these concerns.

The Gotenberg Architecture

Gotenberg operates as a stateless HTTP API packaged in a Docker container. To convert an HTML document to PDF, your application must:

  1. Run the Gotenberg container as a separate service
  2. Send HTTP requests with multipart form data containing your HTML
  3. Receive the PDF binary in the HTTP response
  4. Handle timeouts, retries, and error cases across the network boundary

The basic setup requires pulling the image and running it:

docker run --rm -p 3000:3000 gotenberg/gotenberg:8
Enter fullscreen mode Exit fullscreen mode

While this single command appears simple, production deployments demand considerably more configuration.

Container Configuration Complexity

The Gotenberg image exposes numerous configuration flags for tuning Chromium and LibreOffice behavior. A production-ready docker-compose configuration often looks like this:

version: "3.8"
services:
  gotenberg:
    image: gotenberg/gotenberg:8
    restart: unless-stopped
    command:
      - "gotenberg"
      - "--chromium-disable-javascript=false"
      - "--chromium-allow-list=file:///tmp/.*"
      - "--chromium-deny-list="
      - "--chromium-ignore-certificate-errors=true"
      - "--chromium-disable-web-security=true"
      - "--api-timeout=180s"
      - "--chromium-max-queue-size=20"
      - "--libreoffice-disable-routes=false"
    ports:
      - "3000:3000"
    deploy:
      resources:
        limits:
          memory: 4G
        reservations:
          memory: 2G
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
Enter fullscreen mode Exit fullscreen mode

This configuration addresses several issues developers encounter:

  • Memory limits prevent runaway Chromium processes from consuming all host resources
  • Health checks enable container orchestrators to restart failed instances
  • Queue size limits prevent request pile-up during traffic spikes
  • Timeout configuration balances between allowing complex renders and preventing hung requests

Each of these settings requires understanding both Gotenberg's internals and your specific workload characteristics.

DevOps Overhead

Running Gotenberg means operating another service in your infrastructure. This creates ongoing work that compounds over time.

Service Monitoring

Your existing application monitoring needs to extend to Gotenberg:

  • Response time tracking for conversion requests
  • Error rate monitoring for failed conversions
  • Memory and CPU utilization alerts
  • Queue depth monitoring to detect backpressure
  • Health check integration with your alerting system

A typical Prometheus configuration for Gotenberg monitoring:

- job_name: 'gotenberg'
  static_configs:
    - targets: ['gotenberg:3000']
  metrics_path: '/prometheus/metrics'
  scrape_interval: 15s
Enter fullscreen mode Exit fullscreen mode

You then need Grafana dashboards, alert rules, and runbooks for when things go wrong.

Version Management

Gotenberg releases new versions regularly. Each update potentially changes:

  • Chromium version (affecting rendering behavior)
  • LibreOffice version (affecting document conversion)
  • API behavior or configuration options
  • Resource consumption patterns

Staying on old versions risks security vulnerabilities and missing bug fixes. Upgrading requires testing to ensure your existing conversions still work correctly. This creates a recurring maintenance task that many teams underestimate.

Security Considerations

Gotenberg accepts arbitrary HTML and converts it using a full browser engine. This creates security surface area:

  • The service should not be exposed to the public internet
  • Input validation at the application layer remains essential
  • Network policies should restrict which services can reach Gotenberg
  • Container security hardening (non-root user, read-only filesystem) adds complexity
# Kubernetes NetworkPolicy example
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: gotenberg-ingress
spec:
  podSelector:
    matchLabels:
      app: gotenberg
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: your-application
      ports:
        - protocol: TCP
          port: 3000
Enter fullscreen mode Exit fullscreen mode

Network Latency Impact

Every PDF conversion requires a round trip over the network. This adds latency that accumulates in high-volume scenarios.

Request Flow Analysis

A typical Gotenberg HTML to PDF conversion involves:

  1. Serialization: Your application serializes HTML, CSS, and assets into multipart form data
  2. Network transfer (outbound): The request travels to Gotenberg over TCP
  3. Parsing: Gotenberg parses the multipart request
  4. Chromium rendering: The actual PDF generation occurs
  5. Network transfer (inbound): The PDF binary returns to your application
  6. Deserialization: Your application reads the response bytes

Steps 2 and 5 introduce latency that does not exist with in-process conversion. Within a Kubernetes cluster, this might add 1-5ms per request. Across availability zones or regions, the penalty grows to 10-50ms or more.

Throughput Constraints

Network-based architecture creates throughput limitations:

  • TCP connection overhead for each request (or connection pool management)
  • Serialization/deserialization CPU cost
  • Network bandwidth consumption for large HTML documents or PDFs
  • Potential for network-related failures (timeouts, connection resets)

Consider a scenario generating 1,000 invoices. With Gotenberg, each invoice requires a network round trip. Even with connection pooling and parallelization, the network overhead accumulates:

// Gotenberg approach - each conversion is a network call
public async Task<byte[]> ConvertWithGotenberg(string html)
{
    using var client = new HttpClient();
    using var content = new MultipartFormDataContent();

    // Serialize HTML into form data
    var htmlContent = new StringContent(html);
    content.Add(htmlContent, "files", "index.html");

    // Network round trip to Gotenberg
    var response = await client.PostAsync(
        "http://gotenberg:3000/forms/chromium/convert/html",
        content);

    return await response.Content.ReadAsByteArrayAsync();
}
Enter fullscreen mode Exit fullscreen mode

Container Orchestration Challenges

Production deployments rarely run a single Gotenberg instance. Scaling introduces additional complexity.

Kubernetes Deployment Considerations

A production Kubernetes deployment requires:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: gotenberg
spec:
  replicas: 3
  selector:
    matchLabels:
      app: gotenberg
  template:
    metadata:
      labels:
        app: gotenberg
    spec:
      containers:
        - name: gotenberg
          image: gotenberg/gotenberg:8
          ports:
            - containerPort: 3000
          resources:
            requests:
              memory: "2Gi"
              cpu: "1000m"
            limits:
              memory: "4Gi"
              cpu: "2000m"
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 30
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: gotenberg
spec:
  selector:
    app: gotenberg
  ports:
    - port: 3000
      targetPort: 3000
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: gotenberg
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: gotenberg
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
Enter fullscreen mode Exit fullscreen mode

This configuration addresses several concerns:

  • Replicas: Multiple instances for availability and throughput
  • Resource limits: Prevent individual pods from consuming excessive resources
  • Health probes: Enable Kubernetes to detect and replace unhealthy instances
  • HPA: Automatic scaling based on load

Each component requires tuning based on your workload, and misconfiguration leads to either wasted resources or degraded performance.

State and Stickiness Issues

While Gotenberg is stateless by design, certain conversion scenarios benefit from request affinity:

  • Multi-step conversions that share temporary files
  • Conversions requiring pre-warmed Chromium instances
  • Debugging scenarios where you need consistent routing

Implementing session affinity adds another layer of configuration and can conflict with load balancing efficiency.

Resource Overhead

Running Gotenberg means dedicating compute resources to a service that sits idle between conversion requests.

Memory Consumption

A running Gotenberg instance with Chromium consumes significant memory even when idle:

  • Base container: ~500MB
  • Chromium process: ~300-500MB additional
  • LibreOffice (if enabled): ~200-400MB additional
  • Per-request overhead: varies with document complexity

For a three-replica deployment with comfortable headroom, you might allocate 12GB of memory to Gotenberg alone.

CPU Utilization Patterns

PDF conversion is CPU-intensive during rendering but leaves CPU idle between requests. Unless your traffic patterns show consistent conversion load, you pay for capacity that sits unused. The bursty nature of most PDF generation workloads makes right-sizing difficult.

An Alternative Approach: Embedded Conversion

The operational complexity of Gotenberg stems from its architecture as a separate service. An alternative approach embeds PDF conversion directly into your application process, eliminating the service boundary.

IronPDF takes this approach, packaging a Chrome-based rendering engine as a NuGet package. Conversion happens in-process without network calls:

using IronPdf;

public class InvoiceGenerator
{
    public byte[] GenerateInvoice(InvoiceData data)
    {
        // Create renderer - Chrome engine is embedded
        var renderer = new ChromePdfRenderer();

        // Configure rendering options
        renderer.RenderingOptions.MarginTop = 20;
        renderer.RenderingOptions.MarginBottom = 20;
        renderer.RenderingOptions.PaperSize = IronPdf.Rendering.PdfPaperSize.A4;

        // Build HTML from template
        string html = BuildInvoiceHtml(data);

        // Convert in-process - no network call
        PdfDocument pdf = renderer.RenderHtmlAsPdf(html);

        return pdf.BinaryData;
    }

    private string BuildInvoiceHtml(InvoiceData data)
    {
        return $@"
            <!DOCTYPE html>
            <html>
            <head>
                <style>
                    body {{ font-family: Arial, sans-serif; }}
                    .invoice-header {{ display: flex; justify-content: space-between; }}
                    .line-items {{ width: 100%; border-collapse: collapse; }}
                    .line-items th, .line-items td {{
                        border: 1px solid #ddd;
                        padding: 8px;
                        text-align: left;
                    }}
                </style>
            </head>
            <body>
                <div class='invoice-header'>
                    <h1>Invoice #{data.InvoiceNumber}</h1>
                    <p>Date: {data.Date:yyyy-MM-dd}</p>
                </div>
                <!-- Invoice content here -->
            </body>
            </html>";
    }
}
Enter fullscreen mode Exit fullscreen mode

Architectural Comparison

Aspect Gotenberg (Service) IronPDF (Embedded)
Deployment Separate container NuGet package
Network calls Required None
Scaling Independent service scaling Scales with application
Monitoring Separate metrics pipeline Application metrics
Versioning Container image updates Package updates
Latency Network round-trip In-process
Resource isolation Container boundaries Process shared

Deployment Simplification

With embedded conversion, your Dockerfile remains straightforward:

FROM mcr.microsoft.com/dotnet/aspnet:8.0
WORKDIR /app
COPY --from=build /app/publish .
ENTRYPOINT ["dotnet", "YourApplication.dll"]
Enter fullscreen mode Exit fullscreen mode

No sidecar containers, no service mesh configuration, no inter-service authentication. Your application handles PDF conversion as another function call.

Code Comparison

Batch invoice generation illustrates the difference:

Gotenberg approach:

public async Task<List<byte[]>> GenerateInvoicesBatch(List<InvoiceData> invoices)
{
    var results = new List<byte[]>();
    using var client = new HttpClient { Timeout = TimeSpan.FromMinutes(5) };

    // Process in parallel with concurrency limit
    var semaphore = new SemaphoreSlim(10);
    var tasks = invoices.Select(async invoice =>
    {
        await semaphore.WaitAsync();
        try
        {
            using var content = new MultipartFormDataContent();
            var html = BuildInvoiceHtml(invoice);
            content.Add(new StringContent(html), "files", "index.html");

            var response = await client.PostAsync(
                "http://gotenberg:3000/forms/chromium/convert/html",
                content);

            response.EnsureSuccessStatusCode();
            return await response.Content.ReadAsByteArrayAsync();
        }
        finally
        {
            semaphore.Release();
        }
    });

    return (await Task.WhenAll(tasks)).ToList();
}
Enter fullscreen mode Exit fullscreen mode

IronPDF approach:

public List<byte[]> GenerateInvoicesBatch(List<InvoiceData> invoices)
{
    var renderer = new ChromePdfRenderer();
    renderer.RenderingOptions.PaperSize = IronPdf.Rendering.PdfPaperSize.A4;

    // Process in parallel - no network, no semaphore needed for external service
    return invoices
        .AsParallel()
        .Select(invoice =>
        {
            string html = BuildInvoiceHtml(invoice);
            PdfDocument pdf = renderer.RenderHtmlAsPdf(html);
            return pdf.BinaryData;
        })
        .ToList();
}
Enter fullscreen mode Exit fullscreen mode

The embedded approach eliminates:

  • HTTP client configuration
  • Semaphore-based concurrency limiting for external service
  • Network timeout handling
  • Response deserialization
  • Retry logic for transient network failures

Platform Support

IronPDF runs on the same platforms where your .NET application runs:

  • Windows (x64, x86)
  • Linux (Debian, Ubuntu, CentOS, Alpine)
  • macOS (Intel and Apple Silicon)
  • Docker containers
  • Azure App Service, AWS Lambda, Google Cloud Run

The rendering engine binaries are included in the NuGet package and extracted at runtime.

When Gotenberg Makes Sense

Despite the complexity, Gotenberg remains appropriate for certain scenarios:

  1. Polyglot environments: When PDF generation is needed from multiple applications written in different languages
  2. Strict resource isolation: When PDF conversion must be isolated from application memory for security or stability
  3. Existing microservices infrastructure: When your team already operates extensive service meshes and the incremental cost is minimal
  4. LibreOffice document conversion: When you need Word/Excel to PDF conversion alongside HTML conversion

Migration Considerations

Teams moving from Gotenberg to embedded conversion should consider:

API Surface Changes

Gotenberg uses multipart form requests. IronPDF uses method calls:

// Gotenberg: HTTP multipart request
// POST /forms/chromium/convert/html
// Content-Type: multipart/form-data
// files: index.html

// IronPDF: Method call
var pdf = renderer.RenderHtmlAsPdf(htmlString);
// or
var pdf = renderer.RenderHtmlFileAsPdf("path/to/file.html");
Enter fullscreen mode Exit fullscreen mode

Configuration Translation

Common Gotenberg configurations map to IronPDF options:

var renderer = new ChromePdfRenderer();

// Gotenberg: --chromium-disable-javascript=true
renderer.RenderingOptions.EnableJavaScript = false;

// Gotenberg: --chromium-wait-delay=1s
renderer.RenderingOptions.WaitFor.RenderDelay = 1000;

// Gotenberg: PDF options in request body
renderer.RenderingOptions.MarginTop = 10;
renderer.RenderingOptions.MarginBottom = 10;
renderer.RenderingOptions.MarginLeft = 10;
renderer.RenderingOptions.MarginRight = 10;

// Gotenberg: paper size in request
renderer.RenderingOptions.PaperSize = IronPdf.Rendering.PdfPaperSize.A4;
// or custom size
renderer.RenderingOptions.SetCustomPaperSizeInMillimeters(210, 297);
Enter fullscreen mode Exit fullscreen mode

Licensing

IronPDF is commercial software. Evaluate the licensing cost against the operational cost savings from eliminating a separate service. For many teams, the reduced DevOps burden justifies the license expense.

A free trial allows testing with your actual workload before commitment.

Conclusion

Gotenberg's Docker-based architecture trades development convenience for operational complexity. Running a separate conversion service means managing containers, configuring orchestration, monitoring another system, handling network failures, and accepting latency overhead.

Embedding PDF conversion in your application process eliminates these concerns. The conversion code becomes part of your application, scaling and deploying together, without network boundaries or service management overhead.

For .NET teams, IronPDF provides Chrome-based HTML rendering as a NuGet package, matching Gotenberg's conversion quality while removing the architectural complexity. The approach particularly benefits teams without dedicated DevOps capacity or those seeking to reduce their operational footprint.


Jacob Mellor is CTO at Iron Software with over 25 years building developer tools.


References

  1. Gotenberg Documentation{:rel="nofollow"} - Official installation and configuration guide
  2. Gotenberg GitHub Repository{:rel="nofollow"} - Source code and issue tracker
  3. Gotenberg Docker Hub{:rel="nofollow"} - Official Docker image
  4. IronPDF for .NET - Embedded Chrome PDF generation for .NET
  5. IronPDF Docker Guide - Running IronPDF in containers
  6. ChromePdfRenderer API Reference - IronPDF rendering options

For the latest IronPDF documentation and tutorials, visit ironpdf.com.

Top comments (0)