DEV Community

Ajit Kumar
Ajit Kumar

Posted on

Parsing NGINX Configurations with Rust: A Complete Guide to nginx-discovery

If you've ever worked with web applications, you've likely encountered NGINX - the powerful web server and reverse proxy that powers a significant portion of the internet. But have you ever needed to programmatically analyze NGINX configuration files? Maybe you're building a monitoring tool, creating automated documentation, or managing configurations across multiple servers?

In this tutorial, we'll explore nginx-discovery, a Rust library that makes parsing and analyzing NGINX configurations surprisingly easy. Whether you're new to NGINX or a seasoned DevOps engineer, this guide will help you understand both NGINX configurations and how to work with them programmatically.

Table of Contents

  1. Understanding NGINX Configuration
  2. Why Parse NGINX Configs Programmatically?
  3. Getting Started with nginx-discovery
  4. Basic Configuration Parsing
  5. Extracting Server Blocks
  6. Working with Locations and Proxies
  7. Real-World Use Case: Building a Log Parser
  8. Conclusion

Understanding NGINX Configuration

Before we dive into parsing, let's understand what NGINX configurations actually are.

What is NGINX?

NGINX (pronounced "engine-x") is a web server that can also function as a reverse proxy, load balancer, and HTTP cache. It's known for being fast, reliable, and handling high concurrent connections efficiently.

The Structure of NGINX Configs

NGINX configuration files are text files that use a simple, hierarchical syntax. Here's a basic example:

Simple NGINX configuration:

user nginx;
worker_processes auto;

http {
    server {
        listen 80;
        server_name example.com;

        location / {
            root /var/www/html;
            index index.html;
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Let's break down the key concepts:

1. Directives: Simple statements like user nginx; or listen 80;

2. Blocks: Containers for other directives, defined by { }. Common blocks include:

  • http - HTTP server configuration
  • server - Virtual server (like a virtual host)
  • location - URL path matching and handling

3. Context: Where a directive appears matters. Some directives only work in certain contexts (http, server, or location).

Common NGINX Configuration Patterns

Serving static files:

location / {
    root /var/www/html;
    try_files $uri $uri/ =404;
}
Enter fullscreen mode Exit fullscreen mode

Reverse proxy (forwarding to an app server):

location /api {
    proxy_pass http://localhost:3000;
    proxy_set_header Host $host;
}
Enter fullscreen mode Exit fullscreen mode

SSL/HTTPS configuration:

server {
    listen 443 ssl;
    ssl_certificate /etc/ssl/cert.pem;
    ssl_certificate_key /etc/ssl/key.pem;

    server_name secure.example.com;
}
Enter fullscreen mode Exit fullscreen mode

Why Parse NGINX Configs Programmatically?

You might wonder: "Why would I need to parse NGINX configs with code?" Here are some real-world scenarios:

1. Configuration Management

When managing dozens or hundreds of servers, you need to:

  • Verify all servers follow security policies
  • Check which servers have SSL enabled
  • Find which ports are being used across your infrastructure
  • Ensure logging is configured consistently

2. Monitoring and Observability

Building monitoring tools that need to:

  • Discover log file locations automatically
  • Find all upstream servers for health checks
  • Map which services are proxied where
  • Track configuration changes over time

3. Migration and Documentation

When migrating infrastructure, you need to:

  • Generate documentation from existing configs
  • Convert configs to other formats (Kubernetes, Terraform)
  • Analyze dependencies between services
  • Create inventory of your infrastructure

4. Log Analysis

For centralized logging systems:

  • Automatically discover all log files
  • Parse log format definitions to understand log structure
  • Route logs from different services appropriately

This is exactly where nginx-discovery comes in!

Getting Started with nginx-discovery

Installation

First, add nginx-discovery to your Rust project:

Cargo.toml:

[dependencies]
nginx-discovery = "0.2"
Enter fullscreen mode Exit fullscreen mode

Your First Parse

Let's start with the simplest example - parsing a basic NGINX configuration:

main.rs:

use nginx_discovery::parse;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let config = r#"
        server {
            listen 80;
            server_name example.com;
        }
    "#;

    let parsed = parse(config)?;
    println!("Successfully parsed {} directives", parsed.directives.len());

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

Run this and you'll see:

Successfully parsed 1 directives
Enter fullscreen mode Exit fullscreen mode

Simple, right? The parse() function takes NGINX configuration text and returns an Abstract Syntax Tree (AST) that represents the structure.

Basic Configuration Parsing

Now let's explore what we can do with parsed configurations.

Understanding the AST

When you parse a config, you get a tree structure:

use nginx_discovery::parse;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let config = r#"
        http {
            access_log /var/log/nginx/access.log;

            server {
                listen 80;
                server_name example.com;
            }
        }
    "#;

    let parsed = parse(config)?;

    // Find all 'server' blocks
    let servers = parsed.find_directives_recursive("server");
    println!("Found {} server blocks", servers.len());

    // Access directive arguments
    for server in servers {
        for listen in server.find_children("listen") {
            let port = &listen.args[0];
            println!("Server listening on port: {}", port);
        }
    }

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

Output:

Found 1 server blocks
Server listening on port: 80
Enter fullscreen mode Exit fullscreen mode

Using the High-Level API

For common tasks, nginx-discovery provides a convenient high-level API:

use nginx_discovery::NginxDiscovery;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let config = r#"
        http {
            log_format main '$remote_addr - $request';
            access_log /var/log/nginx/access.log main;

            server {
                server_name example.com www.example.com;
            }
        }
    "#;

    let discovery = NginxDiscovery::from_config_text(config)?;

    // Get all access logs
    let logs = discovery.access_logs();
    for log in &logs {
        println!("Log file: {}", log.path.display());
        if let Some(format) = &log.format_name {
            println!("  Format: {}", format);
        }
    }

    // Get all server names
    let names = discovery.server_names();
    println!("\nServer names: {:?}", names);

    // Get log formats
    let formats = discovery.log_formats();
    for format in &formats {
        println!("\nLog format '{}' uses variables:", format.name());
        for var in format.variables() {
            println!("  - ${}", var);
        }
    }

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

Output:

Log file: /var/log/nginx/access.log
  Format: main

Server names: ["example.com", "www.example.com"]

Log format 'main' uses variables:
  - $remote_addr
  - $request
Enter fullscreen mode Exit fullscreen mode

Extracting Server Blocks

One of the most powerful features in v0.2.0 is server block extraction. Let's explore this in detail.

Basic Server Extraction

use nginx_discovery::NginxDiscovery;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let config = r#"
        http {
            server {
                listen 80;
                server_name example.com;
                root /var/www/html;
            }

            server {
                listen 443 ssl;
                server_name secure.example.com;

                ssl_certificate /etc/ssl/cert.pem;
                ssl_certificate_key /etc/ssl/key.pem;
            }
        }
    "#;

    let discovery = NginxDiscovery::from_config_text(config)?;

    // Extract all servers
    let servers = discovery.servers();
    println!("Found {} server blocks\n", servers.len());

    for (i, server) in servers.iter().enumerate() {
        println!("Server {}:", i + 1);

        // Server names
        if !server.server_names.is_empty() {
            println!("  Names: {}", server.server_names.join(", "));
        }

        // Listen directives
        for listen in &server.listen {
            let ssl = if listen.ssl { " (SSL)" } else { "" };
            println!("  Listening: {}:{}{}", listen.address, listen.port, ssl);
        }

        // Root directory
        if let Some(root) = &server.root {
            println!("  Root: {}", root.display());
        }

        println!();
    }

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

Output:

Found 2 server blocks

Server 1:
  Names: example.com
  Listening: *:80
  Root: /var/www/html

Server 2:
  Names: secure.example.com
  Listening: *:443 (SSL)
Enter fullscreen mode Exit fullscreen mode

Finding SSL Servers

Here's a practical example - finding all SSL-enabled servers:

use nginx_discovery::NginxDiscovery;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let config = r#"
        server {
            listen 80;
            server_name http.example.com;
        }

        server {
            listen 443 ssl http2;
            server_name https.example.com;
        }

        server {
            listen 80;
            listen 443 ssl;
            server_name dual.example.com;
        }
    "#;

    let discovery = NginxDiscovery::from_config_text(config)?;

    // Get only SSL servers
    let ssl_servers = discovery.ssl_servers();

    println!("SSL-enabled servers:");
    for server in ssl_servers {
        if let Some(name) = server.primary_name() {
            println!("  - {}", name);

            // Check for HTTP/2
            for listen in &server.listen {
                if listen.ssl && listen.http2 {
                    println!("    (HTTP/2 enabled)");
                }
            }
        }
    }

    // Get all listening ports
    let ports = discovery.listening_ports();
    println!("\nAll ports in use: {:?}", ports);

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

Output:

SSL-enabled servers:
  - https.example.com
    (HTTP/2 enabled)
  - dual.example.com

All ports in use: [80, 443]
Enter fullscreen mode Exit fullscreen mode

Working with Locations and Proxies

Location blocks define how NGINX handles different URL paths. Let's see how to work with them:

Extracting Location Blocks

use nginx_discovery::NginxDiscovery;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let config = r#"
        server {
            server_name api.example.com;

            location / {
                root /var/www/docs;
            }

            location /api/v1 {
                proxy_pass http://backend-v1:3000;
            }

            location /api/v2 {
                proxy_pass http://backend-v2:4000;
            }

            location ~ \.php$ {
                fastcgi_pass unix:/var/run/php-fpm.sock;
            }
        }
    "#;

    let discovery = NginxDiscovery::from_config_text(config)?;
    let servers = discovery.servers();

    if let Some(server) = servers.first() {
        println!("Locations for {}:\n", 
            server.primary_name().unwrap_or("unknown"));

        for location in &server.locations {
            println!("Path: {}", location.path);

            if location.is_proxy() {
                println!("  Type: Reverse Proxy");
                if let Some(upstream) = &location.proxy_pass {
                    println!("  Upstream: {}", upstream);
                }
            } else if location.is_static() {
                println!("  Type: Static Files");
                if let Some(root) = &location.root {
                    println!("  Root: {}", root.display());
                }
            } else {
                println!("  Type: Other");
            }

            println!();
        }
    }

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

Output:

Locations for api.example.com:

Path: /
  Type: Static Files
  Root: /var/www/docs

Path: /api/v1
  Type: Reverse Proxy
  Upstream: http://backend-v1:3000

Path: /api/v2
  Type: Reverse Proxy
  Upstream: http://backend-v2:4000

Path: \.php$
  Type: Other
Enter fullscreen mode Exit fullscreen mode

Finding All Proxy Locations

This is super useful for service discovery:

use nginx_discovery::NginxDiscovery;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let config = r#"
        server {
            location / {
                root /var/www;
            }

            location /auth {
                proxy_pass http://auth-service:5000;
            }

            location /users {
                proxy_pass http://user-service:5001;
            }
        }
    "#;

    let discovery = NginxDiscovery::from_config_text(config)?;

    // Get all proxy locations
    let proxies = discovery.proxy_locations();

    println!("Found {} proxy locations:\n", proxies.len());
    for proxy in proxies {
        println!("{} -> {}", 
            proxy.path, 
            proxy.proxy_pass.as_ref().unwrap()
        );
    }

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

Output:

Found 2 proxy locations:

/auth -> http://auth-service:5000
/users -> http://user-service:5001
Enter fullscreen mode Exit fullscreen mode

Real-World Use Case: Building a Log Parser

Let's build something practical - a tool that discovers all NGINX log files and their formats, then helps you set up a centralized logging system.

The Problem

You have multiple NGINX servers with different configurations. You need to:

  1. Find all log files
  2. Understand their formats
  3. Configure a log collector (like Fluentd or Vector)

The Solution

Here's a complete example that reads an NGINX config and generates a log collection configuration:

Complete log discovery tool:

use nginx_discovery::NginxDiscovery;
use std::collections::HashMap;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Sample NGINX configuration
    let config = r#"
        http {
            log_format main '$remote_addr - $remote_user [$time_local] '
                            '"$request" $status $body_bytes_sent '
                            '"$http_referer" "$http_user_agent"';

            log_format json '{"remote_addr":"$remote_addr",'
                            '"time":"$time_local",'
                            '"request":"$request",'
                            '"status":$status}';

            access_log /var/log/nginx/access.log main;

            server {
                server_name api.example.com;
                access_log /var/log/nginx/api-access.log json;

                location /v1 {
                    access_log /var/log/nginx/api-v1.log main;
                    proxy_pass http://api-v1:3000;
                }
            }

            server {
                server_name web.example.com;
                access_log /var/log/nginx/web-access.log main;
            }
        }
    "#;

    let discovery = NginxDiscovery::from_config_text(config)?;

    // Get all log formats
    let formats = discovery.log_formats();
    let mut format_map: HashMap<String, Vec<String>> = HashMap::new();

    for format in &formats {
        format_map.insert(
            format.name().to_string(),
            format.variables().to_vec()
        );
    }

    // Get all access logs
    let logs = discovery.access_logs();

    println!("=== NGINX Log Discovery Report ===\n");

    println!("Found {} log formats:", formats.len());
    for format in &formats {
        println!("\nFormat: '{}'", format.name());
        println!("Variables: {}", format.variables().join(", "));
    }

    println!("\n\nFound {} log files:", logs.len());
    for log in &logs {
        println!("\nLog file: {}", log.path.display());
        println!("Context: {:?}", log.context);

        if let Some(format_name) = &log.format_name {
            println!("Format: {}", format_name);

            if let Some(vars) = format_map.get(format_name) {
                println!("Fields: {}", vars.join(", "));
            }
        } else {
            println!("Format: combined (default)");
        }
    }

    // Generate a sample Fluentd config
    println!("\n\n=== Sample Fluentd Configuration ===\n");

    for log in &logs {
        let source_name = log.path
            .file_stem()
            .and_then(|s| s.to_str())
            .unwrap_or("unknown");

        println!("<source>");
        println!("  @type tail");
        println!("  path {}", log.path.display());
        println!("  pos_file /var/log/fluentd/{}.pos", source_name);
        println!("  tag nginx.{}", source_name);

        if let Some(format_name) = &log.format_name {
            if format_name == "json" {
                println!("  <parse>");
                println!("    @type json");
                println!("  </parse>");
            } else {
                println!("  <parse>");
                println!("    @type nginx");
                println!("  </parse>");
            }
        }

        println!("</source>\n");
    }

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

Output:

=== NGINX Log Discovery Report ===

Found 2 log formats:

Format: 'main'
Variables: remote_addr, remote_user, time_local, request, status, body_bytes_sent, http_referer, http_user_agent

Format: 'json'
Variables: remote_addr, time_local, request, status


Found 4 log files:

Log file: /var/log/nginx/access.log
Context: Http
Format: main
Fields: remote_addr, remote_user, time_local, request, status, body_bytes_sent, http_referer, http_user_agent

Log file: /var/log/nginx/api-access.log
Context: Server("api.example.com")
Format: json
Fields: remote_addr, time_local, request, status

Log file: /var/log/nginx/api-v1.log
Context: Location("/v1")
Format: main
Fields: remote_addr, remote_user, time_local, request, status, body_bytes_sent, http_referer, http_user_agent

Log file: /var/log/nginx/web-access.log
Context: Server("web.example.com")
Format: main
Fields: remote_addr, remote_user, time_local, request, status, body_bytes_sent, http_referer, http_user_agent


=== Sample Fluentd Configuration ===

<source>
  @type tail
  path /var/log/nginx/access.log
  pos_file /var/log/fluentd/access.pos
  tag nginx.access
  <parse>
    @type nginx
  </parse>
</source>

<source>
  @type tail
  path /var/log/nginx/api-access.log
  pos_file /var/log/fluentd/api-access.pos
  tag nginx.api-access
  <parse>
    @type json
  </parse>
</source>
Enter fullscreen mode Exit fullscreen mode

This tool automatically:

  1. ✅ Discovers all log files across your NGINX config
  2. ✅ Identifies which format each log uses
  3. ✅ Extracts all variables from each format
  4. ✅ Generates Fluentd configuration to collect all logs
  5. ✅ Handles both standard and JSON log formats

Extending the Tool

You could extend this further to:

  • Parse actual log files based on discovered formats
  • Send logs to different destinations based on context
  • Generate Elasticsearch index templates
  • Create Grafana dashboard definitions
  • Build log volume predictions

Conclusion

The nginx-discovery library makes it easy to programmatically work with NGINX configurations. Whether you're building DevOps tools, monitoring systems, or configuration management solutions, this library provides the foundation you need.

Key Takeaways

  1. NGINX configs are structured - Understanding blocks, directives, and contexts is essential
  2. Parsing is easy - The high-level API handles most common use cases
  3. Server extraction is powerful - v0.2.0 adds comprehensive server block support
  4. Real-world applications - From log analysis to service discovery, the possibilities are endless

What's Next?

Here are some ideas for what you could build:

  • 🔍 Configuration auditor - Check if all servers follow security best practices
  • 📊 Infrastructure mapper - Visualize your microservices architecture
  • 🔄 Config migration tool - Convert NGINX configs to Kubernetes Ingress
  • 📝 Documentation generator - Auto-generate docs from your configs
  • 🚨 Monitoring setup - Automatically configure monitoring for all services

Resources

Get Involved

Have ideas for improvements? Found a bug? Contributions are welcome! Check out the GitHub repository and feel free to open issues or submit pull requests.


Happy parsing! 🦀

If you found this tutorial helpful, please consider:

  • ⭐ Starring the GitHub repo
  • 📝 Sharing this article
  • 💬 Leaving a comment with your use case

What will you build with nginx-discovery? Let me know in the comments below!

Top comments (0)