📋 Part of the OpenTelemetry ecosystem: The Filelog Receiver is a component of the OpenTelemetry Collector that transforms file-based logs into structured OpenTelemetry format. New to OpenTelemetry? Start with What is OpenTelemetry?
Traditional log collection ties you to specific backends. The OpenTelemetry Filelog Receiver breaks this lock-in by converting file logs into vendor-neutral format, allowing you to send data to any compatible backend (Uptrace, Jaeger, Elasticsearch, Splunk).
This guide takes you from basic file tailing to production-grade log pipelines with parsing, multiline handling, and Kubernetes integration.
What is Filelog Receiver
The Filelog Receiver reads log files from disk, parses their content, and converts them into OpenTelemetry log records. It's part of the OpenTelemetry Collector Contrib distribution.
Why OpenTelemetry format?
OpenTelemetry provides a vendor-neutral standard for telemetry data. Instead of being locked into proprietary formats, your logs become portable:
- Vendor-neutral: Send logs to any backend without changing instrumentation
- Structured: Consistent format with timestamps, severity, attributes
- Correlated: Link logs to distributed traces and metrics
- Standardized: Part of CNCF OpenTelemetry specification
💡 Learn more: Read our OpenTelemetry Architecture guide to understand how components fit together.
Core capabilities:
- Tails log files continuously
- Handles log rotation automatically
- Parses JSON, regex, and structured formats
- Combines multiline entries (stack traces)
- Checkpoints file positions across restarts
- Enriches logs with metadata
How it works - the 4-step loop:
flowchart LR
A[1. Discover Files] --> B[2. Read Lines]
B --> C[3. Parse & Transform]
C --> D[4. Emit Logs]
D --> A
-
Discover: Scans filesystem using
include/excludepatterns -
Read: Opens files and reads new lines based on
start_atsetting - Parse: Applies operators to structure raw text
- Emit: Sends structured logs to the pipeline
Quick Start: Your First Log Collection
📚 Prerequisites: This guide assumes you have the OpenTelemetry Collector installed. Need help getting started? See our Collector installation guide.
💡 Backend Configuration: Examples in this guide use Uptrace as the
observability backend. OpenTelemetry is vendor-neutral - you can send logs to
Jaeger, Grafana Cloud, Elasticsearch, Splunk, or any OTLP-compatible platform.
See backend examples at the end of this guide.
Let's start simple. Imagine your app writes JSON logs to /var/log/myapp.log:
{"time":"2025-10-06 14:30:00","level":"ERROR","message":"Connection failed","user_id":"123"}
Here's the minimal configuration:
receivers:
filelog:
include: [/var/log/myapp.log]
start_at: beginning
operators:
- type: json_parser
timestamp:
parse_from: attributes.time
layout: '%Y-%m-%d %H:%M:%S'
severity:
parse_from: attributes.level
exporters:
otlp/uptrace: # OTLP = OpenTelemetry Protocol
endpoint: api.uptrace.dev:4317
headers:
uptrace-dsn: 'YOUR_UPTRACE_DSN_HERE'
service:
pipelines:
logs:
receivers: [filelog]
exporters: [otlp/uptrace]
💡 About OTLP: OpenTelemetry Protocol (OTLP) is the standard protocol for transmitting telemetry data. Learn more about configuring exporters.
What happens:
- File is discovered and opened
- Each line is parsed as JSON
-
timebecomes the log timestamp -
levelbecomes severity - All fields become attributes
Result: Structured logs sent to your OpenTelemetry APM.
Understanding Operators
Operators are the building blocks of log processing. Each operator does one thing well:
Raw Log → [Operator 1] → [Operator 2] → [Operator 3] → Structured Log
Common Operators
JSON Parser - for structured JSON logs:
receivers:
filelog:
include: [/var/log/app/*.json]
operators:
- type: json_parser
timestamp:
parse_from: attributes.timestamp
layout: '%Y-%m-%dT%H:%M:%S.%fZ'
Regex Parser - for text logs:
receivers:
filelog:
include: [/var/log/app.log]
operators:
- type: regex_parser
regex: '^(?P<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) \[(?P<level>\w+)\] (?P<message>.*)$'
timestamp:
parse_from: attributes.time
layout: '2006-01-02 15:04:05'
severity:
parse_from: attributes.level
Example log:
2025-10-06 14:30:00 [ERROR] Database connection timeout
Add Attributes - enrich with metadata:
operators:
- type: add
field: attributes.environment
value: production
- type: add
field: attributes.service.name
value: payment-api
Container Parser - for Kubernetes (simplified):
receivers:
filelog:
include: [/var/log/pods/*/*/*.log]
include_file_path: true
operators:
- type: container # Handles all container formats automatically
🔗 Kubernetes Logs: For complete Kubernetes log collection, see our Kubernetes Log Management guide.
Automatically extracts:
- Kubernetes namespace, pod, container names
- Container runtime format (CRI-O, containerd, Docker)
- stdout/stderr stream
Multiline Logs: Stack Traces
Stack traces span multiple lines. Without configuration, each line becomes a separate log:
2025-10-06 14:30:00 ERROR Exception occurred
at com.example.App.main(App.java:42)
at java.lang.Thread.run(Thread.java:829)
Solution: Use multiline to combine them:
receivers:
filelog:
include: [/var/log/app/errors.log]
multiline:
line_start_pattern: '^\d{4}-\d{2}-\d{2}' # New entry starts with date
operators:
- type: regex_parser
regex: '^(?P<time>\S+) (?P<level>\w+) (?P<message>[\s\S]*)$'
Now the entire stack trace is captured as one log record.
Common patterns:
# Java exceptions
line_start_pattern: '^(\d{4}-\d{2}-\d{2}|Exception|Caused by:)'
# Python tracebacks
line_start_pattern: '^(Traceback|\d{4}-\d{2}-\d{2})'
# Go panics
line_start_pattern: '^(panic:|\d{4}/\d{2}/\d{2})'
Kubernetes Log Collection
🔗 Integration Tip: For automated deployment in Kubernetes, use the OpenTelemetry Operator which simplifies Collector management and auto-instrumentation.
Simple Method (Recommended)
Use the container operator for automatic parsing:
receivers:
filelog:
include: [/var/log/pods/*/*/*.log]
include_file_path: true
operators:
- type: container
processors:
batch:
timeout: 10s
exporters:
otlp/uptrace:
endpoint: api.uptrace.dev:4317
headers:
uptrace-dsn: 'YOUR_UPTRACE_DSN_HERE'
service:
pipelines:
logs:
receivers: [filelog]
processors: [batch]
exporters: [otlp/uptrace]
This automatically:
- Detects container runtime format
- Parses timestamps and metadata
- Handles partial log lines
- Extracts Kubernetes metadata
DaemonSet Deployment
Deploy as DaemonSet to collect from all nodes:
📖 Full Kubernetes Setup: See our complete Kubernetes Monitoring with OpenTelemetry guide for cluster-wide observability.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: otel-collector
namespace: monitoring
spec:
selector:
matchLabels:
app: otel-collector
template:
metadata:
labels:
app: otel-collector
spec:
containers:
- name: otel-collector
image: otel/opentelemetry-collector-contrib:0.136.0
volumeMounts:
- name: varlog
mountPath: /var/log
readOnly: true
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
State Persistence: Never Lose Logs
Without persistence, restarting the Collector means:
-
start_at: beginning→ Re-read everything (duplicates!) -
start_at: end→ Miss logs written during downtime
Solution: Checkpointing with storage extension:
extensions:
file_storage:
directory: /var/lib/otelcol/file_storage
timeout: 10s
receivers:
filelog:
include: [/var/log/app.log]
storage: file_storage # Link to storage extension
service:
extensions: [file_storage]
pipelines:
logs:
receivers: [filelog]
exporters: [otlp/uptrace]
What happens:
- File positions saved to disk
- On restart, reading resumes from last position
- No duplicates, no data loss
Essential for production!
Real-World Examples
NGINX Access Logs
receivers:
filelog/nginx:
include: [/var/log/nginx/access.log]
operators:
- type: regex_parser
regex: '^(?P<remote_addr>\S+) - (?P<remote_user>\S+) \[(?P<time>[^\]]+)\] "(?P<method>\S+) (?P<path>\S+) (?P<protocol>\S+)" (?P<status>\d{3}) (?P<bytes_sent>\d+)'
timestamp:
parse_from: attributes.time
layout: '02/Jan/2006:15:04:05 -0700'
severity:
parse_from: attributes.status
mapping:
"401": WARN
"5": ERROR
Handles:
192.168.1.1 - - [06/Oct/2025:14:30:00 +0000] "GET /api/users HTTP/1.1" 200 1234
PostgreSQL Logs
receivers:
filelog/postgres:
include: [/var/log/postgresql/*.log]
multiline:
line_start_pattern: '^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}'
operators:
- type: regex_parser
regex: '^(?P<timestamp>\S+ \S+ \w+) \[(?P<pid>\d+)\] (?P<level>\w+): (?P<message>[\s\S]*)$'
timestamp:
parse_from: attributes.timestamp
layout: '2006-01-02 15:04:05.000 MST'
severity:
parse_from: attributes.level
Application with Trace Context
receivers:
filelog/app:
include: [/var/log/app/*.json]
operators:
- type: json_parser
timestamp:
parse_from: attributes.timestamp
layout: '%Y-%m-%dT%H:%M:%S.%fZ'
trace:
trace_id:
parse_from: attributes.trace_id
span_id:
parse_from: attributes.span_id
Log example:
{
"timestamp": "2025-10-06T14:30:00.123Z",
"level": "error",
"msg": "Payment failed",
"trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
"span_id": "00f067aa0ba902b7"
}
Trace context links logs to distributed traces!
Troubleshooting
Logs Not Appearing
Problem: No logs show up in your backend.
Check 1: Permissions
# Verify collector can read files
ls -la /var/log/myapp.log
# Check collector process user
ps aux | grep otelcol
Fix:
# Add collector user to log group
usermod -a -G adm otelcol
# Or adjust permissions
chmod 644 /var/log/myapp.log
Check 2: start_at Setting
Default is end (only new logs). For testing, use:
start_at: beginning
Check 3: File Glob Pattern
# Test your pattern
ls /var/log/pods/*/*/*.log
# Verify files exist
find /var/log -name "*.log"
Regex Not Matching
Debug with test:
# Test pattern locally
echo "2025-10-06 14:30:00 ERROR Test" | grep -P '^\d{4}-\d{2}-\d{2}'
Add debug exporter:
exporters:
debug:
verbosity: detailed
service:
pipelines:
logs:
receivers: [filelog]
exporters: [debug]
Check logs:
journalctl -u otelcol -f | grep -i error
Multiline Not Working
Common mistake:
# Bad - pattern too specific
multiline:
line_start_pattern: '^2025-10-06'
# Good - flexible pattern
multiline:
line_start_pattern: '^\d{4}-\d{2}-\d{2}'
High Memory Usage
Cause: Too many files or large entries.
Solutions:
- Exclude unnecessary files:
exclude:
- /var/log/debug-*.log
- /var/log/archive/*.log
- Limit entry size:
receivers:
filelog:
max_log_size: 1MiB
- Add memory limiter:
processors:
memory_limiter:
check_interval: 1s
limit_mib: 512
Performance Optimization
Batching
Reduce network calls:
processors:
batch:
timeout: 10s
send_batch_size: 1024
service:
pipelines:
logs:
receivers: [filelog]
processors: [batch]
exporters: [otlp/uptrace]
Guidelines:
- Low latency:
timeout: 1-5s - High volume:
send_batch_size: 1024-2048 - Balance both for production
Resource Limits
processors:
memory_limiter:
check_interval: 1s
limit_mib: 512
spike_limit_mib: 128
batch:
timeout: 10s
Important: Memory limiter must be first processor!
Polling Interval
receivers:
filelog:
poll_interval: 500ms # Default: 200ms
- Lower: More CPU, faster detection
- Higher: Less CPU, higher latency
- Default (200ms) good for most cases
Advanced Features
Compressed Files
receivers:
filelog:
include: [/var/log/archive/*.log.gz]
compression: auto # or 'gzip'
Header Metadata
receivers:
filelog:
include: [/var/log/app.log]
start_at: beginning
header:
pattern: '^# METADATA:.*$'
metadata_operators:
- type: regex_parser
regex: 'version="(?P<app_version>[^"]+)"'
File example:
# METADATA: version="1.2.3" env="production"
2025-10-06 14:30:00 INFO App started
Metadata added to all logs in the file.
Router Operator
Route logs based on content:
receivers:
filelog:
include: [/var/log/app.log]
operators:
- type: json_parser
- type: router
routes:
- output: error_logs
expr: 'attributes.level == "ERROR"'
- output: default_logs
expr: 'true'
- id: error_logs
type: add
field: attributes.alert
value: true
- id: default_logs
type: noop
FAQ
-
Why aren't my logs appearing?
Possible causes include:- Collector can't read files or directories due to permission issues.
-
start_at: endonly reads new logs (usebeginningfor testing). - The glob pattern may not match files specified in the
includesetting.
How do I handle log rotation?
Log rotation is handled automatically. The receiver tracks files by fingerprint, not filename.
Whenapp.logrotates toapp.log.1, it finishes reading the old file and starts the new one.-
What's the difference between attributes and resource?
- Attributes: Log-level metadata (varies per log)
- Resource: Service-level metadata (shared across all logs from the same service)
operators:
- type: add field: attributes.request_id # Per-log value: "req-123"
- type: add field: resource["service.name"] # Per-service value: "api"
Can I parse logs from multiple services?
Yes. Use multiple receivers:
receivers:
filelog/app1:
include: [/var/log/app1/*.log]
operators:
- type: add
field: attributes.service.name
value: app1
filelog/app2:
include: [/var/log/app2/*.log]
operators:
- type: add
field: attributes.service.name
value: app2
service:
pipelines:
logs:
receivers: [filelog/app1, filelog/app2]
exporters: [otlp/uptrace]
How do I test regex patterns?
Use Regex101 with the "Golang" flavor to match the Collector's regex engine.What if I have mixed compressed and uncompressed files?
receivers:
filelog:
include: [/var/log/app/*.log, /var/log/app/*.log.gz]
compression: auto # Auto-detects .gz files
- How do I move attributes to resource?
operators:
- type: move
from: attributes["file.name"]
to: resource["log.file.name"]
- Can I filter logs before sending? Yes. Use a filter processor:
processors:
filter:
logs:
exclude:
match_type: regexp
record_attributes:
- key: level
value: DEBUG
- How do I parse key-value logs?
operators:
- type: regex_parser
regex: '(\w+)=("[^"]*"|\S+)'
parse_from: body
Example: user=john status=200 duration=123ms
-
What's the max file size limit?
There’s no hard limit, but you can usemax_log_sizeto prevent memory issues:
receivers: filelog: max_log_size: 1MiB # Per log entry
Backend Examples
This guide uses Uptrace in examples, but OpenTelemetry works with any OTLP-compatible backend. Here are quick configuration examples for other platforms:
Uptrace
exporters:
otlp/uptrace:
endpoint: api.uptrace.dev:4317
headers:
uptrace-dsn: 'YOUR_UPTRACE_DSN'
Grafana Cloud
exporters:
otlp:
endpoint: otlp-gateway.grafana.net:443
headers:
authorization: "Bearer YOUR_GRAFANA_TOKEN"
Jaeger
exporters:
otlp:
endpoint: jaeger-collector:4317
tls:
insecure: true
Datadog
exporters:
otlp:
endpoint: trace.agent.datadoghq.com:4317
headers:
dd-api-key: "YOUR_DATADOG_API_KEY"
New Relic
exporters:
otlp:
endpoint: otlp.nr-data.net:4317
headers:
api-key: "YOUR_NEW_RELIC_LICENSE_KEY"
Prometheus (metrics only)
exporters:
prometheus:
endpoint: "0.0.0.0:8889"
Multiple Backends
Send data to multiple platforms simultaneously:
exporters:
otlp/uptrace:
endpoint: api.uptrace.dev:4317
headers:
uptrace-dsn: 'YOUR_UPTRACE_DSN'
otlp/datadog:
endpoint: trace.agent.datadoghq.com:4317
headers:
dd-api-key: "YOUR_DATADOG_API_KEY"
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp/uptrace, otlp/datadog] # Send to both!
More backends: See the OpenTelemetry Vendors directory for 40+ compatible platforms.
Top comments (0)