Collecting logs is only half the job. The real value comes from turning logs into insights.
If you are running Nginx on a server, you already have everything you need to understand:
- How many people visit your site
- Which pages are popular
- Where traffic comes from
- How bots and crawlers behave
- How much bandwidth you consume
In this article, we will explore four practical ways to analyze Nginx logs:
- GoAccess directly on the server
- GoAccess from S3-backed logs
- Third-party log analytics services
- Custom analytics using Python
1. What Is GoAccess?
GoAccess is a fast, open-source, terminal-based web log analyzer that can also generate HTML dashboards.
Key features:
- Real-time and offline analysis
- Works directly with Nginx access logs
- No JavaScript tracking on users
- Can visualize bots, visitors, requests, and bandwidth
- Produces a single static HTML file
GoAccess reads raw log files and generates analytics — no instrumentation required.
2. Understanding Nginx Log Format (Important)
Before using GoAccess, confirm your log format.
Most Nginx installations use the combined format:
log_format combined '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"';
This is critical because GoAccess must parse the format correctly.
3. Option 1 — GoAccess Directly from /var/log/nginx
This is the simplest and most common setup.
Installation
sudo apt install goaccess
Verify:
goaccess --version
Generate an HTML Report
sudo goaccess /var/log/nginx/access.log \
--log-format=COMBINED \
--date-format=%d/%b/%Y \
--time-format=%H:%M:%S \
-o /var/www/html/nginx_report.html
Open in browser:
http://your-domain/nginx_report.html
Advantages
- Fast
- No data movement
- Great for quick diagnostics
Limitations
- Only current log
- Loses history after rotation
- Not suitable for long-term analysis
4. Option 2 — GoAccess from Rotated Logs
To analyze historical data, include rotated logs:
sudo zcat /var/log/nginx/access.log*.gz /var/log/nginx/access.log | \
goaccess --log-format=COMBINED \
--date-format=%d/%b/%Y \
--time-format=%H:%M:%S \
-o /var/www/html/nginx_full_report.html
This gives you multi-day analytics.
5. Option 3 — GoAccess from S3 Backups (Recommended for Production)
If logs are backed up to S3, you can analyze them without touching the server.
Step 1: Download logs from S3
aws s3 sync s3://ngnix-logs/16-12-2025/ip-172-31-44-115/ ./logs/
Step 2: Analyze with GoAccess
zcat logs/access.log*.gz | \
goaccess --log-format=COMBINED \
--date-format=%d/%b/%Y \
--time-format=%H:%M:%S \
-o nginx_report_dec_2025.html
This can be done:
- On your laptop
- On a CI server
- On a reporting EC2 instance
Why This Is Powerful
- Zero load on production server
- Unlimited historical analysis
- Easy monthly or yearly reports
- Cheap (S3 storage is low-cost)
6. Handling Bots and Crawlers in GoAccess
GoAccess automatically detects known bots.
You can:
- View bots and humans together (default)
- Or exclude bots:
--ignore-crawlers
For transparency, do not ignore crawlers unless you explicitly need human-only stats.
7. Option 4 — Third-Party Log Analytics Services
If you do not want to manage infrastructure, consider hosted solutions.
Popular Options
- Datadog
- ELK Stack (Elastic Cloud)
- Splunk
- Sumo Logic
- CloudWatch Logs + Insights
How They Work
- Logs are shipped via agent or pipeline
- Data is indexed
- Queries and dashboards are built
Pros
- Powerful querying
- Alerting
- Correlation across services
Cons
- Cost
- Vendor lock-in
- Complexity for small projects
For early-stage or low-traffic projects, GoAccess is often sufficient.
8. Option 5 — Python-Based Log Analytics
If you want custom analytics, Python is a great choice.
Example: Counting Page Views per URL
import gzip
from collections import Counter
counter = Counter()
with gzip.open("access.log.2.gz", "rt") as f:
for line in f:
parts = line.split('"')
if len(parts) > 1:
request = parts[1]
path = request.split(" ")[1]
counter[path] += 1
for path, count in counter.most_common(10):
print(path, count)
When Python Makes Sense
- Custom metrics
- Feeding data to ML models
- Exporting to CSV or databases
- Advanced filtering
When It Doesn’t
- Real-time dashboards
- Visualization (GoAccess is better)
9. Automating GoAccess Reports
You can generate daily or monthly reports using cron.
Example (monthly):
0 1 1 * * zcat /var/log/nginx/access.log*.gz | goaccess \
--log-format=COMBINED \
--date-format=%d/%b/%Y \
--time-format=%H:%M:%S \
-o /var/www/html/nginx_report_last_month.html
10. Verifying Analytics Accuracy
Always sanity-check:
- Total requests vs AWS data transfer
- Page size × visits ≈ bandwidth
- Bot traffic vs real users
- Status code distribution
Logs never lie — dashboards sometimes do.
11. Choosing the Right Approach
| Scenario | Best Option |
|---|---|
| Small server | GoAccess local |
| Historical analysis | GoAccess + rotated logs |
| Production | GoAccess from S3 |
| Enterprise | Third-party tools |
| Custom logic | Python |
Final Thoughts
Analytics does not require invasive tracking scripts.
Your server already knows:
- Who visited
- What they requested
- How much data you served
By combining:
- Proper log rotation
- Reliable backups
- GoAccess analytics
…you gain visibility, cost awareness, and operational confidence with minimal complexity.
Logs are not just records.
They are observability without guesswork.
Top comments (0)