This Is What’s Really Hitting Your Website (Hint: Not People)
I wanted to understand how much of our traffic was actually human, so I pulled and analyzed 48 hours of raw request logs.
No filters, no analytics layer, just direct log data.
Time Frame
Start: 2026-03-31 10:00 UTC
End: 2026-04-02 10:00 UTC
All requests within that window were grouped by path patterns and behavior.
Traffic Breakdown
Requests were classified into four categories:
- WordPress probing (paths containing wp)
- XMLRPC access attempts
- PHP endpoint probing
- General scanning and enumeration
Results:
- WordPress probes: 34 percent
- XMLRPC attempts: 18 percent
- PHP probes: 27 percent
- Other scanning: 21 percent
Roughly 79 percent of requests were not normal user activity.
Sample of Active IPs
Below is a subset of IPs with the highest request volume or repeated attack patterns during the window:
185.220.101.45 WordPress login brute force patterns
45.146.165.12 XMLRPC pingback attempts
103.248.70.33 PHP endpoint scanning
91.134.23.198 Multi-path probing (/admin, /login, /.env)
176.65.148.92 High-frequency requests consistent with botnet behavior
198.54.117.210 Credential stuffing attempts
5.188.62.76 Known scanner signature patterns
194.147.142.88 Repeated wp-login hits
212.83.150.120 PHPMyAdmin probing
139.59.37.12 Generic crawler with attack signatures
Many of these generated hundreds to thousands of requests over the 48 hour period.
Observed Attack Patterns
WordPress Probing
Even on non-WordPress systems, these paths were repeatedly hit:
- /wp-login.php
- /wp-admin/
- /wp-content/plugins/
This is automated scanning, not targeted behavior.
XMLRPC Access
Frequent hits on:
/xmlrpc.php
Common uses include pingback abuse and brute force via API endpoints.
PHP File Probing
Requests targeting common configuration and entry points:
- /index.php
- /config.php
- /.env
- /db.php These are looking for exposed configs or weak deployments.
Credential Stuffing
Repeated requests to:
- /login
- /admin
- /api/auth
Often with high frequency and rotating IPs.
What This Means
If you rely on standard analytics:
Traffic volume may be inflated
Engagement metrics may be misleading
Infrastructure may be handling unnecessary load
More importantly, this traffic is constant. It is not tied to visibility or popularity. Any exposed service will receive it.
Internal Response
After seeing this across multiple systems, we started aggregating this data instead of treating each site in isolation.
The approach:
Track IPs across multiple deployments
Classify behavior based on request patterns
Identify repeat offenders
Apply blocking rules based on shared observations
This evolved into a simple shared threat dataset.
Threat Network Concept
Instead of reacting per site:
An IP flagged on one system is known to others
Patterns such as WordPress probing or XMLRPC abuse are categorized
Repeated behavior increases confidence in classification
Blocking decisions become faster and more consistent
This reduces duplicate analysis and speeds up mitigation.
Outcome
After applying filtering based on this data:
Cleaner traffic metrics
Reduced unnecessary requests
Lower noise in logs
Better visibility into actual users
Closing
The main takeaway from this dataset is straightforward.
A large portion of inbound traffic to public web services is automated and non-user driven.
This data is from a limited 48 hour window across a small set of systems. Patterns may vary, but the presence of automated scanning is consistent.
If you are interested in testing this type of visibility or contributing additional data points, I am running a small beta around this approach.
Top comments (0)