loading...
Cover image for 3 things you might see in your logs once your site is public

3 things you might see in your logs once your site is public

intricatecloud profile image Danny Perez ・4 min read

You've finished deploying your website to its new domain. You start to see your normal user traffic, but then you also notice funny patterns in your access logs. Here's a few examples of things you might see in your access logs once your site or API is public in production.

  1. Automated vulnerability scanners from all over the world
  2. Crawlers
  3. SQL Injection attempts

Scanners

You know what regularly shows up? Scanners. Lots and lots of open source scanners. IPs originating from all over the world China, Ukraine, Russia. Randomly throughout the week, we'll see scanner traffic. What does scanner traffic look like?

There's a few signs you can use to tell:

  • You have slightly elevated error rates for a sustained period of ~30 minutes and suddenly dies off. A mix of 4xx's and 5xx's. elevated error rates
  • It's requesting paths that don't exist in your application.
  • Some scanners use a specific User-Agent header that identifies itself as a scanner

You might see requests like this...

x.x.x.x - - [07/Apr/2018:02:50:27 +0000] "GET /wp-json/oembed/1.0/embed?url=..%2F..%2F..%2F..%2F..%2F..%2F..%2FWindows%2FSystem32%2Fdrivers%2Fetc%2Fhosts%00&format=xml HTTP/1.0" 404 132 "http://www.zzzz.yyy/xxxx" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.57.2 (KHTML, like Gecko) Version/5.1.6 Safari/534.57.2"

In there is an encoded URL: ../../../../../../../Windows/System32/drivers/etc/hosts&. This type of request looks mainly like a recon attempt to see if it can even get to files in that directory.

If you're running a wordpress site, always be prepared to receive a barrage of known exploits to your site. We're not even running a wordpress website and we get this type of thing in our access logs. Included there is a revslider_show_image vulnerability to steal wp-config file from 2014.

| count | uri                                                                           |
|-------|-------------------------------------------------------------------------------|
| 12    | /wp-admin/admin-ajax.php                                                      |
| 6     | /wp-admin/admin-ajax.php?action=revslider_show_image&img=../wp-config.php     |
| 6     | /wp-admin/tools.php?page=backup_manager&download_backup_file=../wp-config.php |
| 3     | /wp-admin/wp-login.php?action=register                                        |
| 2     | /help/wp/wp-admin/setup-config.php                                            |
| 1     | /help/new/wp-admin/setup-config.php                                           |
| 1     | /help/wp-admin/setup-config.php                                               |
| 1     | /wp-admin/js/password-strength-meter.min.js?ver=4.9.1                         |
| 1     | /wp-admin/js/password-strength-meter.min.js?ver=4.9.2                         |

Depending on how your application is hosted, you may even see increased latencies during a scan:
increased latency graph

Crawlers

These are innocuous. Just Google and Bing indexing content. Typical internet being the internet. This particular crawler is nice enough to include a User Agent header that sends you to a site that says exactly what they do with all the data they collect. Check it out - http://www.exensa.com/crawl

x.x.x.x [30/Mar/2018:04:39:15 +0000] "GET /robots.txt HTTP/1.1" 404 136 "-" "Barkrowler/0.7 (+http://www.exensa.com/crawl)"

SQL injection attacks

You might see these recon attacks - you can take a known public URL and add a URL-encoded SELECT statement to see if anything funny comes out in the response. Like this:

x.x.x.x [11/Apr/2018:16:09:37 +0000] "GET /REDACTED&response_mode=%28SELECT%20%28CHR%28113%29%7C%7CCHR%28107%29%7C%7CCHR%28120%29%7C%7CCHR%28107%29%7C%7CCHR%28113%29%29%7C%7C%28SELECT%20%28CASE%20WHEN%20%285423%3D5423%29%20THEN%201%20ELSE%200%20END%29%29%3A%3Atext%7C%7C%28CHR%28113%29%7C%7CCHR%28122%29%7C%7CCHR%28112%29%7C%7CCHR%2898%29%7C%7CCHR%28113%29%29%29&response_type=code&scope=openid HTTP/1.1" 302 0 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/532.0 (KHTML, like Gecko) Chrome/4.0.202.0 Safari/532.0"

There's a SQL injection string in there that is URL encoded. If you decode it, you get this

(SELECT (CHR(113)||CHR(107)||CHR(120)||CHR(107)||CHR(113))||(SELECT (CASE WHEN (5423=5423) THEN 1 ELSE 0 END))::text||(CHR(113)||CHR(122)||CHR(112)||CHR(98)||CHR(113)))

These are tests for injection. If the response contains something out of the ordinary, they know they can do it. Not entirely sure whats going on here though.

Defend against it

Here's a few things that you can use to keep most of these people away.

  • Use security groups (in AWS) / iptables to limit inbound traffic only to the ports that you need. If only and 80/443 are open, you're primarily prone to web application attacks instead of SSH/FTP/DNS and the like. Here's an AWS Whitepaper on Security Best Practices

  • Periodically review your logs to see what's going on. If you see SQL injection attempts, see which calls have returned a 200/500 - this means something potentially useful has made it to attacker.

  • Outright reject anything thats not even remotely close to your applications URLs - For example, you'd want to block Windows paths when your application is hosted on Linux like paths containing System32, cmd.exe, C:\\ in your access logs.

  • Run a scan against your application locally and be one step ahead of hackers. Know what you're exposed to first. Check out these open source scanners

  • Many of these problems go away if you host your static website on AWS S3 - here's a guide I wrote if you're doing it for the first time

  • if you're looking for an alternative to Wordpress? See this guide for building a blog with gatsby.js

If you found this helpful, you can ❤️ it and follow me here on dev.to 😄

Posted on by:

intricatecloud profile

Danny Perez

@intricatecloud

DevOps Engineer & Engineering Manager at an ed-tech company helping our teams ship software quickly and reliably.

Discussion

markdown guide
 

You need to add fail2ban to your list. An appropriately tuned fail2ban goes a long way towards cutting down on how quickly your logs fill up with scanners and brute-forcers.

 

Nice write-up Danny. It might also be worth talking about how things like AWS WAF can be used to prevent some of these scans/attacks.

 

Funny story about WAF. We had enabled it and were piping the logs to our splunk instance - a few days later we had found that WAF was dumping 300GB/day! It very helpfully drops the full request context which was great albeit a little verbose.