Filtering out unwanted website traffic to improve SEO
My solo project which runs through my limited company is a cryptocurrency analytics platform called Crypto Statto. When you are immensely busy working on the underlying infrastructure, architecture and functionality, the small things get left until the end. Having now released a fairly coherent version of the platform the main goal is to get traffic to the site and users joining up. One of the big problems is unwanted traffic.
When you are trying to get reasonably accurate website statistics and you look into the traffic, you can be forgiven for thinking that you're getting more traffic than you think.
This is a problem when trying to gauge increases in traffic especially if you're attempting to look at finals on the potential of getting investment for your new venture/startup.
About the website technology and infrastructure
The website runs in dotnet core, served by (Internet Information Services) IIS hosting. Effectively the pipeline is for requests to go to IIS which are then passed down to the dotnet core app. Naturally, unwanted excessive requests are not only an irritation, but also they're consuming massive amounts of resources on the web server in the form of extra memory usage and CPU.
My website uses SmarterStats, which is pretty cool. It contains a lot of reports and different functionality within it.
Discovering that a lot of traffic is malevolent traffic - not all traffic is good traffic
When digging into my web traffic I found that large numbers of requests were attempting to access PHP extensions and sundry such as folders related to WordPress.
Not being that familiar with PHP it's obvious that there are people trying to scan the website for potential security vulnerabilities.
One of the most incredible parts of this is that bots and scammers were looking to try and find if I had a ".git" extension - incredible!
In the last week - "/wordpress/" was requested 158 times for example.
Here we have cheeky tykes hammering my website looking for security vulnerabilities that will never be there, unnecessarily making my application work even harder.
The short term solution - web.config
The short term solution is to add filters to web.config to redirect to block and redirect requests.
Here is a subset the web dot config.
<system.webServer>
<rewrite>
<rules>
<rule name="Block unwanted endpoints" stopProcessing="true">
<match url="^(wp/?|bc/?|bk/?|backup/?|old/?|new/?|main/?|\.git/config)$" ignoreCase="true" />
<action type="CustomResponse" statusCode="404" statusReason="Not Found" statusDescription="Resource not found." />
</rule>
<rule name="Redirect example.com to www" patternSyntax="ECMAScript" stopProcessing="true">
<match url=".*$" />
<conditions logicalGrouping="MatchAll">
<add input="{HTTP_HOST}" pattern="^www" negate="true" />
</conditions>
<action type="Redirect" url="https://www.cryptostatto.com{REQUEST_URI}" appendQueryString="false" redirectType="Permanent" />
</rule>
<rule name="HTTPS force" enabled="true" stopProcessing="true">
<match url=".*$" />
<conditions logicalGrouping="MatchAll">
<add input="{HTTPS}" pattern="^OFF$" />
</conditions>
<action type="Redirect" url="https://www.cryptostatto.com{REQUEST_URI}" appendQueryString="false" redirectType="Permanent" />
</rule>
</rules>
</rewrite>
</system.webServer>
Filtering out these searches within my dotnet core application
Yes this can be done. However, we are asking the web application to do extra unnecessary work.
Filtering out these searches from my website traffic stats monitor
The thing I did not want to do first off was to filter these URLs out of my traffic. This would have made my statistics more accurate but my website would still be getting hammered by these spurious requests. I will probably look at the filtering options within SmarterStats to see if I can ensure that these URLs are not included within my statistics.
More long term solutions
There are much better long term solutions, but when you're trying to run a website as low cost as possible, this at least can help deflect some of the noise in any website stats when producing data analytics on website traffic.
The obvious more long term solution is to move certain features of the application to separate microservices (API), cloud based hosting, and NoSQL databases, with better security infrastructure.
Some will say, what about Cloudflare? Maybe at some point.
Hope it helps
The benefits are hopefully;
- Using IIS to do more of the upfront management of website traffic
- Less unwanted traffic
- Maybe bots stop attempting to access my site - big hope
- Less drain on application CPU and memory
- Better website traffic statistics
You can find my site at https://www.cryptostatto.com
Top comments (0)