DEV Community

Daniel Fenton
Daniel Fenton

Posted on • Originally published at website.auditmy.co.uk

How to Write a Robots.txt File for Your Small Business Website

What on earth is a robots.txt file?

Imagine you run a bakery and a journalist turns up wanting to photograph everything. You might say, "Yes, photograph the front of house, but please don't go into the kitchen or the staff room." A robots.txt file does exactly that for your website. It is a small text file that tells search engine robots [automated programmes that crawl and read websites] which pages they are allowed to look at and which they should leave alone.

Every time Google, Bing, or any other search engine wants to index [add to its list of searchable pages] your website, its robot visits your robots.txt file first. It reads the instructions, then behaves accordingly.

Robots.txt is not a security measure. It is a polite set of instructions, not a lock on the door. A well-behaved search engine will follow it. A badly behaved one might not. Never rely on robots.txt to hide sensitive information.

Why does this matter to your business?

If your robots.txt file is missing, wrong, or set to block everything, Google simply cannot index your website properly. Fewer people find you when they search for what you sell.

This is one of the most common reasons a small business website disappears from search results without any obvious explanation. If you have ever wondered why your website is not showing up on Google, a misconfigured robots.txt file is often the culprit.

Getting this right takes about ten minutes once you understand what you are doing.

Where does the file live?

Your robots.txt file must always sit at the root [the top level, main address] of your website. That means it should be accessible at a web address that looks like this:

https://www.yourbusiness.co.uk/robots.txt
Enter fullscreen mode Exit fullscreen mode

Try that now with your own web address. Type your website address into a browser, add /robots.txt at the end, and press enter. If you see a page of text, you already have one. If you see a blank page or an error, you need to create one.

What does a robots.txt file actually look like?

Here is the simplest possible robots.txt file, the one that most small business websites need:

User-agent: *
Disallow:
Enter fullscreen mode Exit fullscreen mode

That is genuinely it. Two lines. Here is what each part means.

User-agent: refers to which robot you are talking to. The asterisk [the star symbol] means "everyone" or "all robots." So User-agent: * means "this instruction applies to all search engine robots."

Disallow: tells the robot which pages it cannot visit. When you leave it blank after the colon, it means "there is nothing off limits, you can look at everything." This is the right setting for most small business websites.

💡 If your robots.txt file currently contains Disallow: / with a forward slash after the colon, you have accidentally blocked every search engine from your entire website. Change it to Disallow: with nothing after the colon as soon as possible.

When would you want to block something?

There are a few cases where telling robots to stay away from certain pages actually makes sense.

Login pages, thank-you pages that appear after a purchase, and internal search results pages are not worth Google indexing. They do not help customers find you, and a site cluttered with low-quality indexed pages can work against you in search rankings.

Here is how you would block specific pages:

User-agent: *
Disallow: /thank-you/
Disallow: /login/
Disallow: /search-results/
Enter fullscreen mode Exit fullscreen mode

Each Disallow: line blocks one folder or page. The forward slash before the page name matters. Do not leave it out.

⚠️ Never block your CSS [the files that control how your website looks] or JavaScript [the files that make your website interactive] files. Google needs to see those to understand your site properly. An older piece of advice used to suggest blocking them, but that guidance is well out of date.

A note about sitemaps

A sitemap [a file that lists all the pages on your website, a bit like a contents page] is a separate file from robots.txt, but the two work well together. You can point robots towards your sitemap by adding one line at the bottom of your robots.txt file:

User-agent: *
Disallow:

Sitemap: https://www.yourbusiness.co.uk/sitemap.xml
Enter fullscreen mode Exit fullscreen mode

This helps search engines find all your pages more quickly. If you are not sure whether you have a sitemap, try visiting your web address followed by /sitemap.xml and see what comes up.

Once you have your sitemap sorted, submitting it directly through Google Search Console is a smart next step. It is a free tool from Google that shows you how your site is performing in search results.

How to actually create or edit the file

This depends on what platform your website is built on. For most popular platforms, you either do not need to touch robots.txt manually, or there is a simple settings panel that handles it for you.

WordPress
If you use an SEO plugin like Yoast or Rank Math, go to the plugin settings and look for a section called "Tools" or "Edit robots.txt." You can edit the file directly from there without touching any code. If you do not have an SEO plugin, you can use a free plugin called "WP Robots Txt" to manage it.


Wix
Wix generates a robots.txt file for you automatically. You can customise it by going to your Wix dashboard, clicking Settings, then SEO Tools, then Robots.txt. Be careful here as the interface can be a little confusing. If in doubt, leave the default settings in place.


Squarespace
Squarespace does not let you fully edit robots.txt on most plans. It generates one automatically. On higher-tier plans you have more control. If you are on a basic plan and need changes made, contact Squarespace support directly.


Shopify
Shopify controls your robots.txt file for you and does a reasonable job by default. You can edit it using Liquid [Shopify's own template language] in your theme files, but this is best left to a developer unless you are confident. For most small shops, the default is absolutely fine.


Custom or bespoke website
If your website was built by a developer, ask them to check the robots.txt file for you. Share this article with them and ask them to confirm the file matches what is described here. You should not have to pay much, if anything, for a quick check like this.


The most common mistakes to avoid

Blocking the whole website with Disallow: / is the biggest one, and it happens more often than you would think. It is usually left over from when the site was being built and the developer did not want Google indexing a half-finished website. They forgot to change it before launch.

The second most common mistake is having no robots.txt file at all. Search engines will still crawl your site, but you lose the ability to guide them, and some robots may slow down when they cannot find the file they expect.

Typos and formatting errors are the third problem. Robots.txt is very literal. A space in the wrong place or a missing forward slash can make an instruction meaningless or cause unintended blocking.

💡 After making any changes to your robots.txt file, run a free check at website.auditmy.co.uk to see whether search engines can still reach your key pages. It takes about thirty seconds and flags the most common problems in plain English. You might also want to read our guide on why your website might not be showing up on Google while you are at it.

A simple robots.txt file for most small businesses

If you want a safe, sensible starting point that works for the vast majority of small business websites, here it is:

User-agent: *
Disallow:

Sitemap: https://www.yourbusiness.co.uk/sitemap.xml
Enter fullscreen mode Exit fullscreen mode

Replace the web address in the sitemap line with your actual address. Save the file as a plain text file called robots.txt with no other words in the name. Upload it to the top level of your website. That is all there is to it.

This file says: every robot is welcome, nothing is off limits, and here is where to find the full list of pages. Simple, honest, and effective.

One last thing

Robots.txt sits quietly in the background doing its job when it is right, and causing real problems when it is wrong. Most small business owners never think about it, which is exactly why checking it is such a useful quick win.

Ten minutes today could be the difference between your website appearing in search results and sitting invisible while your competitors take the customers you should be getting.

Top comments (0)