Part 1: The Importance Of A Robots.txt File

Part 1: The Importance Of A Robots.txt File

You could be surprised to hear that one small text file, known as robots.txt, might be the fall of your site. Consequently, it is essential that you understand the purpose of a robot.txt file in search engine optimization and learn how to check that you are using it correctly. A robots.txt file provides instructions to web robots on the pages the website owner does not wish to be crawled. You’ll need to create a new text file and save it as “robots” (you can use the Notepad program on a Windows PC or TextEdit for Mac), and then Save As a text delimited file.


Upload it to the root directory of your site.

This is usually a root level folder called htdocs, or www, which makes it appear immediately after your domain name. In case you use subdomains, you will need to create a robot.txt file for each subdomain.


What to include in the robots.txt file

There is frequently disagreements over what should and should not be put in robots.txt files. Please be aware again that robots.txt is not meant to deal with security issues for your website, therefore I’d recommend that the location of any admin or private pages on your site are not included in the robots.txt file because those locations would be visible to anyone viewing the file.

You may include a cache box, image files, or irrelevant part of a forum or adult section of an internet site, for instance. Any URLs including the path disallowed will be excluded by the search engines. You can type in some of the following to control where robots can and can’t go.


Allow Googlebot to index every part of your site

User-agent: Googlebot

Allow: /


Allow everything apart from certain files

Occasionally you might wish to display the media on your website or provide documents, but do not want them to appear on image search results pages, social network previews or document internet search engine listings. Files you might want to block can be any animated GIFs, PDF instruction manuals or any development PHP files, for example, shown below:

User-agent: Googlebot

Disallow: /gif$

Disallow: /pdf$

Disallow: /php$


Click here to continue on to Part 2 of this post.

