What is a robots.txt file?
A robots.txt file is a text file having various commands. We can create robots.txt file easily as per our requirements. The various commands in this file instruct crawler of various search engines about indexing or not indexing various pages of a website.
In other words we can say that we can have control on various search engine crawler with the help of robots.txt file. It is case sensitive file.
It should be robots.txt. The Robots.txt or robots.Txt plays different role than robots.txt. It is most important technical part of Search Engine Optimization(SEO).
How to find it?
A robots.txt file can be found very easily. You can see it by adding robots.txt after the homepage URL of any site if the site has a robots.txt file.
In the blow example you can see the robots.txt file of https://bloggersmaker.com/.
A robots.txt is added to the root of URL. Then it has become https://bloggersmaker.com/robots.txt.
User-agent: It is used for a web crawler of a search engine to follow the crawl instructions. You can give instruction to the crawler of a particular search engine or to the crawler of all search engines.
For a single crawler User-agent: Google
For all the crawler User-agent: *
Disallow: This command is used to give instruction not to crawl a particular URL to the specific user-agent. One URL is allowed for a single Disallow:
To exclude home page or root Disallow: /
To exclude thank you page Disallow: /thankyou/
Sitemap: It is used to call out the location of any XML sitemap(s) associated with this URL. The command is only supported by Google, Ask, Bing, and Yahoo.
Allow: This command is used to tell only googlebot to crawl a particular page. Its parent page may be disallowed.
Examples of various commands
To exclude all robots from the entire server:
User-agent: * Disallow: /
To allow all robots complete access:
User-agent: * Disallow:
To exclude all robots from part of the server:
User-agent: * Disallow: /tmp/ Disallow: /junk/
To exclude a single robot:
User-agent: google Disallow: /
To allow a single robot:
User-agent: Google Disallow:
How to create robots.txt
It is very easy to create a robots.txt for any website. It is advisable to consider following points.
- Any text editor i.e Notepad can be used.
- The text editor should be able to create standard UTF-8 text files.
- Avoid word processor.
- It must be named as robots.txt.
- A site can have only one robots.txt.
- The robots.txt file must be located at the root of the website host to which it applies as explained above with the help of https://bloggersmaker.com/robots.txt.