robots.txt file

robots.txt file

What is a robots.txt file?

 A robots.txt file is a text file having various commands. We can create robots.txt file easily as per our requirements. The various commands in this file instruct crawler of various search engines about indexing or not indexing various pages of a website.

In other words we can say that we can have control on various search engine crawler with the help of robots.txt file. It is case sensitive file.

It should be robots.txt. The Robots.txt or robots.Txt plays different role than robots.txt. It is most important technical part of Search Engine Optimization(SEO).

How to find it?

A robots.txt file can be found very easily. You can see it by adding robots.txt after the homepage URL of any site if the site has a robots.txt file.

In the blow example you can see the robots.txt file of https://bloggersmaker.com/. 

A robots.txt is added to the root of  URL. Then it has become https://bloggersmaker.com/robots.txt.

Basic Terms

User-agent: It is used for a web crawler of a search engine to follow the crawl instructions. You can give instruction to the crawler of a particular search engine or to the crawler of all search engines.

For a single crawler
User-agent: Google
For all the crawler
User-agent: *

Disallow: This command is used to give instruction not to crawl a particular URL to the specific user-agent. One URL is allowed for a single Disallow:

To exclude home page or root
Disallow: /
To exclude thank you page
Disallow: /thankyou/

Sitemap: It is used to call out the location of any XML sitemap(s) associated with this URL.  The command is only supported by Google, Ask, Bing, and Yahoo.

Sitemap: https://bloggersmaker.com/sitemap_index.xml

Allow: This command is used to tell only googlebot to crawl a particular page. Its parent page may be disallowed.

Allow: /wp-admin/admin-ajax.php

Examples of various commands

To exclude all robots from the entire server:

User-agent: *
Disallow: /

To allow all robots complete access:

User-agent: *
Disallow:

To exclude all robots from part of the server:

User-agent: *
Disallow: /tmp/
Disallow: /junk/

To exclude a single robot:

User-agent: google
Disallow: /

To allow a single robot:

User-agent: Google
Disallow:
How to create robots.txt

It is very easy to create a robots.txt for any website. It is advisable to consider following points.

  • Any text editor i.e Notepad can be used.
  • The text editor should be able to create standard  UTF-8 text files.
  • Avoid word processor.
  • It must be named as robots.txt.
  • A site can have only one robots.txt.
  • The robots.txt file must be located at the root of the website host to which it applies as explained above with the help of https://bloggersmaker.com/robots.txt.

vikas-yadav

About the author

Vikas Yadav is a writer to Bloggers Maker. He is also founder of bloggersmaker.com. He has vast experience in SEO of more than ten years in various niches i.e. Education, Pharmacy, Realty, Airline, Gifts, Data Recovery, Best Website Hosting, Mobile Application Development and News.

1 thought on “robots.txt file”

  1. I blog often and I seriously appreciate your content.
    Your article has really peaked my interest. I’m going to
    take a note of your site and keep checking for new
    details about once a week. I subscribed to your Feed too.

    Reply

Leave a Comment