What is robots.txt ?

Robots.txt is frequent name of a text file that is uploaded to a Web site's root directory and linked in the html code of the Web site. The robots.txt file is used to have the funds for directions about the Web site to Web robots and spiders.
 
Robots.txt is what?
Robots.txt: A file located in the root directory, this file contains the complete text text (not HTML).

It allows the Webmaster (Webmaster) set components with separate permissions for each crawler. In other words through this file, it enables more flexible webmasters to give or not give bot of the search engine (SE) indexes (index) a certain area of ​​your website.

Robots.txt may require each different bot types of different SE can visit the website or in the area of ​​the site or not?

Examples:

User-agent: *: For all the bots to access the website

Disallow: / administrator /: block bot access the administration page

Disallow: /: Block not for bots to access the entire website

Disallow: /images/nguoidep.JPG: Block not for bots to access the image file named nguoidep.JPG
 
The robots.txt is a simple text file in your web site that informs search engine bots how to crawl and index website or web pages. It is great when search engines frequently visit your site and index your content but often there are cases when indexing parts of your online content is not what you want.
 
Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naïve to rely on robots.txt to protect it from being indexed and displayed in search results.
 
Last edited:
Back
Top