Cheaper Domains
Cheaper Domains

Just how important is a Robots.txt file to GoogleBot?

Date
19 Jun, 2008 | Posted in: Search Engine Optimisation | 1 Comments

I have always been under the impression that a site does not necessarily require a robots.txt file. In fact, alot of sits I have built do not have a robots.txt file and yet there are very well indexed by GoogleBot and other search engine robots.

However, recently Google employee JohnMu, said in a Google Groups thread that if your robots.txt file is unreachable due to timing out or other issues, not including a 404 not found status, Google "tends not to crawl the site at all just to be safe."

What does this mean? Google might not crawl your entire website if it has issues reaching your robots.txt file properly.

In the case in the thread, the robots.txt file was unreachable due to a complex set of redirects that made Googlebot very dizzy.

John explains later on that "unparsable robots.txt" files are "generally" okay, since Google is getting back some type of server response. When you have an issue is when generally "the URL is just unreachable (perhaps a "security update" that ended up blocking us in general) or situations like this where we give up trying to access the URL (which in a way is unreachable as well)," said John.


Comments

Commment Richard Green said:

i always wondered how that work, cheers.

23 Jun, 2008 @ 01:44
Post a comment

Comments should be less than 1000 characters. Allowed HTML tags are a,p,strong,b,i,em and u.

To help combat spam, all comments on this blog are reviewed by a site moderator prior to being published.

Anti Spam

Refresh image