Search Domain Names
Recent Comments
- Richard Green wrote:
i always wondered how that work, cheers. - Julian Beachworth wrote:
Thanks, i will check this out... getting hacked is not fun, trust me! - Luke Johnson wrote:
The Hitwise article is worth a read if anyone has a spare 5 mins
Just how important is a Robots.txt file to GoogleBot?
19 Jun, 2008 | Posted in: Search Engine Optimisation | 1 Comments
I have always been under the impression that a site does not necessarily require a robots.txt file. In fact, alot of sits I have built do not have a robots.txt file and yet there are very well indexed by GoogleBot and other search engine robots.
However, recently Google employee JohnMu, said in a Google Groups thread that if your robots.txt file is unreachable due to timing out or other issues, not including a 404 not found status, Google "tends not to crawl the site at all just to be safe."
What does this mean? Google might not crawl your entire website if it has issues reaching your robots.txt file properly.
In the case in the thread, the robots.txt file was unreachable due to a complex set of redirects that made Googlebot very dizzy.
John explains later on that "unparsable robots.txt" files are "generally" okay, since Google is getting back some type of server response. When you have an issue is when generally "the URL is just unreachable (perhaps a "security update" that ended up blocking us in general) or situations like this where we give up trying to access the URL (which in a way is unreachable as well)," said John.



i always wondered how that work, cheers.
23 Jun, 2008 @ 01:44