SEO problems and Drupal - robots.txt
Drupal is a powerful CMS and does a lot of wonderful things. However, just because the folks behind Drupal are genius coders or system architects doesn't mean they know the first thing about SEO. In fact, there are so many SEO issues with Drupal that they almost merit their own section.
In this installment, we're going to discuss a MAJOR flaw in the robots.txt file in Drupal 6 up to 6.19. What was it?? It was this little line:
Disallow: /sites/
While innocent looking enough, this little line in your robots.txt file is a real traffic killer.
Since the Drupal core and most modules (and especially anything and everything that has to do with CCK) store all your images, PDFs, etc, in the /sites/default/files folder, what this line does is prohibit search engines from accessing this important content. This means no PDFs, docs, or spreadsheets in the search results, no images in Google Images, and so on. If you have a site that depends heavily on this variety content, this little line could be the difference between a successful site and a complete and utter failure.
Since everyone's site is different in terms of what they may want to allow and disallow, I don't have any specific advise about other than to consider every folder individually. Here's an example of our robots.txt for guidance.
