Sitemaps can be of two kinds; a page or pages on your site that lists the pages on your website, often hierarchically organized or ‘Google Sitemaps’ which is a process that allows site owners to submit urls of pages they would like to have included in Google’s index. The two kinds of sitemap serve slightly different purposes, both important.

A conventional sitemap is designed to help the human visitor if they can’t find what they are looking for and also to ensure that Googlebot (Google’s Web Crawler) finds the important pages of your site. A well executed example of this kind of sitemap is Apple’s sitemap. From the optimization point of view a page like this presents an opportunity to link to your own pages with appropriate anchor text (see the last paragraph of Internal Links). If you have more than a few pages on your site then a sitemap can only be advantageous.

Google Sitemaps however is a solution to a problem that Google has with crawling the entire web. Googlebot spends a lot of time and resources fetching pages that have not changed since it last looked at them. Crawling billions of pages to find that the majority are the same as last time is not very efficient and Google Sitemaps has been designed to improve the process. The idea is that site owners submit a sitemap to Google and next time Googlebot visits their site it knows where to go and look for changed or new pages.

If site owners use Google Sitemaps it will reduce their machine time and reduce their bandwidth i.e. it saves them money. Also site owners get their new content indexed quicker and a reduced load on their servers by Googlebot not fetching unchanged pages. Google have provided a sitemap protocol and an automated process for the whole procedure.

Google Sitemaps does not replace the established Googlebot crawling procedure and should be used to solve specific problems, such as:

 

  • If you need to reduce the bandwidth taken by Googlebot.
  • If your site has (accessible) pages that are not crawled.
  • If you generate a lot of new pages and want them crawled quickly.
  • If you have two or more pages listed for the same search you can use page priority to list the better one.

 

Google have an extensive help and explanation of the procedure at About Google Sitemaps.

August 5, 2006

Google has renamed Google Sitemaps to Google Webmaster Toolsunder the new heading of Webmaster Central.

April 15, 2007

Google, MSN, Yahoo and ASK have recently announced support for sitemap auto-discovery via the robots.txt file and have agreed on a standard sitemaps protocol.

By adding the following line of code to your robots.txt file the search engines will now auto-discover your sitemap file.

Sitemap: http://www.example.com/sitemap.xml