Creating a Google Sitemap that doesn't slow down publishing

This day and age it is essential for virtually every website to publish a Google Sitemap, a document which instructs Google's search engine (and others) about every single page on your website. This obviates the need for them to crawl your web site looking for content, and provides Google with additional information about your site that will instruct their crawler how often to come back looking for updates.

Publishing an XML file that indexes every single page and entry on your web site though could have a massive effect on your site's publishing performance. This HOWTO shows you how to setup a Google Sitemap in such a way that it doesn't kill your system.

Prerequisites

  • This HOWTO requires the use of Movable Type 4.2 or greater, or any version of Melody.
  • You must have the run-periodic-tasks script setup to run (recommended, once every 5 minutes).

Instructions

  1. Create a template module called "Sitemap Include". Paste into the contents of the following file: m_sitemap.mtml

  2. Set the newly created template module's template module caching settings to one of the following:

    • Expire on the creation of a page and/or entry - this will keep the Sitemap most up to date, but it will be slower.
    • Expire every 60 minutes. This will ensure that your Sitemap is only published once per hour. This keeps it sufficiently up to date for most sites and is minimized unnecessary publishing even more.
  3. Create an index template called "Google Sitemap." Set its output file to "sitemap.xml". Paste into the contents of the following file:
    sitemap.mtml

  4. Set the newly created template to publish via Publish Queue.

Explanation

So what is happening here? We leverage the fact that a template module's contents can be cached to create an index template that does nothing but include a template module. When that index template is published, one two things will happen:

  1. The contents of the module will be pulled from the cache and the template will publish with the snap of a finger.
  2. The contents of the module will be rebuilt via the regular publishing process.

On a busy site, where the index templates are currently being republished because of new comments and the like, then this ensures that most of the time the Sitemap publishes quickly and does not affect overall site performance. But periodically (according to your caching settings), the sitemap is republished to ensure it remains up to date.

This is how Endevver implements sitemaps with all of its clients and how we recommend others to do the same.

Advanced implementations could use custom fields to control values and preferences within the sitemap itself. We leave those enhancements though up to you!