Technical

How To: XML Sitemap
Error Checking

by on 20th February 2015

Today, I want to share a tip for quickly weeding out errors in sitemap XML files. This post will teach your a few new tricks with my favourite crawl tool, Screamingfrog and (hopefully) save you some time in your own technical SEO audit projects. By the end of this tutorial, your XML sitemap will the audited for any 404 errors (or 5xx errors, etc).

How to check your XML Sitemap for errors with Screaming Frog

  1. Download and save your xml sitemap
  2. Choose Mode->List in Screamingfrog
  3. Select URL list file
  4. Choose the XML file and click “Open”
  5. Click Start to start crawling

Why a Nice Clean Sitemap?

Google have invested a lot of time and effort into improving the sitemaps functionality in Search Console and the advice I’ve always heard from Google people is advice like; keep your sitemaps as error free as you can, use the correct canonical URL. I’ve always felt that a sitemap file with a very low load time is also advisable if you can speed up the dynamic elements of the file generation.

sitemaph-console

We have a very tight threshold on how clean your sitemap needs to be. When people are learning about how to build sitemaps, it’s really critical that they understand that this isn’t something that you do once and forget about. This is an ongoing maintenance item, and it has a big impact on how Bing views your website. What we want is end state URLs and we want hyper-clean. We want only a couple of percentage points of error.

Duane Forrester (Bing)

Checking for Problems During a Site Audit

When I’m working on a site, I sometimes need to work out what state the XML sitemap is in. If I know there’s been a recent update to the file(s) it’s not always a good idea to totally rely on the data coming from Search Console. If you’re ever in that situation, here’s how to get a fresher impression of the state of your sitemap.xml file.

Import your XML Sitemap File into Screaming Frog

First, start by finding your XML Sitemap URL. It should be available in the sitemaps report in search console, or as we use Yoast’s SEO plugin for WordPress, it’s available via settings too. I find the URL and do a quick “Save as” to my desktop:

our-xml-sitemap
Our XML Sitemap viewed in Chrome

Then, upload the file to Screamingfrog. Firstly you’ll need to switch to list mode:

open-xml-sitemap

Screamingfrog lists all the URLs it find in the XML, ready for you to start crawling:

open-xml-sitemap2

From there, the crawl begins. You can see the errors by sorting by “Status” like this:

report-sitemap-errors

And there you go; thanks to Screamingfrog we’ve audited our sitemap. Happy analysing!

Like this? Sign up to read more

Responses

  1. Richard

    Thanks for the useful post on the sitemaps and using excel to get the HTTP status codes for URLS. I have installed the seo extension and have tried to add a new colunm followed by adding the http status. I’m having no luck with this. I have simply pointed excel to my sitemap, loaded all the data, added a new colunm and pasted the code. I also replaced the typo html with http.

    Am i missing something? I am running office 2007.

    cheers

    shivun

  2. Hi Shivun

    Paste this query into a cell: =seotoolsversion()

    You *should* get a version number. If you don’t, the tools haven’t installed correctly.

  3. Richard, when I click in a cell to type it shows the function called =seotoolsversion() but does’nt run?

    shivun

Comments are closed.