Today, I want to share a tip for quickly weeding out errors in sitemap XML files. This post will teach your a few new tricks with my favourite crawl tool, Screamingfrog and (hopefully) save you some time in your own technical SEO audit projects. By the end of this tutorial, your XML sitemap will the audited for any 404 errors (or 5xx errors, etc).
How to check your XML Sitemap for errors with Screaming Frog
- Download and save your xml sitemap
- Choose Mode->List in Screamingfrog
- Select URL list file
- Choose the XML file and click “Open”
- Click Start to start crawling
Why a Nice Clean Sitemap?
Google have invested a lot of time and effort into improving the sitemaps functionality in Search Console and the advice I’ve always heard from Google people is advice like; keep your sitemaps as error free as you can, use the correct canonical URL. I’ve always felt that a sitemap file with a very low load time is also advisable if you can speed up the dynamic elements of the file generation.
We have a very tight threshold on how clean your sitemap needs to be. When people are learning about how to build sitemaps, it’s really critical that they understand that this isn’t something that you do once and forget about. This is an ongoing maintenance item, and it has a big impact on how Bing views your website. What we want is end state URLs and we want hyper-clean. We want only a couple of percentage points of error.
Duane Forrester (Bing)
Checking for Problems During a Site Audit
When I’m working on a site, I sometimes need to work out what state the XML sitemap is in. If I know there’s been a recent update to the file(s) it’s not always a good idea to totally rely on the data coming from Search Console. If you’re ever in that situation, here’s how to get a fresher impression of the state of your sitemap.xml file.
Import your XML Sitemap File into Screaming Frog
First, start by finding your XML Sitemap URL. It should be available in the sitemaps report in search console, or as we use Yoast’s SEO plugin for WordPress, it’s available via settings too. I find the URL and do a quick “Save as” to my desktop:
Our XML Sitemap viewed in Chrome
Then, upload the file to Screamingfrog. Firstly you’ll need to switch to list mode:
Screamingfrog lists all the URLs it find in the XML, ready for you to start crawling:
From there, the crawl begins. You can see the errors by sorting by “Status” like this:
And there you go; thanks to Screamingfrog we’ve audited our sitemap. Happy analysing!