Identifying dead links for more effective link analysis

URL Profiler:

One of the easiest and most accessible tools to filter out links is URL Profiler. The cost of a license is relatively low and it is capable of handling large amounts of data.

To identify dead links, load the URL list and add in the domain that you want to analyse.

Check against within the ‘Link Analysis’ section and click “Run Profiler”.

Once the crawl is complete, the results can be viewed in the standard export. To be specific, the Link Status column, highlighted below.

If you encounter large number of server errors, you might want to consider slowing down your crawl or changing the user-agent to GoogleBot.

Kerboo:

Another excellent option, and the tool we typically use internally for link auditing, is Kerboo. Kerboo is a specialist tool for backlink analysis and has real utility whether you are auditing existing links or prospecting for new opportunities.

Kerboo automatically merges link profile data which can be imported via the APIs of Majestic, Ahrefs and Search Console on an ongoing basis, or manually uploaded as a list of URLs. As it is cloud based, Kerboo won’t hog all of your PC’s processing power, which on large profiles can take days to complete.

In the above example, 70% of the complete link profile – gathered from Majestic, Ahrefs, Search Console and MOZ – was no longer live.

One of the advantages of Kerboo over other tools, is that it allows scanning for multiple domains. This can come handy if your website went through a domain migration in the past and you would like to scan for links pointing to the legacy version as well.

Screaming Frog:

Identifying pages with links no longer live is also possible via the Swiss army knife of SEO – Screaming Frog. Using Screaming Frog’s custom extraction function and a relatively simple RegEx rule can establish whether the crawled page contains a link to the site under review.

The RegEx rule below, once inserted into the extraction field within Screaming Frog, will list any page that features a valid link pointing to builtvisible.com. It will filter out links with incorrect HTML syntax and look for links pointing to subdomains as well (“www.”,”testing.”, etc.).

Our example:

(?i)a.{1,}\s*href=\s*("|)\s*(http:\/\/|https:\/\/|)([a-z0-9]{1,}+\.|)builtvisible\.com

Regex template:

(?i)a.{1,}\s*href=\s*("|)\s*(http:\/\/|https:\/\/|)([a-z0-9]{1,}+\.|)[Example Domain]\.[Domain TLD]

Please note, in order to maintain valid regex syntax, the “.” characters must be escaped by inserting a “\” sign in front of them when entering the domain’s TLD.

To make sure you get the most accurate results possible, it’s also recommended that the crawls run with the following settings:

Always Follow Redirects
5xx Response Retries (changed from 0 to 5)
Crawl Speed (changed to 5 URLs/s from unlimited)
User Agent (changed to Google-bot from Screaming Frog Bot)

After Screaming Frog has completed its crawl the results should look like this within the extraction tab:

If the “Link Status Check 1” field is populated, it means that a link pointing to the defined domain has been spotted.

It’s important to keep in mind that – unlike the previous tools – Screaming Frog is quite memory intensive using the default storage mode. If you are scanning several thousand links, it’s worth switching the memory mode to internal storage (HDD/SSD) instead. This will help to reduce system stress in exchange for speed.

Taking things further

Although this post primarily focused on reviewing the status of a link for auditing purposes, there are so many other uses for this approach. As an example, when trying to recover lost link equity, knowing what links are live can drastically reduce your list of recommended redirects.

Regardless of what you use this for, I hope the techniques discussed in this post save you some time – so you can concentrate on analysing the links that matter.

Zgred

31st January 2019 at 15:40

Take a look on Clusteric.com :) Nice software from Poland, also available in English version. Very useful to prepare Onsite and Offsite analysis.

Dan Smullen

31st January 2019 at 18:38

Screaming frog and that custom regex extraction to do this is the business. Thanks for sharing. Another cool way of using SF and this technique is to grab all brand mentions from google news to check for unlinked brand mentions at scale! A little off topic, but popped to mind when reading this.

Consulenza

9th February 2019 at 17:09

Another pro for Screaming Frog. I just love this software more and more!

Lisa George

25th February 2019 at 12:49

Very interesting post on Link Analysis.

Awesome tips and step by step explanation. Really appreciate the way you have written and explained.

I am really gonna apply this for the future. Worth reading it.

Thanks for sharing it with us.

Good work..!!

Identifying dead links for more effective link analysis

URL Profiler:

Kerboo:

Screaming Frog:

Our example:

Regex template:

Taking things further

Comments are closed.

Zgred

Dan Smullen

Consulenza

Lisa George

You might also be interested in

Identifying dead links for more effective link analysis

Identifying dead links for more effective link analysis

Category page link building for ecommerce sites. The good,...

Category page link building for ecommerce sites. The good, the bad and the ugly.

What good looks like when it comes to effective SEO auditing

What good looks like when it comes to effective SEO auditing

Join the Inner Circle