One of my favourite tools in OSE has always been the classic “Top Pages” report, allowing for deeper visibility on your most linked to pages.
With the top pages report, you may find that lower authority sites with many pages get less crawl bandwidth than sites with higher authority. This can lead to URLs in the report with “No Data” in the HTTP status field. No problem, here’s a tip to get the most out of that data using Xenu’s Link Sleuth and a simple VLOOKUP in Excel.
Collect your Open Site Explorer data
First, you’ll need to navigate over to the Top Pages tab in Open Site Explorer, and export all of the data into Excel via CSV.
Paste the exported data in to Excel and make a table
I like working with Excel tables, so highlight the exported data from the CSV and press “CTRL + L” to make your table:
Grab a refreshed and up to date list of HTTP headers
Now you have your list of internal pages, copy the list and paste what you’ve got into Notepad (Or Notepad++, Programmers Notepad or whatever else you use). We’re going to save that list and upload to Xenu via “Check URL list”.
With the list of URLs saved, open Xenu and head to “Options > Preferences“.
It’s really important to set “Maximum level” to 0, that way only the list of URLs you’ve provided will get crawled and Xenu won’t discover and follow any new URLs.
Next, head to “File > Check URL list” ready to upload your URL list. Xenu will get straight on with the business of crawling that URL list. If you’ve got thousands of URLs, this part could take a while, so make yourself a coffee or get on with some other work for a few hours… As soon as Xenu is finished, export the crawl results into a TAB separated file.
Import the crawl data in to the Excel spreadsheet and perform a VLOOKUP
Import your new list of URLs into Excel by pasting the data into a new tab. If you’re using tables, be sure to create a table with your new data. The new table should be called “Table2″. Next, you need to VLOOKUP the URL and fetch Xenu’s recorded server header response. The query will look something like this:
If you’re not familiar with Excel, try reading using tables in Excel 2007, and this post explaining VLOOKUP. The end result should be a table that has a new column with your up to date Xenu server header results making for a powerful, highly actionable review of your site’s most linked to pages.