SEO | Technical

How to use Google advanced search operators to find indexation and technical SEO issues

by on 8th May 2019

We SEOs like our tools. Each one is instrumental in uncovering aspects of a site we might not have thought about or trawl information that would take years to do manually. One of the best tools, however, is often underutilised: Google itself.

Google has the advantage of not only being free but storing exactly which of your site’s URLs (and competitors) are indexed. With a little persuasion, SERPs reveal all of their knowledge to execute tasks such as competitor research, content analysis and technical auditing.

In this article, I’ll be focusing on the latter, giving examples so you can see them in action and use to eliminate those issues.

Basic Search Operators

Before we consider specific usage cases, it is important to look at the basic commands and how they can refine the SERPs. These search operators can all be combined to dig deep into Google’s index.

1.Quotation marks search operator

By surrounding your search query in quotation marks, you are telling Google you want to search for the exact term. This can either be the entire string of the search term or certain words that need to be grouped together within the result.

Example use:

“technical audit” seo

2. Site search operator

This search operator can be used to restrict your search to a specific website.

Example use:

site:builtvisible.com

3. OR search operator

Use this to search a specific page for multiple words or phrases. Multiple OR search operators can be strung together. This can also be used in combination with the site search operator to search multiple sites.

Example use:

“webmaster tools” OR “search console”

4. Inurl search operator

This command will search within page URLs for a specific word or phrase.

Example use:

inurl:screaming frog

5. Filetype search operator

This restricts the search to certain filetypes only – really useful for finding those PDFs that were uploaded in 2010 and subsequently forgotten about.

Example use:

filetype:pdf

6. Around search operator

This one is a personal favourite. You can find examples of content that have the words you are searching for in close proximity. The number within the brackets is used to define how close the words must be to each other.

Example use:

“Joe Cole” AROUND(3) “west ham”

7. Exclude search operator (-)

Used to provide results that do not include a certain word. It can often be included before other search operators to exclude sites/titles. E.g. -site:builtvisible.com would exclude this site from your search.

Example use:

Google -analytics

8. Include words (+)

Again, this can be combined with most search operators but is used to make sure that a specific word/phrase is included in the page.

Example use:

Google +”data studio”

9. Date Range search operators

This one is currently in testing by Google but considering their search liaison Twitter account has Tweeted about it a lot means it is likely to stay.

SearchLiaison-before-and-after search-operators

This search operator allows you to search for content created before or after specified dates. Super handy for content and competitor research or just to see if any legacy (or new) pages have an issue either pre or post-migration.

Example use:

GDPR before:2018-06-01 after:2016-06-01

This list isn’t exhaustive but will give you some ideas of how you can refine your search queries to get more specific results.

How to use search operators for technical audits

While the likes of Screaming Frog, Botify and Deep Crawl can find potential issues on your site, they tend not to show you what is stored by search engines and therefore what users can find when they search for your site. It is imperative that you find out how Google perceives your site to not only find the technical issues but to be able to understand the scope of the problem and how much of a priority it should be.

Every site is different and therefore has its own unique issues. Below we will look at some of the problems we often find and what can be done to fix them.

It is worth bearing in mind that, while the screenshots below will feature the results at the top of the SERPs, it is still worth clicking onto page two (or even page 15) to find the strange pages that Google has indexed.

1. Check your site

site:huffingtonpost.co.uk

Whenever I do a technical audit, a simple site search is always my first port of call. Ignore page 1 entirely and start looking at high page numbers. Simply note down what you find for later searches, which can then be refined down at a later stage. Some common things that can be found are malformed URLs, indexed parameters and pages with filler text.

Examples:

site-search-combined

All sorts of interesting issues can be thrown up if you dig through site:xx  SERPs and, when found, you’ll have your work cut out for you.

2. Check if filler content has been indexed

There are several ways of doing this. Like checking for sub-domains, such as a dev, test pages, (see below). However, one that frequently gets overlooked is checking for the dummy ‘lorem ipsum’ text. This text is frequently used by developers to see if the format looks right. While the text itself can vary, it almost always begins with lorem ipsum, which means we can do an intext test to see if any has erroneously been indexed.

Example use:

site:bbc.com intext:lorem ipsum

BBCcom-lorem-ipsum

3. Find duplicate titles

If in your tech audit you notice that a title with the word ‘test’ has been indexed, you can easily find all the others that Google has indexed using the intitle search operator that we used earlier.

Example use:

Site:xxx intitle:test

test-in-title (1)

4. Check if non-secure pages have been indexed

This is a good way to find out which of your non-secure http pages have been indexed by Google. This could point to canonicalization issues (or rather that Google is ignoring your canonical tags) or that a global redirect from http to https is needed. Insecure content tends to be ranked slightly lower than secure content and, depending on the site, could pose a genuine security issue for users.

Example use:

site:bbc.co.uk inurl:http

OR

site:bbc.co.uk -inurl:https

BBC-http-checks

Using the first string will simply find all the http pages. The second exclude string may bring up some weird pages such as FTP pages or other protocols that really should not be in the index.

5. Finding subdomains that you were not aware of

This search operator is especially powerful when you combine it with multiple exclude search operators so that you can take away sub-domains that you know should be indexed. This is a great pointer to a whole manner of technical issues or to find long-forgotten and redundant sub-domains.

Example use:

site:amazon.com -www -music -aws

amazon-sub-domains

6. Finding all the non-html content on a site

This is great for finding useful content that can be converted into standard HTML files, which will be easier to index and rank in Google. Filetype can be used to find files that shouldn’t be indexed and can have a no-index tag or a robots.txt disallow directive applied to them.

Example use:

site:bbc.co.uk filetype:xml OR filetype:pdf OR filetype:txt

BBC-file-types

7. Find specific parameters that have been indexed

Annoyingly, the inurl search operator doesn’t like wildcards, which means you must find the parameters manually.

The ‘site: search’ we looked at earlier will likely have given you a few parametrised URLs to get you started. Other ways to find things to look at include the sites robots.txt file (check to see if the parameters you have asked to be disallowed have actually been removed from the index), the Google Search Console Parameters report (old search console only at the moment) or by crawling the site with something like Screaming Frog.

Once you have this information you can simply do an inurl search with the parameters you have found to work out the scale of the problem and if it is worth the development time to fix it.

Example use:

site:amazon.co.uk inurl:?ie=

2.8million URLs is probably worth fixing….

amazon-parameters

8. Finding old addresses or phone numbers

If your business has recently relocated or changed its contact details, then the following search operator can be used to make sure that every page has been updated. This can also be used to check directory sites that need to be updated with the new details.

Example use:

site:Builtvisible.com intext:”old address” OR intext:”old phone number”

address-checks

9. Finding internal link opportunities

As part of your technical auditing process, you may have noted that your internal linking isn’t as great as it could be. But how can you find the pages that should link to the page? There are many methods that usually involve a lot of page categorisation and a massive Excel sheet of some description. Why not let Google do the heavy lifting for you?

Simply enter a relevant search term via intext and exclude the page where you want to link to and all should be revealed. Or, at the very least, give you a few ideas to get you started.

Example use:

site:Builtvisible.com -site: builtvisible.com/scraping-people-also-ask-boxes-for-seo-and-content-research/ intext:”scraping”

potential-backlink-checks

10. Find URLs with a different domain

This will allow you to find all the sites or pages that sit on a different top level domain( TLD) than your main site. It might not necessarily be a technical issue, but it is worth drilling down into these TLDs for duplicate content issues or to see if the same technical issues that are affecting your main site are replicated on other TLDs.

Example use:

site:asos.* -site:asos.com

asos-tld-checks

Go forth and Google

Hopefully, this article has shown you the type of information that can be prised from the SERPs to diagnose specific issues facing your website.

Most of the search queries discussed only feature two or three operators, but with greater refinement even more knowledge can be found when chaining four, five or more of these operators together – it’s up to you to experiment and figure out what works for your sites or the task you’re trying to achieve.

Technical audits are not the only use we have found for this kind of methodology. Our content team regularly use similar search operator queries when conducting competitor, outreach or content research for our clients, which can help to provide a great insight into the overall search landscape for any given topic or keyword.

Responses

  1. Glenn Gabe has pointed out on Twitter that another search operator that can be used is the src: operator within Google Image search.

    This operator is useful to find all the images that are being indexed on a site(or combined with other search operators to drill down further). He recently encountered an issue where the site: search operator was not giving a full picture and with the help of some for the team at Google found this solution to work better.

    Example code:
    banana src:bbc.co.uk

    Cheers for that one Glenn!

  2. Great guide Dave, great use of them in 9 and 10

    Another use would be checking for internal duplicate content.

    site:example.com “paste a sentence or two of content you want to check here”

    Can be easier that just pasting it in without the site operator if a lot of scraper sites have used the content.

  3. Great Guide and helpful

  4. It’s a definately great and useful article Dave. Thank you.

Comments are closed.

We're hiring – check out our careers page Careers

Get insights straight to your inbox

Stay one step ahead of the competition with our monthly Inner Circle email full of resources, industry developments and opinions from around the web.