SEO tools show the what but not the why
These, and other SEO tools, focus on measuring traffic and capturing data as core objectives. SEO teams around the world use these SEO tools to report on improvement (or decline) of organic search revenue, but they do so while looking at complete traffic data as an aggregate without knowing the detailed impact of each action taken.
Consider this chart, with what degree of confidence can you say the increases seen are the result of the two optimisations annotated.
Don’t just take my word for it:
Reporting the true impact of SEO changes on organic traffic and revenue can be tricky with so many contributing variables in play. Seasonality, PR, technical issues, environmental challenges like COVID and of course Google updates can all greatly influence and distort our view, and that’s before encountering limitations on the data we have available to us
Dan Butler, Performance Director, Builtvisible
Causal Inference – Showing the true impact of change
In industry, data science is used in a wide range of business problems, from customer churn to sales forecasts, supply chain logistics, product development, and everything in between. Causal inference is an area of data science dedicated to the investigation of causes that can be inferred from data.
Humans have been interested in causality for hundreds of years. However, causal science as an area of study is fairly recent. Only in the latter half of the 20th century we see research actively focused on causality thanks to the work of pioneering methodologists such as Donald Rubin and Judea Pearl.
The prime tools for causal inference used today are A/B and multivariate testing. Today, many businesses rely on randomized control trials (or RCTs, the generalised form of A/B and MV tests) to set up test and control groups that help understand the causal effects any change may have upon bottom line metrics. Marketing campaigns, sales drives, conversion rate optimisations, or customer success strategies are just some examples of operations benefiting from RCTs tests applicable to almost any industry.
In general, when you find yourself in any situation in which you can take independent, blind sampling; you are able to set-up an A/B test and work to understand the impacts of any proposed action. SEO, however, is one of the few areas of study where A/B tests are simply not viable. There is no way to perform sampling on a population of one, which in SEO’s case, is the search engine.
From a data perspective applied to SEO, we are looking to understand the impact that our actions have upon the ranking and searchability of our properties. We are working with and against an unknown algorithm powering a search engine, our sole test subject, the root of our measurement problem.
Causal inference helps solve this problem. Causality is an area of study full of interesting and impactful topics, but also some of the most difficult. Unlike traditional machine learning methods, causal answers cannot be learned by tried and tested predictive or descriptive tools. In causal inference we need to develop models that can look back in time. Models that, from a starting point in the past, can estimate what would have happened if no action was taken. In statistics, they call this estimation the counter-factual.
By comparing present-day data with an estimated counter-factual we can report causal uplifts using only observational info. There is no need to set up expensive experiments or extract samples and split them into test and control groups. You can even apply causal inference methods to understand the impact of optimisations implemented months ago if you have sufficient data before and after said action was taken.
We’ve found causal science invaluable to bring causal methods alive in our work at Builtvisible; causal science products developed internally enable us to solve challenging but critical data science problems that would otherwise be impossible to tackle.
SEOCausal – Builtvisible’s proprietary testing framework
Enter Builtvisible’s SEOCausal, a causal inference product developed in-house dedicated to measuring impacts of SEO actions. This product allows us to measure, with accuracy and precision, the impacts that optimisations have had upon any property or domain.
Our tool provides us with a suite of uplift modelling and causal inference methods using machine learning algorithms based on state-of-the-art research from leading teams including Google and Uber. The product was designed to be quick, flexible, and effective, plus tailored specifically to solve SEO problems.
SEOCausal allows analysts to understand the relationship between optimisation applied and changes in bottom-line metrics. It can measure the impact upon any metric for which you have sufficient time-series data. What is the impact of adding keyword optimised content? Or the impact of new backlinks? Or a re-categorisation of products? Or a technical backend optimisation? All these questions and more can be answered (and quantified) with causal inference and SEOCausal.
By the nature of causal inference, there are a couple requirements for SEOCausal to work effectively.
The first is that the intervention needs to happen over a short time period. The causal algorithms powering the tool break down when the test date is spread over longer periods. In other words, there must be a clear and defined test-date that earmarks and splits the pre-intervention period from the testing period.
The second requirement relates to the group of pages affected by the intervention, our test set. This test set, consisting of selected pages that have had an optimisation applied, need to remain unchanged during the entirety of the test period. Otherwise, the impacts from any additional changes are collated with our test’s and the results turn muddy and inconclusive. The test period itself is recommended to last no shorter than 28 days. We recommend freezing the pages as a means to eliminate any possibility of other impacts on the underlying area being tested.
And that’s it, really, you have your causal test ready!
SEOCausal in action
We have had huge success using this tool to measure the impacts of dozens of interventions in smaller groups of pages for enterprise-scale clients. By controlling the number of pages, you minimise the risk a negative action may have on the overall organic flow of users for the company. The ability to control the exposure of SEO implementations while gathering exact data on the uplift of each one is nothing short of extraordinary.
SEO teams can use causal inference to understand priorities, identify quick wins, eliminate bad options and, generally, plan an SEO strategy grounded in data and analytics.
We have completed SEOCausal analyses with a wide range of large ecommerce or enterprise-scale multinational websites, resulting in tangible, attributable uplift to individual work streams such as:
- Adding category titles to landing page titles
- Revamping content to target high-value keywords
- Creating internal back-links across high-traffic areas of the property
- Simple html changes like updating the header values or changing title case
- Loads more!
We have been able to effectively measure the impact of these specific implementations and report the uplift seen on metrics like impressions, sessions, users, clicks, position, and more. For example, in a recent PLP content update targeting high-value keywords for an enterprise-scale ecommerce client, our SEOCausal models concluded that the new content was the cause of an increase in impressions of more than 14%.
Planning for test success
So, what is the actual process behind SEOCausal? As simple as: Build, Match, Fit.
1) Build the Dataset:
The first step takes a global dataset that contains as much data, from as many pages, as your device can handle. The more data the better chance you have at finding strong matching controls at the expense of more computing resources. This global dataset can be obtained from any source, but we mainly use GSC and GA for most of our tests. This dataset has a companion; a list with the pages that are to be tested.
After running the algorithm, the output from this step is a properly formatted and aggregated dataset ready for control matching.
2) Match the Controls:
This step takes the dataset built during the previous step and finds the dynamic time warping (DWT) distance to every other page, in a one-to-many fashion. Pages with the lowest DWT are considered the best controls for our test and will be used for modelling in the final step. We usually match thousands of pages and select no more than 300.
3) Fit the Model
The final step receives as input the formatted dataset from step 1 and a selection of up to 300 control markets from step 2. SEOCausal then applies a causal model that considers seasonality, variance, and bias variables; and returns an estimated impact from the intervention applied.
Final thoughts
Measuring detailed uplift of a single SEO action is not easy but not impossible. Recent research in data science has enabled Builtvisible to supercharge reporting and analytics in ways that simply were not possible before. We can confidently inform clients on the results of any optimisation and provide revenue-centric analysis that highlights the value of well-done SEO.
By integrating data science teams closer to SEO experts, we realise gains and create synergies that in turn become great products like SEOCausal. I am personally excited, and so is most of my team, about the possibilities yet to be found in the junction of these two important areas of a modern economy.
If you are interested in learning more about the applications of SEOCausal, curious about setting up a fresh SEO test for any of your properties, or simply want to reach out and comment on this post, just contact our team and we can help you get this show on the road!