Pre-test Modeling

What is Pre-test Modeling

Pre-test modeling utilizes statistical models to understand better the test you intend to run. Will the test achieve stat sig in a reasonable amount of time? What will the business impacts be of different lifts? Do we feel confident we can achieve those lifts?

The Why and the how

By Developing simple tables that display the key points for a test before a significant amount of work is performed, an organization can better target the tests most likely to achieve the company-wide goals. This is done through a simple statistical significance calculation for a range of test conversion rates allowing the team to not only view the expected results but a range of variants, both positive and negative, so they can accurately asses the validity of the test.

Overview

In order to optimize testing strategies, you need to know where you are and where you want to go. A team can develop a strong predictive model with only those two pieces of information. In the example we will go over in this article, we will look at an organization that has an average number of entries for a given time period, an average conversion rate, and a company goal of a net increase in conversions. This is only one example of how to develop a model, and small changes should be made to best develop your models.

Some common issues that occur when pre-test modeling is not performed.

Running on too small of a testing audience — If a test audience is too small, it can sometimes take months or even a year to achieve a statistically significant outcome.
What should be done if the lift is smaller than originally expected — During the execution of the test, at what point is the lift low enough that the test should be scrapped?
A test ends up being successful but has little to no business impact. — Even if this test is successful, will it make meaningful progress on the larger organization’s key objectives worth running in the first place?

The Example

In this example, we look at a process with 120 daily entries and an average 20% conversion rate. This test is recommended to achieve the company goal of increasing sales by 10,000 this year. Once the team decides on a potential strategy and expected conversion rate, we can input values into our model.

As you can see below, the only values we need to understand the impact of this test are our current conversion rate and entry average, the company goal, the expected new conversion rate set by the team, and the expected start date.

With this information, we can easily see the most important results of our test. If the test begins by January 14th, we can reach statistical significance within nine days which leads to 2000 additional sales or 20% of our goal.

These tables only produce the expected returns; it is up to the team to decide if those returns are worth the expected effort. For example, if the team feels it will only be able to implement three tests in the year, they may need to look for more impactful tests than this.

Now you may be asking yourself “What if I run the test and we do not reach the 25% conversion rate we are predicting?” A good model will be prepared for such a situation. In our example we use a step analysis to view what the impact of different conversion rates will have on the results of the test.

As you can see above, we now have an expanded table for several potential conversion rates from our test. A few things about the design of this table.

The central column is highlighted green, as that is our expected result.
All columns have special rules to indicate errors and warnings. For example, the red 20% indicates no lift from the control, and the yellow #N/A values that show statistical significance would take more than X days (120 in this case) to be achieved.

With these additional pieces of information team should be able to easily determine whether or not a test is worth running in order to achieve the desired company goal.

Related Information

Errors and Warnings

You will notice red and yellow highlights in some of the above cells. These are rules put in place to notify the user that there are either errors or warnings with their inputs. You can learn more about general rules in our Errors and Warnings article. In this specific case, we are highlighting one error and two warnings. The error is the fact that one of our range of potential conversions is equal to or lower than the current conversion. The first warning is for the expected start date in the input section, letting the operator know that they have input a date in the past. The second is for the cells reading #N/A in the range of conversion sections. This triggers when the lift will not achieve statistical significance within 120 days of its start.

Multivariate Testing

A simple A/B test was used in this example to help demonstrate how modeling can be useful. To better optimize the test, you will want to do a multivariate test, sometimes known as an A/B/C… test. With this, you can test out more options to determine which will produce the best lift. However, it should be noted that you may have to use a “likelihood to reach desired lift” calculation instead of a standard statistical significance test. This is because adding more variables reduces the audience size of each group, which could delay reaching statistical significance.