How to Analyze A/B Tests in Tableau Using Z-Tests

• Technical Features

This is the fourth post in a series on statistical analysis in Tableau. For other applications, see How to Isolate Linear Regression Equations in Tableau.

So you’ve completed an A/B test, now what? It’s easy to determine which version performed better, but is the outcome statistically significant?

You could use an online calculator to find out. Search for a calculator, type in the test results, and it will tell you whether the results are significant. When you analyze just a few tests, these calculators will work well. But this method becomes tedious, invites typos, and limits choice as you increase the number of tests to analyze. Building the statistical test in Tableau takes more time upfront but eliminates the tedium and risk of manual entry. You also gain control of the confidence level and statistical method used.

This workbook applies a two-tailed z-test to a sample dataset of email opens. It flags each email winner and indicates whether the test was statistically significant. You can download the workbook as-is and replace the dataset with your own or keep reading for instructions to build your copy.

An Introduction to the Statistics

This paragraph briefly reviews the statistics and is completely optional. A z-test determines if the means of two populations have a statistically significant difference when the population size is at least 30. A two-tailed test will consider both over and underperformance, making it ideal for A/B testing. For z-tests, we need two metrics, the z-score, and the z-test result. The z-score is the threshold for statistical significance and is determined by the confidence level. Two-tailed tests have higher thresholds than one-tailed tests. These values come from lookup tables. Next, we use the observations to calculate the z-test result. If the result of the z-test is greater in magnitude than the z-score then we can state that the populations are significantly different.

First, create a calculation to determine which subject line performed better.

This is the z-test calculation we’re going to translate into Tableau. Here, A and B represent our open rates, while popA and popB represent emails sent.

First calculate the numerator. Make sure that your table calculation includes both A and B and is calculated separately for each email.

Next, calculate the first part of the denominator.

Then we calculate the second part of the denominator, wrapping the same calculation in a lookup function.

Finally, bring the components together. The z-tests for A and B have the same value but opposite signs. You can test this by changing the sort order, putting B before A.

To generalize the significance across both A and B sets, we use Window Max and take the absolute value.

Next, create a confidence interval parameter.

An Introduction to Parameters in Tableau

Comparing Z-Test Results to the Z-Score

Determine significance by comparing the z-test result to the z-score. The z-score values in the calculation come from lookup tables for two-tailed z-tests. You’ll need to update these values if you want to use a one-tailed test. Then use tSignificance and SL Winner to indicate winners and significant winners.

For each A/B test we now know which subject line performed better, and whether the test was statistically significant. With this information we can prioritize updating emails and focus future testing on emails with uncertain results. I hope this tutorial saved you some time.

– Felicia

Related Content

Tableau UI Tip 1: How to Create a Custom Top Navigation

This content is excerpted from my book, Innovative Tableau: 100 More Tips, Tutorials, and Strategies, published by O’Reilly Media Inc.,…

Dashboard Gauge 2: How to Make Rounded Bars and Scales in Tableau

This is the second in a five-part series on dashboard gauges in Tableau. For future updates, subscribe to our mailing…

How to Reorder Stacked Bars on the Fly in Tableau

I often mention during my training workshops that stacked bar charts are among my least favorite chart types because unless…