Deep-diving into A/B testing

3 min readMay 14, 2021

A/B testing as a concept is quite straightforward. Experiment with two version & select the one which performs better. However, there are few glaring questions which need to answered before conducting the testing, namely :-

What should be the duration of the experiment ?
How to access analytically which version performed better ?
What should be the optimum set of users exposed to the A/B testing?

Usually the higher the duration of the experiment, the confidence in the output increases but usually organization doesn’t have the luxury of time and the resources to conduct the experiment. Also in the fast pace scenario, the decision needs to be taken with optimum confidence over the output.

In summary, the idea of A/B testing ensures that the results are statistically significant without an iota of randomness.

As a result, two values are calculated to derive the statistical significance as explained below :-

Confidence interval 👍

A confidence interval is one way of presenting the uncertainty associated with a given measurement of a parameter of interest.

For instance, if you’re interested in the difference between the conversion rate of a test variation of a checkout page and our current checkout page, we would:

Perform an A/B test to measure the difference between the test and control groups.

If your experiment shows a 95% confidence interval for conversion of 5% to 10%, this means that there’s a 95% chance test variation increased conversion between 5% and 10%. Kudos ! 🙌 If it went the other way to -5% to -7%, it’s failure.

Usually, confidence interval might also include a range which is inclusive of 0% . In such cases, it implies that the test variation neither increased or decreased the conversion rate.

As a PM, you should evaluate factors other than metrics to identify whether you should pursue with test variation or not, such as — ease of use, industry standards, long term improvement, etc.

P-values 👎

The p-value is a measure of evidence against the null hypothesis. In simple terms, if we had conducted an A/A experiment instead of an A/B experiment, then the null hypothesis = “ conducting an A/A experiment “ & p-value will reveal the probability that there is some change when we conducted A/B testing. It doesn’t help in concluding whether that change is good or bad.

Most commonly used p-value of 0.05 which translates to 95% confidence interval. They are directly co-related.

PMs prefer confidence interval over p-value because it gives ample clarity about the success or failure of the experiment.

Thanks for reading! If you’ve got ideas to contribute to this conversation please comment. If you like what you read and want to see more, clap me some love! Follow here, or connect with me on LinkedIn

Deep-diving into A/B testing

Confidence interval 👍

P-values 👎

Written by Rohit Verma