A/B testing is one of the most reliable ways to make decisions in marketing and product development. Instead of arguments like "I think the red button is better," you get concrete data. But to get a valid result, you need to follow certain rules. Let's walk through the entire process step by step.
What Is A/B Testing
An A/B test (split test) is an experiment in which the audience is randomly divided into two (or more) groups. Each group sees a different version of a page, email, ad, or UI element. You then compare the results to determine which version performs better on a chosen metric.
Group A sees the control version (the current variant), and Group B sees the test version (with the change). The difference in conversion between the groups reveals the effect of the change.
Step 1: Formulate a Hypothesis
Every test starts with a hypothesis. A good hypothesis contains three elements:
- What you want to change (e.g., the CTA button text).
- Why you believe it will affect the metric (e.g., "the current text is unclear; users don't understand what will happen when they click").
- What result you expect (e.g., "conversion will increase by 10–15%").
Example: "If we replace the 'Submit' button with 'Get a Free Quote,' form conversion will increase by 15% because the user will have a clear understanding of the benefit."
Step 2: Define the Metric
Choose one primary metric by which you'll judge the test's success. This could be:
- Conversion rate (CTR, form completion rate, purchase rate).
- Average order value or revenue per user.
- Bounce rate or pages per session.
Important: don't try to optimize multiple metrics in a single test. This complicates interpretation and increases the risk of false conclusions.
Step 3: Calculate the Required Sample Size
One of the biggest mistakes is stopping a test too early. To achieve statistical significance, you need a sufficient amount of data. The sample size depends on:
- Current conversion rate — the lower it is, the more data you need.
- Minimum Detectable Effect (MDE) — the minimum improvement you want to detect. Detecting a 1% effect requires far more data than detecting a 20% effect.
- Significance level and power — typically, a 95% significance level (alpha = 0.05) and 80% power (beta = 0.2) are used.
Use dedicated calculators to determine the required sample size. Don't rely on intuition — the math is critical here.
Step 4: Launch the Test
When launching, follow these rules:
- Randomization. Assignment to groups must be random. You can't show version A in the morning and version B in the evening — that introduces systematic bias.
- Simultaneity. Both versions must be shown during the same time period. Comparing Monday to Sunday is invalid.
- Isolate changes. Test one change at a time. If you simultaneously change the headline, button color, and copy, you won't know which change drove the result.
Step 5: Wait for Results
Don't peek at the results every hour and don't stop the test as soon as you see one version "winning." Early peeking is a statistical trap: with a small sample, random fluctuations can easily be mistaken for a real effect. Set the test end date in advance and stick to it.
Step 6: Interpret the Results
After the test ends, evaluate:
- Statistical significance. If the p-value is less than 0.05, the result is very likely not due to chance.
- Practical significance. Even a statistically significant 0.1% increase in conversion may be economically meaningless.
- Confidence interval. It shows the range of the probable true effect. If the confidence interval includes zero, the effect is not proven.
Common Mistakes
- Stopping the test at the first "positive" results.
- Testing very small changes with an insufficient sample size.
- Running multiple tests simultaneously on the same audience without accounting for interaction effects.
- Ignoring segments — results may differ between mobile and desktop users.
Conclusion
A/B testing is a discipline, not a magic button. A properly designed experiment provides objective data for decision-making. Follow the steps described above, respect the statistics — and your product decisions will become significantly more accurate.
Calculate the statistical significance of your A/B test using our A/B test calculator.