A/B Test Calculator

Check if your A/B test results are statistically significant. Calculates p-value, Z-score, uplift, and statistical power. Free.

Confidence Level
Control
Visitors
Conversions
Variant
Visitors
Conversions
Significance Thresholds Reference
Confidence
Required p-value
Use Case
90%
p < 0.10
Quick iteration
95%
p < 0.05
Industry standard
99%
p < 0.01
High-stakes tests
Power ≥ 80% recommended. Low power = higher risk of missing real improvements.

How to Use

  1. 1 Select your desired confidence level: 90% for quick tests, 95% (standard), or 99% for high-stakes decisions.
  2. 2 Enter the number of visitors and conversions for your Control group (the original version).
  3. 3 Enter the number of visitors and conversions for your Variant group (the new version you're testing).
  4. 4 The calculator shows whether results are statistically significant, along with p-value, Z-score, uplift, and statistical power.
  5. 5 If results are not significant, you may need more data — use the A/B Test Sample Size Calculator to plan your test.

Use Cases

Validate Landing Page Tests

After running an A/B test on a landing page headline or CTA button, enter visitor and conversion counts for both versions to determine if the difference is statistically significant and not due to chance.

Email Subject Line Testing

Compare open rates or click rates between two email subject lines. Enter total sends and opens/clicks for each variant to check if the winner is statistically meaningful.

Pricing Page Experiments

For high-stakes tests like pricing changes, use the 99% confidence level to ensure you have strong statistical evidence before rolling out a change that could impact revenue.

Post-Test Analysis

After stopping a test, analyze whether your observed uplift is real. The calculator shows both the p-value and statistical power, helping you avoid false positives and false negatives.

FAQ

Statistical significance means the difference between your control and variant is unlikely to be due to random chance. At 95% confidence, you're saying there's only a 5% chance the result is a false positive. A p-value below 0.05 at 95% confidence = statistically significant.

The p-value is the probability that you'd see a difference this large (or larger) purely by chance, assuming there's no real difference. A p-value of 0.03 means there's a 3% chance the result is random. Lower p-values = stronger evidence. At 95% confidence, you need p < 0.05.

Statistical power (1-β) is the probability that your test can detect a real effect if one exists. 80% power is the standard target — meaning a 20% chance of missing a real improvement (Type II error). Low power means you might stop a test too early and miss a genuine winner.

Uplift = (Variant Rate − Control Rate) / Control Rate × 100. It's the relative improvement of the variant over control. A control converting at 3% and variant at 3.6% = 20% uplift. Note: uplift and statistical significance are separate — even a large uplift can be insignificant with small samples.

No. All calculations run entirely in your browser. No data is sent to any server.

Related Tools