A/B Test False Positives: p=0.03 with 50 Users Explained

#abtesting #bayesianstatistics #frequentiststatistic #conversionrateoptimi

The 200-Conversion Mirage

You ship a new checkout button. After 200 conversions, the p-value hits 0.03. Your manager celebrates. You push to production.

Two weeks later, the "winning" variant is underperforming the control by 8%. What happened?

The culprit isn't bad luck — it's a fundamental mismatch between frequentist hypothesis testing and small-sample reality. Most A/B test calculators assume you're flipping a coin thousands of times. But early-stage products, B2B funnels, and niche features rarely see that volume. And that's where the math breaks down in ways most data analysts never learned in school.