Why is stopping an A/B test at first significance problematic?

Tests peeking and Type I error inflation. Name peeking; explain daily looks inflate false positive rates above nominal alpha; note p-values assume one look at fixed sample size; recommend pre-committed runtimes or sequential testing.
Tests understanding of the peeking problem and Type I error inflation in frequentist A/B testing. A strong answer names peeking, explains that daily interim looks inflate the overall false positive rate from a nominal 5% to 20% or higher over weeks, notes that standard p-values assume a single analysis at a fixed sample size, and recommends committing to a fixed runtime or using sequential testing with alpha spending. Red flags include conflating this with p-hacking or citing sample size without mentioning repeated testing.
Read the original → docs.growthbook.io
- #ab testing
- #peeking
- #statistics
- #type i error
- #experimentation
Get five bites like this every day.
Tezvyn delivers a daily feed of 60-second tech bites with quizzes to lock in what you learn.