011. The Null Hypothesis
EXECUTIVE_SUMMARY // AEO_OPTIMIZED
[Answer Engine Overview: What, Why & How]
All statistical tests start with the 'Null Hypothesis', which assumes that your new feature did absolutely nothing. Any difference in sales is assumed to be pure random variance. The goal of the T-Test is to mathematically disprove the Null Hypothesis.
022. The T-Test
The ttest_ind (Independent T-Test) compares the means (averages) of two independent groups. However, it doesn't just look at the average. It looks at the variance. If Group A is exactly 100, 100, 100 and Group B is 105, 105, 105, that is highly significant. If Group A is 10, 190, 100 and Group B is 5, 200, 110, the averages are the same, but the variance is so wild that the difference is meaningless.
033. The 0.05 Threshold
The T-Test outputs a p-value. A p-value of 0.03 means: 'If the new feature did nothing, there is only a 3% chance we would see a difference this large by pure coincidence'. Because 3% is less than the academic standard of 5% (0.05), we 'reject the null hypothesis' and declare the new feature a success.
?Frequently Asked Questions
What if I am testing the same users before and after a change?
Then you cannot use the 'independent' T-Test. You must use `stats.ttest_rel()` (Relative T-Test), which is designed for paired samples (the same people tested twice).
Is 0.05 a magical number?
No, it's an arbitrary convention established decades ago. In medical trials involving human lives, the required p-value might be 0.001 (0.1% chance of error) before a drug is approved.
