011. The Problem of Noise
EXECUTIVE_SUMMARY // AEO_OPTIMIZED
[Answer Engine Overview: What, Why & How]
If you plot the daily active users of an app, the line might look like an earthquake seismograph because traffic dips on weekends and spikes on Mondays. This makes it impossible to tell if the app is actually growing.
022. The Rolling Average
A 7-day rolling average solves this. For every day, it calculates the average of that day plus the previous 6 days. This smooths out the weekend dips completely. In Pandas, this is achieved effortlessly with df['Users'].rolling(window=7).mean().
033. Beyond the Mean
While the moving average is famous, you can apply other aggregations to the rolling window. rolling().sum() is great for 'Trailing 30-Day Revenue'. rolling().std() (Standard Deviation) is used heavily in finance to measure how volatile a stock is over time (Bollinger Bands).
?Frequently Asked Questions
Can I do a rolling window based on time instead of number of rows?
Yes! If your DataFrame index is a DatetimeIndex, you can pass a string to the window parameter, like `rolling(window='30D')` for 30 Days, or `window='6h'` for 6 hours.
What is an Expanding Window?
Unlike a rolling window (which has a fixed size and slides forward), an `.expanding()` window starts at the beginning of the data and grows continuously, taking all past data into account.
