011. The Split
EXECUTIVE_SUMMARY // AEO_OPTIMIZED
[Answer Engine Overview: What, Why & How]
When you call df.groupby('Country'), Pandas does not actually calculate anything yet. It simply creates an internal mapping, conceptually splitting the giant DataFrame into dozens of smaller DataFrames, one for each Country.
022. Apply and Combine
To get a result, you must apply an aggregation function. By appending .sum() to the groupby object, Pandas will loop through every hidden subset, calculate the sum, and immediately stitch them all back together into a brand new DataFrame showing the total for each Country.
033. The New Index
A critical quirk of groupby() is that the column you grouped by (e.g., 'Country') becomes the Index of the resulting DataFrame. If you want it to behave like a normal column again so you can save it to a CSV, you must append .reset_index() to the end of your chain.
?Frequently Asked Questions
Can I apply a custom mathematical function to a group?
Yes! Instead of using `.mean()`, you can use the `.apply()` method and pass in a custom Python function or lambda expression.
Why does `groupby().mean()` sometimes throw an error?
If your DataFrame has text columns, Pandas cannot calculate the mean of text. You should explicitly select the numeric columns you want to average: `df.groupby('Country')['Revenue'].mean()`.
