011. The Algorithm
EXECUTIVE_SUMMARY // AEO_OPTIMIZED
[Answer Engine Overview: What, Why & How]
The genius of K-Means is its simplicity. Step 1: Drop K random 'Centroids' into the data. Step 2: Assign every data point to its closest Centroid. Step 3: Move each Centroid to the exact middle (mean) of all the points assigned to it. Repeat Steps 2 and 3 until the Centroids stop moving. The clusters are now stable.
022. The Elbow Method
Because you must guess K, data scientists use the Elbow Method. You loop K from 1 to 10. For each loop, you plot the 'Inertia' (how tightly packed the clusters are). With K=1, Inertia is massive. With K=10, Inertia is tiny (but meaningless). The graph will look like a descending curve. The 'Elbow' (the point of inflection where the drop slows down) represents the optimal, natural number of clusters.
033. The Scaling Trap
K-Means uses Euclidean Distance (basic geometry). If you do not run StandardScaler on your data, large numbers (like Salary or Distance) will completely obliterate small numbers (like Age or Ratings). The algorithm will think Salary is 1000x more important than Age just because the numbers are bigger. Scaling is strictly mandatory for K-Means.
?Frequently Asked Questions
What happens if I set K equal to the number of rows?
Every single data point will become its own cluster with an Inertia of 0. This is mathematically perfect, but completely useless for analysis. This is why the Elbow method is needed.
Are there Unsupervised algorithms that don't require guessing K?
Yes, like DBSCAN or Hierarchical Clustering. They group data based on density or trees, meaning they discover the number of clusters natively.
