What if you have data but no answers? Unsupervised learning finds the labels for you. K-Means is the algorithm that brings order to unlabeled chaos.
1The Centroid Logic
K-Means works by placing 'K' number of points called Centroids in your feature space. It then assigns every data point to its nearest centroid. After assignment, it calculates the average of those points and moves the centroid to that new 'mean' position.
2Finding the Elbow
A common question is: 'How do I know what K should be?'. The Elbow Method provides the answer. By plotting the Inertia (sum of squared distances) against the number of clusters, you'll see a sharp drop that eventually levels off. The 'elbow' of this curve is the optimal K.
3The Limitations
K-Means is incredibly fast but has weaknesses. It assumes clusters are spherical and similar in size. It also struggles with noise and outliers, which can pull centroids away from the true center of the data.
