Finding the best line to divide two groups sounds simple, but in high-dimensional space, it's an art. SVM is the algorithm that masters this art by maximizing the margin.
1The Optimal Hyperplane
Support Vector Machines (SVM) work by finding a Hyperplane (a decision boundary) that separates classes with the maximum possible Margin. A larger margin means the model is more likely to generalize well to new, unseen data.
2Support Vectors
The algorithm is named after Support Vectorsβthe data points that lie closest to the decision boundary. These points are the most difficult to classify and directly define the position and orientation of the hyperplane. Removing other data points wouldn't change the boundary at all.
3The Kernel Trick
When data cannot be separated by a straight line, we use a Kernel. This mathematical trick project data into a higher-dimensional space where a flat hyperplane CAN separate the classes. The RBF (Radial Basis Function) kernel is the most popular choice for non-linear datasets.
