How a Support Vector Machine trains — one update at a time
An SVM finds the widest possible margin separating the classes. Here we train it the way it's actually done at scale — stochastic sub-gradient descent on the hinge loss (the Pegasos solver) — so you can step through one example at a time and watch the boundary, the margin, and the support vectors move. Switch the kernel to see linear fail and RBF succeed on non-linear data.
1 · Margin. The boundary is w·x + b = 0. The margin is the empty band
between the two dashed lines w·x + b = ±1. Its geometric width is 2/‖w‖ —
so minimizing ‖w‖² maximizes the margin.
2 · Support vectors. Only the points on or inside the margin (or, for kernels, with a non-zero coefficient) determine the boundary. Delete every other point and you get the exact same model. They're circled in the plot.
3 · Soft margin (C). Real data overlaps. C trades margin width against
violations: high C = punish every mistake = narrow, hard margin; low C =
tolerate violations for a wider, smoother margin.
4 · Kernel trick. Replace the dot product x·z with a kernel
K(x, z) and the SVM separates data with a curved boundary — without ever computing the
high-dimensional feature map. linear, RBF, and poly are below.
SVMs are not scale-invariant, so features are standardized first.
Task, kernel & dataset
// what to separate, and howHyperparameters
// drag to tune · changes reset the modelTraining controls
// one update, one epoch, or all at onceWhy this update?
// the sub-gradient step for the chosen exampleClick ▷ Next update to take one sub-gradient step; the hinge-loss
decision for that example will be shown here.
Epoch history
// one row per completed epochNo epochs completed yet.