// interactive walkthrough

How a Support Vector Machine trains — one update at a time

An SVM finds the widest possible margin separating the classes. Here we train it the way it's actually done at scale — stochastic sub-gradient descent on the hinge loss (the Pegasos solver) — so you can step through one example at a time and watch the boundary, the margin, and the support vectors move. Switch the kernel to see linear fail and RBF succeed on non-linear data.

The four ideas you need

1 · Margin. The boundary is w·x + b = 0. The margin is the empty band between the two dashed lines w·x + b = ±1. Its geometric width is 2/‖w‖ — so minimizing ‖w‖² maximizes the margin.

2 · Support vectors. Only the points on or inside the margin (or, for kernels, with a non-zero coefficient) determine the boundary. Delete every other point and you get the exact same model. They're circled in the plot.

3 · Soft margin (C). Real data overlaps. C trades margin width against violations: high C = punish every mistake = narrow, hard margin; low C = tolerate violations for a wider, smoother margin.

4 · Kernel trick. Replace the dot product x·z with a kernel K(x, z) and the SVM separates data with a curved boundary — without ever computing the high-dimensional feature map. linear, RBF, and poly are below. SVMs are not scale-invariant, so features are standardized first.

Task, kernel & dataset

// what to separate, and how

task

kernel

dataset

Hyperparameters

// drag to tune · changes reset the model

Training controls

// one update, one epoch, or all at once

ready

decision boundary & margin —

model state —

Why this update?

// the sub-gradient step for the chosen example

Click ▷ Next update to take one sub-gradient step; the hinge-loss decision for that example will be shown here.

mean hinge loss —

training accuracy —

Epoch history

// one row per completed epoch

No epochs completed yet.