// interactive walkthrough

How Extra Trees trains — random forest, more random

Extremely Randomized Trees (Geurts et al. 2006). Same overall structure as random forest — many trees, averaged predictions, per-split feature subset — but with two key extras: no bootstrap (use the full training set per tree) and a random threshold per feature at each split instead of the optimal one. Faster to train, more variance reduction per tree, often comparable accuracy.

Extra Trees vs Random Forest — the two changes

1. Threshold drawing. RF scans every midpoint of the sorted feature values and picks the one with maximum impurity decrease. Extra Trees draws one uniform-random threshold τ ~ U[min(X_f), max(X_f)] per feature in the random subset, then picks the feature whose random split is best. No threshold search → faster, more random, smoother decision boundaries.

2. Bootstrap. Sklearn's default for ET is bootstrap=False (vs. True for RF). Each tree sees the full training set; randomness comes entirely from feature/threshold sampling. Bootstrap can be turned on for an OOB estimate.

Task & impurity criterion

// per-tree split scoring

task

criterion

Hyperparameters

// drag to tune · changes reset the forest

bootstrap

Training controls

// build the forest tree by tree

ready

data & ensemble prediction —

selected tree —

click ▶ Next tree to start.

internal leaf

Random thresholds at the root

// the splits this tree's root considered

Build a tree to see the random-threshold draws here.

Forest gallery

// click a tree to inspect

train / OOB error vs n_trees —

OOB error train error

ensemble vs actuals —

Feature importance

// weighted impurity decrease across all trees

No trees yet.

Forest history

// one row per tree

No trees yet.