// interactive walkthrough

How Extra Trees trains — random forest, more random

Extremely Randomized Trees (Geurts et al. 2006). Same overall structure as random forest — many trees, averaged predictions, per-split feature subset — but with two key extras: no bootstrap (use the full training set per tree) and a random threshold per feature at each split instead of the optimal one. Faster to train, more variance reduction per tree, often comparable accuracy.

Extra Trees vs Random Forest — the two changes

1. Threshold drawing. RF scans every midpoint of the sorted feature values and picks the one with maximum impurity decrease. Extra Trees draws one uniform-random threshold τ ~ U[min(Xf), max(Xf)] per feature in the random subset, then picks the feature whose random split is best. No threshold search → faster, more random, smoother decision boundaries.

2. Bootstrap. Sklearn's default for ET is bootstrap=False (vs. True for RF). Each tree sees the full training set; randomness comes entirely from feature/threshold sampling. Bootstrap can be turned on for an OOB estimate.

settings changed — forest has been reset.
01

Task & impurity criterion

// per-tree split scoring
task
criterion
02

Hyperparameters

// drag to tune · changes reset the forest
bootstrap
03

Training controls

// build the forest tree by tree
ready
data & ensemble prediction
selected tree
click ▶ Next tree to start.
internal leaf
04

Random thresholds at the root

// the splits this tree's root considered

Build a tree to see the random-threshold draws here.

05

Forest gallery

// click a tree to inspect
train / OOB error vs n_trees
OOB error train error
ensemble vs actuals
06

Feature importance

// weighted impurity decrease across all trees

No trees yet.

07

Forest history

// one row per tree

No trees yet.