How Extra Trees trains — random forest, more random
Extremely Randomized Trees (Geurts et al. 2006). Same overall structure as random forest — many trees, averaged predictions, per-split feature subset — but with two key extras: no bootstrap (use the full training set per tree) and a random threshold per feature at each split instead of the optimal one. Faster to train, more variance reduction per tree, often comparable accuracy.
1. Threshold drawing. RF scans every midpoint of the sorted feature values and picks
the one with maximum impurity decrease. Extra Trees draws one uniform-random threshold
τ ~ U[min(Xf), max(Xf)] per feature in the random subset, then picks
the feature whose random split is best. No threshold search → faster, more random, smoother decision
boundaries.
2. Bootstrap. Sklearn's default for ET is bootstrap=False (vs. True
for RF). Each tree sees the full training set; randomness comes entirely from feature/threshold
sampling. Bootstrap can be turned on for an OOB estimate.
Task & impurity criterion
// per-tree split scoringHyperparameters
// drag to tune · changes reset the forestTraining controls
// build the forest tree by treeRandom thresholds at the root
// the splits this tree's root consideredBuild a tree to see the random-threshold draws here.
Forest gallery
// click a tree to inspectFeature importance
// weighted impurity decrease across all treesNo trees yet.
Forest history
// one row per treeNo trees yet.