← Explainer Library

Interactive Explainer

Dropout Playground

Slide the dropout probability and watch a small network flicker. Every frame is a different sub-network. Over many frames you're training an exponential ensemble with shared weights — the core intuition behind the most influential regularizer of 2012.

~8 min Deep Learning · Regularization

Dropout was Hinton's 2012 insight: on every forward pass, randomly silence a fraction p of the hidden units. At inference, turn dropout off and use the whole network. In effect, you train an ensemble of 2N thinned networks that all share weights — for the price of one forward pass per batch.

The playground

Layer 1 alive: Layer 2 alive: Active edges: Scale factor (inverted): 1.11

Try this: set p = 0.5 and press Auto-resample. Each frame you see a different sub-network. Switch to Eval mode — dropout turns off, every unit fires, and predictions become deterministic.

Why the 1/p rescale?

Naive dropout would make activations at eval time (all units alive) systematically larger than at train time (only a fraction alive). To keep the expected activation constant, we scale the surviving units up by 1/p during training:

hdrop = (h ⊙ mask) / p    mask ~ Bernoulli(p)

This is called inverted dropout. It's the form PyTorch uses. Because the rescaling happens at training time, the eval path stays the simplest thing: identity.

Common bug. Forgetting model.eval() at inference means dropout stays on — your predictions will flicker randomly, and calibration will be broken.

Two intuitions for why it works

Ensemble view. Each minibatch trains a different thinned network. Over training you're implicitly training 2N networks that all share weights. At test time, no mask ⇒ you get the geometric-mean ensemble for free.

Co-adaptation view. Without dropout, unit j relies on unit k being alive to do its job. Dropout forces every unit to be useful on its own, distributing the representation rather than localizing it.

Where dropout lives in 2026

Part of the ES 667 Deep Learning course · IIT Gandhinagar · Aug 2026.