Pruning — The Jenga Approach
Analogy: In Jenga, you remove blocks that aren't structurally important. The tower still stands.
Neural networks are the same. Many weights are close to zero — they barely contribute.
Before pruning: [0.8, 0.001, -0.7, 0.002, 0.9, -0.003]
After pruning: [0.8, 0, -0.7, 0, 0.9, 0 ]
Set small weights to zero. The model still works.
Typical result: 50-90% of weights pruned with < 1% accuracy loss.
Structured pruning (removing entire neurons/channels) gives real speedups because hardware can skip whole blocks of computation.