Understanding Each Line

Why zero_grad() ? PyTorch accumulates gradients. Without clearing, old gradients add up with new ones — usually not what we want.

Neural Networks

Part 1: From Linear Models to Neural Networks

The Story So Far

Where Neural Networks Are Today

Today's Roadmap

Part 1: The Paradigm Change

The Old Way: Hand-Crafted Features

The Problem with Hand-Crafted Features

The Neural Network Way: Learn EVERYTHING

Why This Matters: One Architecture, Many Problems

But How? Let's Build Up From What We Know

Part 2: The Perceptron

Inspiration: The Biological Neuron

The Perceptron (Rosenblatt, 1958)

A Neuron = Something You Already Know!

Building Logic Gates with Perceptrons

AND Gate

OR Gate

NOT Gate

What Does the Perceptron Actually Do?

Visualizing Decision Boundaries

Exercise: Try NAND Yourself!

Part 3: The XOR Problem

Now Try: XOR Gate

Why XOR is Impossible for One Neuron

The Minsky & Papert Crisis (1969)

Two Ways to Fix the XOR Problem

Two Ways to Fix the XOR Problem

The Key Idea: Transform the Space

Notebook Time!

Part 4: The Multi-Layer Perceptron

The MLP: Input → Hidden → Output

Why "Hidden"?

But Wait — Why Do We Need Non-Linearity?

With vs Without Activation: Visual Proof

Activation Functions: Adding the "Bends"

Activation Functions: Visual Comparison

ReLU: The Modern Default

XOR Solved: Step Activations by Hand

What Just Happened?

Visualizing the Transformation

The Big Picture

Part 5: Forward Propagation

Forward Pass: The Big Idea

Worked Example: A Tiny Network

Worked Example: Step by Step

Worked Example: Output

How Many Parameters Does This Have?

Notebook Time!

Multi-Class: From 2 Classes to 10

Softmax: Visual

Softmax: A Worked Example

Summary of Part 1

What's Missing?

Break

Neural Networks

Part 2: Training Neural Networks

Recap: Where We Are

Part 6: The Learning Problem

Random Weights → Random Predictions

Loss Functions: Measuring "How Wrong"

The Goal: Minimize the Loss

Part 7: Backpropagation

The Gradient Question

The Computational Graph

Backprop Intuition: The Blame Game

Backprop: You Don't Need to Do It by Hand!

PyTorch Autograd: A Simple Demo

Notebook Time!

Part 8: The Training Loop

The Training Loop: 4 Steps, Repeat

The Training Loop in PyTorch

Understanding Each Line

Learning Rate: How Big a Step?

Mini-Batch: Not Too Much, Not Too Little

Epochs, Batches, Iterations

Watching Training Progress

Notebook Time!

Part 9: What Do Hidden Layers Learn?

Hidden Layers Learn a Hierarchy of Features