Syllabus · UDL mapping

Lecture-by-lecture reading map for Prince · Understanding Deep Learning (2023)

Primary textbook · Simon J. D. Prince, Understanding Deep Learning (MIT Press, 2023). Free PDF at udlbook.github.io.

The course follows the UDL chapter order closely. Read the cited chapter(s) before each lecture.

Lecture 0 is a probability / MLE primer; 24 main lectures follow. L6 (Regularization) is comprehensive — spans two class sessions.

Module 0 · Probability & MLE Primer

#	Lecture	Reading	Supplementary
0	Probability, MLE & NLL	Bishop & Bishop Ch 2–5; Prince Ch 1, Ch 3	KL, MAP, reparameterization, score-function preview

Module 1 · Foundations & Going Deep

#	Lecture	UDL chapter	Supplementary
1	Why DL + MLP Recap	Ch 1 Introduction · Ch 3 Shallow networks	Bishop Ch 6
2	UAT + ResNets + Initialization	Ch 4 Deep networks · Ch 7 Gradients & init · Ch 11 Residual networks	He et al. 2015 (ResNet)
3	Training Deep Networks in Practice	Ch 6 Fitting models (early) · Ch 8 Measuring performance	Karpathy makemore-1

Module 2 · Optimization

#	Lecture	UDL chapter	Supplementary
4	SGD, Momentum, Nesterov	Ch 6 (SGD + momentum)	Sutskever et al. 2013
5	Adam, AdamW, Schedules	Ch 6 (Adam) · Ch 7 (initialization revisited)	Kingma & Ba 2015; Loshchilov & Hutter 2017

Module 3 · Regularization

#	Lecture	UDL chapter	Supplementary
6	Regularization (2 sessions) · classical + data + dropout + normalization	Ch 9 Regularization · Ch 11 (BatchNorm)	Belkin 2019 (double descent); Santurkar 2018 (BN); Ba 2016 (LN)

Module 4 · CNNs & Visual Recognition

#	Lecture	UDL chapter	Supplementary
7	CNN Deep Dive + Classic Architectures	Ch 10 Convolutional networks (early)	LeCun 1998; Krizhevsky 2012
8	Modern CNNs & Transfer Learning	Ch 10 (advanced) · Ch 11 (skip in CNNs)	Szegedy 2014; Howard 2017
9	Detection & Segmentation	Not in Prince — use Bishop Ch 10 + CS231n notes	Ren 2015 (Faster R-CNN); Redmon 2015 (YOLO); Ronneberger 2015 (U-Net); SAM

Module 5 · Sequence Models

#	Lecture	Reading	Supplementary
10	RNNs, LSTMs, GRUs	Bishop Ch 12 · d2l §9–10	Hochreiter & Schmidhuber 1997
11	Seq2Seq & Motivation for Attention	Bishop Ch 12 · d2l §10.6–10.8	Sutskever et al. 2014

Module 6 · Attention & Transformers

#	Lecture	UDL chapter	Supplementary
12	Attention Mechanism	Ch 12 Transformers (early)	Bahdanau 2014; Luong 2015
13	The Transformer — Built Live	Ch 12 (mid)	Vaswani 2017; Karpathy nanoGPT
14	Tokenization & Pretraining Paradigms	Ch 12 (pretraining)	Sennrich 2015 (BPE); Devlin 2018 (BERT); Radford 2018 (GPT-1); Karpathy tokenization

Module 7 · LLMs

#	Lecture	Reading	Supplementary
15	Large Language Models	Hoffmann 2022 (Chinchilla); HuggingFace course Ch 1	Karpathy State of GPT; Touvron 2023 (Llama 2); Su 2021 (RoPE)
16	Alignment & Fine-tuning	HF PEFT docs; Ouyang 2022 (InstructGPT); Rafailov 2023 (DPO)	Hu 2021 (LoRA); Dettmers 2023 (QLoRA); Anthropic Constitutional AI

Module 8 · Self-Supervision & Vision-Language

#	Lecture	UDL chapter	Supplementary
17	Self-Supervised & Contrastive Learning	Ch 14 Unsupervised learning (contrastive)	Chen 2020 (SimCLR); Grill 2020 (BYOL); He 2021 (MAE); Oquab 2023 (DINOv2)
18	Vision-Language Models	Ch 12 (ViT) + papers	Dosovitskiy 2020 (ViT); Radford 2021 (CLIP); Liu 2023 (LLaVA); Alayrac 2022 (Flamingo)

Module 9 · Generative Models

#	Lecture	UDL chapter	Supplementary
19	Autoencoders & VAEs	Ch 17 Variational autoencoders	Kingma & Welling 2013
20	GANs	Ch 15 GANs	Goodfellow 2014; Radford 2015 (DCGAN); Arjovsky 2017 (WGAN)
21	Diffusion Models — Theory	Ch 18 Diffusion models (early)	Ho et al. 2020 (DDPM); Song 2020 (Score-SDE)
22	Diffusion Models — Practice	Ch 18 (later) + HF `diffusers` docs	Rombach 2022 (Stable Diffusion); Ho & Salimans 2022 (CFG); Peebles & Xie 2023 (DiT)

Module 10 · Wrap-up

#	Lecture	Reading
23	Efficient Inference (KV-cache, quantization, FlashAttention, distillation, speculative decoding)	Chip Huyen blog; HF inference docs; Dao 2022 (FlashAttention); Hinton 2015 (distillation); Leviathan 2022 (speculative decoding); Kwon 2023 (vLLM)
24	Frontier · Agents, Reasoning, Interpretability + Course Recap	Yao 2022 (ReAct); Wei 2022 (CoT); OpenAI o1 blog; Anthropic interpretability blog; Elhage 2021 (circuits)

Where UDL doesn’t cover

L9 · Detection & Segmentation — UDL does not cover OD/segmentation. Use Bishop Ch 10 + CS231n notes.
L15 · LLMs at scale — scaling laws, RoPE, GQA, distributed training. Use Chinchilla paper + HF course.
L16 · Alignment & Fine-tuning — LoRA, RLHF, DPO. Use HF PEFT docs + DPO paper.
L23 · Efficient Inference — KV-cache, quantization, FlashAttention. Blog posts + papers.
L24 · Frontier · Agents, Reasoning, Interpretability — active 2024–26 research. Curated blogs + papers.

Other references

Bishop & Bishop, Deep Learning: Foundations and Concepts (2024) — rigorous second opinion.
Zhang et al., Dive into Deep Learning (d2l.ai) — hands-on PyTorch notebooks.
Karpathy, Neural Networks: Zero to Hero (YouTube) — build-from-scratch video series; backup for L12–L14.
Goodfellow, Bengio, Courville, Deep Learning (2016) — classical reference.