LLM From Scratch

Build a Large Language Model from the ground up.
Seven notebooks. Pure PyTorch. No magic.

Parts

600K

Parameters

100%

From Scratch

The Series

Learn next-token prediction by building a model that generates names

Scale up to literature with temperature sampling and creative text generation

Implement Byte-Pair Encoding to understand subword tokenization

Build the transformer architecture: attention, multi-head, positional encoding

Transform your base model into an instruction-following assistant with SFT

Implement Direct Preference Optimization for human preference alignment

Train a model to generate working Python code with automated testing

git clone https://github.com/nipunbatra/llm-from-scratch.git
cd llm-from-scratch
pip install torch matplotlib jupyter