pytorch – Philippe Adjiman's blog

GPT From Scratch #7: Building a GPT

Self Attention is the heart of Transformers, the T of GPT. But there are few additional critical parts to the transformer architecture that actually made it shine.

November 22, 2025

GPT From Scratch #6: Coding Self Attention

This is where we get to understand the ~20 most important and impactful lines of code which started the gen AI revolution.

November 19, 2025

GPT From Scratch #3: The Bigram Model

The simplest model we can use to predict the next character is a Bigram Model. But if implemented as a neural net, the building blocks will stay the same up to GPT.

November 10, 2025

GPT From Scratch #2: The Training Set

Don’t understate the importance of building a proper training set. It is a critical part of the process, and in GPT’s case, a beautifully cleaver one as well.

November 6, 2025

GPT From Scratch #1: Intro

You probably use AI, but do you understand it? Get ready to dive into the internals of what started the (gen) AI revolution: GPT.

October 24, 2025

Deep Learning Gymnastics #4: Master Your (LLM) Cross Entropy

Use all the gymnastics tricks we’ve learned in order to master (LLM) cross-entropy in PyTorch and TensorFlow.

March 9, 2024

Deep Learning Gymnastics #3: Tensor (re)Shaping

Your tensors aren’t the right shape? Learn how to reshape, squeeze, and stack them like a deep learning gymnast.

February 3, 2024

Deep Learning Gymnastics #2: Tensor Indexing

Learn how smart indexing lets you build batches, embeddings, and masked ops efficiently in modern DL frameworks.

December 23, 2023

Deep Learning Gymnastics #1: Tensor Broadcasting

Master broadcasting like a pro and learn how a single trick can make your deep learning code faster, cleaner, and more elegant.

July 16, 2023

Category: pytorch