Philippe Adjiman's blog

November 22, 2025

GPT From Scratch #7: Building a GPT

Self Attention is the heart of Transformers, the T of GPT. But there are few additional critical parts to the transformer architecture that actually made it shine.
November 19, 2025

GPT From Scratch #6: Coding Self Attention

This is where we get to understand the ~20 most important and impactful lines of code which started the gen AI revolution.
November 15, 2025

GPT From Scratch #5: Positional Encodings

In this post, we’ll show how to add to the neural net the notion of position of the tokens. Simple but powerful.
November 14, 2025

GPT From Scratch #4: The Mathematical Trick Behind Self Attention

One simple mathematical trick. The most cleaver matrix multiplication of the gen AI revolution. What enabled ultra fast self attention.
November 10, 2025

GPT From Scratch #3: The Bigram Model

The simplest model we can use to predict the next character is a Bigram Model. But if implemented as a neural net, the building blocks will stay the same up to GPT.
November 6, 2025

GPT From Scratch #2: The Training Set

Don’t understate the importance of building a proper training set. It is a critical part of the process, and in GPT’s case, a beautifully cleaver one as well.
October 24, 2025

GPT From Scratch #1: Intro

You probably use AI, but do you understand it? Get ready to dive into the internals of what started the (gen) AI revolution: GPT.
November 28, 2024

Decoding Transformers: The Neural Nets Behind LLMs and More

Dive into the magic of self-attention and learn why Transformers became the backbone of every cutting-edge genAI model.
March 9, 2024

Deep Learning Gymnastics #4: Master Your (LLM) Cross Entropy

Use all the gymnastics tricks we’ve learned in order to master (LLM) cross-entropy in PyTorch and TensorFlow.
February 3, 2024

Deep Learning Gymnastics #3: Tensor (re)Shaping

Your tensors aren’t the right shape? Learn how to reshape, squeeze, and stack them like a deep learning gymnast.
December 23, 2023

Deep Learning Gymnastics #2: Tensor Indexing

Learn how smart indexing lets you build batches, embeddings, and masked ops efficiently in modern DL frameworks.
July 16, 2023

Deep Learning Gymnastics #1: Tensor Broadcasting

Master broadcasting like a pro and learn how a single trick can make your deep learning code faster, cleaner, and more elegant.
November 3, 2018

Visualising SGD with Momentum, Adam and Learning Rate Annealing

Watch optimizers battle it out in a visual showdown—Momentum vs Adam vs LR schedules, explained with intuition and flair.
April 3, 2018

Deep Dive Into Logistic Regression: Part 3

In this third and last post of this series, we present the use of a very effective and powerful library to build logistic regression models (among others) in practice: Vowpal Wabbit.
February 26, 2018

Deep Dive Into Logistic Regression: Part 2

Want to know how to implement Stochastic Gradient Descent for Logistic regression able to learn millions of parameters using the hashing trick and per-coordinate adaptive learning rate with a tiny memory footprint? This post is for you.
December 9, 2017

Deep Dive Into Logistic Regression: Part 1

Learn the fundamental theory behind logistic regression.
September 12, 2013

A Data Science Exploration From the Titanic in R

Step aboard the Titanic dataset: Explore, feature-engineer, and model your way to survival predictions with style.
December 30, 2010

How To Easily Build And Observe TF-IDF Weight Vectors With Lucene And Mahout

Want to peek inside TF-IDF weights? Here’s a quick way to build and analyze them without the headache.
February 6, 2010

What Are The 10 Most Cited Websites On Twitter When Tweeting About Hot Trends?

Scrapes and analyzes tweets around Google Hot Trends to see which domains dominate the conversation.
January 14, 2010

Hadoop Tutorial Series, Issue #4: To Use Or Not To Use A Combiner

Explains when Hadoop Combiners help (or hurt) performance and correctness, with code‑level guidance.
January 7, 2010

Hadoop Tutorial Series, Issue #3: Counters In Action

Shows how to instrument MapReduce jobs with Hadoop Counters to track custom metrics during large‑scale processing.
January 6, 2010

How To Build A Relevant Real Time Search Engine Prototype In Few Hundreds Lines Of Code

A hands‑on blueprint for a lightweight, low‑latency (toy) search engine that ingests and surfaces fresh content fast.
December 20, 2009

Hadoop Tutorial Series, Issue #2: Getting Started With (Customized) Partitioning

Teaches key partitioning patterns (e.g., partial sorts to specific reducers) to control data flow in MapReduce jobs.
December 7, 2009

Hadoop Tutorial Series, Issue #1: Setting Up Your MapReduce Learning Playground

Step‑by‑step setup of a Cloudera VM + Maven project so you can quickly experiment with Hadoop wordcount and beyond.
November 11, 2009

Flexible Collaborative Filtering In JAVA With Mahout Taste

Rapid prototyping approach to a recommendation engine using Mahout Taste’s pluggable similarity and scoring components.
November 2, 2009

Writing A Token N-Grams Analyzer In Few Lines Of Code Using Lucene

Leverages Lucene analyzers to emit token n‑grams for downstream text mining or search tasks with minimal Java.