Philippe Adjiman's blog

The Hadoop Tutorial Series

[Note: this was written in 2009 (!) when Hadoop was just starting to get popular, so it is already “very” old but keeping this series live as an archive and souvenir of the early days of the “big data” era]

In that series, you’ll find a progressive set of tutorials written along the way around the Hadoop Apache Project (images are broken and will be restored soon)

Hadoop Tutorial Series, Issue #4: To Use Or Not To Use A Combiner

Explains when Hadoop Combiners help (or hurt) performance and correctness, with code‑level guidance.
Hadoop Tutorial Series, Issue #3: Counters In Action

Shows how to instrument MapReduce jobs with Hadoop Counters to track custom metrics during large‑scale processing.
Hadoop Tutorial Series, Issue #2: Getting Started With (Customized) Partitioning

Teaches key partitioning patterns (e.g., partial sorts to specific reducers) to control data flow in MapReduce jobs.
Hadoop Tutorial Series, Issue #1: Setting Up Your MapReduce Learning Playground

Step‑by‑step setup of a Cloudera VM + Maven project so you can quickly experiment with Hadoop wordcount and beyond.

The Hadoop Tutorial Series

Hadoop Tutorial Series, Issue #4: To Use Or Not To Use A Combiner

Hadoop Tutorial Series, Issue #3: Counters In Action

Hadoop Tutorial Series, Issue #2: Getting Started With (Customized) Partitioning

Hadoop Tutorial Series, Issue #1: Setting Up Your MapReduce Learning Playground