[Note: this was written in 2009 (!) when Hadoop was just starting to get popular, so it is already “very” old but keeping this series live as an archive and souvenir of the early days of the “big data” era]
In that series, you’ll find a progressive set of tutorials written along the way around the Hadoop Apache Project (images are broken and will be restored soon)
-
Hadoop Tutorial Series, Issue #4: To Use Or Not To Use A Combiner
Explains when Hadoop Combiners help (or hurt) performance and correctness, with code‑level guidance.
-
Hadoop Tutorial Series, Issue #3: Counters In Action
Shows how to instrument MapReduce jobs with Hadoop Counters to track custom metrics during large‑scale processing.
-
Hadoop Tutorial Series, Issue #2: Getting Started With (Customized) Partitioning
Teaches key partitioning patterns (e.g., partial sorts to specific reducers) to control data flow in MapReduce jobs.
-
Hadoop Tutorial Series, Issue #1: Setting Up Your MapReduce Learning Playground
Step‑by‑step setup of a Cloudera VM + Maven project so you can quickly experiment with Hadoop wordcount and beyond.




