Hadoop Cookbook – 3, How to build your own Hadoop distribution.

Problem : You want to build your own Hadoop distribution.

Often you need particular feature added through patch in your Hadoop build and it’s still in trunk and  not available in Hadoop releases . In such cases you can build and distribute your own Hadoop distribution.

Solution: You can build your own version of Hadoop distribution by following steps given below.

1. Checkout latest released branch (lets say we want to work on Hadoop 0.20 branch)

  > svn checkout \
  http://svn.apache.org/repos/asf/hadoop/common/tags/release-X.Y.Z/ hadoop-common-X.Y.Z

2. Download required patch

3. Apply required patch -> patch -p0 -E < /path/to/patch

4. Test patch

 ant \
  -Dpatch.file=/patch/to/my.patch \
  -Dforrest.home=/path/to/forrest/ \
  -Dfindbugs.home=/path/to/findbugs \
  -Dscratch.dir=/path/to/a/temp/dir \ (optional)
  -Dsvn.cmd=/path/to/subversion/bin/svn \ (optional)
  -Dgrep.cmd=/path/to/grep \ (optional)
  -Dpatch.cmd=/path/to/patch \ (optional)

5. Build Hadoop binary with documentation
ant -Djava5.home=$Java5Home -Dforrest.home=/path_to/apache-forrest
-Dfindbugs.home=/path_to/findbugs/latest compile-core tar

Successful completion of above command will create hadoop tar which can be used as hadoop distribution.

One thought on “Hadoop Cookbook – 3, How to build your own Hadoop distribution.

Comments are closed.