Adventures in Data

Hadoop Cookbook – 3, How to build your own Hadoop distribution.

Problem : You want to build your own Hadoop distribution.

Often you need particular feature added through patch in your Hadoop build and it’s still in trunk and  not available in Hadoop releases . In such cases you can build and distribute your own Hadoop distribution.

Solution: You can build your own version of Hadoop distribution by following steps given below.

1. Checkout latest released branch (lets say we want to work on Hadoop 0.20 branch)

  > svn checkout \
  http://svn.apache.org/repos/asf/hadoop/common/tags/release-X.Y.Z/ hadoop-common-X.Y.Z

2. Download required patch

3. Apply required patch -> patch -p0 -E < /path/to/patch

4. Test patch

 ant \
  -Dpatch.file=/patch/to/my.patch \
  -Dforrest.home=/path/to/forrest/ \
  -Dfindbugs.home=/path/to/findbugs \
  -Dscratch.dir=/path/to/a/temp/dir \ (optional)
  -Dsvn.cmd=/path/to/subversion/bin/svn \ (optional)
  -Dgrep.cmd=/path/to/grep \ (optional)
  -Dpatch.cmd=/path/to/patch \ (optional)
  test-patch

5. Build Hadoop binary with documentation
ant -Djava5.home=$Java5Home -Dforrest.home=/path_to/apache-forrest
-Dfindbugs.home=/path_to/findbugs/latest compile-core tar

Successful completion of above command will create hadoop tar which can be used as hadoop distribution.

About these ads

Written by Ravi

May 27, 2010 at 1:18 am

Posted in Hadoop cookbook

One Response

Subscribe to comments with RSS.

  1. [...] Download hadoop binary or build hadoop binary from hadoop source. [...]


Comments are closed.

Follow

Get every new post delivered to your Inbox.