Why Hadoop ?

Why Hadoop is latest buzz word? Not only in valley but everywhere else. In Europe CERN scientist are evaluating HDFS file system for storing huge data (approximately 40TB / day ) generated by LHC to store and process on HDFS. In Asia China Mobile worlds biggest telecom giant is using Hadoop for processing huge data. And in Silicon Valley every other web company is already using Hadoop (Yahoo! , Facebook, LinkedIn, Netflix, IBM , Twitter , Zynga, Amazon….. list goes on) see complete list on Powered by page on Hadoop wiki.

Last month Microsoft announced  that they will be supporting Hadoop on Azure (Cloud computing platform by MS, competing with Amazon EC2 and S3).

There can’t be any argument that Hadoop gained this momentum because it’s an open source project. And credit goes to Yahoo! Although Hadoop was brain child of Dough Cutting (Formerly Yahoo! employee). Yahoo! spent tremendous resources in making Hadoop what it is today.  Yahoo! contributes more than 80% code of Hadoop.

Who said there is no free lunch?

Being open source project Hadoop offers  the best pricing for everybody which is FREE. Also very active community of Hadoop developers and users are useful resource for newbies.  The beauty of open source software is that not only it’s license and distribution is free but it’s support is also free by user and developer community.

Hadoop twiki  is documented with detailed information on Hadoop cluster setup and tutorials for Map reduce programming. You don’t have to pay anybody to setup your cluster or teach you how to write Map reduce programming.

Bend it like you want.

Open source  = source code is available to everyone. As Hadoop is an Apache project everybody can contribute in source code and everybody can express their opinion on how things should be done. As it’s source code is available to everyone , you can customize it as per your needs you can add your own functionalities and if you want you can contribute it back (Unfortunately there are some people out there who don’t contribute back to the community).

It’s  the complete package

Hadoop comes with complete package and there are many more things being added in this packaged , There are new applications being added on top of Hadoop which will be using Hadoop’s scalability and durability. Already there is NoSql stack rising for solving problems which traditional SQL can not solve. Hadoop supports other NoSql projects like Pig, Hive, Hbase which can be used for data mining, web mining  , ETL , BI .