Getting started with Hadoop

Getting started with Hadoop ecosystem is quite different than getting hands dirty with any other library. I plan to document here, few important points of my learning for the benefit of others.

Local Installation

I’ve chosen to try Hortonworks Sandbox for the following reasons.You can also install the sandbox provided by Cloudera (known as Cloudera QuickStart VM) instead.

  • to try out few additional packages like Hue, HCatalog etc.
  • runs on 32-bit and 64-bit OS (Windows XP, Windows 7, Windows 8 and Mac OSX)


  • Minimum 4GB RAM; 8Gb required to run Ambari and Hbase
  • Virtualization enabled on BIOS

I had to re-image my laptop when I tried to enable the virtualization from BIOS. So make sure to backup all important data and try at your own discretion. Instructions on enabling the virtualization are provided below.


Enable virtualization in BIOS

1. Press Esc during system start-up to bring the below screen


2. Press F10 to enter BIOS and then select System Configuration

3. Select Device Configurations


4. Enable the below two settings


5. Download Oracle Virtual Box from here
6. Download Hortonworks Sandbox VM from here
7. Follow the rest of the instructions from here


  • root/hadoop
  • hue/hadoop


  2. SSH: 2222
  3. SCP: 2222


Tagged , , , ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: