Images

Getting started with Hadoop

Getting started with Hadoop ecosystem is quite different than getting hands dirty with any other library. I plan to document here, few important points of my learning for the benefit of others.

Local Installation

I’ve chosen to try Hortonworks Sandbox for the following reasons.You can also install the sandbox provided by Cloudera (known as Cloudera QuickStart VM) instead.

  • to try out few additional packages like Hue, HCatalog etc.
  • runs on 32-bit and 64-bit OS (Windows XP, Windows 7, Windows 8 and Mac OSX)

Prerequisites

  • Minimum 4GB RAM; 8Gb required to run Ambari and Hbase
  • Virtualization enabled on BIOS
NOTE

I had to re-image my laptop when I tried to enable the virtualization from BIOS. So make sure to backup all important data and try at your own discretion. Instructions on enabling the virtualization are provided below.

Installation

Enable virtualization in BIOS

1. Press Esc during system start-up to bring the below screen

screen1

2. Press F10 to enter BIOS and then select System Configuration

screen2
3. Select Device Configurations

screen3

4. Enable the below two settings

screen4

5. Download Oracle Virtual Box from here
6. Download Hortonworks Sandbox VM from here
7. Follow the rest of the instructions from here

Credentials

  • root/hadoop
  • hue/hadoop

Connectivity

  1. http://127.0.0.1:8888
  2. SSH: 127.0.0.1 2222
  3. SCP: 127.0.0.1 2222

URL’s

Tagged , , , ,