Hadoop on Cloudera VM

To start the daemons do
$ for service in /etc/init.d/hadoop-0.20-*; do sudo $service start; done

Check
http://localhost:50070/dfshealth.jsp
to see if daemons are up

Installation is in
/usr/lib/hadoop-0.2x/

I then uploaded a file into HDFS
hadoop fs -mkdir input
hadoop dfs -put /.../.../test.file input

and ran a local ruby-Script as mapper on that input
hadoop jar /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2-CDH3B4.jar
-input input/*
-output output
-mapper /home/cloudera/mapper1.rb
-reducer /home/cloudera/reducer1.rb
-file /home/cloudera/mapper1.rb
-file /home/cloudera/reducer1.rb

Afterwards I looked at the result
hadoop fs -cat /user/cloudera/output/part-00000

Advertisements