Hadoop installation Fedora 20

  1. Hadoop download
    http://mirror.hosting90.cz/apache/hadoop/common/current/
  2. Java JDK SE download
    http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html
  3. Install guide for reference
    http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
    http://accretioninfinity.wordpress.com/2013/06/11/installing-hadoop-for-fedora-oracle-linuxsingle-node-cluster/
  4. After starting dfs check that your web is responding http://localhost:50070/dfshealth.html#tab-overview
    $ sbin/start-dfs.sh
    hadoopgui2
  5. In step 5 there seems to be mistake in following line 
    [root@drash hadoop]# bin/hdfs dfs -put etc/hadoop ../input
  6. Main config files I found in /usr/local/hadoop/etc/hadoop
    [hduser@drash hadoop]$ ls -ltr *.xml
    -rw-r–r– 1 hduser hduser 9257 Jun 21 08:05 hadoop-policy.xml
    -rw-r–r– 1 hduser hduser 620 Jun 21 08:05 httpfs-site.xml
    -rw-r–r– 1 hduser hduser 3589 Jun 21 08:05 capacity-scheduler.xml
    -rw-r–r– 1 hduser hduser 867 Oct 12 08:22 hdfs-site.xml
    -rw-r–r– 1 hduser hduser 754 Oct 12 08:29 yarn-site.xml
    -rw-r–r– 1 hduser hduser 221 Oct 12 20:48 mapred-site.xml
    -rw-r–r– 1 hduser hduser 243 Oct 12 20:48 core-site.xml
    [hduser@drash hadoop]$
  7. Change 755 and env settings also in /usr/local/hadoop/etc/hadoop
    [root@drash hadoop]# ll *.sh
    -rwxr-xr-x 1 hduser hduser 3511 Oct 12 08:20 hadoop-env.sh
    -rwxr-xr-x 1 hduser hduser 1449 Jun 21 08:05 httpfs-env.sh
    -rwxr-xr-x 1 hduser hduser 1383 Jun 21 08:05 mapred-env.sh
    -rwxr-xr-x 1 hduser hduser 4603 Oct 12 20:56 yarn-env.sh
  8. In case of Connection refused error check http://wiki.apache.org/hadoop/ConnectionRefused
    I
    n all configuration files DO NOT list localhost, 0.0.0.0, 127.0.0.1 but your IP address like 192.168.1.106
  9. Starting Yarn will enable admin port http://localhost:8042/node
    hadoopgui
  10. Start each deamon separately and check result in /usr/local/hadoop/logs
    [root@drash logs]# ls -ltr | tail -5
    -rw-r–r– 1 root root 40198 Oct 12 21:05 hadoop-root-secondarynamenode-drash.log
    -rw-r–r– 1 root root 702 Oct 12 21:06 yarn-root-resourcemanager-drash.out
    -rw-r–r– 1 root root 103382 Oct 12 21:07 yarn-root-resourcemanager-drash.log
    -rw-r–r– 1 root root 51453 Oct 12 21:07 yarn-root-nodemanager-drash.log
    -rw-r–r– 1 root root 2062 Oct 12 21:08 yarn-root-nodemanager-drash.out
  11. Port 9000 should be in LISTEN state – check also your Firewall configuration
    [root@drash hadoop]# netstat -an | grep 9000
    tcp 0 0 192.168.1.106:9000 0.0.0.0:* LISTEN
    tcp 0 0 192.168.1.106:9000 192.168.1.106:49996 ESTABLISHED
    tcp 0 0 192.168.1.106:49996 192.168.1.106:9000 ESTABLISHED
  12. You should be able to play with Hadoop and hdfs
    [root@drash bin]# ./hdfs fsck -list-corruptfileblocks
    Connecting to namenode via http://localhost:50070
    The filesystem under path ‚/‘ has 0 CORRUPT files
  13. Refference links
    http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
    http://docs.sigmoidanalytics.com/index.php/Installing_Hadoop_1.0.4_on_Fedora_19_with_spark_0.8.0

 

Jan D.
Jan D.

"The only real security that a man will have in this world is a reserve of knowledge, experience, and ability."

Articles: 673

Leave a Reply

Vaše e-mailová adresa nebude zveřejněna. Vyžadované informace jsou označeny *