0% found this document useful (0 votes)
87 views5 pages

Create A Multi-Node Cluster For Distributed Hadoop Environment

The document provides steps to set up a multi-node Hadoop cluster on Ubuntu 18 using Cloudera Hadoop 2.5.0 and Java 11. The key steps include: 1) Creating a common user on each node and setting up SSH access; 2) Configuring hostname and IP addresses in /etc/hosts and /etc/hostname files on all nodes; 3) Installing Java and Hadoop on all nodes; 4) Configuring Hadoop by editing configuration files on the master node and copying them to slave nodes.

Uploaded by

Anju Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views5 pages

Create A Multi-Node Cluster For Distributed Hadoop Environment

The document provides steps to set up a multi-node Hadoop cluster on Ubuntu 18 using Cloudera Hadoop 2.5.0 and Java 11. The key steps include: 1) Creating a common user on each node and setting up SSH access; 2) Configuring hostname and IP addresses in /etc/hosts and /etc/hostname files on all nodes; 3) Installing Java and Hadoop on all nodes; 4) Configuring Hadoop by editing configuration files on the master node and copying them to slave nodes.

Uploaded by

Anju Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

“ Create a multi-node cluster for distributed Hadoop

Environment ”

S/w specifications
Ubuntu 18
Hadoop Cloudera 2.5.0
Java 11

STEPS TO SETUP CLUSTER

Step1 Create a common username example: anju (On all nodes)


Code
sudo useradd -s /bin/bash -d /home/anju -m -G sudo anju
sudo passwd anju
(Set an easy password and login in to the system)

****************************************************************
Step 2 Edit the /etc/hosts and /etc/hostname folder. (On all nodes)
Code
sudo nano /etc/hosts
sudo nano /etc/hostname

#Note the IP Address of master and slaves

master 192.168.75.93
slave1 192.168.75.85
slave2 192.168.75.156
#(Remove The loopback address)
In /etc/hosts change the name to master, slave1 , slave2 respectively.
(Restart the system)
****************************************************************
Step 3 Setup ssh connection (On all nodes)
Code
sudo apt-get install openssh-server
ssh-keygen -b 4096
--------------------------------------------------------------
ssh-copy-id 192.168.75.156
ssh-copy-id 192.168.75.93
ssh-copy-id 192.168.75.85
-----------------------------------------------------------
#To connect with other computers:
ssh 192.168.75.85
ssh 192.168.75.156 (Do Cntrl +Shift + T for multiple terminals)

#Step to have root login enabled for ssh connection


cd /etc/ssh
ls
sudo nano sshd-config
#Change PermitRootLogin to yes and remove the #(comment) before it.
sudo service sshd restart
#This will restart the ssh service eand allow you to do scp.
****************************************************************

Step 4 Install Java (On all nodes)


Code
sudo apt-get update && apt-get upgrade
sudo apt-get install default-jdk
java -version
jps
update-alternatives --config java (For path copy)
****************************************************************
Step 5 Install Hadoop (On All nodes )
#Go to users directory inside home folder.
#Download the tar file
# Tar is the archiving utility you are using. x = extract files from archive. v = verbose
(meaning list all files as they process) f = file (meaning to use a file , the file you are
decompressing) z = decompress through gzip (used if the file ends in .tar.gz instead
of just .tar)

Code
sudo wget http://archive /cloudera…………………..
sudo tar -xzvf hadoop-2.5.0-cdh 5.3.2
sudo mv hadoop-2.5.0 hadoop
ls
****************************************************************
Step 6 Configure Hadoop (On Master node only)
sudo nano ~/.bashrc
source ~/.bashrc
#(Copy the content from website and change the path in prefix according to your
username “anju”)

#Check if everything is installed


bash
hdfs
(If no error go ahead with other steps)
----------------------------------------------------------------------
-> Edit hadoop-env.sh
Path cd /home/anju/hadoop/etc/hadoop
#(Copy the path of java (till jdk, exclude lib..) and paste in hadoop-env.sh
export JAVA_HOME=<PATH>
echo $JAVA_HOME
#This should give you the java path
--------------------------------------------------------------------------
-> Edit core-site.xml
#Copy the <conf> </conf> at the end of the file
(also edit your user directory “anju”)
sudo nano core-site.xml
#We need to create a data folder for Hadoop hdata as we mentioned in the property
of conf
#So Crete a directory hdata using mkdir hdata in the correct location.
#Give read write and other privileges to it.
sudo chmod -R 777 /home/anju/hdata
(-R means all files inside permissions get changed)
-----------------------------------------------------------------------------
-> Edit hdfs-site.xml
#Copy the same data
---------------------------------------------------------------------------
-> Edit mapred-site.template.xml
#Copy the same data at the end of the file
----------------------------------------------------------------------------
-> Edit yarn-site.template.xml
#Copy the same data
--------------------------------------------------------------------------
-> Edit the slaves file
sudo nano slaves
#write
slave1 192.168.75.85
slave2 192.168.75.156
****************************************************************
Step 7 Copy this configuration to the slave nodes (On Master node only)
sudo scp -r /home/anju/hadoop/etc/hadoop/* (space)
192.168.75.156:/home/anju/hadoop/etc/hadoop
sudo scp -r /home/anju/hadoop/etc/hadoop/* (space)
192.168.75.85:/home/anju/hadoop/etc/hadoop
****************************************************************
Step 8 Check on all nodes
hadoop version
java -version
****************************************************************
Step 9 Start Hadoop Cluster
#(Go to the bin folder)
hdfs namenode-format (Do this only once or else all data will get deleted)
#Start the service
jps
start-dfs.sh
stop-dfs.sh
hadoop-daemon.sh start datanode

********
Use these links
https://data-flair.training/blogs/hadoop-2-6-multinode-cluster-setup/
http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.5.0-cdh5.3.2.tar.gz

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy