0% found this document useful (0 votes)
111 views4 pages

Single Node Cluster Creation in AWS Educate EC2

This document provides step-by-step instructions for creating a single node Hadoop cluster on an AWS EC2 instance running Ubuntu. It describes launching an EC2 instance, connecting to it via SSH, installing Java and Hadoop, configuring core Hadoop files, formatting the namenode, and starting Hadoop services to stand up a single node Hadoop cluster for testing or learning purposes.

Uploaded by

ch.Bhanu rekha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views4 pages

Single Node Cluster Creation in AWS Educate EC2

This document provides step-by-step instructions for creating a single node Hadoop cluster on an AWS EC2 instance running Ubuntu. It describes launching an EC2 instance, connecting to it via SSH, installing Java and Hadoop, configuring core Hadoop files, formatting the namenode, and starting Hadoop services to stand up a single node Hadoop cluster for testing or learning purposes.

Uploaded by

ch.Bhanu rekha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Single Node cluster creation in AWS Educate EC2

Step-1: Open https://aws.amazon.com/education/awseducate/ and click on Login to AWS Educate


If you don’t have account on aws create one using KL Mail

Step-2:Navigate to AWS account then click on AWS Educate Starter Accountthen goto AWS
Console

Step-3: Goto to EC2 which is under Services/compute servicescomputeEC2


Step-4: Click on Launch Instance . under search bar type Ubuntu then select Ubuntu Server 18.04
LTS instance

Step-5: Select one of the type of instance (General purpose -t2 medium type is recommended)
then click on Review and LaunchLaunch

Step-6: A popup window appear you can select choose an existing key pair and browse the pair
and launch instance

If you are creating instance first time then create a new key pair give keypair name of
your wish then download it safely on your system (Key pair is mandatory to login to your
instance)

Then click on view instance.

Step-7: Select your instance click on Connect then check on A standard SSH client .
Step-8: Open Command prompt on your windows then navigate to the path of key pair which you
had already downloaded (Refer step-6)

Step-9: Connect to your instance using its public DNS: copy and past the ssh command shown on
your aws Connect to your instance window.
Example: ssh -i "tarunsai.pem" ubuntu@ec2-18-232-129-119.compute-
1.amazonaws.com

Note : .pem file name , instance username differs from one another
Now you logged in to your instance using ssh connectivity.

Step-10: Update and upgrade pakages in ubuntu using


$ sudo apt-get update && sudo apt-get upgrade command

Step-11: Start installing Hadoop on ubuntu terminal

1. Install java on ubuntu$ sudo apt-get install default-jdk


2. Generate SSH key for Hadoop $ ssh-keygen -t rsa -P ""
3. enable SSH access to your virally created machine with this newly created key.cat
$HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
4. Test your connectivity to local host ssh localhost
5. exit from the localhost  $exit
6. Download Hadoop $ wget https://archive.apache.org/dist/hadoop/common/hadoop-
2.8.5/hadoop-2.8.5.tar.gz
7. Extract Hadoop tar file $ tar -xzvf hadoop-2.8.5.tar.gz
8. Edit bashrc  $ nano ./.bashrc
9. Paste these export statements at the end of the file
 export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
 export HADOOP_HOME=/home/ubuntu/hadoop-2.8.5
 export HADOOP_INSTALL=$HADOOP_HOME
 export HADOOP_MAPRED_HOME=$HADOOP_HOME
 export HADOOP_COMMON_HOME=$HADOOP_HOME
 export HADOOP_HDFS_HOME=$HADOOP_HOME
 export YARN_HOME=$HADOOP_HOME
 export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
 export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
 export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
 save and exit by CTRl+O Enter CTRL + X

10. source the bashrc file  $ source ~/.bashrc


11. edit the Hadoop-env.sh file $ nano /home/ubuntu/hadoop-2.8.5/etc/hadoop/hadoop-env.sh
 modify export JAVA_HOME path to export JAVA_HOME=/usr/lib/jvm/java-11-
openjdk-amd64
 modify export HADOOP_CONF_DIR to export HADOOP_CONF_DIR=$
{HADOOP_CONF_DIR:-"/home/ubuntu/hadoop-2.8.5/etc/hadoop"}
 save and exit by CTRl+O Enter CTRL + X
12. Edit core-site.xml file configuration $ nano /home/ubuntu/hadoop-2.8.5/etc/hadoop/core-
site.xml
 Add these configuration to core-site.xmll file
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/ubuntu/hadooptmpdata</value>
</property>
</configuration>

 Save and exit


13. Create these directories
 $ mkdir hadooptmpdata
 $ mkdir -p hdfs/datanode
 $ mkdir -p hdfs/namenode
14. Edit hdfs-site.xml file$ nano /home/ubuntu/hadoop-2.8.5/etc/hadoop/hdfs-site.xml
 Add these configuration to hdfs-site.xml file
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<name>dfs.name.dir</name>
<value>file:///home/ubuntu/hdfs/namenode</value>
<name>dfs.data.dir</name>
<value>file:///home/ubuntu/hdfs/datanode</value>
</property>
</configuration>
15. Copy mapred template  $cp hadoop-2.8.5/etc/hadoop/mapred-site.xml.template hadoop-
2.8.5/etc/hadoop/mapred-site.xml
16. Edit mapred-site.xml file$ nano /home/ubuntu/hadoop-2.8.5/etc/hadoop/mapred-site.xml
 Add these configuration to mapred-site.xml file
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

17. Edit yarn-site.xml file $nano /home/ubuntu/hadoop-2.8.5/etc/hadoop/yarn-site.xml


 Add these configuration to yarn-site.xml file
<configuration>
<property>
<name>mapreduceyarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
18. Format the namenode before using it $ hdfs namenode -format
19. Start the services of hadoop $ start-all.sh
20. Check the started servicesjps
21. If every services of hadoop starts then exit from the ubuntu connection $ exit

Go back to aws console in the browser and select the created


instance ActionsInstance statestop
NOTE: Active internet connection is required while using AWS
instances and must stop the running instances before signing out
from aws console

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy