CC EXP 8 VBHV
CC EXP 8 VBHV
Aim: Install Hadoop single node cluster and run simple applications likeword
count.
Hadoop framework is well comportable in the Linux environment but for the users who
are not familiar with Linux environment but want to use the hadoop framework can be
make use of this article. This article is aim to Install hadoop single node cluster and run
simple application like wordcout.
Procedure:
1. Install Java
2. Configure and install hadoop
3. Test hadoop installation
4. Create wordcount program
5. Input file to mapreduce
6. Display the output
I. JAVA Installation
1. Go to official Java Downloading page
https://www.oracle.com/java/technologies/javase-jre8-downloads.html
1. After downloading java, run the jdk-8u241-windows-x64.exe file
2. Follow the instructions and click next.
3. After finishing the installation it is need to set Java environment variable
4. Go to Start->Edit the System environment variable->Environment
variable
5. Then Click new and enter variable name as “JAVA_HOME”
6. In the value field Enter the java path such as
“C:\Java\jdk1.8.0_241”(Consider your installation folder)
Fig-3.1
Fig-3.2
8 . Then click Ok and Go to Command Prompt
9. Type “Java -version”. If it prints the installed version of java, now java
successfully installed in your System.
Fig-3.3
Fig-3.4
Fig-3.6
i. core-site.xml
<configuration> <property> <name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value> </property>
</configuration>
ii. Rename " mapred- site. xml. template " to " mapred- site. xml " and
edit this fileD:/Hadoop/etc/hadoop/mapred-site.xml, paste below xml
paragraph and save this file.
<configuration> <property> &https://www.linkedin.com/redir/phishing-
page?url=lt%3Bname%26gt%3Bmapreduce%2eframework%2ename</name>
<value>yarn</value> </property>
</configuration>
iv. hdfs-site.xml
<configuration> <property> <name>dfs.replication</name>
<value>1</value> </property> <property>
<name>dfs.namenode.name.dir</name>
<value>D:\hadoop\data\namenode</value> </property> <property>
<name>dfs.datanode.data.dir</name>
<value>D:\hadoop\data\datanode</value> </property>
</configuration>
v. yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux- services</name>
<value>mapreduce_shuffle</value> </property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value> </property>
</configuration>
viii. Delete file bin on D:\Hadoop\bin and replace it by the bin file of
Downloaded configuration file (from Hadoop Configuration.zip).
ix. Open cmd and typing command "hdfs namenode – format " .You
will see through command prompt which tasks are processing, after
competeation you will get a massage like namenode format succesfully and
shutdown message
Fig-3.7
2. To start the hadoop locate to “D:\hadoop\sbin” via command prompt andpress
start-all.cmd
Fig-3.8
Now, you can see the namenode, datanode and yarn engines getting start,
Fig-3.9
3. Now type “jps”. JPS (Java Virtual Machine Process Status Tool) is a command is
used to check all the Hadoop daemons like NameNode, DataNode,
ResourceManager, NodeManager etc.
Fig-3.10
4. Open: http://localhost:8088 in any browser
Fig-3.11
Fig-3.12
Fig-3.13
Fig-3.15
Analysis:
This provides a clear, step-by-step guide for installing and configuring Hadoop on a
Windows system, along with running a basic WordCount program. It covers essential
tasks such as setting up Java, configuring Hadoop, testing the installation, and executing
the WordCount program. The instructions are detailed, including screenshots for clarity.
However, it could benefit from explanations of Hadoop concepts, troubleshooting tips,
and considerations for security. Overall, it's a useful resource for beginners aiming to set
up Hadoop on Windows.
Conclusion:
In this experiment, we installed and ran Hadoop on a Windows environment, complete
with executing a simple WordCount program. By following the detailed instructions
provided, users can successfully set up their Hadoop single-node cluster and perform
basic MapReduce tasks. While the guide covers essential steps and includes helpful
visuals, there's room for improvement in terms of explaining Hadoop concepts, offering
troubleshooting guidance, and addressing security considerations. Nonetheless, it serves
as a valuable resource for beginners seeking to explore Hadoop in a Windows setting.
Result:
Installed and ran Hadoop on Windows, including executing a WordCount program, and
explained in depth the concepts and addressing potential issues.