0% found this document useful (0 votes)
3 views

Data Analytics Lab

This document outlines the steps for installing Apache Hadoop on Windows 10, including prerequisites like Java installation and setting environment variables. It details the configuration of various XML files necessary for Hadoop's operation and provides instructions for testing the installation. Additionally, it highlights the applications of Hadoop in business for data processing and decision-making.

Uploaded by

21040452
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Data Analytics Lab

This document outlines the steps for installing Apache Hadoop on Windows 10, including prerequisites like Java installation and setting environment variables. It details the configuration of various XML files necessary for Hadoop's operation and provides instructions for testing the installation. Additionally, it highlights the applications of Hadoop in business for data processing and decision-making.

Uploaded by

21040452
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Experiment No.

Title: Write steps for installing the HADOOP in windows 10

The Apache Hadoop software library is a framework that allows for the distributed
processing of large data sets across clusters of computers using simple programming
models. It is designed to scale up from single servers to thousands of machines, each
offering local computation and storage. Rather than rely on hardware to deliver high-
availability, the library itself is designed to detect and handle failures at the application
layer, so delivering a highly-available service on top of a cluster of computers, each of
which may be prone to failures.

Install Java

– Java JDK Link to download


https://www.oracle.com/java/technologies/javase-jdk8-downloads.html
– extract and install Java in C:\Java
– open cmd and type -> javac -version

Download Hadoop

– https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.3.0/hadoop-3.3.0.tar.gz

– extract to C:\Hadoop
1. Set the path JAVA_HOME Environment variable
2. Set the path HADOOP_HOME Environment variable
Configurations: -

Edit file C:/Hadoop-3.3.0/etc/hadoop/core-site.xml,


paste the xml code in folder and save

<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
======================================================

Rename “mapred-site.xml.template” to “mapred-site.xml” and edit this file C:/Hadoop-


3.3.0/etc/hadoop/mapred-site.xml, paste xml code and save this file.

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
======================================================
Create folder “data” under “C:\Hadoop-3.3.0”
Create folder “datanode” under “C:\Hadoop-3.3.0\data”
Create folder “namenode” under “C:\Hadoop-3.3.0\data”

======================================================
Edit file C:\Hadoop-3.3.0/etc/hadoop/hdfs-site.xml,
paste xml code and save this file.

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/hadoop-3.3.0/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/hadoop-3.3.0/data/datanode</value>
</property>
</configuration>
======================================================

Edit file C:/Hadoop-3.3.0/etc/hadoop/yarn-site.xml,


paste xml code and save this file.

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
======================================================

Edit file C:/Hadoop-3.3.0/etc/hadoop/hadoop-env.cmd


by closing the command line
“JAVA_HOME=%JAVA_HOME%” instead of set “JAVA_HOME=C:\Java”

======================================================
Testing: -

– Open cmd and change directory to C:\Hadoop-3.3.0\sbin


– type start-all.cmd

– Start namenode and datanode with this command


– type start-dfs.cmd
– Start yarn through this command
– type start-yarn.cmd

Make sure these apps are running

– Hadoop Namenode
– Hadoop datanode
– YARN Resource Manager
– YARN Node Manager

Open: http://localhost:8088
======================================================

Hadoop installed Successfully…………

======================================================

Applications of Hadoop

Hadoop is developed by Doug Cutting and Michale J. It is managed by apache software


foundation and licensed under the Apache license 2.0 Hadoop. It is beneficial for the big
business because it is based on cheap servers, requiring less cost to store the data and process
the data. Hadoop helps make a better business decision by providing a history of data and
various company records. So by using this technology company can improve its business.
Hadoop does lots of processing over collected data from the company to deduce the result
which can help to make a future decision.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy