Installing Hadoop On Ubuntu
Installing Hadoop On Ubuntu
AMRITPAL SINGH
Introduction
• Hadoop is a Java-based programming framework that supports the
processing and storage of extremely large datasets on a cluster of
inexpensive machines.
• It was the first major open source project in the big data playing field
and is sponsored by the Apache Software Foundation.
Introduction
• Hadoop 2.7 is comprised of four main layers:
• Many other processing models are available for the 2.x version of
Hadoop.
Introduction
• Hadoop clusters are relatively complex to set up, so the project
includes a stand-alone mode which is suitable for learning about
Hadoop, performing simple operations, and debugging.
• We'll install Hadoop in stand-alone mode and run one of the example
example MapReduce programs it includes to verify the installation
Prerequisites
• An Ubuntu 16.04 server with a non-root user with sudo privileges
• Java
Steps
• Step 1 — Installing Java
• java –version
• With Java in place, we'll visit the Apache Hadoop Releases page to
find the most recent stable release.
• http://hadoop.apache.org/releases.html
Steps
Steps
• On the server, we'll use wget to fetch it:
• wget http://apache.mirrors.tds.net/hadoop/common/hadoop-
2.7.3/hadoop-2.7.3.tar.gz
• wget
https://dist.apache.org/repos/dist/release/hadoop/common/hadoop
-2.7.3/hadoop-2.7.3.tar.gz.mds
Steps
• Then run the verification:
• Output
• d489df3808244b906eb38f4d081ba49e50c4603db03efd5e594a1e98b
09259c2 hadoop-2.7.3.tar.gz
Steps
• Compare this value with the SHA-256 value in the .mds file:
• cat hadoop-2.7.3.tar.gz.mds
Steps
• You can safely ignore the difference in case and the spaces.
• Finally, we'll move the extracted files into /usr/local, the appropriate
place for locally installed software.
• We will use readlink with the -f flag to follow every symlink in every
part of the path, recursively.
• Then, we'll use sed to trim bin/java from the output to give us the
correct value for JAVA_HOME.
Steps
• To find the default Java path
• Output
• /usr/lib/jvm/java-8-openjdk-amd64/jre/
Steps
• You can copy this output to set Hadoop's Java home to this specific
version, which ensures that if the default Java changes, this value will
not.