0% found this document useful (0 votes)
63 views30 pages

Cloudera Install

The document provides instructions for installing Hadoop on Linux or Windows and describes basic Hadoop commands for file management and directory navigation in the Hadoop file system. It also includes steps for configuring important Hadoop configuration files and launching the Hadoop services on Linux.

Uploaded by

chetana tukkoji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views30 pages

Cloudera Install

The document provides instructions for installing Hadoop on Linux or Windows and describes basic Hadoop commands for file management and directory navigation in the Hadoop file system. It also includes steps for configuring important Hadoop configuration files and launching the Hadoop services on Linux.

Uploaded by

chetana tukkoji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 30

19ECS442P: BIG DATA LAB

Lab experiments for Bigdata

S.NO Name of the Task Page No.


1 Installation of Hadoop
1.1 Linux OS
(OR)
1.2 Windows OS
2 Perform file management task in Hadoop.
a. Creating directory
b. List the contents of a directory
c. Upload and download a file
d. See contents of a file
e. Copy a file from source to destination
Move file from source to destination.
1.1 Hadoop Installation (Linux)

Prerequisite Test

=============================

sudo apt update

sudo apt install openjdk-8-jdk -y

java -version; javac -version

sudo apt install openssh-server openssh-client -y

sudo adduser hdoop

su - hdoop

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

chmod 0600 ~/.ssh/authorized_keys

ssh localhost

Downloading Hadoop (Please note link is updated to new


version of hadoop here on 6th May 2022)

===============================
wget https://downloads.apache.org/hadoop/common/hadoop-
3.2.3/hadoop-3.2.3.tar.gz

tar xzf hadoop-3.2.3.tar.gz

Editng 6 important files

=================================

1st file

===========================

sudo nano .bashrc - here you might face issue saying hdoop is
not sudo user

if this issue comes then

su - aman

sudo adduser hdoop sudo

sudo nano .bashrc

#Add below lines in this file

#Hadoop Related Options

export HADOOP_HOME=/home/hdoop/hadoop-3.2.3

export HADOOP_INSTALL=$HADOOP_HOME

export HADOOP_MAPRED_HOME=$HADOOP_HOME

export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME

export YARN_HOME=$HADOOP_HOME

export
HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/nat
ive

export PATH=$PATH:$HADOOP_HOME/sbin:
$HADOOP_HOME/bin

export
HADOOP_OPTS"-Djava.library.path=$HADOOP_HOME/lib/nativ"

source ~/.bashrc

2nd File

============================

sudo nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh

#Add below line in this file in the end

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

3rd File

===============================

sudo nano $HADOOP_HOME/etc/hadoop/core-site.xml


#Add below lines in this file(between "<configuration>" and
"<"/configuration>")

<property>

<name>hadoop.tmp.dir</name>

<value>/home/hdoop/tmpdata</value>

<description>A base for other temporary


directories.</description>

</property>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

<description>The name of the default file


system></description>

</property>

4th File

====================================

sudo nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml

#Add below lines in this file(between "<configuration>" and


"<"/configuration>")
<property>

<name>dfs.data.dir</name>

<value>/home/hdoop/dfsdata/namenode</value>

</property>

<property>

<name>dfs.data.dir</name>

<value>/home/hdoop/dfsdata/datanode</value>

</property>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

5th File

================================================

sudo nano $HADOOP_HOME/etc/hadoop/mapred-site.xml

#Add below lines in this file(between "<configuration>" and


"<"/configuration>")
<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

6th File

==================================================

sudo nano $HADOOP_HOME/etc/hadoop/yarn-site.xml

#Add below lines in this file(between "<configuration>" and


"<"/configuration>")

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-
services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

<property>

<name>yarn.resourcemanager.hostname</name>
<value>127.0.0.1</value>

</property>

<property>

<name>yarn.acl.enable</name>

<value>0</value>

</property>

<property>

<name>yarn.nodemanager.env-whitelist</name>

<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS
_HOME,HADOOP_CONF_DIR,CLASSPATH_PERPEND_DISTCACH
E,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>

</property>

Launching Hadoop

==================================

hdfs namenode -format

./start-dfs.sh

Reference: https://www.youtube.com/watch?v=Ih5cuJYYz6Y
1.2 Hadoop Installation (Windows)

Cloudera Quick Start VM Installation

Step1:

Download & Install VM Ware Workstation Player

Step2:

Download Cloudera quick start VM

Step3:

Install Cloudera quick start VM on VM Ware workstation

1) Download & Install VM Ware Workstation Player

Download VM Ware from the given link

https://www.vmware.com/in/products/workstation-player/
workstation-player-evaluation.html
Step2:

Download Cloudera quick start VM

https://www.youtube.com/redirect?
event=video_description&redir_token=QUFFLUhqa2w5bUc2YkN
zNHhJZnN2cjJPUHl2blRBWnFDZ3xBQ3Jtc0tsNms5X2ZEU3Blb0
dERWdEa1h0NDRiZEMxYXdOOS1DLUlxNGo0NEhNcVFGcGpTYl
djNmtzTFh5dEF0YjJKSnV0aDBvbzlFMjRMWXB4LWk5aldreTNl
MkhJdlMyWkRIZU1xVmJWQ1lsTDlOc2R5NG1IUQ&q=https%3A
%2F%2Fdownloads.cloudera.com%2Fdemo_vm%2Fvmware
%2Fcloudera-quickstart-vm-5.13.0-0-
vmware.zip&v=0LmLMur_MSo

Extract file by using 7-zip or winrar


Step3:

Install Cloudera quick start VM on VM Ware workstation


(NOTE: It takes time to load-depends on system performance)
(Desktop View)
Cloudera Exploration:
GUI based query editor
2 Hadoop Commands

 Hadoop is a open-source distributed framework that is used to


store and process a large set of datasets. To store data, Hadoop
uses HDFS, and to process data, it uses MapReduce & Yarn.
 hadoop fs or hdfs dfs are file system commands to interact with
HDFS.
 These commands are very similar to Unix Commands.

Unix Commands Hadoop Commands


default path /home/cloudera default path /user/cloudera

a) ls command: [cloudera@quickstart ~]$ $hdfs dfs -


This command is used to list all the ls
files. or
ex: [cloudera@quickstart ~]$ $hadoop
[cloudera@quickstart ~]$ ls fs -ls
NOTE: From the same terminal, we can type both Unix and
HDFS commands.
b) mkdir: creates directory

[cloudera@quickstart ~]$ cd demoLocal

[cloudera@quickstart demoLocal]$cd ..

[cloudera@quickstart ~]$ clear --- to clear the screen

Note: Default path to HDFS files /user/cloudera

[cloudera@quickstart ~]$hdfs dfs -mkdir demoHdfs

[cloudera@quickstart ~]$ hdfs dfs -ls

To check whether demoHdfs directory created or not, do the


following steps:
i) open browser -> click HUE (username and password:
cloudera)
[cloudera@quickstart ~]$ hadoop version
Hadoop 2.6.0-cdh5.13.0
Subversion http://github.com/cloudera/hadoop -r
42e8860b182e55321bd5f5605264da4adc8882be
Compiled by jenkins on 2017-10-04T18:08Z
Compiled with protoc 2.5.0
From source with checksum
5e84c185f8a22158e2b0e4b8f85311
This command was run using /usr/lib/hadoop/hadoop-
common-2.6.0-cdh5.13.0.jar
[cloudera@quickstart ~]$

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy