Hands-On Exercise: Access HDFS With Command Line and Hue
Hands-On Exercise: Access HDFS With Command Line and Hue
In
this
exercise
you
will
practice
working
with
HDFS,
the
Hadoop
Distributed
File
System.
You
will
use
the
HDFS
command
line
tool
and
the
Hue
File
Browser
web-‐based
interface
to
manipulate
files
in
HDFS.
$ $DEV1/scripts/training_setup_dev1.sh
The
simplest
way
to
interact
with
HDFS
is
by
using
the
hdfs
command.
To
execute
file
system
commands
within
HDFS,
use
the
hdfs dfs
command.
2. Enter:
This
shows
you
the
contents
of
the
root
directory
in
HDFS.
There
will
be
multiple
entries,
one
of
which
is
/user.
Individual
users
have
a
“home”
directory
under
this
directory,
named
after
their
username;
your
username
in
this
course
is
training,
therefore
your
home
directory
is
/user/training.
3. Try viewing the contents of the /user directory by running:
You will see your home directory in the directory listing.
There
are
no
files
yet,
so
the
command
silently
exits.
This
is
different
than
if
you
ran
hdfs dfs -ls /foo,
which
refers
to
a
directory
that
doesn’t
exist
and
which
would
display
an
error
message.
Note
that
the
directory
structure
in
HDFS
has
nothing
to
do
with
the
directory
structure
of
the
local
filesystem;
they
are
completely
separate
namespaces.
6.
Change
directories
to
the
local
filesystem
directory
containing
the
sample
data
we
will
be
using
in
the
course.
$ cd $DEV1DATA
If
you
perform
a
regular
Linux
ls
command
in
this
directory,
you
will
see
several
files
and
directories
used
in
this
class.
One
of
the
data
directories
is
kb.
This
directory
holds
Knowledge
Base
articles
that
are
part
of
Loudacre’s
customer
service
website.
This
copies
the
local
kb
directory
and
its
contents
into
a
remote
HDFS
directory
named
/loudacre/kb.
You should see the KB articles that were in the local directory.
Relative paths
In HDFS, any relative (non-absolute) paths are considered relative to your home
directory. There is no concept of a “current” or “working” directory as there is in
Linux and similar file systems.
10. Enter:
This
prints
the
last
20
lines
of
the
article
to
your
terminal.
This
command
is
handy
for
viewing
HDFS
data.
An
individual
file
is
often
very
large,
making
it
inconvenient
to
view
the
entire
file
in
the
terminal.
For
this
reason,
it’s
often
a
good
idea
to
pipe
the
output
of
the
fs -cat
command
into
head,
tail,
more,
or
less.
12.
There
are
several
other
operations
available
with
the
hdfs dfs
command
to
perform
most
common
filesystem
manipulations:
mv,
cp,
mkdir,
etc.
In
the
terminal
window,
enter:
$ hdfs dfs
You
see
a
help
message
describing
all
the
file
system
commands
provided
by
HDFS.
Try playing around with a few of these commands if you like.
15.
Because
this
is
the
first
time
anyone
has
logged
into
Hue
on
this
server,
you
will
be
prompted
to
create
a
new
user
account.
Enter
username
training
and
password
training,
then
click
Create
Account.
(If
prompted
you
may
click
“Remember
Password”)
• Note:
When
you
first
log
in
to
Hue
you
may
see
a
misconfiguration
warning.
This
is
because
not
all
the
services
Hue
depends
on
are
running
on
the
course
VM.
You
can
disregard
the
message.
• Note:
If
your
Firefox
window
is
too
small
to
display
the
full
menu
names,
you
will
see
just
the
icons
instead.
17.
By
default,
the
contents
of
your
HDFS
home
directory
(/user/training)
display.
In
the
directory
path
name,
click
the
leading
slash
(/)
to
view
the
HDFS
root
directory.
18.
The
contents
of
the
root
directory
display,
including
the
loudacre
directory
you
created
earlier.
Click
on
that
directory
to
see
the
contents.
19.
Click
on
the
name
of
the
kb
directory
to
see
the
knowledge
base
articles
you
uploaded.
20. View one of the files by clicking on the name of any one of the articles.
21.
In
the
file
viewer,
the
contents
of
the
file
are
displayed
on
the
right.
In
this
case,
the
file
is
fairly
small,
but
typical
files
in
HDFS
are
very
large,
so
rather
than
displaying
the
entire
contents
on
one
screen,
Hue
provides
buttons
to
move
between
pages.
22.
Return
to
the
directory
review
by
clicking
View
file
location
in
the
Action
panel
on
the
left.
24.
To
upload
a
file,
click
the
Upload
button.
You
can
choose
to
upload
a
plain
file,
or
to
upload
a
zipped
file
(which
will
be
automatically
unzipped
after
upload).
In
this
case,
select
Files,
then
click
Select
Files.
27.
When
the
file
has
uploaded,
it
will
be
displayed
in
the
directory.
Click
the
checkbox
next
to
the
file’s
icon,
then
click
the
Actions
button
to
see
a
list
of
actions
that
can
be
performed
on
the
selected
file(s).
28.
Optional:
explore
the
various
file
actions
available.
When
you’ve
finished,
select
any
unneeded
files
you
have
uploaded
and
click
the
Move
to
trash
button
to
delete.