Experiment 3: Hive: Aim: To Understand Data Processing Tool - Hive and HQL (Hive Query Language)
Experiment 3: Hive: Aim: To Understand Data Processing Tool - Hive and HQL (Hive Query Language)
Aim: To understand Data Processing Tool – Hive and HQL (Hive query
language)
Objectives:
1. Create Managed and External tables in HIVE
2. Load data in HIVE table from Local File System
3. Load data in HIVE table from HDFS
4. Query data sets using Hive QL
5. Create partitions and buckets
Key concept:
1
Q1: How to enter the HIVE Shell?
Go to the Terminal and type hive, you will see the hive on the
prompt.
use emp_details;
create table emp(empno int, ename string, job string, sal int,
deptno int)
row format delimited fields terminated by ',';
1,A,clerk,4000,10
2,A,clerk,4000,30
3,B,mgr,8000,20
4,C,peon,2000,40
5,D,clerk,4000,10
6,E,mgr,8000,50
2
The keyword 'OVERWRITE' signifies that existing data in the
table is deleted. If the 'OVERWRITE' keyword is omitted, data
files are appended to existing data sets.
Found 2 items
drwxrwxrwx - cloudera supergroup 0 2018-07-24 02:40
/user/hive/warehouse/emp_details.db/emp
drwxrwxrwx - cloudera supergroup 0 2018-07-24 02:28
/user/hive/warehouse/emp_details.db/emp1
Found 1 items
-rwxrwxrwx 1 cloudera supergroup 104 2018-07-24 02:40
/user/hive/warehouse/emp_details.db/emp/empdetails.txt
,A,clerk,4000,10
2,A,clerk,4000,30
3,B,mgr,8000,20
4,C,peon,2000,40
5,D,clerk,4000,10
6,E,mgr,8000,50
3
Q7: How to see all the tables present in database
show tables;
Q14: List the employee names where job has l as the second
character
4
Q18: How to drop table
drop table emp;
Syntax:
1,2,3
4,5,6
5
NOTE: You will not see this external table in the location
/user/hive/warehouse/emp_details.db as you saw in case of
managed table, this is because external table is created by
referring the data to the location where txt file is there and
not by loading it in the hive table.
Moreover if you drop the managed table all the data will be lost
in location /user/hive/warehouse/emp_details.db where as in case
of external data your data will still remain in hdfs.
In above output we saw that two managed tables only being seen
and not ext1 which is an external table.
If you drop the external table we will not lose the data in hdfs
as shown below.
If you drop the managed table , you will see that you will not
find your data in location: /user/hive/warehouse/emp_details.db
6
..................WORKING WITH MOVIES DATA SET ……………………….
Q4: Load the data set of movies from local to hive table
hive> LOAD DATA LOCAL INPATH
'/home/cloudera/Desktop/hive_demo/movies_new' INTO table
movie_details;
Q8: Select all records where movie name starts from letter c or
C
Q9: select all records where movie name starts with The
7
Q10: What is the maximum rating of the movie
hive> select max(rating) from movie_details;
Q13: List all the years with total number of views in each year
( hint group by year), restrict the records to 5
8
hive>create database shopping;
use shopping;
1,purse,bag,shimla
2,lipstick,cosmetic,delhi
3,bowl,utensils,jammu
4,mobile,electronic gadget,hyderabad
5,skirt,apparel,chennai
6,bed cover,furnishing,chandigarh
7,car,toys,karnal
8,hand purse,bag,solan
9,cream,cosmetic,jhodpur
10,plate,utensils,mohali
11,head phones,electronic gadget,calicut
12,top,apparel,mumbai
13,table cover,furnishing,agra
17,truck,toys,jaipur
18,wallet,bag,solan
19,foundation,cosmetic,jhodpur
20,spoon,utensils,mohali
21,speaker,electronic gadget,calicut
22,suit,apparel,mumbai
33,table sheet,furnishing,agra
24,auto,toys,jaipur
9
hive> create table shopping3(code INT, item_name STRING, place
string)
> partitioned by (category string)
> clustered by (place) into 3 buckets
> row format delimited fields terminated by ','
> stored as texfile;
Found 8 items
drwxrwxrwx - cloudera supergroup 0 2018-07-24 10:48
/user/hive/warehouse/shopping.db/shopping3/category=__HIVE_DEFAU
LT_PARTITION__
drwxrwxrwx - cloudera supergroup 0 2018-07-24 10:48
/user/hive/warehouse/shopping.db/shopping3/category=apparel
drwxrwxrwx - cloudera supergroup 0 2018-07-24 10:48
/user/hive/warehouse/shopping.db/shopping3/category=bag
drwxrwxrwx - cloudera supergroup 0 2018-07-24 10:48
/user/hive/warehouse/shopping.db/shopping3/category=cosmetic
drwxrwxrwx - cloudera supergroup 0 2018-07-24 10:48
/user/hive/warehouse/shopping.db/shopping3/category=electronic
gadget
drwxrwxrwx - cloudera supergroup 0 2018-07-24 10:48
/user/hive/warehouse/shopping.db/shopping3/category=furnishing
drwxrwxrwx - cloudera supergroup 0 2018-07-24 10:48
/user/hive/warehouse/shopping.db/shopping3/category=toys
drwxrwxrwx - cloudera supergroup 0 2018-07-24 10:48
/user/hive/warehouse/shopping.db/shopping3/category=utensils
10
Q6: Check out the buckets for the partition “utensils”?
Found 3 items
-rwxrwxrwx 1 cloudera supergroup 0 2018-07-24 10:48
/user/hive/warehouse/shopping.db/shopping3/category=utensils/000
000_0
-rwxrwxrwx 1 cloudera supergroup 13 2018-07-24 10:48
/user/hive/warehouse/shopping.db/shopping3/category=utensils/000
001_0
-rwxrwxrwx 1 cloudera supergroup 32 2018-07-24 10:48
/user/hive/warehouse/shopping.db/shopping3/category=utensils/000
002_0
11