0% found this document useful (0 votes)

6 views

Lecture 11- Introduction to Apache Hive

The document provides an overview of Apache Hive, a data analysis tool designed for querying and managing large datasets on Hadoop using Hive Query Language (HQL). It compares Hive with other technologies like Hadoop and Pig, highlighting its strengths in handling large-scale data and its limitations in real-time analytics. Additionally, it discusses the history of Hive's development and its role in processing data for various industries.

Uploaded by

kmngl47

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Lecture 11- Introduction to Apache Hive

Uploaded by

kmngl47

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Lecture 11

Apache Hive

By
Dr. Aditya Bhardwaj

aditya.bhardwaj@bennett.edu.in

Big Data Analytics and Business Intelligence (CSET/CMCA-580)

Lecture 12-
13 Apache Hive RoadMap
Working
Architecture,
Commands

Lecture 14

Lecture 11
Joins and Partitision
Introduction to in Hive
Apache Hive

Lecture 15

Practical
Demonstrations on
HQL
Hadoop vs Pig vs Hive- Quick Look at Industrial Use cases
Use Case Apache Hadoop Apache Pig Apache Hive

Data Storage Used for ETL (Extract,

Used for querying and
and Distributed storage and processing of Transform, Load)
managing large datasets
Management large-scale datasets across multiple processes to prepare data
stored in Hadoop with a
nodes using HDFS. stored in Hadoop for
SQL-like interface.
analysis.
Real-Time Not suitable for real-time analytics Suitable for batch data Supports faster querying
Analytics due to batch processing nature. processing but not for real- compared to raw
time analytics. MapReduce; however, not
ideal for real-time
analytics.
Financial Provides a scalable environment for Used to clean, aggregate, Supports SQL-like queries
Analysis and storing and processing transaction and transform raw financial for fast analysis and
Fraud Detection data and detecting anomalies. data for further analysis. reporting of financial data
stored in Hadoop.
Healthcare Data Stores and processes large datasets, Used to transform and Used for querying and
Processing such as electronic health records clean healthcare data for analyzing structured
(EHR), medical imaging, etc. analysis or model building healthcare data, such as
patient records and clinical
data.
History of Hive
 At Facebook the data grew from GBs (2006) to 1 TB/day (2007) and
today it is 500+ PBs per day.

 Rapidly grown data made traditional warehousing failed to process.

 Hadoop is an alternative to store and process large data.

 But MapReduce is very low-level and requires custom code.

 Facebook developed Hive as solution.

 Sept 2008 – Hive becomes a Hadoop subproject.

 Apache continued the development of Hive.

Why Go for Hive When Pig is There?
Hive vs. SQL: Which One to Choose for Data Analysis Better?

Hive and SQL Server are not comparable in any way other than
the similarity in the syntax of the query language.
While SQL Server is built to be able to respond in real-time
from a single machine, hive is for processing large data sets that
may span hundreds or thousands of machines.
Apache Hive is an open source project run by volunteers at the
Apache Software Foundation, used for querying, managing and
storing structured data on Hadoop.
Hive uses HQL (Hive Query Language) that lets you use SQL-
like syntax to define your map and reduce steps
Hive vs. SQL
Challenges of Hive
▪ Compared to Apache Pig, Latency for Apache Hive queries is
generally very high.
Key Summary Points on Hive
 HIVE is not a database but a data analysis tool through SQL
kind of syntax.

 HiveQL (Hive Query Language) are automatically translated

into MapReduce jobs executed on Hadoop.

 Apache Hive converts the SQL queries into MapReduce jobs and
then submits it to the Hadoop cluster.
Reference

 https://hive.apache.org/
Thanks

Lesson Plan 7 Tabata Training
100% (1)
Lesson Plan 7 Tabata Training
4 pages
6 H Data With Hive Big Data Analytics B.tech. Final Year
No ratings yet
6 H Data With Hive Big Data Analytics B.tech. Final Year
24 pages
Bda 06
No ratings yet
Bda 06
15 pages
BigData Analytics Unit-V
No ratings yet
BigData Analytics Unit-V
21 pages
BD - Unit - IV - Hive and Pig
No ratings yet
BD - Unit - IV - Hive and Pig
41 pages
Apache Hive Essentials - Sample Chapter
No ratings yet
Apache Hive Essentials - Sample Chapter
13 pages
bda4og
No ratings yet
bda4og
18 pages
Unit 5
No ratings yet
Unit 5
5 pages
Unit 4 Hadoop Eco System PDF
No ratings yet
Unit 4 Hadoop Eco System PDF
78 pages
Unit 5(Pig,Hive,Hbase)
No ratings yet
Unit 5(Pig,Hive,Hbase)
18 pages
Hadoop - Hive
No ratings yet
Hadoop - Hive
190 pages
Unit 5 Bda
No ratings yet
Unit 5 Bda
18 pages
Hive - PIG - HBase - Zookeeper
100% (1)
Hive - PIG - HBase - Zookeeper
31 pages
Apache Hive Handbook: Query, Analyze, and Optimize Big Data
From Everand
Apache Hive Handbook: Query, Analyze, and Optimize Big Data
Robert Johnson
No ratings yet
Session 3.1
No ratings yet
Session 3.1
29 pages
BDA-NOTES-JNTUK-R20-UNIT-4
No ratings yet
BDA-NOTES-JNTUK-R20-UNIT-4
14 pages
Chapter 5 - Introducing Pig Pig Architecture
No ratings yet
Chapter 5 - Introducing Pig Pig Architecture
81 pages
Session 3.2
No ratings yet
Session 3.2
27 pages
Apache Hive
No ratings yet
Apache Hive
17 pages
bda report
No ratings yet
bda report
16 pages
Introduction To Hive
No ratings yet
Introduction To Hive
9 pages
BDA Session 5
No ratings yet
BDA Session 5
41 pages
Hive - Self Learning Notes
No ratings yet
Hive - Self Learning Notes
69 pages
A Project Report On Web Based Data Management
No ratings yet
A Project Report On Web Based Data Management
16 pages
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
From Everand
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
Wei Liu
No ratings yet
Final Doc Presentation Hive
No ratings yet
Final Doc Presentation Hive
20 pages
Unit 5th Question Bank -Solution
No ratings yet
Unit 5th Question Bank -Solution
70 pages
Hive Full Lecture
No ratings yet
Hive Full Lecture
17 pages
Hive
No ratings yet
Hive
12 pages
Unit-IV -BDA
No ratings yet
Unit-IV -BDA
42 pages
Big Data and Data Analytics Cloudera.
No ratings yet
Big Data and Data Analytics Cloudera.
3 pages
S Pig Hive HBase Zookeeper 07
No ratings yet
S Pig Hive HBase Zookeeper 07
21 pages
Lecture38 PDF
No ratings yet
Lecture38 PDF
23 pages
(r17a0528) Big Data Analytics-57-100
No ratings yet
(r17a0528) Big Data Analytics-57-100
44 pages
Data Warehousing & Analytics On Hadoop: Joydeep Sen Sarma, Ashish Thusoo Facebook Data Team
No ratings yet
Data Warehousing & Analytics On Hadoop: Joydeep Sen Sarma, Ashish Thusoo Facebook Data Team
19 pages
Web Based Data Management of Apache Hive
No ratings yet
Web Based Data Management of Apache Hive
22 pages
BD_Unit3_Summary_781df07f-8ff5-4069-8dd6-f5257e5ce394
No ratings yet
BD_Unit3_Summary_781df07f-8ff5-4069-8dd6-f5257e5ce394
6 pages
BDA IA-3 QB-1[1]
No ratings yet
BDA IA-3 QB-1[1]
17 pages
DA Unit-5
No ratings yet
DA Unit-5
78 pages
Introduction To Hive
No ratings yet
Introduction To Hive
8 pages
Big Data
No ratings yet
Big Data
120 pages
Big Data Analytics
No ratings yet
Big Data Analytics
131 pages
BD U-5 (Anupam Sir)
No ratings yet
BD U-5 (Anupam Sir)
12 pages
7.Hive
No ratings yet
7.Hive
30 pages
big-data-unit 5
No ratings yet
big-data-unit 5
54 pages
Unit 5 Lecture No-1(Hive)
No ratings yet
Unit 5 Lecture No-1(Hive)
30 pages
S_Pig_Hive_HBase
No ratings yet
S_Pig_Hive_HBase
19 pages
Big Data Huawei Course
No ratings yet
Big Data Huawei Course
23 pages
Pig_Hive_Spark_Big_Data_Analytics
No ratings yet
Pig_Hive_Spark_Big_Data_Analytics
10 pages
Module 5_data analytics
No ratings yet
Module 5_data analytics
4 pages
Pig Vs Hive VS Native Map Reduc E: Pangool
No ratings yet
Pig Vs Hive VS Native Map Reduc E: Pangool
6 pages
Hive and Presto For Big Data
No ratings yet
Hive and Presto For Big Data
31 pages
Unit 5 Lecture No-1(Hive)
No ratings yet
Unit 5 Lecture No-1(Hive)
30 pages
BIG DATA Module 2 FINAL SMI
No ratings yet
BIG DATA Module 2 FINAL SMI
44 pages
What Is Apache Pig
No ratings yet
What Is Apache Pig
8 pages
The Free Hive Book
No ratings yet
The Free Hive Book
1 page
Big-Data-Unit 5
No ratings yet
Big-Data-Unit 5
54 pages
Big Data Analytics: Seema Acharya Subhashini Chellappan
100% (1)
Big Data Analytics: Seema Acharya Subhashini Chellappan
47 pages
Hive
No ratings yet
Hive
7 pages
Introduction to Hive
No ratings yet
Introduction to Hive
14 pages
bdcc-2.4
No ratings yet
bdcc-2.4
5 pages
Local Optimization of Three-Address-Code
No ratings yet
Local Optimization of Three-Address-Code
6 pages
Runtime Environments in Compiler Design
100% (1)
Runtime Environments in Compiler Design
12 pages
Optimal Code Generation in Compiler Design
No ratings yet
Optimal Code Generation in Compiler Design
12 pages
Error Recovery
No ratings yet
Error Recovery
16 pages
IBM Data Analyst Capstone Project
No ratings yet
IBM Data Analyst Capstone Project
19 pages
Third Quarter Exam in Reading and Writing
No ratings yet
Third Quarter Exam in Reading and Writing
3 pages
Principles of Management-BBA 143
No ratings yet
Principles of Management-BBA 143
3 pages
The Fortune-Telling Images: A Study of Narrative Technique and Iconography of Tianzhu Lingqian
No ratings yet
The Fortune-Telling Images: A Study of Narrative Technique and Iconography of Tianzhu Lingqian
162 pages
Opticaltweezers
No ratings yet
Opticaltweezers
3 pages
Questionnaire: Impact On The Work-Life Balance of Female Teachers During Pandemic
No ratings yet
Questionnaire: Impact On The Work-Life Balance of Female Teachers During Pandemic
4 pages
SOP Prajesh
No ratings yet
SOP Prajesh
4 pages
SSC CGL Exam 2014 Reasoning Model Practice Papers
No ratings yet
SSC CGL Exam 2014 Reasoning Model Practice Papers
5 pages
RTS Schedule 2025-26
No ratings yet
RTS Schedule 2025-26
6 pages
The Nature of Human Creativity 1st Edition Robert J. Sternberg - The ebook version is available in PDF and DOCX for easy access
100% (1)
The Nature of Human Creativity 1st Edition Robert J. Sternberg - The ebook version is available in PDF and DOCX for easy access
72 pages
Formal Lesson Plan Template 3
No ratings yet
Formal Lesson Plan Template 3
2 pages
Library Management System
No ratings yet
Library Management System
23 pages
Classroom Emotional Climate Student Engagement and
No ratings yet
Classroom Emotional Climate Student Engagement and
15 pages
John MacArthur_servant of the Word and Flock (1)
No ratings yet
John MacArthur_servant of the Word and Flock (1)
282 pages
Pusd Lcap 2018-2019
No ratings yet
Pusd Lcap 2018-2019
212 pages
Thesis Introduction Sample Format
100% (2)
Thesis Introduction Sample Format
4 pages
Resume (DST)
No ratings yet
Resume (DST)
1 page
N5020+Syllabus_+August+21+2024
No ratings yet
N5020+Syllabus_+August+21+2024
5 pages
This I Believe - Cristina Sarrico
No ratings yet
This I Believe - Cristina Sarrico
2 pages
From Juvenile Delinquency to Adult Crime Criminal Careers, Justice Policy, and Prevention, 1st Edition Instant Download
100% (10)
From Juvenile Delinquency to Adult Crime Criminal Careers, Justice Policy, and Prevention, 1st Edition Instant Download
15 pages
ACFrOgBGQHJ7w7Zu6EPiQ r8HNLVFjqgAXLXwHH0Gee48gB7f6WIiYT G5GDFjWIM4iZPKy6n4RwemUPBaa0VFup4YfofceLmx3I4PIekqBqEaBnbMW7LWFWPSvAyG8
No ratings yet
ACFrOgBGQHJ7w7Zu6EPiQ r8HNLVFjqgAXLXwHH0Gee48gB7f6WIiYT G5GDFjWIM4iZPKy6n4RwemUPBaa0VFup4YfofceLmx3I4PIekqBqEaBnbMW7LWFWPSvAyG8
36 pages
ICT G6 B2 W2 Sheet3 Maze Game
No ratings yet
ICT G6 B2 W2 Sheet3 Maze Game
3 pages
A Study On Group Group Dynamics
No ratings yet
A Study On Group Group Dynamics
46 pages
PA00XR37
No ratings yet
PA00XR37
506 pages
Take Better Notes - The Best Guide (Transform Your Grades Now) - Become Your Most
No ratings yet
Take Better Notes - The Best Guide (Transform Your Grades Now) - Become Your Most
1 page
Jamia Admission Schedule 2013
No ratings yet
Jamia Admission Schedule 2013
21 pages
Comprehensive Written Report
No ratings yet
Comprehensive Written Report
7 pages
Past Simple Verb To Be Worksheet
100% (1)
Past Simple Verb To Be Worksheet
2 pages
Physical Sciences Revision Chemical Equilibrium
No ratings yet
Physical Sciences Revision Chemical Equilibrium
18 pages
The Case For Growth: Why Measure Student Learning?
No ratings yet
The Case For Growth: Why Measure Student Learning?
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lecture 11- Introduction to Apache Hive

Uploaded by

Lecture 11- Introduction to Apache Hive

Uploaded by

Lecture 11

Big Data Analytics and Business Intelligence (CSET/CMCA-580)

Data Storage Used for ETL (Extract,

 Rapidly grown data made traditional warehousing failed to process.

 Hadoop is an alternative to store and process large data.

 But MapReduce is very low-level and requires custom code.

 Facebook developed Hive as solution.

 Sept 2008 – Hive becomes a Hadoop subproject.

 Apache continued the development of Hive.

 HiveQL (Hive Query Language) are automatically translated

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.