Hive Updated
Hive Updated
HIVE
HIVE
• Hive is a data warehouse system - used to analyse
structured data.
• Built on the top of Hadoop.
• Developed by Facebook.
• Functionality of reading, writing, and managing
large datasets residing in distributed storage.
• Runs SQL like queries called HQL (Hive query
language) which gets internally converted to
MapReduce jobs.
• Using Hive, - skip writing complex MapReduce
programs.
• Hive supports Data Definition Language (DDL),
Features of HIVE
• Hive is fast and scalable.
• Capable of analyzing large datasets stored in
HDFS.
• Allows different storage types - plain text, RCFile,
and HBase.
• It uses indexing to accelerate queries.
• It can operate on compressed data stored in the
HDFS.
• It supports user-defined functions (UDFs) where
user can provide its functionality.
4
Limitations of HIVE
• Hive is not capable of handling real-time data.
• It is not designed for online transaction
processing.
• Hive queries contain high latency.
PIG vs HIVE
Hive Pig
Decimal
BIGINT Types 8-byte signed -9,223,372,036,854,775,808 to
integer 9,223,372,036,854,775,807
Type Size Range