0% found this document useful (0 votes)
518 views3 pages

Big Data and Analytics Syllabus 2021

This document outlines a course on Big Data and Analytics that introduces students to big data concepts, Hadoop, and analytics tools. The course contains 5 units that cover getting an overview of big data; understanding the Hadoop ecosystem including HDFS, MapReduce, HBase, Hive, Pig, and Zookeeper; MapReduce fundamentals and using HBase; MongoDB and Cassandra; and Hive and Pig. The course objectives are to understand big data platforms and use cases, introduce big data challenges, teach skills for managing and analyzing big data using Hadoop tools.

Uploaded by

rashmi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
518 views3 pages

Big Data and Analytics Syllabus 2021

This document outlines a course on Big Data and Analytics that introduces students to big data concepts, Hadoop, and analytics tools. The course contains 5 units that cover getting an overview of big data; understanding the Hadoop ecosystem including HDFS, MapReduce, HBase, Hive, Pig, and Zookeeper; MapReduce fundamentals and using HBase; MongoDB and Cassandra; and Hive and Pig. The course objectives are to understand big data platforms and use cases, introduce big data challenges, teach skills for managing and analyzing big data using Hadoop tools.

Uploaded by

rashmi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 3

BIG DATA AND ANALYTICS

Course Code: CS8T02 Credit: 4-0-0-4


COURSE OBJECTIVES :
 Understand the Big Data Platform and its Use cases
 Introduce students the concept and challenge of big data
 Provide HDFS Concepts and Interfacing with HDFS
 Teach students in applying skills and tools to manage and analyze the big data.

UNIT I: Getting an Overview of Big Data 10 Hrs


What is Big Data? History of Data Management-Evolution of Big Data, Structuring Big Data-
Types of Data, Elements of Data, Advantages of Big Data Analytics Introducing Technologies
for Handling Big Data Distributed and Parallel Computing for Big Data, Introducing Hadoop,
Cloud Computing and Big Data: Cloud Delivery Models, Cloud Services for Big Data, Cloud
Providers in Big Data Market, In-Memory Computing Technology for Big Data.

UNIT II: Understanding Hadoop Ecosystem 10 Hrs


Hadoop Ecosystem, Hadoop Distributed File System: HDFS Architecture, Concept of Blocks in
HDFS in HDFS Architecture, NameNodes and DataNodes, The Command-line Interface, Using
HDFS Files, HDFS High Availability, Features of HDFS, MapReduce, Hadoop YARN,
Introducing HBase: HBase Architecture, Regions, Storing Big Data with Hbase, Interacting with
Hadoop Ecosystem, Hbase in Operation – Programming with HBase,
Combining HBase and HDFS: REST and Thrift, Data Integrity in HDFS, Features of HBase,
Hive, Pig and Pig Latin, Sqoop, Zookeeper, Flume, Oozie
.

UNIT III: Understanding MapReduce Fundamentals and HBase 11 Hrs


The MapReduce Framework: Exploring the Features of MapReduce , working of MapReduce,
Exploring Map and Reduce functions.
Techniques to Optimize MapReduce Jobs : Harware / Network Topology, Synchronization, File
System. Uses of MapReduce, Role of HBase in Big data Processing : Characteristics of HBase,
Installation of HBase.

UNIT IV: Introduction to MongoDB and Cassandra 10 Hrs


Introduction to MongoDB: What is and Why MongoDB? Terms used in RDBMS and
MongoDB, Data types in MongoDB,MongoDB Query language.
Apache Cassandra, features, CQL data types, CQLSH, key spaces, CRUD, collections, TTL,
using a counter, ALTER commands, import and export, query system tables.

UNIT V: Introduction to Hive and Pig 11 Hrs


what is Hive? , Hive Architecture, Hive Data Types, Hive File Format, Hive Query Language
(HQL), RCFile Implementation, SerDe, User-defined Function(UDF).
What is Pig? The Anatomy of Pig, Pig on Hadoop , Pig Philosophy, Use Case for Pig: ETL
Processing, Pig Latin Overview , Data Types in Pig ,Running Pig , Execution Modes of Pig
,HDFS Commands ,Relational Operators, Eval Function, Complex Data Types ,Piggy Bank,
User- Defined Functions (UDF) ,Parameter Substitution , Diagnostic Operator , Word Count
Example using Pig ,When to use Pig? When not to use Pig? Pig at Yahoo! ,Pig versus Hive .

TEXT BOOK
1. Big Data: Black Book :Dt Editorial Services, Dreamtech Press, Edition 2016 (Chapter 1).
2. Big Data and Analytics, Seema Acharya, Subhashini Chellappan, Infosys Limited,
Publication:Wiley India Private Limited,1st Edition 2015.

REFERENCE BOOKS
1. Hadoop in Practice, Alex Holmes, Manning Publications Co., September 2014, Second
Edition.
2. Programming Pig, Alan Gates, O’Reilly, Kindle Publication.
3. Programming Hive, Dean Wampler, O’Reilly, Kindle Publication

COURSE OUTCOMES
1. Identify the characteristics of datasets and compare the trivial data and big data for
various applications.
2. Demonstrate an open source software framework called Hadoop and supported tool to
empower any meaningful conversation on Big data and analytics.
3. Compare and Contrast different Hadoop supporting tools with traditional tool
4. How Big Data can be analyzed to extract knowledge and apply tools for bigdata analytics

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy