0% found this document useful (0 votes)
4 views2 pages

Big Data Analytics

The document outlines a Big Data Analytics course, detailing its objectives, units, and outcomes. Key topics include Apache Hadoop, HDFS, Map Reduce, and data analysis using R. Students will learn to identify Big Data implications, manage Hadoop environments, and apply statistical techniques for data analysis.

Uploaded by

tharan26072001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views2 pages

Big Data Analytics

The document outlines a Big Data Analytics course, detailing its objectives, units, and outcomes. Key topics include Apache Hadoop, HDFS, Map Reduce, and data analysis using R. Students will learn to identify Big Data implications, manage Hadoop environments, and apply statistical techniques for data analysis.

Uploaded by

tharan26072001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

BIG DATA ANALYTICS LT P C

3 0 0 3

COURSE OBJECTIVES:

 Understand the Big Data Platform and its Use cases


 Provide an overview of Apache Hadoop
 Provide HDFS Concepts and Interfacing with HDFS
 Understand Map Reduce Jobs
 Provide hands on Hadoop Eco System
 Exposure to Data Analytics with R.

UNIT I INTRODUCTION TO BIG DATA 8


Types of Digital Data, Introduction to Big Data, Big Data Analytics, History of Hadoop, Apache Hadoop, Analyzing Data with Unix tools, Analyzing Data with
Hadoop, Hadoop Streaming, Hadoop Echo System, IBM Big Data Strategy, Introduction to Infosphere Big Insights and Big Sheets.
UNIT II HDFS (HADOOP DISTRIBUTED FILE SYSTEM) MAP REDUCE 8
HDFS (Hadoop Distributed File System) The Design of HDFS, HDFS Concepts, Command Line Interface, Hadoop file system interfaces, Data flow, Data Ingest
with Flume and Scoop and Hadoop archives, Hadoop I/O: Compression, Serialization, Avro and File-Based Data structures

UNIT III MAP REDUCE 7


Anatomy of a Map Reduce Job Run, Failures, Job Scheduling, Shuffle and Sort, Task Execution, Map Reduce Types and Formats, Map Reduce Features.
Unit IV HADOOP ECO SYSTEM PIG 8
Introduction to PIG, Execution Modes of Pig, Comparison of Pig with Databases, Grunt, Pig Latin, User Defined Functions, Data Processing operators. Hive:
Hive Shell, Hive Services, Hive Meta store, Comparison with Traditional Databases, HiveQL, Tables, Querying Data and User Defined Functions. HBase:
HBasics, Concepts, Clients, Example, HBase Versus RDBMS. Big SQL: Introduction
UNIT V GETTING STARTED WITH R 7
Installing R - The R environment - R packages - Basics of R - Data Structures - Reading data into R - Graphics in R. Writing R functions - Control Statements (if
and else, switch, if else, compound tests) - Loops in R (for, while, controlling loops) - Applications using the functions and loops.
UNIT VI DATA MANIPULATION AND ANALYSIS 7
Group manipulation - Data Reshaping - Manipulating Strings - Basic Statistics using R (Summaries, Correlation, t-tests, ANOVA)- Linear Models - Simple and
Multiple regression, GLM - Logit Regression, Model diagnostics - Residuals, Cross validation, Boot strapping

TOTAL: 45 PERIODS

COURSE OUTCOMES

At the end of the course, students will be able to

CO1: Identify Big Data and its Business Implications


CO2: List the components of Hadoop and Hadoop Eco-System.
CO3 Access and Process Data on Distributed File System.
CO4: Manage Job Execution in Hadoop Environment.
CO5: Develop Big Data Solutions using Hadoop Eco System:
CO6: Gain basic Knowledge about the R language and apply statistical computing techniques and graphics for analyzing big data

TEXT BOOKS
1.Tom White “Hadoop: The Definitive Guide” Third Edit on, O’reily Media, 2012.
2. Seema Acharya, Subhasini Chellappan, "Big Data Analytics" Wiley 2015.
3.Sandip Rakshit, R Programming for Beginners, McGraw Hill Education, 2017.
4. R Programming An Approach to Data Analytics by G.Sudhamathy, MJP Publications,2021

REFERENCES (Minimum 3)

1.Michael Berthold, David J. Hand, "Intelligent Data Analysis”, Springer, 2007.

2.Jay Liebowitz, “Big Data and Business Analytics” Auerbach Publications, CRC press (2013)

3.Tom Plunkett, Mark Hornick, “Using R to Unlock the Value of Big Data: Big Data Analytics with Oracle R Enterprise and Oracle R Connector for Hadoop”,

McGraw-Hill/Osborne Media (2013), Oracle press

4. Anand Rajaraman and Jefrey David Ulman, “Mining of Massive Datasets”, Cambridge University Press, 2012

5. Bill Franks, “Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics”, John Wiley & sons, 2012.

6.Glen J. Myat, “Making Sense of Data”, John Wiley & Sons, 2007

7. Pete Warden, “Big Data Glossary”, O’Reily, 2011.

CO PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12

CO1 2 1 1 1 1

CO2 2 1 2 2 3 2 3

CO3 2 1 2 2 3 2 3

CO4 2 1 2 2 3 3 3

CO5 2 2 3 2 3 3 3

CO6 2 2 3 3 3 3 3

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy