0% found this document useful (0 votes)
92 views3 pages

BDA Question Bank

This document outlines the units and topics covered in the subject "Big Data Analytics". It includes 6 units: [1] Introduction to Big Data, [2] Hadoop, [3] NoSQL, [4] Mining Data Streams, [5] Frameworks, and [6] Spark. Unit 1 introduces concepts like characteristics of big data and how big data analytics can benefit applications like smart cities. Unit 2 focuses on Hadoop components like MapReduce, HDFS, and YARN. Unit 3 covers NoSQL databases and how they differ from SQL. Unit 4 discusses analyzing data streams. Unit 5 examines frameworks like Pig, Hive, and HBase. Unit 6 details Apache Spark and how it improves on Map

Uploaded by

Stuti Shah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views3 pages

BDA Question Bank

This document outlines the units and topics covered in the subject "Big Data Analytics". It includes 6 units: [1] Introduction to Big Data, [2] Hadoop, [3] NoSQL, [4] Mining Data Streams, [5] Frameworks, and [6] Spark. Unit 1 introduces concepts like characteristics of big data and how big data analytics can benefit applications like smart cities. Unit 2 focuses on Hadoop components like MapReduce, HDFS, and YARN. Unit 3 covers NoSQL databases and how they differ from SQL. Unit 4 discusses analyzing data streams. Unit 5 examines frameworks like Pig, Hive, and HBase. Unit 6 details Apache Spark and how it improves on Map

Uploaded by

Stuti Shah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Computer Engineering Department

Subject code: 3170722


Subject Name: Big Data Analytics

UNIT-1 Introduction to Big Data


1. What is Big Data? Explain characteristics of Big Data.
2. What are the benefits of Big Data? Discuss challenges under Big Data. How Big
Data Analytics can be useful in the development of smart cities. (Discuss one
application)
3. What is Big Data? Explain how big data processing differs from distributed
processing.
4. List various application of big data. How it can be used to improve business for a
superstore.
5. Explain characteristics of Big Data.
6. What is data serialization? With proper examples discuss and differentiate
structured, unstructured and semi-structured data. Make a note on how type of data
affects data serialization.

UNIT-2 Hadoop
1. Explain working of following phases of Map Reduce with one common example.
● Map Phase
● Combiner Phase
● Shuffle and Sort Phase
● Reducer Phase
2. Write Map Reduce code for counting occurrences of specific words in the input text
file(s). Also write the commands to compile and run the code.
3. Explain Job Scheduling in Map Reduce. How it is done in case of
● The Fair Scheduler
● The Capacity Scheduler
4. Explain Avro data serialization technique in MapReduce
5. Explain “Map Phase” and “Combiner Phase” in MapReduce.
6. What is Resilient Distributed Dataset in Apache Spark? Explain in detail. Make a
note on why RDD is better than Map Reduce data storage?
7. What are the advantages of Hadoop? Explain Hadoop Architecture and its
Components with proper diagram.
8. Explain working of Hive with proper steps and diagram.
9. What do you mean by HiveQL Data Definition Language? Explain any three
HiveQL DDL command with its syntax and example
10. Draw HDFS Architecture. Explain any two commands of HDFS from following
commands with syntax and al least one example of each.
● copyFromLocal
● setrep
● checksum
11. Explain core architecture of Hadoop with suitable block diagram. Discuss role of
each component in detail.
12. What is Hadoop Ecosystem? Discuss various components of Hadoop Ecosystem.
13. List various configuration files used in Hadoop Installation. What is use of
mapred-site.xml?

UNIT-3 NoSQL
1. Write a short note on NoSQL databases. List the differences between NoSQL and
relational databases?
2. Define NO SQL Database.
3. What is Key Value data store?
4. Compare document store vs Key value store.
5. Differentiate master-slave versus peer-to-peer models.
6. List the classification of NoSQL Databases and explain about Key-Value Stores.
7. What is NoSQL? What are the advantages of NoSQL? Explain the types of NoSQL
databases.
8. Differences between SQL Vs NoSQL explain it with suitable example.

UNIT-4 Mining Data Stream


1. What is a data stream?
2. Explain the different applications of data streams in detail.
3. Explain the stream model and Data stream management system architecture.
4. List out Problems on Data Streams.
5. What is Real Time Analytics? Discuss their technologies in detail.
6. Write the application of RTAP.
7. Define Decaying window and how it performed in data analytics.
8. Write a short note on the following:
a. Counting distinct elements in a stream.
b. Finding most popular elements using decaying window
9. What is Streaming Data Architecture?
10. Explain Filtering a stream in detail.

UNIT-5 Frameworks
1. What is Zookeeper? What are the benefits of Zookeeper?
2. Draw architecture of APACHE PIG and explain in short.
3. Explain in detail about HIVE.
4. What is Hive?
5. Difference Between Hbase and Hive.
6. What is Hbase?

UNIT - 6 Spark
1. Explain Spark components in detail. Also list the features of spark.
2. What are the problems related to Map Reduce data storage? How Apache Spark
solves it using Resilient Distributed Dataset? Explain RDDs in detail.
3. What is Apache Spark?
4. Explain the key features of Apache Spark.
5. What are benefits of Spark over MapReduce?
6. Describe HBase and ZooKeeper in details.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy