0% found this document useful (0 votes)
65 views4 pages

Big Data Qpapers

1. Big data is characterized by volume, velocity, and variety. Examples of big data sources include e-commerce sites which generate large amounts of diverse customer data at a high speed. 2. The data architecture design for big data involves distributed processing across clustered systems to handle large and diverse datasets. It uses a shared-nothing architecture with data sharding across nodes. 3. Grid computing and cluster computing are approaches for distributed processing of large datasets. Grid computing links distributed resources while cluster computing involves grouping nodes within an organization.

Uploaded by

Sushma S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views4 pages

Big Data Qpapers

1. Big data is characterized by volume, velocity, and variety. Examples of big data sources include e-commerce sites which generate large amounts of diverse customer data at a high speed. 2. The data architecture design for big data involves distributed processing across clustered systems to handle large and diverse datasets. It uses a shared-nothing architecture with data sharding across nodes. 3. Grid computing and cluster computing are approaches for distributed processing of large datasets. Grid computing links distributed resources while cluster computing involves grouping nodes within an organization.

Uploaded by

Sushma S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

CIE QUESTION PAPER UG

Global Academy of Technology, Bengaluru NAAC

Department of Artificial Intelligence and Data Science Internal Test No: 1" IA

Date 7 1 2 Semester Sth USN


A 2

BIG DATA ANALYTICS: TOOLS Subject Code


Subject Name 0 A D S 5
AND TECHNIQUES
Time:75 Mins. Note: Answer all full questions. Max. Marks: 40
CO's and Cognitive
Questions Marks
No. level
List and describe the characteristics of big data. Illustrate by
1 CO1 L2 10
considering examples of E-commerce, and how big data is used.
OR

2 With aneat block diagram, explain the Data Architecture design. CO1 L2 10

Write anote on
3 a) Grid Computing COI L2 10
b) Cluster Computing
OR
Define Data Quaiity in Big data. List and discuss Factors Affecting
4
Data Quality. CO1 L2 10

Describe the following data pre-processing needs in detail


a) Data Cleaning
b) Data Enrichment
5 CO1 L2 10
c) Data Editing
d) Data Reduction
e) Data wrangling
OR

6
Write a note on Analytics Scalability to Big data and Massive
Parallel Processing (MPP) platforms. CO1 L2 10

With a neat diagram, explain Hadoop's main or core components.


CO3 L2 10

OR
List all the features of the Hadoop System and explain any four in
detail. CO3 L2 10
UG
CIE QUESTION PAPER

KAAC
Global Academy of Technology, Bengaluru
Second Internal Assessment - January 2023 Semester

Subject Code 2 D s5 3
Subject Name BIG DATA ANALYTICS: TOOLS AND TECHNIQUES

Note: Answer ALL full questions gnd each question carry 10 marks.
Time: Max. Marks: 40
75 mins

CO
Q. No. Questions
a List the props and cons of Distribution using sharding. (S5M) CO2
Q1. b. Give the comparison between NoSQL and SQL/RDBMS. (5M)
OR
With neat diagrams, explain the following for shared-Nothing Architecture for Bigdata Tasks CO2
i. Single Server model
Q2. ii Sharding very large databases.
Master Slave distribution model
iv. Peer-to-peer distribution model.

List and explain all the features of MangoDB. CO2


Q3.
OR
Illustrate the CQL commands and their functionality CO2
Q4.

a Explain Map reduce framework and its programming model. (6M)


Q5. CO3
b Write all the Features of MapReduce framework. (4M)
OR
Explain the following
i) HDFS block replication
Q6. ii) HDFS Safemode CO3
iii) Rack awarenesS
iv) Name node high availability

a Explain any five HDFS commands with example. (5M)


Q7. CO3
b. llustrate HBase along with its features. (5M)
OR
CO3
Q8. Briefly explain Hadoop physical organization with diagram

Page 1 of 2
UG
CIE QUESTION PAPER

(A)
NAAC
Global Academy of Technology, Bengaluru
Thlrd Internal Assessment - February 2023 Sernester V

Subject Name BIG DATA ANALYTICS: TOOLS AND TECHNIQUES Subject Code A D 3

Time: Note: Answer ALLfull questlons and each question carry 10 marks. Max. Marks: 40
75 mins

Q. No. Questions CO

Q1. Define data streaming, 1Ilustrate the data steam model in detail with a supporting diagram. C04

OR
1)
Q2. ii) Explain the query processing architecture in stream model. (6M) CO4
Write allthe comparison between DBMS and DSMS. (4M)

Write a note on
Q3. i) Strcam Computing CO4
ii) Sampling data in a stream
OR
Q+. Explain the bloom filter analysis and A variant of bloom filter. CO4

Q5. What is Hive? Illustrate main features and Architecture of Hivewith neat diagram.
COs
OR

Q6. Write a diagram depicts the workflow between Hive and Hadoop. And Explain each COs
component.

Describes the relational operators of Pig Latin for the following.


i) Loading and Storing
Q7. ii) Filtering
ii) Grouping and Joining COS
iv) Combining and splitting
OR
Q8. List anddescribe any 10 frequently used pig commands. COS

Page 1 of 1
20ADS53

GLOBAL ACADEMY OF TECHNOLOGY, BENGALURU


(An Autonomous Institute, affiliated to VTU, Belagavi)

USN 2 A

Fifth Semester B.E. Degree Examination, February 2023


Big Data Analytics: Tools and Techniques
Max. Marks: 100
Time: 3 hrs. module.
Note: AnSwer any Five full questions, choosing 0NE full question from each
Module -1
characteristics of big data. Also explain structured and un (10Marks)
a. List and explain the
structured data with and example.
Define data quality in big data. List and explain factors affecting data quality.
(10 Marks)
b.
OR
phases available in big data analytics? Explain each phase and (10 Marks)
a. What are the different architecture.
write a diagram ofoverview of a reference model for analytics (10 Marks)
by cloud
b. List all features of cloud computing and explain all services provided
computing with example.
Module- 2
for big (10 Marks)
a. With neat diagram, explain the following for shared-nothing architecture
Peer-to-Peer distribution model.
data task. i) Master slave replication model i) (10 Marks)
b. Discuss the No SQL data stores and their characteristics features.
OR
(05 Marks)
4 a.
List and explain the characteristics of Cassandra.
(05 Marls)
b. List and explain the features of Mango DB
(10 Marks)
C. Explain the following with respect to Mango DB
i) Dynamic schema i) Mango DB replicates
Module -3
(10 Marks)
a. With a neat diagram, explain Hadloop's main components and ecosystem
components. (10 Marks)
b. Explain any five HDFS commands with example.
OR
(10 Marks)
a. Briefly explain Hadloop physical organizationii)with neat diagram.
HDFS name node federation. (10Marks)
b. Write a note on i) HDES block replication
Module -4 (10 Marks)
What is data streaming? Explain data stream model in detail. (10 Marks)
b. Write a note on i) Stream computing ii) Sampling data in a stream.
OR
(05 Marks)
a Explain the features of data processing using spark streaming.
(10 Marks)
b. Briefly describe the spark streaming functionalities and operators.
(05 Marks)
C. Write allthe comparison between DBMS and DSMS.
Module -5
What is Hive? Ilustrate main features and architecture of Hive with neat diagram. (10Marks)
a
List any 5 (five) Hive DDL commands and DML commands. Explain each with an (10 Marks)
b.
example.
OR
(10Marks)
10 What is apache Pig? List and explain the features and advantages of apache Pig. (10 Marks)
b. Describe the relational operators of Pig Latin for the following
i) Loading and Storing ii) Grouping and Joining

Page 1 of 1

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy