0% found this document useful (0 votes)

19 views7 pages

Bda Assignment

Big data consists of very large and complex datasets that are difficult to process using traditional tools. It is characterized by high volume, variety, velocity and variability. Hadoop is an open source framework for distributed storage and processing of big data across clusters of commodity hardware. Its core components include HDFS for storage and MapReduce for processing. The NameNode tracks metadata and the DataNodes store actual data blocks. The JobTracker coordinates jobs while the TaskTrackers execute tasks on the DataNodes.

Uploaded by

Avinash

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views7 pages

Bda Assignment

Uploaded by

Avinash

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Q.1 - What is Big Data? Explain characteristics of Big Data.

ANS :-

Wikipedia defines "Big Data" as a collection of data sets so large and complex that it becomes
difficult to process using on hand database management tools or traditional data processing
applications.
In simple terms,
"Big Data" consists of very large volumes of heterogeneous data that is being generated, often,
at high speeds. These data sets cannot be managed and processed using traditional data
management tools and applications at hand. Big Data requires the use of a new set of tools,
applications and frameworks to process and manage the data.

Characteristics Of Big Data

Big data can be described by the following characteristics:

• Volume
• Variety
• Velocity
• Variability

(i) Volume – The name Big Data itself is related to a size which is enormous. Size of data
plays a very crucial role in determining value out of data. Also, whether a particular data can
actually be considered as a Big Data or not, is dependent upon the volume of data.
Hence, ‘Volume’ is one characteristic which needs to be considered while dealing with Big
Data solutions.

(ii) Variety – The next aspect of Big Data is its variety.

Variety refers to heterogeneous sources and the nature of data, both structured and
unstructured. During earlier days, spreadsheets and databases were the only sources of data
considered by most of the applications. Nowadays, data in the form of emails, photos, videos,
monitoring devices, PDFs, audio, etc. are also being considered in the analysis applications.
This variety of unstructured data poses certain issues for storage, mining and analyzing data.

(iii) Velocity – The term ‘velocity’ refers to the speed of generation of data. How fast the data
is generated and processed to meet the demands, determines real potential in the data.

Big Data Velocity deals with the speed at which data flows in from sources like business
processes, application logs, networks, and social media sites, sensors, Mobile devices, etc.
The flow of data is massive and continuous.

(iv) Variability – This refers to the inconsistency which can be shown by the data at times,
thus hampering the process of being able to handle and manage the data effectively.
Describe Traditional vs. Big Data business approach. Explain Challenges of Conventional
4.
System.
ANS :-

Challenges of conventional systems

• Big data is the storage and analysis of large data sets.

• These are complex data sets that can be both structured or unstructured.
• They are so large that it is not possible to work on them with traditional analytical tools.
• One of the major challenges of conventional systems was the uncertainty of the Data
Management Landscape.
• Big data is continuously expanding, there are new companies and technologies that are
being developed every day.
• A big challenge for companies is to find out which technology works bests for them
without the introduction of new risks and problems.
• These days, organizations are realising the value they get out of big data analytics and
hence they are deploying big data tools and processes to bring more efficiency in their
work environment.
Q.7 :- What are the advantages of Hadoop? Explain Hadoop Architecture and its Components
with proper diagram.

ANS :- Advantage of Hadoop :-

1. Varied Data Sources

2. Cost-effective
3. Performance
4. Fault-Tolerant
5. Highly Available
6. Low Network Traffic
7. High Throughput
8. Open Source
9. Scalable
10. Ease of use
11. Compatibility
12. Multiple Languages Supported
Other Ecosystem Components/Projects :-
10. - What is Name node & Data node in Hadoop Architecture.
ANS :-

NameNode :-
NameNode can be regarded as the system’s master. It keeps track of the file system tree and
metadata for all of the system’s files and folders. Metadata information is stored in two files:
‘Namespace image’ and ‘edit log.’ Namenode is aware of all data nodes carrying data blocks
for a particular file, but it does not keep track of block positions. When the system starts, this
information is rebuilt from data nodes each time.

Name Node is the HDFS controller and manager since it is aware of the state and metadata of
all HDFS files, including file permissions, names, and block locations. Because the metadata
is tiny, it is kept in the memory of the name node, allowing for speedier data access.
Furthermore, because the HDFS cluster is accessible by several customers at the same time, all
of this data is processed by a single computer. It performs file system actions such as opening,
shutting, renaming, and so on.

DataNode :-
The data node is a commodity computer with the GNU/Linux operating system and data node
software installed. In a cluster, there will be a data node for each node (common
hardware/system). These nodes are in charge of the system’s data storage.

Datanodes respond to client requests by performing read-write operations on file systems. They
also carry out actions such as block creation, deletion, and replication in accordance with the
name node’s instructions.
Q.12. :-Discuss role of JobTracker and TaskTracker in processing data with Hadoop.

ANS :- Jo bTra cker a nd Ta skTra cker

JobTracker and TaskTracker are 2 essential process involved in MapReduce execution in
MRv1 (or Hadoop version 1). Both processes are now deprecated in MRv2 (or Hadoop
version 2) and replaced by Resource Manager, Application Master and Node Manager
Daemons.

Job Tracker –
1. JobTracker process runs on a separate node and not usually on a DataNode.
2. JobTracker is an essential Daemon for MapReduce execution in MRv1. It is
replaced by ResourceManager/ApplicationMaster in MRv2.
3. JobTracker receives the requests for MapReduce execution from the client.
4. JobTracker talks to the NameNode to determine the location of the data.
5. JobTracker finds the best TaskTracker nodes to execute tasks based on the data
locality (proximity of the data) and the available slots to execute a task on a
given node.
6. JobTracker monitors the individual TaskTrackers and the submits back the
overall status of the job back to the client.
7. JobTracker process is critical to the Hadoop cluster in terms of MapReduce
execution.
8. When the JobTracker is down, HDFS will still be functional but the MapReduce
execution can not be started and the existing MapReduce jobs will be halted.

TaskTracker –
1. TaskTracker runs on DataNode. Mostly on all DataNodes.
2. TaskTracker is replaced by Node Manager in MRv2.
3. Mapper and Reducer tasks are executed on DataNodes administered by
TaskTrackers.
4. TaskTrackers will be assigned Mapper and Reducer tasks to execute by
JobTracker.
5. TaskTracker will be in constant communication with the JobTracker signalling
the progress of the task in execution.
6. TaskTracker failure is not considered fatal. When a TaskTracker becomes
unresponsive, JobTracker will assign the task executed by the TaskTracker to
another node.

A PROJECT REPORT On Online Quiz System
No ratings yet
A PROJECT REPORT On Online Quiz System
46 pages
SImplified Solutions of BAD601 Model Question Paper
No ratings yet
SImplified Solutions of BAD601 Model Question Paper
32 pages
Sample of Globe Proof of Billing
No ratings yet
Sample of Globe Proof of Billing
2 pages
Msbte UT 1 QB Answers
No ratings yet
Msbte UT 1 QB Answers
13 pages
Bda Unit 2
No ratings yet
Bda Unit 2
16 pages
Big Data Analytics
No ratings yet
Big Data Analytics
44 pages
Drawing Aids: Setting Grid and Snap
No ratings yet
Drawing Aids: Setting Grid and Snap
14 pages
Unit1.1.1 RTHFGBCV TRHBGFV TDHNGFB
No ratings yet
Unit1.1.1 RTHFGBCV TRHBGFV TDHNGFB
26 pages
Unit 1,2,3,4
No ratings yet
Unit 1,2,3,4
116 pages
016-0171-559-C - Viper 4 and Viper 4 Plus Installation Manual
No ratings yet
016-0171-559-C - Viper 4 and Viper 4 Plus Installation Manual
34 pages
Big Data Analysis BDA IMP QNA Openinapp
No ratings yet
Big Data Analysis BDA IMP QNA Openinapp
33 pages
Jamb Test Manual
No ratings yet
Jamb Test Manual
14 pages
Aprisa XE User Manual: September 2006
No ratings yet
Aprisa XE User Manual: September 2006
134 pages
Big Data - Hadoop Questions Answers
No ratings yet
Big Data - Hadoop Questions Answers
18 pages
User Manual Fudaa-LSPIV 1.7
No ratings yet
User Manual Fudaa-LSPIV 1.7
61 pages
BDA Question Bank
No ratings yet
BDA Question Bank
33 pages
Unit-I Material
No ratings yet
Unit-I Material
32 pages
Top 50 Hadoop Interview Questions For 2019
No ratings yet
Top 50 Hadoop Interview Questions For 2019
42 pages
Dewe-5000 220e
No ratings yet
Dewe-5000 220e
24 pages
05 - Singleton Pattern
No ratings yet
05 - Singleton Pattern
21 pages
Bigdata
No ratings yet
Bigdata
3 pages
Unit 2
No ratings yet
Unit 2
22 pages
Hadoop and Java Ques - Ans
No ratings yet
Hadoop and Java Ques - Ans
222 pages
HR Analytics - Unit 5
No ratings yet
HR Analytics - Unit 5
18 pages
IMTC634 - Data Science - Chapter 13
No ratings yet
IMTC634 - Data Science - Chapter 13
16 pages
DAG1000-1/2S Analog Telephone Adapter
No ratings yet
DAG1000-1/2S Analog Telephone Adapter
3 pages
Short Questions
No ratings yet
Short Questions
17 pages
Hadoop Lab
100% (1)
Hadoop Lab
32 pages
OOP Chapter 4
No ratings yet
OOP Chapter 4
17 pages
Bda Ese
No ratings yet
Bda Ese
66 pages
Bda 2
No ratings yet
Bda 2
6 pages
Neb Class 12 Computer Programming in C Notes
No ratings yet
Neb Class 12 Computer Programming in C Notes
60 pages
BDA Viva
No ratings yet
BDA Viva
26 pages
Lecture 02
No ratings yet
Lecture 02
60 pages
Bda QB
No ratings yet
Bda QB
18 pages
Bda Answer Bank: 1. Characteristics of Big Data 5V
No ratings yet
Bda Answer Bank: 1. Characteristics of Big Data 5V
28 pages
Competition Questions
No ratings yet
Competition Questions
10 pages
Adv Itt CAAT MCQ
No ratings yet
Adv Itt CAAT MCQ
5 pages
Apache Hadoop Developer Training PDF
100% (1)
Apache Hadoop Developer Training PDF
397 pages
Sip Parameters
100% (1)
Sip Parameters
20 pages
Zarin Tasnim
No ratings yet
Zarin Tasnim
11 pages
Questionsand Answers
No ratings yet
Questionsand Answers
23 pages
Big Data Analysis IAT-1
No ratings yet
Big Data Analysis IAT-1
43 pages
Big Data Analysis Unit 1-5 Extended
No ratings yet
Big Data Analysis Unit 1-5 Extended
35 pages
SolarWinds CSFI Report
No ratings yet
SolarWinds CSFI Report
10 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
11 pages
Unit V Cloud Technologies and Advancements
No ratings yet
Unit V Cloud Technologies and Advancements
33 pages
StewartPCalc7 01 04 Output
No ratings yet
StewartPCalc7 01 04 Output
33 pages
de3eff8f6907b6b29ecc2014b615d71dd241738f80ca7be596d3983253f3d57d
No ratings yet
de3eff8f6907b6b29ecc2014b615d71dd241738f80ca7be596d3983253f3d57d
2 pages
聊天機器人之檢索效能之研究以BERT為方法
No ratings yet
聊天機器人之檢索效能之研究以BERT為方法
79 pages
Big Data QB
No ratings yet
Big Data QB
37 pages
BDA Unit-3
No ratings yet
BDA Unit-3
47 pages
Bda Ut1 Que Ans
No ratings yet
Bda Ut1 Que Ans
13 pages
Unit - I Introduction To Big Data
No ratings yet
Unit - I Introduction To Big Data
38 pages
Assignment BDHHHH
No ratings yet
Assignment BDHHHH
15 pages
How To Send To A Business Paypal - Google Search
No ratings yet
How To Send To A Business Paypal - Google Search
1 page
1 Bda Chapter1 Answer
No ratings yet
1 Bda Chapter1 Answer
7 pages
Apache Hadoop Training
No ratings yet
Apache Hadoop Training
377 pages
Hadoop Overview: Open Source Framework Processing Large Amounts of Heterogeneous Data Sets Distributed Fashion
No ratings yet
Hadoop Overview: Open Source Framework Processing Large Amounts of Heterogeneous Data Sets Distributed Fashion
62 pages
I Am Preparing For A Big Data Analytics University...
No ratings yet
I Am Preparing For A Big Data Analytics University...
15 pages
19 Assessing Model Accuracy
No ratings yet
19 Assessing Model Accuracy
16 pages
WWW Doubtly in Big Data Analytics Semester 7 Mu Ai Ds Viva Qna
No ratings yet
WWW Doubtly in Big Data Analytics Semester 7 Mu Ai Ds Viva Qna
7 pages
Basic Hadoop Interview Questionsxyzz
No ratings yet
Basic Hadoop Interview Questionsxyzz
18 pages
IRJET - Big Data-A Review Study With Comp
No ratings yet
IRJET - Big Data-A Review Study With Comp
6 pages
Big-Data Final
No ratings yet
Big-Data Final
7 pages
IET Udaipur BDA Unit-1
No ratings yet
IET Udaipur BDA Unit-1
10 pages
Ism 6404 CH 7
No ratings yet
Ism 6404 CH 7
47 pages
Jquery Validation
No ratings yet
Jquery Validation
2 pages
Syllabus Ns (Network Security)
No ratings yet
Syllabus Ns (Network Security)
6 pages
Uc PDF
No ratings yet
Uc PDF
10 pages
Apache Hadoop Developer Training
100% (1)
Apache Hadoop Developer Training
394 pages
HADOOP
No ratings yet
HADOOP
55 pages
Basic Big Data Interview Questions
No ratings yet
Basic Big Data Interview Questions
16 pages
SIP Master Stations: Configuration Guide
No ratings yet
SIP Master Stations: Configuration Guide
36 pages
ST-1 Solution Big Data KCS061
No ratings yet
ST-1 Solution Big Data KCS061
26 pages
Chapter 1 - Information Theory
No ratings yet
Chapter 1 - Information Theory
55 pages
Network Slicing in 5G
No ratings yet
Network Slicing in 5G
14 pages
Hadoop Architecture and Its Functionality
No ratings yet
Hadoop Architecture and Its Functionality
7 pages
X - AI - Question Bank2022
No ratings yet
X - AI - Question Bank2022
7 pages
Big Data Hadoop Interview Questions and Answers
100% (1)
Big Data Hadoop Interview Questions and Answers
25 pages
DSBDA ORAL Question Bank
100% (1)
DSBDA ORAL Question Bank
6 pages
Big Data Hadoop Interview Questions and Answers
No ratings yet
Big Data Hadoop Interview Questions and Answers
26 pages
Apache Hadoop Developer Training PDF
No ratings yet
Apache Hadoop Developer Training PDF
394 pages
Testing Big Data: Camelia Rad
No ratings yet
Testing Big Data: Camelia Rad
31 pages
BDA Answers-1
No ratings yet
BDA Answers-1
15 pages
022 - SK Santan - Sebutharga Pemasangan Wireless AP
No ratings yet
022 - SK Santan - Sebutharga Pemasangan Wireless AP
1 page
Updated Unit-2
0% (1)
Updated Unit-2
55 pages
Design and Implementation of Student Registration System For Universities
No ratings yet
Design and Implementation of Student Registration System For Universities
4 pages
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Bda Assignment

Uploaded by

Bda Assignment

Uploaded by

Q.1 - What is Big Data? Explain characteristics of Big Data.

Characteristics Of Big Data

(ii) Variety – The next aspect of Big Data is its variety.

Challenges of conventional systems

• Big data is the storage and analysis of large data sets.

ANS :- Advantage of Hadoop :-

1. Varied Data Sources

ANS :- Jo bTra cker a nd Ta skTra cker

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.