0% found this document useful (0 votes)
73 views11 pages

Bigdata-Bigdata (Set 1)

The document provides an introduction to big data including definitions, key concepts, and benefits. It covers the three V's of big data, different forms it can take, and advantages of big data processing. Common big data technologies like Hadoop, Spark, and Kafka are also discussed.

Uploaded by

n.tiwari1114
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views11 pages

Bigdata-Bigdata (Set 1)

The document provides an introduction to big data including definitions, key concepts, and benefits. It covers the three V's of big data, different forms it can take, and advantages of big data processing. Common big data technologies like Hadoop, Spark, and Kafka are also discussed.

Uploaded by

n.tiwari1114
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Introduction to Bigdata

chapter in
Bigdata

1 of 2 sets

Chapter: Introduction to Bigdata

1. Data in ___________ bytes size is called Big Data.


A. Tera
B. Giga
C. Peta
D. Meta
o m
Answer:C
. c
te
2. How many V's of Big Data
a
A. 2
q M
B. 3
c
C. 4
D. 5
M
Answer:D

3. Transaction data of the bank is?


A. structured data
B. unstructured datat
C. Both A and B
D. None of the above
Answer:A

4. In how many forms BigData could be found?


A. 2
B. 3
C. 4
D. 5
Answer:B

5. Which of the following are Benefits of Big Data Processing?


A. Businesses can utilize outside intelligence while taking decisions
B. Improved customer service
C. Better operational efficiency
D. All of the above
Answer:D

6. Which of the following are incorrect Big Data Technologies?


A. Apache Hadoop
B. Apache Spark
C. Apache Kafka
D. Apache Pytarch
Answer:D

7. The overall percentage of the world’s total data has been created just within the
past two years is ?
A. 80%
B. 85%
C. 90%
D. 95%
Answer:C

8. Apache Kafka is an open-source platform that was created by?


A. LinkedIn
B. Facebook
C. Google
D. IBM
Answer:A

9. What was Hadoop named after?


A. Creator Doug Cutting’s favorite circus act
B. Cuttings high school rock band
C. The toy elephant of Cutting’s son

View all MCQ's at McqMate.com


D. A sound Cutting’s laptop made during Hadoop development
Answer:C

10. What are the main components of Big Data?


A. MapReduce
B. HDFS
C. YARN
D. All of the above
Answer:D

11. All of the following accurately describe Hadoop, EXCEPT ____________


A. Open-source
B. Real-time
C. Java-based
D. Distributed computing approach
Answer:B

12. __________ has the world’s largest Hadoop cluster.


A. Apple
B. Datamatics
C. Facebook
D. None of the above
Answer:C

13. Facebook Tackles Big Data With _______ based on Hadoop.


A. Project Prism
B. Prism
C. Project Big
D. Project Data
Answer:A

14. ___________ is general-purpose computing model and runtime system for


distributed data analytics.
A. Mapreduce
B. Drill
C. Oozie

View all MCQ's at McqMate.com


D. None of the above
Answer:A

15. The examination of large amounts of data to see what patterns or other useful
information can be found is known as
A. Data examination
B. Information analysis
C. Big data analytics
D. Data analysis
Answer:C

16. Big data analysis does the following except?


A. Collects data
B. Spreads data
C. Organizes data
D. Analyzes data
Answer:D

17. What makes Big Data analysis difficult to optimize?


A. Big Data is not difficult to optimize
B. Both data and cost effective ways to mine data to make business sense out of it
C. The technology to mine data
D. None of the above
Answer:B

18. The new source of big data that will trigger a Big Data revolution in the years
to come is?
A. Business transactions
B. Social media
C. Transactional data and sensor data
D. RDBMS
Answer:C

19. The unit of data that flows through a Flume agent is


A. Log
B. Row

View all MCQ's at McqMate.com


C. Record
D. Event
Answer:D

20. Listed below are the three steps that are followed to deploy a Big Data Solution
except
A. Data Processing
B. Data dissemination
C. Data Storage
D. Data Ingestion
Answer:B

21. Who popularized bigdata term?


A. John deere
B. John Mashey
C. johny Mashe
D. Jhon Mash
Answer:B

22. Numbers ,text, image, audio and video data is ____


A. Volume
B. Value
C. Varity
D. Variety
Answer:D

23. Real time data is ______.


A. Field
B. Primary Key
C. unique
D. record
Answer:C

24. ______ is the term that is used to describe data that is high volume , high
velocity and /or high variety.
A. Analytics

View all MCQ's at McqMate.com


B. Bigdata
C. Hadoop Data
D. Bigdata analytics
Answer:B

25. According to analysts, for what can traditional IT systems provide a foundation
when they’re integrated with big data technologies like Hadoop?
A. Big data management and data mining
B. Data warehousing and business intelligence
C. Management of Hadoop clusters
D. Collecting and storing unstructured data
Answer:A

26. Point out the wrong statement.


A. Hardtop processing capabilities are huge and its real advantage lies in the ability to process
terabytes & petabytes of data
B. Hardtop processing capabilities are huge and its real advantage lies in the ability to process
terabytes & petabytes of data
C. The programming model, MapReduce, used by Hadoop is difficult to write and test
D. All of these
Answer:C

27. __________ can best be described as a programming model used to develop


Hadoop-based applications that can process massive amounts of data.
A. MapReduce
B. Mahout
C. Oozie
D. All of the mentioned
Answer:A

28. __________ has the world’s largest Hadoop cluster.


A. Apple
B. Datamatics
C. Facebook
D. None of the mentioned

View all MCQ's at McqMate.com


Answer:C

29. Facebook Tackles Big Data With _______ based on Hadoop.


A. ‘Project Prism’
B. ‘Prism’
C. ‘Project Big’
D. ‘Project Data’
Answer:A

30. Data science is the process of diverse set of data through ?


A. organizing data
B. processing data
C. analysing data
D. All of the above
Answer:D

31. The modern conception of data science as an independent discipline is


sometimes attributed to?
A. William S.
B. John McCarthy
C. Arthur Samuel
D. Satoshi Nakamoto
Answer:A

32. Which of the following language is used in Data science?


A. C
B. C++
C. R
D. Ruby
Answer:C

33. Which of the following is false?


A. Subsetting can be used to select and exclude variables and observations
B. Raw data should be processed only one time.
C. Merging concerns combining datasets on the same observations to produce a result with more
variables

View all MCQ's at McqMate.com


D. None Of the above
Answer:B

34. What is the work of Data Architect?


A. utilize large data sets to gather information that meets their company's needs
B. work with businesses to determine the best usage of the information yielded from data
C. build data solutions that are optimized for performance and design applications
D. All of the above
Answer:C

35. Which of the following is correct skills for a Data Scientist?


A. Probability & Statistics
B. Machine Learning / Deep Learning
C. Data Wrangling
D. All of the above
Answer:D

36. Which of the following are correct component for data science?
A. Data Engineering
B. Advanced Computing
C. Domain expertise
D. All of the above
Answer:D

37. Which of the following is not a part of data science process?


A. Discovery
B. Model Planning
C. Communication Building
D. Operationalize
Answer:C

38. Which of the following are the Data Sources in data science?
A. Structured
B. Unstructured
C. Both A and B

View all MCQ's at McqMate.com


D. None Of the above
Answer:C

39. Which of the following is not a application for data science?


A. Recommendation Systems
B. Image & Speech Recognition
C. Online Price Comparison
D. Privacy Checker
Answer:D

40. Point out the correct statement.


A. Raw data is original source of data
B. Preprocessed data is original source of data
C. Raw data is the data obtained after processing steps
D. None of the above
Answer:A

41. Which of the following is one of the key data science skills?
A. Statistics
B. Machine Learning
C. Data Visualization
D. All of the above
Answer:D

42. Which of the following is a key characteristic of a hacker?


A. Afraid to say they don't know the answer
B. Willing to find answers on their own
C. Not Willing to find answers on their own
D. All of the above
Answer:B

43. Raw data should be processed only one time.


A. True
B. False
C. Can be true or false

View all MCQ's at McqMate.com


D. Can not say
Answer:B

44. Which of the following is the common goal of statistical modelling?


A. Inference
B. Summarizing
C. Subsetting
D. None of the above
Answer:A

45. Causal analysis is commonly applied to census data.


A. True
B. False
C. Can be true or false
D. Can not say
Answer:B

46. Which of the following model is usually a gold standard for data analysis?
A. Inferential
B. Descriptive
C. Causal
D. All of the above
Answer:C

47. Which of the following is a revision control system?


A. Git
B. Numpy
C. Scipy
D. Slidify
Answer:A

48. Which of the following step is performed by data scientist after acquiring the
data?
A. Data Cleaning
B. Data Integration
C. Data Replication

View all MCQ's at McqMate.com


D. All of the above
Answer:A

49. Which of the following focuses on the discovery of (previously) unknown


properties on the data?
A. Data mining
B. BigData
C. Data wrangling
D. Machine Learning
Answer:A

View all MCQ's at McqMate.com

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy