0% found this document useful (0 votes)
476 views17 pages

DSBDa MCQ

1. The document contains a multiple choice question (MCQ) quiz on data science and big data analytics. 2. The quiz contains 31 questions related to topics like introduction to data science and big data, big data characteristics, data warehousing, Hadoop ecosystem components, data mining etc. 3. The questions test knowledge on key concepts, technologies, approaches and skills used for data science and analytics on large datasets.

Uploaded by

noxex
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
476 views17 pages

DSBDa MCQ

1. The document contains a multiple choice question (MCQ) quiz on data science and big data analytics. 2. The quiz contains 31 questions related to topics like introduction to data science and big data, big data characteristics, data warehousing, Hadoop ecosystem components, data mining etc. 3. The questions test knowledge on key concepts, technologies, approaches and skills used for data science and analytics on large datasets.

Uploaded by

noxex
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Sinhgad Technical Education Society’s

SINHGAD INSTITUTE OF TECHNOLOGY


(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)

UNIT NO.1: INTRODUCTION: DATA SCIENCE AND BIG DATA


Syllabus
Introduction to Data science and Big Data, Defining Data science and Big Data, Big Data examples,
Data explosion, Data volume, Data Velocity, Big data infrastructure and challenges, Big Data
Processing Architectures, Data Warehouse, Re-Engineering the Data Warehouse, Shared everything
and shared nothing architecture, Big data learning approaches.

Q. MCQ Ans Marks


No
Q.1 According to analysts, for what can traditional IT systems provide a foundation when A [01]
they’re integrated with big data technologies like Hadoop?
A. Big data management and data mining B. Data warehousing and business
intelligence C. Management of Hadoop clusters D. Collecting and storing unstructured
data
Q.2 What are the five V’s of Big Data? D [01]
A. Volume B. Velocity C. Variety D. All the above
Q.3 Which of the following is the most important language for Data Science? C [01]
A) java B) Ruby C) R D) None
Q.4 Which of the following approach should be used to ask Data Analysis question? B [01]
A) Find only one solution for particular problem B) Find out the question which is to be
answered C) Find out answer from dataset without asking question D) None of the
mentioned
Q.5 Which of the following is one of the key data science skills? D [01]
A) Statistics B) Machine Learning C) Data Visualization D) All of the mentioned
Q.6 Which of the following term is appropriate to the below figure? B [01]

A) Large Data B) Big Data C) Dark Data D) None of the mentioned


Q.7  Which of the following characteristic of big data is relatively more concerned to data B [01]
science?
A) Velocity B) Variety C) Volume D) None of the mentioned
Q.8 3V’s are not sufficient to describe big data. A [01]
A) True B) False
Q.9 Which of the following focuses on the discovery of (previously) unknown properties on A [01]
the data?
A) Data mining B) Big Data C) Data wrangling D) Machine Learning
Q.10 Beyond Volume, variety and velocity are the issues of big data veracity. A [01]
A) True B) False
Q.11 __________ contains information that gives users an easy-to-understand perspective of D [01]
the information stored in the data warehouse.
A) Financial metadata B) Operational metadata C) Technical metadata D) Business
metadata
Q.12 The full form of OLAP is A [01]

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)


A) Online Analytical Processing B) Online Advanced Processing C) Online Advanced
Preparation D) Online Analytical Performance
Q.13 ……………………. is a subject-oriented, integrated, time-variant, nonvolatile B [01]
collection or data in support of management decisions.
A) Data Mining B) Data Warehousing C) Document Mining D) Text Mining
Q.14 The ……………… allows the selection of the relevant information necessary for the data A [01]
warehouse.
A) top-down view B) data warehouse view C) data source view D) business query view
Q.15 Which of the following is not a component of a data warehouse? D [01]
A) Metadata B) Current detail data C) Lightly summarized data D) Component Key
Q.16 Point out the correct statement. A [02]
A) Raw data is original source of data B) Preprocessed data is original source of data
C) Raw data is the data obtained after processing steps D) None of the mentioned
Q.17 Which of the following is performed by Data Scientist? D [02]
A) Define the question B) Create reproducible code C) Challenge results D) All of the
mentioned
Q.18 Point out the wrong statement. C [02]
A) The big volume indeed represents Big Data B) The data growth and social media
explosion have changed how we look at the data C) Big Data is just about lots of data
D) All of the mentioned
Q.19 Which of the following language should be replaced with the question mark in the A [02]
below figure?

A) Java B) PHP C) COBOL D) None of the mentioned


Q.20 Which of the following is not a kind of data warehouse application? D [02]
A) Information processing B) Analytical processing C) Data mining D) Transaction
processing
Q.21 Which of the following is good way of performing experiments in data science? D [02]
A. Measure variability B. Generalize to the problem C. Have Replication D. All of the
Mentioned
Q.22 Which of the following approach should be used to ask Data Analysis question? B [02]
A. Find only one solution for particular problem B. Find out the question which is to be
answered C. Find out answer from dataset without asking question D. None of the
mentioned
Q.23 Which of the following technique comes under practical machine learning? B [02]
A. Bagging B. Boosting C. Forecasting D. None of the Mentioned
Q.24 _________ hides the limitations of Java behind a powerful and concise Clojure API for B [02]
Cascading.
A. Scalding B. Cascalog C. Hcatalog D. Hcalding
Q.25 What are the main components of Big Data? D [02]
A. MapReduce B. HDFS C. YARN D. All of these

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)


Q.26 What are the different features of Big Data Analytics? D [02]
A. Open-Source B. Scalability C. Data Recovery D. All the above
Q.27 Facebook Tackles Big Data With _______ based on Hadoop A [02]
A. Project Prism B. Prism C. ProjectData D. ProjectBid
Q.28 What is a unit of data that flows through a Flume agent? B [02]
A. Record B. Event C. Row D. Log
Q.29 [02]
Q.30 [02]
Q.31 Which of the following would be more appropriate to be replaced with question mark B [03]
in the following figure?

A) Data Analysis B) Data Science C) Descriptive Analytics D) None of the mentioned


Q.32 The method by which customer data or other types of information is analyzed in an B [03]
effort to identify patterns and discover relationships between different data elements is
often referred to as:
A. Customer data management B. Data mining C. Data digging D. None
Q.33 What percentage of digital information is generated by individuals? D [03]
A. 55% B. 27.5% C. 6.5% D. 75%
Q.34 To maximize the benefits of big data analytics techniques, it is critical for companies to A [03]
select the right tools and involve people who possess analytical skills to a project.
A. True B. False
Q.35 Check below the best answer/s to “which industries employ the use of so called “Big D [03]
Data” in their day to day operations (choose 1 or many)?
A. Weather forecasting B. Marketing C. Healthcare D. All of the above
Q.36 What is the projected volume of eCommerce transactions in 2016? D [03]
A. $1 trillion B. $290.7 billion C. $197.8 billion D. $326 billion
Q.37 Investment in digital enterprises has increased by how much since 2005? C [03]
A. 40% B. 28.2% C. Over 50% D. 39.7%
Q.38 What is the recommended best practice for managing big data analytics programs? A [03]
A. Focusing on business goals and how to use big data analytics technologies to meet
them B. Adopting data analysis tools based on a laundry list of their capabilities C.
Letting go entirely of “old ideas” related to data management
D. None
Q.39 …………………… is a good alternative to the star schema. C [03]
A) Star schema B) Snowflake schema C) Fact constellation D) Star-snowflake schema
Q.40 Point out the wrong statement: A [03]
A. Elastic MapReduce (EMR) is Facebook’s packaged Hadoop offering B. Amazon
Web Service Elastic MapReduce (EMR) is Amazon’s packaged Hadoop offering
C. Scalding is a Scala API on top of Cascading that removes most Java boilerplate
D. All of the mentioned

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)


UNIT NO.2: MATHEMATICAL FOUNDATION OF BIG DATA
Syllabus
Probability theory, Tail bounds with applications, Markov chains and random walks, Pair wise
independence and universal hashing, Approximate counting, Approximate median, the streaming
models, Flajolet Martin Distance sampling, Bloom filters, Local search and testing connectivity,
Enforce test techniques, Random walks and testing, Boolean functions, BLR test for linearity.

Q. MCQ Ans Marks


No
Q.1 The expected value or _______ of a random variable is the center of its distribution. c [01]
a) mode b) median c) mean d) Bayesian inference
Q.2 Point out the correct statement. d [01]
a) Some cumulative distribution function F is non-decreasing and right-continuous b)
Every cumulative distribution function F is decreasing and right-continuous c) Every
cumulative distribution function F is increasing and left-continuous d) None of the
mentioned
Q.3 Which of the following of a random variable is a measure of spread? a [01]
a) variance b) standard deviation c) empirical mean d) all of the mentioned
Q.4 The square root of the variance is called the ________ deviation. d [01]
a) empirical b) mean c) continuous d) standard
Q.5 For a situation with weekly dining at either an Italian or Mexican restaurant, A [01]
a.the weekly visit is the trial and the restaurant is the state. b.the weekly visit is the state
and the restaurant is the trial. c.the weekly visit is the trend and the restaurant is the
transition. d.the weekly visit is the transition and the restaurant is the trend.
Q.6 In a throw of coin what is the probability of getting head. C [01]
A.1 B.2 C.1/2 D.0
Q.7 Three unbiased coins are tossed, what is the probability of getting at least 2 tails? C [01]
A.1/3 B.1/6 C.1/2 D.1/8
Q.8 Point out the wrong statement. C [01]
A) A percentile is simply a quantile with expressed as a percent B) There are two types
of random variable C) R cannot approximate quantiles for you for common
distributions D) None of the mentioned
Q.9 Which of the following inequality is useful for interpreting variances? A [01]
A) Chebyshev B) Stautaory C) Testory D) All of the mentioned
Q.10 For continuous random variables, the CDF is the derivative of the PDF. B [01]
a) True b) False
Q.11 Which of the following random variables are the default model for random samples? A [01]
a) iid b) id c) pmd d) all of the mentioned
Q.12 Cumulative distribution functions are used to specify the distribution of multivariate A [01]
random variables.
a) True b) False
Q.13  __________ random variables are used to model rates. C [01]
a) Empirical b) Binomial c) Poisson d) All of the mentioned
Q.14 Which of the following is incorrect with respect to use of Poisson distribution? B [01]
a) Modeling event/time data b) Modeling bounded count data c) Modeling contingency
tables d) All of the mentioned
Q.15 Bernoulli random variables take (only) the values 1 and 0. A [01]
a) True b) False
Q.16 Which of these measures are used to analyze the central tendency of data? B [02]

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)


A) Mean and Normal Distribution B) Mean, Median and Mode C) Mode, Alpha &
Range D) Standard Deviation, Range and Mean E) Median, Range and Normal
Distribution
Q.17 Five numbers are given: (5, 10, 15, 5, 15). Now, what would be the sum of deviations D [02]
of individual data points from their mean?
A) 10 B)25 C) 50 D) 0
Q.18 A test is administered annually. The test has a mean score of 150 and a standard A [02]
deviation of 20. If Ravi’s z-score is 1.50, what was his score on the test?
A) 180 B) 130 C) 30 D) 150
Q.19 If a positively skewed distribution has a median of 50, which of the following statement D [02]
is true?
A) Mean is greater than 50 B) Mean is less than 50 C) Mode is less than 50 D) Both A
and C
Q.20 If the variance of a dataset is correctly computed with the formula using (n – 1) in the A [02]
denominator, which of the following option is true?
A) Dataset is a sample B) Dataset is a population C) Dataset could be either a sample or
a population D) Dataset is from a census
Q.21 Standard deviation is robust to outliers? B [02]
A) True B) False
Q.22 [True or False] The standard normal curve is symmetric about 0 and the total area A [02]
under it is 1.
A) TRUE B) FALSE
Q.23 What is the probability of getting a sum 9 from two throws of dice? B [02]
A.1/3 B.1/9 C.1/12 D.2/9
Q.24  Bag contain 10 back and 20 white balls, One ball is drawn at random. What is the B [02]
probability that ball is white?
A.1 B.2/3 C.1/3 D.4/3
Q.25 There is a pack of 52 cards and Rohan draws two cards together, what is the probability B [02]
that one is spade and one is heart?
A.11/102 B.13/102 C.11/104 D.11/102
Q.26 Which of the following theorem states that the distribution of averages of iid variables, A [02]
properly normalized, becomes that of a standard normal as the sample size increases?
A) Central Limit Theorem B) Central Mean Theorem C) Centroid Limit Theorem
D) All of the mentioned
Q.27 The binomial random variables are obtained as the sum of iid Gaussian trials. A [02]
A) True B) False
Q.28 Which of the following is the top most important thing in data science? B [02]
A) answer B) question C) data D) none of the mentioned
Q.29 Which of the following approach should be used if you can’t fix the variable? A [02]
A) randomize it B) non stratify it C) generalize it D) none of the mentioned
Q.30 If X predicts Y, it does mean X causes Y. B [02]
A) True B) False
Q.31 The probability density function of a Markov process is A [03]
A) p(x1,x2,x3.......xn) = p(x1)p(x2/x1)p(x3/x2).......p(xn/xn-1) B) p(x1,x2,x3.......xn) =
p(x1)p(x1/x2)p(x2/x3).......p(xn-1/xn) C) p(x1,x2,x3......xn) = p(x1)p(x2)p(x3).......p(xn)
D) p(x1,x2,x3......xn) = p(x1)p(x2 *x1)p(x3*x2)........p(xn*xn-1)
Q.32 In Markov analysis, we are concerned with the probability that the B [03]
A. state is part of a system. B. system is in a particular state at a given time. C. time has
reached a steady state. D. transition will occur.
Q.33 Below, we have represented six data points on a scale where vertical lines on scale C [03]
represent unit.

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)

Which of the following line represents the mean of the given data points, where the
scale is divided into same units?
A) A B) B C) C D) D
Q.34 Which of the following is a possible value for the median of the below distribution? B [03]

A) 32 B) 26 C) 17 D) 40


Q.35 For the below normal distribution, which of the following option holds true ? B [03]
σ1, σ2 and σ3 represent the standard deviations for curves 1, 2 and 3 respectively.

A) σ1> σ2> σ3 B) σ1< σ2< σ3 C) σ1= σ2= σ3 D) None


Q.36 Which of the graph below has very strong positive correlation? B [03]

A) B) C) D)
Q.37 A certain stock price has been observed to follow a pattern. If the stock price goes up [03]
one day, there's a 20% chance of it rising tomorrow, a 30% chance of it falling, and a
50% chance of it remaining the same. If the stock price falls one day, there's a 35%
chance of it rising tomorrow, a 50% chance of it falling, and a 15% chance of it
remaining the same. Finally, if the price is stable on one day, then it has a 50-50 change
of rising or falling the next day. Which matrix below is the transition matrix for this
Markov chain, if we list states in the order: (rising, falling, constant)?

A. B. C. D.

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)


Q.38 Choose the correct transition matrix representing the Markov chain with state diagram A [03]
shown below. Assume the states are ordered with A before B.

A. B. C. D.
Q.39 Given the initial state vector (1, 0) and the transition matrix shown below, find the state [03]
vector corresponding to two steps later (n = 2).

A. (0.2002, 0.7998) B) (0.8086, 0.7998) C. (0.8086, 0.1914) D. (0.7998, 0.2002)


Q.40 A dental surgery has two operation rooms. The service times are assumed to be A [03]
independent, exponentially distributed with mean 15 minutes. Andrew arrives when
both operation rooms are empty. Bob arrives 10 minutes later while Andrew is still
under medical treatment. Another 20 minutes later Caroline arrives and both Andrew
and Bob are still under treatment. No other patient arrives during this 30- minute
interval. What is the probability that Andrew will be ready before Bob?
A.1/2 B.1/4 C.2/3D.0

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)


UNIT NO.3: BIG DATA PROCESSING
Syllabus
Big Data technologies, Introduction to Google file system, Hadoop Architecture, Hadoop Storage: HDFS,
Common Hadoop Shell commands, Anatomy of File Write and Read, NameNode, Secondary NameNode,
and DataNode, Hadoop MapReduce paradigm, Map Reduce tasks, Job, Task trackers - Cluster Setup – SSH
& Hadoop Configuration, Introduction to: NOSQL, Textual ETL processing.

Q. MCQ Ans Marks


No
Q.1 All of the following accurately describe Hadoop, EXCEPT: B [01]
A. Open source B. Real-time C. Java-based D. Distributed computing approach
Q.2 __________ has the world’s largest Hadoop cluster. C [01]
A. Apple B. Datamatics C. Facebook D. None of the mentioned
Q.3 Who created the popular Hadoop software framework for storage and processing of B [01]
large datasets?
A) Larry Page B) Doug Cutting C) Richard Stallman D) Alan Cox
Q.4 Point out the wrong statement: C [01]
A. Hardtop’s processing capabilities are huge and its real advantage lies in the ability to
process terabytes & petabytes of data B. Hadoop uses a programming model called
“MapReduce”, all the programs should confirms to this model in order to work on
Hadoop platform C. The programming model, MapReduce, used by Hadoop is difficult
to write and test D. All of the mentioned
Q.5 What was Hadoop named after? C [01]
A. Creator Doug Cutting’s favorite circus act B. Cutting’s high school rock band
C. The toy elephant of Cutting’s son D. A sound Cutting’s laptop made during Hadoop’s
development
Q.6 Point out the wrong statement: A [01]
A. Elastic MapReduce (EMR) is Facebook’s packaged Hadoop offering
B. Amazon Web Service Elastic MapReduce (EMR) is Amazon’s packaged Hadoop
offering
C. Scalding is a Scala API on top of Cascading that removes most Java boilerplate
D. All of the mentioned
Q.7 _______ is the most popular high-level Java API in Hadoop Ecosystem D [01]
A. Scalding
B. HCatalog
C. Cascalog
D. Cascading
Q.8 ___________ is general-purpose computing model and runtime system for distributed A [01]
data analytics.
A. Mapreduce
B. Drill
C. Oozie
D. None of the mentioned
Q.9 Which of the following genres does Hadoop produce? A [01]
A. Distributed file system
B. JAX-RS
C. Java Message Service
D. Relational Database Management System
Q.10 Which of the following platforms does Hadoop run on ? C [01]

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)


A. Bare metal
B. Debian
C. Cross-platform
D. Unix-like
Q.11 InputFormat class calls the ________ function and computes splits for each file and then C [01]
sends them to the jobtracker.
A. puts
B. gets
C. getSplits
D. All of the mentioned
Q.12 On a tasktracker, the map task passes the split to the createRecordReader() method on B [01]
InputFormat to obtain a _________ for that split.
A. InputReader
B. RecordReader
C. OutputReader
D. None of the mentioned
Q.13 _________ is a pluggable Map/Reduce scheduler for Hadoop which provides a way to C [01]
share large clusters.
A. Flow Scheduler
B. Data Scheduler
C. Capacity Scheduler
D. None of the mentioned
Q.14 Applications can use the ____________ to report progress and set application-level C [01]
status messages
A. Partitioner
B. OutputSplit
C. Reporter
D. All of the mentioned
Q.15 The right level of parallelism for maps seems to be around _________ maps per-node B [01]
A. 1-10
B. 10-100
C. 100-150
D. 150-200
Q.16 __________ is the primary interface for a user to describe a MapReduce job to the B [02]
Hadoop framework for execution.
A. JobConfig
B. JobConf
C. JobConfiguration
D. All of the mentioned
Q.17 HCatalog is installed with Hive, starting with Hive release C [02]
A. 0.10.0
B. 0.9.0
C. 0.11.0
D. 0.12.0
Q.18 _________ method clears all keys from the configuration. A [02]
A. clear
B. addResource
C. getClass
D. None of the mentioned
Q.19 Which of the following method is used to get user-specified job name? A [02]
A. getJobName()

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)


B. getJobState()
C. getPriority()
D. All of the mentioned
Q.20 __________ get events indicating completion (success/failure) of component tasks. D [02]
A. getJobName()
B. getJobState()
C. getPriority()
D. getTaskCompletionEvents(int startFrom)
Q.21 The output of the _______ is not sorted in the Mapreduce framework for Hadoop. D [02]
A. Mapper
B. Cascader
C. Scalding
D. None of the mentioned
Q.22 Which of the following phases occur simultaneously? A [02]
A. Shuffle and Sort
B. Reduce and Sort
C. Shuffle and Map
D. All of the mentioned
Q.23 HDFS and NoSQL file systems focus almost exclusively on adding nodes to : A [02]
A. Scale out
B. Scale up
C. Both Scale out and up
D. None of the mentioned
Q.24 Which is the most popular NoSQL database for scalable big data store with Hadoop? A [02]
A. Hbase
B. MongoDB
C. Cassandra
D. None of the mentioned
Q.25 The ________ option allows you to copy jars locally to the current working directory of A [02]
tasks and automatically unjar the files.
A. archives B. files C. task D. None of the mentioned
Q.26 The need for data replication can arise in various scenarios like: D [02]
A. Replication Factor is changed B. DataNode goes down C. Data Blocks get corrupted
D. All of the mentioned
Q.27 ________ is the slave/worker node and holds the user data in the form of Data Blocks. A [02]
A. DataNode B. NameNode C. Data block D. Replication
Q.28 Interface ____________ reduces a set of intermediate values which share a key to a B [02]
smaller set of values.
A. Mapper B. Reduce C. Writable D. Readable
Q.29 Reducer is input the grouped output of a: A [02]
A. Mapper B. Reducer C. Writable D. Readable
Q.30 The ___________ executes the Mapper/ Reducer task as a child process in a separate A [02]
jvm.
A. JobTracker B. TaskTracker C. TaskScheduler D. None of the mentioned
Q.31 During the execution of a streaming job, the names of the _______ parameters are D [03]
transformed.
A. vmap B. mapvim C. mapreduce D. mapred
Q.32 The standard output (stdout) and error (stderr) streams of the task are read by the B [03]
TaskTracker and logged to:
A. ${HADOOP_LOG_DIR}/user B. ${HADOOP_LOG_DIR}/userlogs C. $
{HADOOP_LOG_DIR}/logs D. None of the mentioned

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)


Q.33 The ________ class provides the getValue() method to read the values from its B [03]
instance.
A. Get B. Result C. Put D. Value
Q.34 The _________ Server assigns regions to the region servers and takes the help of B [03]
Apache ZooKeeper for this task.
A. Region B. Master C. Zookeeper D. All of the mentioned
Q.35 _________ is a shell utility which can be used to run Hive queries in either interactive C [03]
or batch mode.
A. $HIVE/bin/hive B. $HIVE_HOME/hive C. $HIVE_HOME/bin/hive D. All of the
mentioned
Q.36 In order to turn on RPC authentication in Hadoop, set the value of Hadoop. security. B [03]
authentication property to:
A. zero B. Kerberos C. false D. None of the mentioned
Q.37 Although the Hadoop framework is implemented in Java, MapReduce applications A [03]
need not be written in:
A. Java B. C C. C# D. None of the mentioned
Q.38 Apache Hadoop YARN stands for: C [03]
A. Yet Another Reserve Negotiator B. Yet Another Resource Network C. Yet Another
Resource Negotiator D. All of the mentioned
Q.39 The updated queue configuration should be a valid one i.e. queue-capacity at each level C [03]
should be equal to:
A. 50% B. 75% C. 100% D. 0%
Q.40 An ___________ is responsible for creating the input splits, and dividing them into D [03]
records.
A. TextOutputFormat B. TextInputFormat C. OutputInputFormat D. InputFormat

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)


UNIT NO.4: BIG DATA ANALYTICS
Syllabus

Data analytics life cycle, Data cleaning , Data transformation, Comparing reporting and analysis,
Types of analysis, Analytical approaches, Data analytics using R, Exploring basic features of R,
Exploring R GUI, Reading data sets, Manipulating and processing data in R, Functions and
packages in R, Performing graphical analysis in R, Integrating R and Hadoop, Hive, Data analytics.

Q. MCQ Ans Marks


No
Q.1 Which of the following step is performed by data scientist after acquiring the data? A [01]
A) Data Cleansing B) Data Integration C) Data Replication D) All of the mentioned
Q.2 [01]
Q.3 [01]
Q.4 [01]
Q.5 [01]
Q.6 [01]
Q.7 [01]
Q.8 [01]
Q.9 [01]
Q.10 [01]
Q.11 [01]
Q.12 [01]
Q.13 [01]
Q.14 [01]
Q.15 [01]
Q.16 [02]
Q.17 [02]
Q.18 [02]
Q.19 [02]
Q.20 [02]
Q.21 [02]
Q.22 [02]
Q.23 [02]
Q.24 [02]
Q.25 [02]
Q.26 [02]
Q.27 [02]
Q.28 [02]
Q.29 [02]
Q.30 [02]
Q.31 [03]
Q.32 [03]
Q.33 [03]
Q.34 [03]
Q.35 [03]
Q.36 [03]

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)


Q.37 [03]
Q.38 [03]
Q.39 [03]
Q.40 [03]
Q.41

5
UNIT NO.5: BIG DATA VISUALIZATION
SYLLABUS

Introduction to Data visualization, Challenges to Big data visualization, Conventional data visualization
tools, Techniques for visual data representations, Types of data visualization, Visualizing Big Data, Tools
used in data visualization, Propriety Data Visualization tools, Open –source data visualization tools,
Analytical techniques used in Big data visualization, Data visualization with Tableau, Introduction to:
Pentaho, Flare, Jasper Reports, Dygraphs, Datameer Analytics Solution and Cloudera, Platfora, NodeBox,
Gephi, Google Chart API, Flot, D3, and Visual.ly.

http://www.allindiaexams.
in/engineering/cse/data-
science-mcq/data-analysis-
research

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)

Q. MCQ An Marks
No s
Q.1 [01]
Q.2 [01]
Q.3 [01]
Q.4 [01]
Q.5 [01]
Q.6 [01]
Q.7 [01]
Q.8 [01]
Q.9 [01]
Q.10 [01]
Q.11 [01]
Q.12 [01]
Q.13 [01]
Q.14 [01]
Q.15 [01]
Q.16 [02]
Q.17 [02]
Q.18 [02]
Q.19 [02]
Q.20 [02]
Q.21 [02]
Q.22 [02]
Q.23 [02]
Q.24 [02]
Q.25 [02]
Q.26 [02]
Q.27 [02]
Q.28 [02]
Q.29 [02]
Q.30 [02]
Q.31 Which of the following is true about below given histogram? B [03]

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)

A) Above histogram is unimodal

B) Above histogram is bimodal

C) Given above is not a histogram

D) None of the above

Q.32 Consider a regression line y=ax+b, where a is the slope and b is the intercept. If C [03]
we know the value of the slope then by using which option can we always find
the value of the intercept?

A) Put the value (0,0) in the regression line True

B) Put any value from the points used to fit the regression line and compute the
value of b False

C) Put the mean values of x & y in the equation along with the value a to get b
False

D) None of the above can be used False

Q.33 [03]
Q.34 [03]
Q.35 [03]

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)


Q.36 [03]
Q.37 [03]
Q.38 [03]
Q.39 [03]
Q.40 [03]

UNIT NO.6: BIG DATA TECHNOLOGIES APPLICATION AND IMPACT


SYLLABUS

Social media analytics, Text mining, Mobile analytics , Roles and responsibilities of Big data person,
Organizational impact, Data analytics life cycle, Data Scientist roles and responsibility, Understanding
decision theory, creating big data strategy, big data value creation drivers, Michael Porter’s valuation
creation models, Big data user experience ramifications, Identifying big data use cases.

http://www.allindiaexams.
in/engineering/cse/tablea
u-mcq-quiz-tableau-
online-test

Q. MCQ Ans Marks


No
Q.1 Accurate prediction depends heavily on measuring the right variables. A [01]
A) True B) False
Q.2 [01]
Q.3 [01]
Q.4 [01]
Q.5 [01]
Q.6 [01]
Q.7 [01]
Q.8 [01]
Q.9 [01]
Q.10 [01]
Q.11 [01]
Q.12 [01]
Q.13 [01]
Q.14 [01]
Q.15 [01]
Q.16 Which of the following relationship are usually identified as average effects? B [02]
A) Descriptive B) Causal C) Predictive D) None of the mentioned
Q.17 Which of the following analysis is usually modeled by deterministic set of equations? C [02]
A) Predictive B) Causal C) Mechanistic D) All of the mentioned
Q.18 [02]
Q.19 [02]
Q.20 [02]

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department


Sinhgad Technical Education Society’s
SINHGAD INSTITUTE OF TECHNOLOGY
(Affiliated to Savitribai Phule Pune University, Pune and Approved by, AICTE, New Delhi.)
Gat No. 309/310 , Kusgaon (Bk), off Mumbai –Pune, Expressway.
Lonavala, Pune, 410401, Website : www.sinhgad.edu
Department of Information Technology

MCQ - DATA SCIENCE AND BIG DATA ANALYTICS (TE-IT)


Q.21 [02]
Q.22 [02]
Q.23 [02]
Q.24 [02]
Q.25 [02]
Q.26 [02]
Q.27 [02]
Q.28 [02]
Q.29 [02]
Q.30 [02]
Q.31 Which of the following is more applicable to the below figure? A [03]

A) Descriptive B) Causal C) Predictive D) None of the mentioned


Q.32 [03]
Q.33 [03]
Q.34 [03]
Q.35 [03]
Q.36 [03]
Q.37 [03]
Q.38 [03]
Q.39 [03]
Q.40 [03]

Prepared by, Prof.Swapnali B.Ware,SIT-IT Department

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy