Ocs353 DSF Question Bank 25-26
Ocs353 DSF Question Bank 25-26
QUESTION BANK
Course Outcomes
Course
Course Outcomes
Outcome No
Applying
K Remembering K Understanding K
(Application of
1 (Knowledge) 2 (Comprehension) 3
Knowledge)
BT
Q.No Questions Topic Mark
Level
Data
Define Data Science and Big Data. (NOV/DEC 2022) Science:Uses K1 2 Marks
1.
and Benefits
What is the role of data science in business, medical research, Data
healthcare, education, social media, technology and financial Science:Uses K1 2 Marks
2.
institutions? and Benefits
Facets of
Write the main types/categories of data? .(AU NOV/DEC 2023) K1 2 Marks
3. Data
How missing values present in a dataset are treated during data Data Science
K1 2 Marks
4. analysis phase? (AU APR/MAY 2024) Process
Statistical
Define Median with example.(AU NOV/DEC 2023) Description K2 2 Marks
5.
of Data
Identify and write down various data analytic challenges faced in Data Science
K2 2 Marks
6. the conventional system.(AU APR/MAY 2024) Process
Machine
9. What is the standard approach to supervised learning? K2 2 Marks
Learning
Machine
10. What is meant by k-means algorithm? K1 2 Marks
Learning
What is the difference between artificial learning and machine Machine
11. K2 2 Marks
learning? Learning
Machine
12. State DBSCAN Algorithm K2 2 Marks
Learning
Linear
13. What is a Linear Regression? K2 2 Marks
Regression
14. What is the difference between classification and regression? regression K2 2 Marks
15. What is the difference between K-means and KNN? Clustering K2 2 Marks
16. How to train a model in machine learning. Machine
K2 2 Marks
Learning
What is the classification algorithm? Classification
17. K2 2 Marks
Algorithm
18. Define regression line. regression K2 2 Marks
State the learners in classification problem. Mention about lazy
Classification
19. K2 2 Marks
learners. Algorithm
What are the steps involved in data science process. data science
20. K2 2 Marks
process
PART - B
Explain various learning techniques involved in Unsupervised Unsupervised
1. K3 13 Marks
Learning. (AU APR/MAY 2024) Learning
Explain the types of Machine learning. Machine
2. K3 13 Marks
Learning
Assume an image has pixel size 240x180. Elaborate how K
3. means clustering can be used to achieve lossy data compression Clustering K2 13 Marks
of that image. (Nov/Dec’23)
Explain Semi Supervised Learning in detail. Machine K2
4. 13 Marks
Learning
List the applications of clustering and identify the advantages and K2
5. Clustering 13 Marks
disadvantages of clustering algorithms.(AU APR/MAY 2024)
What is a Classification Algorithm? Explain the steps to K2
Classification
6. construct a Classification Algorithm. List and Explain about the 13 Marks
Algorithm
different procedures used. (April/May 2023)
7. Explain in detail about outlier analysis (Nov/Dec’23) outlier K2 13 Marks
PART – C (If applicable)
Consider five points {x1,x2,x3,x4,x5} with the following
coordinates as a two-dimensional samples for clustering:
X1=(0.5, 1.75),x2=(1,2), x3=(1.75, 0.25), x4=(4,1),x5=(6,3)
Illustrate the k-means algorithm on the above data set. The Clustering 15 Marks
1. K3
required number of clusters is two and initially, clusters are
formed from random distribution of samples: C1={x1,x2,x4} and
C2={x3,x5} (April/May 2023)
List non-parametric techniques and Explain K-nearest neighbor
Clustering K3 15 Marks
2. estimation
The values of x and their corresponding values of y are shown
in the table below. Linear
x 1 2 3 4 5 6 7 Regression
3. K3 15 Marks
y 3 4 5 5 6 8 10 Models: Least
i) Find the least square regression line y=ax+b squares
ii) Estimate the value of y when x=10 (April/May 2023)
UNIT 4- DATA VISUALIZATION
Importing Matplotlib – Line plots – Scatter plots – visualizing errors – density and contour plots –
Histograms – legends – colors – subplots – text and annotation – customization – three dimensional
plotting – Geographic Data with Basemap – Visualization with Seaborn.
PART – A
Q.No Questions Topic BT Level Mark
What is the purpose of errorbar function in Matplotlib? visualizing
K2 2 Marks
1. Give an example.(NOV/DEC 2022) errors
three
Showcase 3-dimensional drawing in Matplotlib with
dimensional K2 2 Marks
2. corresponding Python Code. (NOV/DEC 2022)
plotting
State the two possible options in Ipython notebook used three
to embed graphics directly in the notebook. dimensional K2 2 Marks
3.
(APRIL/MAY 2023) plotting
How plt.scatter function differs from plt.flot function?
Scatter plots K2 2 Marks
4. (APRIL/MAY 2023)
Importing
What is purpose of matplotlib? K2 2 Marks
5. Matplotlib
Importing
Write the dual interface of matplotlib? K2 2 Marks
6. Matplotlib
How to draw a simple line plot using matplotlib? Line plots K2 2 Marks
7.
What functions can be used to draw scatter plots? Scatter plots K2 2 Marks
8.
Write the difference between plot and scatter functions? Scatter plots K2 2 Marks
9.
density and
Define contour plots? K2 2 Marks
10. contour plots
density and
What functions can be used to draw contour plots? K2 2 Marks
11. contour plots
Write a python code snippet that generates a time-series
graph representing COVID-19 incidence cases for a
particular week.(AU APR/MAY 2024)
Histograms K2 2 Marks
12. Day Day Day Day Day Day Day
1 2 3 4 5 6 7
7 18 9 44 2 5 89
Write a python code snippet that draws a histogram for
the following list of positive numbers (AU APR/MAY
Histograms K2 2 Marks
13. 2024)
7 18 9 44 2 5 89 91 11 6 77 85 91
three
How to create a 3-D wireframe plot? dimensional K2 2 Marks
14.
plotting
three
Define surface plot? dimensional K2 2 Marks
15.
plotting
scatter plots
19. Define scatter plots K2 2 Marks
scatter plots
20. Define cylindrical projections. K2 2 Marks
PART – B
Explain about matplotlib with its import, setting styles
text and
K3 13 Marks
1. and displaying the plots.(NOV/DEC 2022) annotation
Appraise the following (i) Histograms (ii) Binnings (iii)
Density with appropriate Python code.(NOV/DEC Histogram K3 13 Marks
2.
2022)
Explain in detail about three dimensional plotting ion
Customization K1 13 Marks
3. matplolib.(AU NOV/DEC 2023)
Explain various features of Matplotlib platform used for
Importing
data visualization and illustrate its challenges.(AU K1 13 Marks
4. Matplotlib
NOV/DEC 2023)
density and
Explain contour plot and density. K2 13 Marks
5. contour plots
Write a code snippet that projects our globe as a 2-D flat
surface (using cylindrical project) and convey three
information about the location of any three major Indian dimensional K3 13 Marks
6.
cities in the map(using scatter plot)(AU APR/MAY plotting
2024)
(i) Write a working code that performs a simple
Guassian process regression(GPR), using the Scikit-
Learn API.
Visualization
(ii) Briefly explain about visualization with Seaborn. K3 13 Marks
7. with Seaborn
Give an example working code segment that represents
a 2-D kernel density plot for any data.(AU APR/MAY
2024)
PART – C (If applicable)
Perform an exploratory data analysis for the following
data with different types of plots:
The dataset contains cases from a study that was
conducted between 1958 and 1970 at the University of
Chicago’s Billings Hospital on the survival of patients
who had undergone surgery for breast cancer.
Visualization
Data attributes :- K3 15 Marks
1. with Seaborn
Age of patient at the time of operation (numerical)
Patient’s year of operation (year-1990,numerical )
Number of positive axillary nodes detected(numerical)
Survival status (class attribute ) 1= the patient survived
5 years or longer, 2= the patient died within 5 year.
(NOV/DEC 2022)
three
Explain in detail about Visualizing a Mobius Strip. dimensional K3 15 Marks
2.
plotting
Explain about Geographic data with Basemap with Geographic
different Map Projections, Map background and Plotting data with K3 15 Marks
3.
data in Maps. Basemap
UNIT 5 HANDLING LARGE DATA
Problems - techniques for handling large volumes of data - programming tips for dealing with large
data sets- Case studies: Predicting malicious URLs, Building a recommender system - Tools and
techniques needed - Research question - Data preparation - Model building – Presentation and
automation.
PART – A
BT
Q.No Questions Topic Mark
Level
Techniques for
What is meant by large data? handling large K2 2 Marks
1.
volumes of data