0% found this document useful (0 votes)
16 views27 pages

ML Mqp2 Solved

Uploaded by

csumant94
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views27 pages

ML Mqp2 Solved

Uploaded by

csumant94
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

ML

MQP-2

Section-A

I. 4*2=8

1. What is Supervised ML? Give an example

Ans:

2. Why Python is used for ML?

Ans: Python is a powerful open source,high level, interpreter, object


oriented programming language.

 Python is easy to understand.

 Python comes with a large number of libraries.

 Python allows easy and powerful implementation.

 Friendly syntax and human-level readability

1
 Community.

3. What is data preparation?

Ans:

4. What is regression? Give an example.

Ans:

5. What is Discrete Output variable? Give an example.

Ans:

2
6. Mention two kinds of Unsupervised Learning.

Ans: i) Clustering

ii) Association

SECTION-B
II. 5*4=20
7. What is Unsupervised ML? Explain key components of Unsupervised ML.

Ans: Unsupervised learning is a type of machine learning in which models


are trained using unlabeled dataset and are allowed to act on that data
without any supervision.

Key components:

 Data Preprocessing:

 Normalization/Standardization: Scaling the data to ensure features

3
have similar ranges.

 Dimensionality Reduction: Techniques like Principal Component


Analysis (PCA) to reduce the number of features while retaining
important information.

 Algorithms:

1. Clustering: Grouping data points into clusters based on similarity.

 K-Means: Partitions data into K clusters by minimizing variance


within each cluster.

 Hierarchical Clustering: Builds a tree of clusters, merging or


splitting them based on distance criteria.

 DBSCAN (Density-Based Spatial Clustering of Applications with


Noise): Forms clusters based on the density of data points.

2. Association: Finding rules that describe large portions of the data.

 Apriori Algorithm: Identifies frequent item sets and derives


association rules.

 Eclat Algorithm: Similar to Apriori but uses a depth-first search.

3. Dimensionality Reduction:

 PCA: Reduces the dimensionality of data while preserving as much


variance as possible.

 t-SNE (t-distributed Stochastic Neighbor Embedding): Visualizes


high-dimensional data in lower dimensions (typically 2D or 3D).

 Autoencoders: Neural networks used for learning efficient codings


of data.

 Evaluation Metrics:

 Silhouette Score: Measures how similar a data point is to its own

4
cluster compared to other clusters.

 Dunn Index: Ratio of the minimum inter-cluster distance to the


maximum intra-cluster distance.

 Elbow Method: Used with K-Means to find the optimal number of


clusters by plotting the explained variance as a function of the
number of clusters.

 Applications:

 Anomaly Detection: Identifying unusual data points that do not fit


the general pattern.

 Market Basket Analysis: Finding associations between products


purchased together.

 Customer Segmentation: Grouping customers based on


purchasing behavior or other characteristics.

 Dimensionality Reduction for Visualization: Reducing data to 2D or


3D to visualize patterns.

8. What is SciPy? Why it is needed for ML? Explain its features?

Ans: Scipy is a Python library useful for solving many mathematical


equations and algorithms.

 It is designed on the top of Numpy library that gives more extension


of finding scientific mathematical formulae like Matrix Rank, Inverse,
polynomial equations, LU Decomposition, etc.

 Using its high-level functions will significantly reduce the complexity


of the code and helps better in analyzing the data.

Installation of SciPy : pip install scipy

5
Need of Scipy in ML:

SciPy is a versatile and highly capable scientific computing library that is


widely used across the scientific computing community.

Features:

 Basic Numerical Functions

 Linear Algebra

 Optimization

 Data Visualization

 Statistics

 Image Processing

9. How to Handle Missing Values and Outliers? Explain with an example?

Ans:

6
7
10. Explain the process of getting data?

Ans:

8
11. Explain the difference between Regression and Classification .

Ans.

9
10
12. Explain limitations of K-mean clustering?

Ans:

11
SECTION-C
III. 4*8=32

13. Explain the main challenges of ML?

Ans:

12
14. How Semi-Supervised ML works? explain with an example.

Ans: Semi-Supervised learning is a type of Machine Learning algorithm that


represents the intermediate ground between Supervised and Unsupervised
learning algorithms. It uses the combination of labeled and unlabeled
datasets during the training period.

13
15. a .How to create a Test Set?

b.Why data reduction is important in ML?

Ans: Test set—a subset used to put the trained model to the test

A test set is defined by splitting the original dataset into distinct


partitions.

14
15
(b)

16
16. a. What is Logistic regression? Explain how it works?

b. WAP in python for spam email detection using the Naive Bayes

classification algorithm .

Ans: (a)

17
18
"OR"

19
20
(b)

21
17. a.Write decision tree algorithm? Explain how it works?

b. Explain how a cluster formed in DBSCAN clustering algorithm?

Ans:

(b)

22
18. a.Explain the types of Clustering methods or techniques ?

Ans:

23
24
25
(b) Write a Python code to use Clustering in Semi Supervised ML

Ans:

26
27

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy