0% found this document useful (0 votes)

8 views23 pages

Assignment 3.docx 2

The document discusses various machine learning concepts, focusing on Support Vector Machines (SVM), Decision Trees, and ensemble methods like Bagging and Boosting. It explains how SVM utilizes quadratic programming and kernel methods for classification, the training and prediction process of Decision Trees, and the importance of regularization in preventing overfitting. Additionally, it covers the CART algorithm, GINI impurity, and provides examples of implementing SVM and AdaBoost classifiers.

Uploaded by

personalabhi077

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views23 pages

Assignment 3.docx 2

Uploaded by

personalabhi077

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Ballari Institute of Technology and Management

Department of Artificial Intelligence and Machine Learning

Course:Machine Learning Sem: 5th sem(A,B,C)
Assignment 3
1. Show that how SVM make prediction using quadratic Programming and
kernalized SVM
SVM make predictions by solving an optimization problem using
quadratic programming(QP) and kernels for non linear decision
boundaries.
The goal of SVM is to find a hyperplane that best separates the
data points of different classes by maximizing the margin.

Fig shows Iris data set. The two classes are easily separated with a
straight line. The left plot shows decision boundaries of 3 possible linear
classifiers. The dashed line does not separate the classes properly. The
solid line on the right rep the decission boundary of an SVM classifier, it
is far away from training instances fitting the widest possible instances
between the classes. This is called large margin classification.
The hard margin and soft margin problems are optimization
problems with linear models. Called quadratic programming problems.
The general problem formulation is given by
To apply 2nd degee polynomial to a 2 dimensional training set train a
linear SVM classifier on the transformed training set . Eqn shows 2nd
degree polynomial mapping function

The dot product of transformed vectors is equal to the square of the dot
product of the original vectors.
2. Discuss Non-linear SVM classification. How can you see polyomial
Kernel, Guassian and RBF kernel.
For non linear dataset,we add a second feature to transform into a linear
dataset.
x2 = (x1)2, the resulting 2D dataset is perfectly linearly separable
from sklearn.datasets import make_moons
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures
polynomial_svm_clf = Pipeline([
("poly_features", PolynomialFeatures(degree=3)),
("scaler", StandardScaler()),
("svm_clf", LinearSVC(C=10, loss="hinge"))
])
polynomial_svm_clf.fit(X, y)

For a low polynomial degree SVM uses kernel trick. The folowing code
explains how to use kernel trick on moons dataset.

Technique to handle non linear is to add similarity feature. The similarity

function to be the Gaussian Radial Basis Function (RBF)
with γ = 0.3
It is a bell-shaped function varying from 0 (very far away from the
landmark) to 1 (at the landmark). Now we are ready to compute the new
features. For example, let’s look at the instance x1 = –1: it is located at a
distance of 1 from the first landmark, and 2 from the second landmark.
Therefore its new features are x2 = exp (–0.3 × 12) ≈ 0.74 and x3 = exp
(–0.3 × 22) ≈ 0.30. The plot on the right of Figure shows the transformed
dataset (dropping the original features). It is now linearly
separable.

Gaussian RBF kernel using the SVC class:

rbf_kernel_svm_clf = Pipeline([
("scaler", StandardScaler()),
("svm_clf", SVC(kernel="rbf", gamma=5, C=0.001))
])
rbf_kernel_svm_clf.fit(X, y)
The plots shows how models are trained with different values of
hyperparameters gamma (γ) and C. Increasing gamma makes the bell-shape
curve narrower (left plot), and as a result each instance’s range of influence is
smaller: the decision boundary ends up being more irregular, wiggling around
individual instances. Conversely, a small gamma value makes the bell-shaped
curve wider, so instances have a larger range of influence, and the decision
boundary ends up smoother. So γ acts like a regularization hyperparameter: if
your model is overfitting, you should reduce it, and if it is underfitting, you
should increase.

3. Explain how decision trees are trained, visualized and used in making
predictions.
Decision Trees are the fundamental components and most powerful
Machine Learning algorithms
To train, visualize, and make predictions with Decision Trees
Figure 6-1. Iris Decision Tree
The following code trains a DecisionTreeClassifier on the iris dataset
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
iris = load_iris()
X = iris.data[:, 2:] # petal length and width
y = iris.target
tree_clf = DecisionTreeClassifier(max_depth=2)
tree_clf.fit(X, y)
Figure 6-1 show predictions on an iris flower to classify it. The root node
(depth 0, at the top): this node asks whether the flower’s petal length is
smaller than 2.45 cm. If it is, then moves down to the root’s left child
node (depth 1, left). In this case, it is a leaf node (i.e., it does not have any
children nodes), so it does not ask any questions:it simply looks at the
predicted class for that node and the Decision Tree predicts flower is an
Iris-Setosa (class=setosa).
Now suppose you find another flower, but this time the petal length is
greater than 2.45 cm. now move down to the root’s right child node
(depth 1, right), which is not a leaf node, so it asks another question: is
the petal width smaller than 1.75 cm? If it is, then flower is most likely
an Iris-Versicolor (depth 2, left). If not, it is likely an Iris-Virginica (depth
2, right).
For example, the depth-2 left node
has a gini score equal to 1 – (0/54)2 – (49/54)2 – (5/54)2 ≈ 0.168
suppose We found a flower whose petals are 5 cm long and 1.5 cm wide. The corresponding
leaf node is the depth-2 left node, so the Decision Tree should output the
following probabilities: 0% for Iris-Setosa (0/54), 90.7% for Iris-Versicolor (49/54),
and 9.3% for Iris-Virginica (5/54). And of course if you ask it to predict the class, it
should output Iris-Versicolor (class 1) since it has the highest probability. Let’s check
this:

tree_clf.predict_proba([[5, 1.5]])
array([[0. , 0.90740741, 0.09259259]])

>>> tree_clf.predict([[5, 1.5]])

array([1]

4. Explain Bagging and pasting with an example.

One way to get a diverse set of classifiers is to use very different training
algorithms. Another approach is to use the same training algorithm for
everypredictor, but to train them on different random subsets of the training set.
When sampling is performed with replacement, this method is called bagging
(short for bootstrap aggregating2). When sampling is performed without
replacement, it is called pasting. both bagging and pasting allow training
instances to be sampled several times across multiple predictors, but only
bagging allows training instances to be sampled several times for the same
predictor. This sampling and training process is represented in Figure 7-4.
Once all predictors are trained, the ensemble can make a prediction for a new
instance by simply aggregating the predictions of all predictors. The aggregation
function is typically the statistical mode (i.e., the most frequent prediction, just
like a hard voting classifier) for classification, or the average for regression.
Each individual predictor has a higher bias than if it were trained on the original
training set, but aggregation reduces both bias and variance.4 Generally, the net
result is that the ensemble has a similar bias but a lower variance than a single
predictor trained on the original training set.

5. Explain CART Algorithm, Regularization hyperparameters in decision

Trees.
The Classification And Regression Tree (CART) algorithm to train
Decision Trees (also called “growing” trees). The idea is quite simple: the
algorithm first splits the training set in two subsets using a single feature
k and a threshold tk (e.g., “petal length ≤ 2.45 cm)
. The cost function that the algorithm tries to minimize is given by
Equation 6-2.
To avoid overfitting the training data, you need to restrict the Decision
Tree’s freedom during training. As you know by now, this is called
regularization. Fig shows two Decision Trees trained on the moons
dataset (introduced in Chapter 5). On the left, the Decision Tree is trained
with the default hyperparameters (i.e., no restrictions), and on the right
the Decision Tree is trained with min_samples_leaf=4. It is quite obvious
that the model on the left is overfitting, and the model on the right will
probably generalize better.

6. What is Boosting. Explain ADA BOOSt and gradient boosting.

Boosting refers to any Ensemble method that can combine several weak
learners into a strong learner. The general idea of most boosting methods is to
train predictors sequentially, each trying to correct its predecessor. There are
many boosting methods available, but by far the most popular are Adaptive
Boosting | One way for a new predictor to correct its predecessor is to pay a bit
more attention to the training instances that the predecessor underfitted. This
results in new predictors focusing more and more on the hard cases. This is the
technique used by Ada‐Boost.
For example, to build an AdaBoost classifier, a first base classifier (such as a
Decision Tree) is trained and used to make predictions on the training set. The
relative weight of misclassified training instances is then increased. A second
classifier is trained using the updated weights and again it makes predictions on
the training set, weights are updated, and so on (see Figure 7-7).
Figure 7-8 shows the decision boundaries of five consecutive predictors on the
moons dataset (in this example, each predictor is a highly regularized SVM
classifier with an RBF kernel14). The first classifier gets many instances wrong,
so their weights get boosted. The second classifier therefore does a better job on
these instances, and so on. The plot on the right represents the same sequence of
predictors except that the learning rate is halved (i.e., the misclassified instance
weights are boosted half as much at every iteration). As you can see, this
sequential learning technique has some similarities with Gradient Descent,
except that instead of tweaking a single predictor’s parameters to minimize a
cost function, AdaBoost adds predictors to the ensemble, gradually making it
better.

Once all predictors are trained, the ensemble makes predictions very much like
bagging or pasting
from sklearn.ensemble import AdaBoostClassifier
ada_clf = AdaBoostClassifier(
DecisionTreeClassifier(max_depth=1), n_estimators=200,
algorithm="SAMME.R", learning_rate=0.5)
ada_clf.fit(X_train, y_train)

7. What is Bayes theorem

Refer the answer given in class

8. Discuss the minimum description length algorithm.

9. Explain the steps in Gibbs Algorithm

10.Write EM algorithm and explain in details.
11.Explain Naïve Bayes clasifier with an example.
12.Implement a Support Vector Machine (SVM) model to classify a dataset
with multiple classes. Explain the steps taken to preprocess the data, train
the model, and optimize its performance. Include the methods used for
hyperparameter tuning and evaluation of the final model.

13.What are the main differences between linear and nonlinear Support
Vector Machines
The fundamental idea behind SVMs is best explained with some pictures.
Fig shows part of the iris dataset. The two classes can clearly be separated
easily with a straight line (they are linearly separable). The left plot
shows the decision boundaries of three possible linear classifiers. The
model whose decision boundary is represented by the dashed line is so
bad that it
does not even separate the classes properly. The other two models work
perfectly on this training set, but their decision boundaries come so close
to the instances that these models will probably not perform as well on
new instances. In contrast, the solid line in the plot on the right represents
the decision boundary of an SVM classifier; this line not only separates
the two classes but also stays as far away from the closest training
instances as possible. You can think of an SVM classifier as fitting the
widest possible street (represented by the parallel dashed lines) between
the classes.
This is called large margin classification.

Soft Margin Classification

If we strictly impose that all instances be off the street and on the right
side, this is called hard margin classification. There are two main issues
with hard margin classification. First, it only works if the data is linearly
separable, and second it is quite sensitive to outliers. Figure 5-3 shows the
iris dataset with just one additional outlier: on the left, it is impossible to
find a hard margin, and on the right the decision boundary ends up very
different from the one we saw in Figure 5-1 without the outlier, and it
will probably not generalize as well.

To avoid these issues it is preferable to use a more flexible model. The

objective is to find a good balance between keeping the street as large as
possible and limiting the margin violations (i.e., instances that end up in
the middle of the street or even on the wrong side). This is called soft
margin classification. In Scikit-Learn’s SVM classes, you can control this
balance using the C hyperparameter: a smaller C value leads to a wider
street but more margin violations. Figure 5-4 shows the decision
boundaries and margins of two soft margin SVM classifiers on a
nonlinearly separable dataset. On the left, using a low C value the margin
is quite large, but many instances end up on the street. On the right, using
a high C value the classifier makes fewer margin violations but ends up
with a smaller margin. However, it seems likely that the first classifier
will generalize better: in fact even on this training set it makes fewer
prediction errors, since most of the margin violations are actually on the
correct side of the decision boundary

The following Scikit-Learn code loads the iris dataset, scales the features,
and then trains a linear SVM model (using the LinearSVC class with C =
1 and the hinge loss function, described shortly) to detect Iris-Virginica
flowers. The resulting model is represented on the left of Figure 5-4.
import numpy as np
from sklearn import datasets
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC
iris = datasets.load_iris()
X = iris["data"][:, (2, 3)] # petal length, petal width
y = (iris["target"] == 2).astype(np.float64) # Iris-Virginica
svm_clf = Pipeline([
("scaler", StandardScaler()),
("linear_svc", LinearSVC(C=1, loss="hinge")),
])
svm_clf.fit(X, y)
you can use the model to make predictions:
>>> svm_clf.predict([[5.5, 1.7]])
array([1.]
Nonlinear SVM Classification
Although linear SVM classifiers are efficient and work surprisingly well
in many cases, many datasets are not even close to being linearly
separable. One approach to handling nonlinear datasets is to add more
features, such as polynomial features (as you did in Chapter 4); in some
cases this can result in a linearly separable dataset.
Consider the left plot in Figure 5-5: it represents a simple dataset with
just one feature x1. This dataset is not linearly separable, as you can see.
But if you add a second feature x2 = (x1)2, the resulting 2D dataset is
perfectly linearly separable.

from sklearn.datasets import make_moons

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures
polynomial_svm_clf = Pipeline([
("poly_features", PolynomialFeatures(degree=3)),
("scaler", StandardScaler()),
("svm_clf", LinearSVC(C=10, loss="hinge"))
])
polynomial_svm_clf.fit(X, y)

14.Explain the concept of GINI impurity and how it is used in decision tree
algorithms.
Gini Impurity: It is a Measure of Impurity. Gini impurity is a metric used
in decision tree algorithms to measure how “pure” a node is in terms of
class distribution. A pure node means all data points in that node belong
to a single class, while an impure node contains data points from multiple
classes.

Formula for Gini Impurity

For a given node , the Gini impurity is calculated as:

Where:
• n: Number of classes.
• pi,k: Proportion of samples belonging to class in the node.

The Gini impurity ranges from 0 (pure node) to a maximum value that
depends on the number of classes.

Interpreting Gini Impurity

• Gini Impurity = 0: All samples in the node belong to one
class (pure node).
• Higher Gini Impurity: Indicates a more mixed distribution of
classes.

For example:
• If all samples in a node are of the same class, Gini=0
• If there are two classes with equal proportions, Gini=0.5

How Gini Impurity is Used in Decision Trees

1. Splitting Nodes:
• Decision trees aim to split nodes such that the resulting child
nodes are as pure as possible.
• At each split, the algorithm evaluates the Gini impurity for
potential splits and chooses the one that minimizes the weighted average
impurity of the child nodes.
2. Weighted Gini Impurity After a Split:
The weighted Gini impurity is computed as:
Once it has successfully split the training set in two, it splits the subsets
using the same logic, then the sub-subsets and so on, recursively. It stops
recursing once it reaches the maximum depth (defined by the max_depth
hyperparameter), or if it cannot find a split that will reduce impurity.

15.Evaluate the performance of each bagging and boosting as well as their

combination.
One way to get a diverse set of classifiers is to use very different training
algorithms, as just discussed. Another approach is to use the same
training algorithm for every predictor, but to train them on different
random subsets of the training set. When sampling is performed with
replacement, this method is called bagging1 (short for bootstrap
aggregating2). When sampling is performed without replacement, it is
called pasting.3 In other words, both bagging and pasting allow training
instances to be sampled several times across multiple predictors, but only
bagging allows training instances to be sampled several times for the
same predictor. This sampling and training process is represented in
Figure 7-4.
Once all predictors are trained, the ensemble can make a prediction for a
new
instance by simply aggregating the predictions of all predictors. The
aggregation function is typically the statistical mode (i.e., the most
frequent prediction, just like a hard voting classifier) for classification, or
the average for regression. Each individual predictor has a higher bias
than if it were trained on the original training set, but aggregation reduces
both bias and variance.4 Generally, the net result is that the ensemble has
a similar bias but a lower variance than a single predictor trained on the
original training set.

from sklearn.ensemble import BaggingClassifier

from sklearn.tree import DecisionTreeClassifier
bag_clf = BaggingClassifier(
DecisionTreeClassifier(), n_estimators=500,
max_samples=100, bootstrap=True, n_jobs=-1)
bag_clf.fit(X_train, y_train)
y_pred = bag_clf.predict(X_test)
One way for a new predictor to correct its predecessor is to pay a bit more
attention to the training instances that the predecessor underfitted. This
results in new predictors focusing more and more on the hard cases. This
is the technique used by Ada‐ Boost.
For example, to build an AdaBoost classifier, a first base classifier (such
as a DecisionTree) is trained and used to make predictions on the training
set. The relative weight of misclassified training instances is then
increased. A second classifier is trained using the updated weights and
again it makes predictions on the training set, weights are updated, and so
on
from sklearn.ensemble import AdaBoostClassifier
ada_clf = AdaBoostClassifier(
DecisionTreeClassifier(max_depth=1), n_estimators=200,
algorithm="SAMME.R", learning_rate=0.5)
ada_clf.fit(X_train, y_train)

16. Discuss the results in terms of accuracy, robustness, and computational

cost.
17. Explain the Maximum Likelihood Estimation (MLE) method and its
significance in parameter estimation

18. Describe the Bayes Optimal Classifier and its theoretical importance in
classification problems.
Answer given in class
19. What is a Bayesian Belief Network, and how does it represent
probabilistic relationships between variables?
Answer given in class
20. Construct a regression using the following data which consists of 10 data
instances and three attributes “Assessment’, ‘Assignment’ and Project.

ML Unit 3 (DS)
No ratings yet
ML Unit 3 (DS)
31 pages
(Onpage For Track Changes) (Question 2)
No ratings yet
(Onpage For Track Changes) (Question 2)
18 pages
M.Tech Software Engineering Full Syllabus
No ratings yet
M.Tech Software Engineering Full Syllabus
67 pages
Machine Learning Crash Course: Computer Vision James Hays
No ratings yet
Machine Learning Crash Course: Computer Vision James Hays
38 pages
Decision Trees
No ratings yet
Decision Trees
8 pages
DSA5102 Lecture3
No ratings yet
DSA5102 Lecture3
34 pages
AIML QB in Short Form
No ratings yet
AIML QB in Short Form
48 pages
Ensemble Learning
No ratings yet
Ensemble Learning
12 pages
Mid2 Answers
No ratings yet
Mid2 Answers
42 pages
Phys361 S24 Lecture 17 Random Forests
No ratings yet
Phys361 S24 Lecture 17 Random Forests
24 pages
ML Mod1
No ratings yet
ML Mod1
48 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
Session 10 - Ensemble Methods (XGBoost)
No ratings yet
Session 10 - Ensemble Methods (XGBoost)
37 pages
Ensemble Learning and Random Forest 4th
No ratings yet
Ensemble Learning and Random Forest 4th
19 pages
Assignment 9 Solution
No ratings yet
Assignment 9 Solution
4 pages
Heart Disease Prediction Python
No ratings yet
Heart Disease Prediction Python
114 pages
1 s2.0 S2214509524005904 Main
No ratings yet
1 s2.0 S2214509524005904 Main
21 pages
11) Elaborate On The Types of Machine Learning With Appropriate Examples
No ratings yet
11) Elaborate On The Types of Machine Learning With Appropriate Examples
9 pages
GENERATING CLOUD MONITORS FROM MODELS TO SECURE - Docx 2
No ratings yet
GENERATING CLOUD MONITORS FROM MODELS TO SECURE - Docx 2
42 pages
Research Papers of Rainfall Ptediction
No ratings yet
Research Papers of Rainfall Ptediction
8 pages
ML Mod 4
No ratings yet
ML Mod 4
13 pages
Handout 03 Classic Classifiers
No ratings yet
Handout 03 Classic Classifiers
39 pages
09 EnsembleLearning
No ratings yet
09 EnsembleLearning
36 pages
ML06 Classical Techniques
No ratings yet
ML06 Classical Techniques
38 pages
ML Unit2
No ratings yet
ML Unit2
22 pages
DAL Assignment 4 Endsem
No ratings yet
DAL Assignment 4 Endsem
8 pages
Chapter Non-Parametric Methods
No ratings yet
Chapter Non-Parametric Methods
9 pages
Guide
No ratings yet
Guide
24 pages
D3 IT Random Forest Apr 2023
No ratings yet
D3 IT Random Forest Apr 2023
32 pages
Lecture 5
No ratings yet
Lecture 5
53 pages
AI Chapter 3 Part 3
No ratings yet
AI Chapter 3 Part 3
49 pages
Module4 DS PPT
No ratings yet
Module4 DS PPT
49 pages
Module 2
No ratings yet
Module 2
34 pages
Decision Trees
No ratings yet
Decision Trees
38 pages
ML Unit 4
No ratings yet
ML Unit 4
47 pages
Jntuk Machine Learning 3-2 Unit-3
No ratings yet
Jntuk Machine Learning 3-2 Unit-3
33 pages
CVR DWDM Manual
100% (1)
CVR DWDM Manual
70 pages
ML Unit-3 Part-1
No ratings yet
ML Unit-3 Part-1
17 pages
Machine Learning (Se204A) Lab Manual
No ratings yet
Machine Learning (Se204A) Lab Manual
27 pages
VTU Module-4 Chapter-2 Ensemble Learning and Random Forests
No ratings yet
VTU Module-4 Chapter-2 Ensemble Learning and Random Forests
61 pages
Data Science - Decision Tree - Random Forest
No ratings yet
Data Science - Decision Tree - Random Forest
15 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
Ratio Imputation Improvement
No ratings yet
Ratio Imputation Improvement
39 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
1.10. Decision Trees - Scikit-Learn 0.24.1 Documentation
No ratings yet
1.10. Decision Trees - Scikit-Learn 0.24.1 Documentation
10 pages
ML Unit 2
No ratings yet
ML Unit 2
37 pages
Decision Tree Approaches For Zero-Inflated Count Data: Seong-Keon Lee & Seohoon Jin
100% (1)
Decision Tree Approaches For Zero-Inflated Count Data: Seong-Keon Lee & Seohoon Jin
15 pages
18ai61-Model Question Paper Solutions
No ratings yet
18ai61-Model Question Paper Solutions
71 pages
Unit 3
No ratings yet
Unit 3
63 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
32 pages
Random Forest
No ratings yet
Random Forest
30 pages
Ml-Unit Iii-1
No ratings yet
Ml-Unit Iii-1
46 pages
PSR 0607 Chap10
No ratings yet
PSR 0607 Chap10
33 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
WEEK 01 Merged
No ratings yet
WEEK 01 Merged
606 pages
Decision Trees and Regression Techniques
No ratings yet
Decision Trees and Regression Techniques
27 pages
Department of Computer Science & Engineering Continuous Internal Assessment - 2
No ratings yet
Department of Computer Science & Engineering Continuous Internal Assessment - 2
2 pages
Determination of Customer Satisfaction Using Improved K-Means Algorithm
No ratings yet
Determination of Customer Satisfaction Using Improved K-Means Algorithm
19 pages
Learning Predictive Analytics With Python - Sample Chapter
100% (2)
Learning Predictive Analytics With Python - Sample Chapter
28 pages
Demand Forecasting
No ratings yet
Demand Forecasting
10 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
Machine Learning: Classification & Decision Trees
No ratings yet
Machine Learning: Classification & Decision Trees
24 pages
Lecture 8
No ratings yet
Lecture 8
19 pages
Data Mining Techniques For Software Effort Estimation: A Comparative Study
No ratings yet
Data Mining Techniques For Software Effort Estimation: A Comparative Study
23 pages
Unit-3 ML P (1) PPTs by DR KSR
No ratings yet
Unit-3 ML P (1) PPTs by DR KSR
21 pages
CP 4
No ratings yet
CP 4
2 pages
Decision Trees and Decision Modeling
No ratings yet
Decision Trees and Decision Modeling
58 pages
LAB (1) Decision Tree: Islamic University of Gaza Computer Engineering Department Artificial Intelligence ECOM 5038
No ratings yet
LAB (1) Decision Tree: Islamic University of Gaza Computer Engineering Department Artificial Intelligence ECOM 5038
18 pages
Random Forest
No ratings yet
Random Forest
10 pages
Data Mining Classification Algorithms: Credits: Padhraic Smyth
No ratings yet
Data Mining Classification Algorithms: Credits: Padhraic Smyth
54 pages
Techniques For Sentiment Analysis of Twitter Data: A Comprehensive Survey
No ratings yet
Techniques For Sentiment Analysis of Twitter Data: A Comprehensive Survey
7 pages
Interview Preparing - ML Draft
No ratings yet
Interview Preparing - ML Draft
12 pages
Bagging and Random Forest Presentation1
100% (3)
Bagging and Random Forest Presentation1
23 pages
Decision Trees
67% (3)
Decision Trees
14 pages
1st Paper Regarding Crop Prediton and Yield
No ratings yet
1st Paper Regarding Crop Prediton and Yield
4 pages
Data Mining in Education Data Classification and Decision Tree Approach 097 Z00080E10038 2
No ratings yet
Data Mining in Education Data Classification and Decision Tree Approach 097 Z00080E10038 2
5 pages
Random Forest
No ratings yet
Random Forest
83 pages
Lecture 05 Random Forest 07112022 124639pm
No ratings yet
Lecture 05 Random Forest 07112022 124639pm
25 pages
Total Marks (15 Qns 1 Mark 15 Marks) : Business Intelligence and Analytics Assignment Week 1
No ratings yet
Total Marks (15 Qns 1 Mark 15 Marks) : Business Intelligence and Analytics Assignment Week 1
29 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
50 pages
Machine Learning Lesson - Plan
No ratings yet
Machine Learning Lesson - Plan
3 pages
13 PracticalMachineLearning
100% (1)
13 PracticalMachineLearning
84 pages
Astrological Prediction For Profession Doctor Usin
No ratings yet
Astrological Prediction For Profession Doctor Usin
5 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Unit - 3 ML
No ratings yet
Unit - 3 ML
17 pages
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Assignment 3.docx 2

Uploaded by

Assignment 3.docx 2

Uploaded by

Ballari Institute of Technology and Management

Department of Artificial Intelligence and Machine Learning

Technique to handle non linear is to add similarity feature. The similarity

Gaussian RBF kernel using the SVC class:

>>> tree_clf.predict([[5, 1.5]])

4. Explain Bagging and pasting with an example.

5. Explain CART Algorithm, Regularization hyperparameters in decision

6. What is Boosting. Explain ADA BOOSt and gradient boosting.

7. What is Bayes theorem

8. Discuss the minimum description length algorithm.

9. Explain the steps in Gibbs Algorithm

Soft Margin Classification

To avoid these issues it is preferable to use a more flexible model. The

from sklearn.datasets import make_moons

Formula for Gini Impurity

For a given node , the Gini impurity is calculated as:

Interpreting Gini Impurity

How Gini Impurity is Used in Decision Trees

15.Evaluate the performance of each bagging and boosting as well as their

from sklearn.ensemble import BaggingClassifier

16. Discuss the results in terms of accuracy, robustness, and computational

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Assignment 3.docx 2

Uploaded by

Assignment 3.docx 2

Uploaded by

Ballari Institute of Technology and Management

Department of Artificial Intelligence and Machine Learning

Technique to handle non linear is to add similarity feature. The similarity

Gaussian RBF kernel using the SVC class:

>>> tree_clf.predict([[5, 1.5]])

4.​ Explain Bagging and pasting with an example.

5.​ Explain CART Algorithm, Regularization hyperparameters in decision

6.​ What is Boosting. Explain ADA BOOSt and gradient boosting.

7.​ What is Bayes theorem

8.​ Discuss the minimum description length algorithm.

9.​ Explain the steps in Gibbs Algorithm

Soft Margin Classification

To avoid these issues it is preferable to use a more flexible model. The

from sklearn.datasets import make_moons

Formula for Gini Impurity

For a given node ￼, the Gini impurity is calculated as:

Interpreting Gini Impurity

How Gini Impurity is Used in Decision Trees

15.​Evaluate the performance of each bagging and boosting as well as their

from sklearn.ensemble import BaggingClassifier

16. Discuss the results in terms of accuracy, robustness, and computational

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

4. Explain Bagging and pasting with an example.

5. Explain CART Algorithm, Regularization hyperparameters in decision

6. What is Boosting. Explain ADA BOOSt and gradient boosting.

7. What is Bayes theorem

8. Discuss the minimum description length algorithm.

9. Explain the steps in Gibbs Algorithm

For a given node , the Gini impurity is calculated as:

15.Evaluate the performance of each bagging and boosting as well as their