0% found this document useful (0 votes)
20 views

Software Mining (ML, Testing) Notes Unit 2, 3

The document discusses several machine learning algorithms and concepts. It covers linear regression, its assumptions, and how to check for normal distribution. It also discusses root mean squared error, mean squared log error and naive bayes classification.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Software Mining (ML, Testing) Notes Unit 2, 3

The document discusses several machine learning algorithms and concepts. It covers linear regression, its assumptions, and how to check for normal distribution. It also discusses root mean squared error, mean squared log error and naive bayes classification.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

ANNOVA IS NOT IN MINING SYLLABUS

//////////////////////////////////////////////////////////////////////////////////////////////////////////
NOT DONE BY MAM

/////////////////////////////////////////////////////////////////////////////////////////////////////////
MACHINE LEARNING ALGORITHM
5. Root Mean Squared Error

Root of Mean Squared Error (MSE) or root of the mean squared distances between
actual and predicted values.

Here, N = total number of data points Yi = actual value Ŷi = predicted value

Higher the RMSE the larger the deviation in actual and predicted value. Lower the
RMSE value the better the model is with its predictions.

Advantages of RMSE:

i) The value of MSE is same as output unit, which makes the interpretation of loss
easy.

Disadvantages of RMSE:

i) Not robust to outliers.

6. Mean Squared Log Error (MSLE)


MSLE is a variation of Mean Squared Error. Use MSLE, when you don't want to
penalize large differences between actual and predicted value.

The logarithmic was introduced to interpret the relative difference between actual
and predicted value. To avoid natural log of possible 0 values, add 1 on both actual
and predicted values before taking logarithmic.

Here, N = total number of data points Yi = actual value Ŷi = predicted value

Advantages of MSLE:

i) Treats small differences between small actual and predicted values same as big
differences between large actual and predicted values.

Disadvantages of MSLE:

i) Penalizes underestimates more than the overestimates.

Linear Regression:
Linear Regression is a supervised machine learning algorithm which performs Regression by
plotting a straight line which best fits the data points.

Y = b0 + b1X1 + b2X2 + ... + bnXn

Assumptions:
 Assumes a linear relationship between the independent variable 'x', and the dependent variable 'y'
 Assumes no correlation between the independent variables 'x' (Multicollinearity)
 Assumes residuals have constant variance at every level of x (Homoscedasticity)
 Assumes residuals of the model are normally distributed (Normality)
 Assumes no pattern is formed when residuals are plotted
Advantages Disadvantages

Simple Implementation Prone to Underfitting

Performs best on Linear Data Sensitive to Outliers

Overfitting can be reduced by regularization Assumes that data is Independent

Check for normal distribution

 Chi-square method
 Kolmogorov-Smirnov
 Shapirov-Wilk

Graphical Method to find Normal distribution

 Histogram
 Quantile-Quantile Plot
Miu = mean

Sigma square = variance


Sigma = standard deviation

Assumtion

Features are continuous

Another different variant of naïve bayes are

Bernoulli (bernouli distribution), multinomial (multinomial distribution)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy