0% found this document useful (0 votes)

9 views19 pages

ML CH

The document discusses the modeling and evaluation process in machine learning, focusing on model selection, training, and evaluation metrics. It highlights the importance of assumptions in regression analysis, techniques for improving model accuracy, and the principles of Ridge and Lasso regression. Key concepts include feature scaling, feature engineering, outlier removal, and the use of regularization to prevent overfitting.

Uploaded by

sharanyakasam086

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views19 pages

ML CH

Uploaded by

sharanyakasam086

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 19

Modelling and evaluation

A. Model Selection

Choice of Algorithm:

 Definition: Selecting an appropriate algorithm for building the machine learning model based
on the nature of the problem and the data characteristics.
 Factors Influencing Algorithm Choice:
 Nature of the Problem:
 Classification: If the target variable is categorical (e.g., spam detection, image recognition).
 Regression: If the target variable is continuous (e.g., house price prediction, stock price
forecasting).
 Data Characteristics:
 Size of the dataset: Some algorithms work better with large datasets (e.g., deep learning),
while others are better for smaller datasets (e.g., decision trees).
 Data complexity: Linear models may work for simpler problems, while more complex
relationships require non-linear models like support vector machines or neural networks.
 Feature types: For categorical data, tree-based algorithms or Naive Bayes might be preferred.
For numerical data, linear regression or neural networks might be better.
1. Splitting Data:
 Definition: The dataset is divided into multiple subsets to train, validate, and test the model.
 Training Set: A portion of the data used to train the model.
 Testing Set: A separate portion used to evaluate the model’s performance after training.
 Validation Set: Sometimes used during training to tune the hyperparameters of the model.
 Common Split Ratios:

 70% training, 30% testing

 80% training, 20% testing
 60% training, 20% testing, 20% validation

B. Training the Model

1. Loss Function:
 Definition: A function that measures the difference between the model's predicted output and
the actual output (ground truth).
 Purpose: The goal during training is to minimize the loss function, which means the model’s
predictions are as close as possible to the true values.
 Common Loss Functions:
 For Classification: Cross-entropy loss (used for binary and multi-class classification).
 For Regression: Mean Squared Error (MSE) or Mean Absolute Error (MAE).
2. Optimization:
 Definition: The process of adjusting the model's parameters (weights in the case of neural
networks, or coefficients in linear models) to minimize the loss function.
 Goal: To find the set of parameters that result in the lowest possible value for the loss
function.
 Optimization Algorithms:
 Gradient Descent: A widely used optimization algorithm that updates the model parameters
iteratively by calculating the gradient (or slope) of the loss function.
 Variants: Stochastic Gradient Descent (SGD), Mini-Batch Gradient Descent, Adam, e

C. Model Evaluation

1. Metrics:
 For Classification:
 Accuracy: The proportion of correct predictions over the total predictions.
 Precision: The proportion of true positives out of all predicted positives (useful in
imbalanced datasets).
 Recall: The proportion of true positives out of all actual positives (useful when false
negatives are critical).
 F1 Score: The harmonic mean of precision and recall, providing a balance between the two.
 For Regression:
 Mean Squared Error (MSE): Measures the average squared difference between predicted
and actual values.
 R-squared (R²): Represents the proportion of variance in the dependent variable that is
predictable from the independent variables.
2. Cross-Validation:
 Definition: A technique used to assess how the model performs on different subsets of the
data to ensure that the model generalizes well and is not overfitting.
 K-Fold Cross-Validation: The dataset is split into K equal-sized folds. The model is trained
on K-1 folds and tested on the remaining fold. This process is repeated K times, with each
fold serving as the test set once.
 Advantages:

 Provides a more reliable estimate of model performance by using multiple data splits.
 Helps in reducing bias in the evaluation process.

The modeling and evaluation steps ensure that a machine learning model not only performs
well on the training data but also generalizes well to unseen data. Proper evaluation using
relevant metrics and cross-validation ensures that the model is both accurate and robust for
real-world deployment.
V. Assumptions in Regression Analysis

A. Linearity

 Assumption: The relationship between the independent and dependent variables is linear.

 Explanation: Linear regression assumes that the dependent variable (y) has a linear relationship

with the independent variables (x , x , ..., xn). This means that for each independent variable,

there is a constant and proportional change in the dependent variable.

 Implication: If the relationship is not linear, the model may not be suitable, and the predictions

or inferences could be misleading. In such cases, transformations (e.g., log, square root) or non-

linear models may be required.

 Visualization: A scatter plot of the independent variable vs. the dependent variable should

roughly show a straight-line pattern if the linearity assumption is satisfied.

B. Independence

 Assumption: The residuals (prediction errors) are independent.

 Explanation: Residuals, the differences between the observed and predicted values, should not

show any patterns or correlations. Each data point should be independent of others, and there

should be no systematic relationship between the residuals for one observation and the residuals

for another observation.

 Implication: If residuals are not independent (i.e., autocorrelation exists), it can indicate that the

model is misspecified or that important variables or time dependencies are not accounted for.

This is especially important in time series data.

 Test: The Durbin-Watson test can be used to check for autocorrelation in residuals.

C. Homoscedasticity

 Assumption: Residuals have constant variance across all levels of the independent variable(s).

 Explanation: Homoscedasticity means that the spread or variability of the residuals is consistent

across all values of the independent variable(s). In other words, the variance of the prediction

errors should not change as the value of the independent variable(s) changes.

 Implication: If the residuals show non-constant variance (heteroscedasticity), it can lead to

inefficient estimates and biased statistical tests. For example, large values of the independent

variable could result in large prediction errors, and small values in small errors.

 Visualization: A scatter plot of residuals versus predicted values should show a random scatter
with no discernible pattern. If there is a "fan-shaped" or "cone-shaped" pattern,

heteroscedasticity is likely present.

 Test: The Breusch-Pagan test or White test can be used to detect heteroscedasticity.

D. Normality of Residuals

 Assumption: Residuals are normally distributed.

 Explanation: The residuals should follow a normal distribution for valid hypothesis testing and

reliable confidence intervals. While linear regression can still produce unbiased estimates even

when the normality assumption is violated, statistical tests (e.g., t-tests, F-tests) rely on this

assumption for inference.

 Implication: If the residuals are not normally distributed, it may affect the validity of confidence

intervals and significance tests for the regression coefficients. Non-normality can also indicate

the need for transformation or the presence of outliers.

 Test: The Shapiro-Wilk test or Kolmogorov-Smirnov test can be used to test for normality.

Additionally, a Q-Q plot (quantile-quantile plot) can visually assess normality by comparing the

residuals' distribution to a standard normal distribution.

Summary of Assumptions in Regression Analysis:

 Linearity: The relationship between independent and dependent variables should be linear.

 Independence: Residuals must be independent of each other.

 Homoscedasticity: Residuals should have constant variance across all levels of the independent

variables.

 Normality of Residuals: Residuals must be normally distributed for valid hypothesis testing.

These assumptions are critical to the validity of the regression model. Violations of these

assumptions can lead to inaccurate predictions and unreliable inferences, highlighting the

importance of checking these assumptions during model evaluation.

VI. Improving the Accuracy of Linear Regression Model

A. Feature Scaling

 Technique: Scale features to a standard range.

 Explanation: Feature scaling involves transforming the independent variables so they have

similar ranges, typically between 0 and 1 or with zero mean and unit variance. This is especially

important for algorithms that rely on distance or gradient-based optimization (like linear

regression, when using gradient descent).

 Types of Feature Scaling:

1. Min-Max Scaling:

 Rescales features to a fixed range, usually [0, 1].

2. Standardization (Z-score Normalization):

 Rescales features so that they have a mean of 0 and a standard deviation of 1.

 Why it's important: Linear regression can be sensitive to the scale of the features. If features

are not scaled properly, the algorithm may give disproportionate importance to variables with

larger magnitudes. Scaling ensures all features contribute equally to the model.

B. Feature Engineering

 Technique: Create new features based on existing ones.

 Explanation: Feature engineering involves transforming or creating new features that may

provide more useful information to the model. Well-crafted features can significantly improve

the model's performance by making patterns more apparent.

 Examples:

1. Polynomial Features:

2. 3.  If there is a non-linear relationship between the independent and dependent variables,

polynomial features (e.g., x2x2, x3x3) can capture this non-linearity.

 Example: If the relationship between the size of a house and its price is quadratic, adding a

feature for the square of size (e.g., size²) can improve the model.

Interaction Features:

 Create features that represent the interaction between two or more variables. For example, if the

house price depends on both the size and the number of bedrooms, adding a feature

like size×bedroomssize×bedrooms can capture this relationship.

Log Transformation:

 For features with skewed distributions (e.g., income, population), applying a log transformation

can reduce the impact of extreme values and normalize the data.

 Why it's important: Properly engineered features can improve the accuracy of the model by

making the relationship between predictors and the target variable more understandable. It can

also help the model learn better patterns and handle non-linearities in the data.

C. Outlier Removal

2.  Technique: Identify and handle outliers in the dataset.

 Explanation: Outliers are data points that are significantly different from the rest of the data and

can negatively impact the performance of the model. They may disproportionately influence the

estimated coefficients in linear regression, leading to biased predictions.

 Methods for Detecting Outliers:

1. Box Plot:

 A box plot can help identify outliers by displaying the interquartile range (IQR). Data points

outside 1.5 times the IQR are often considered outliers.

Z-Score:

 The Z-score measures how many standard deviations a data point is from the mean. Data points

with a Z-score greater than 3 or less than -3 are often considered outliers.

 Formula for Z-score:Z=x−μσZ=σx−μ

3. Scatter Plot:

 In some cases, visualizing the data using a scatter plot can help identify extreme values that

deviate significantly from the general trend.

 Handling Outliers:

1. Removing Outliers:

 Simply removing outliers can sometimes improve model performance, but this should be done

carefully to avoid losing valuable data.

3. 2. Capping or Winsorization:

 This technique involves replacing extreme outliers with the maximum or minimum values within

an acceptable range.

Transformation:
 Applying transformations like log or square root to the feature can reduce the influence of

outliers.

 Why it's important: Outliers can distort the regression model, affecting its ability to generalize.

By identifying and appropriately handling outliers, the model becomes more robust and accurate.

Summary of Techniques for Improving Model Accuracy:

 Feature Scaling ensures that the model treats all features equally by standardizing their ranges.

 Feature Engineering helps create new, more informative features that can improve model

performance.

 Outlier Removal reduces the impact of extreme values, making the model more robust and

improving its ability to predict accurately.

These techniques collectively enhance the performance of a linear regression model by making

the data more suitable for analysis, improving the model’s generalization, and ensuring the most

relevant features are used for prediction.

VII. Ridge Regression

A. Principle

 Objective: Ridge regression introduces a regularization term to the linear regression model to

prevent overfitting by penalizing large coefficients.

 Overfitting occurs when a model becomes too complex, capturing not only the underlying

patterns in the data but also the noise, leading to poor generalization to new, unseen data. Ridge

regression helps mitigate this by adding a penalty to the model's complexity.

 L2 Regularization: Ridge regression is also known as L2 regularization because it adds a

penalty proportional to the sum of the squared coefficients of the features to the cost function.

 Effect of Regularization:

 When λ=0, Ridge regression becomes equivalent to ordinary linear regression (no

regularization).

 When λis large, the regularization term dominates, and the coefficients are forced to be close to

zero, which makes the model simpler and less likely to overfit.

 If λis too large, the model may become underfit and fail to capture important patterns in the

data.

 Benefits of Ridge Regression:

1. Prevents Overfitting: By penalizing large coefficients, Ridge regression reduces the

model's complexity and helps prevent overfitting, especially when dealing with a large number

of features.

2. Improves Stability: Ridge regression helps stabilize the estimation of the coefficients,

especially when the independent variables are highly correlated (multicollinearity), which can

lead to unstable estimates in ordinary least squares regression.

3. Works Well for High-Dimensional Data: In datasets with many features (high-

dimensional data), Ridge regression can improve model performance by keeping the model

weights smaller and more controlled.

 Choosing the Regularization Parameter λ:

 The value of λ controls the strength of regularization. It can be tuned using methods such

as cross-validation or Grid Search.

 A common approach is to use cross-validation to evaluate the model performance for different
values of λ and select the one that minimizes the validation error.

Summary:

 Ridge regression adds an L2 regularization term to the linear regression cost function to

prevent overfitting by penalizing large coefficients.

 The regularization parameter λ controls the strength of the penalty, helping to balance the

tradeoff between model complexity and performance.

 Benefits include better generalization, prevention of overfitting, and improved stability in high-

dimensional data, making it a powerful tool in linear regression.

VIII. Lasso Regression

A. Principle

 Objective: Lasso regression (Least Absolute Shrinkage and Selection Operator) introduces

a regularization term that not only penalizes the magnitude of the coefficients but also

encourages sparsity in the coefficient matrix.

 Sparsity: The Lasso technique encourages some of the model's coefficients to become exactly

zero, effectively performing feature selection by removing irrelevant or less important features.

This leads to simpler models that are easier to interpret.

 L1 Regularization: Lasso regression uses L1 regularization, which adds the sum of the

absolute values of the coefficients to the cost function. This is in contrast to Ridge regression,

which uses L2 regularization (the sum of squared coefficients).

 Effect of Regularization:

 When λ=0, Lasso regression becomes equivalent to ordinary linear regression (no

regularization).

 As λ increases, the penalty on the coefficients becomes stronger, leading to more coefficients

shrinking toward zero.

 When λ is large, many coefficients are driven to exactly zero, resulting in a sparse model with

fewer features.

 Key Characteristics of Lasso Regression:

1. Feature Selection: Lasso has the unique ability to set some of the coefficients exactly to

zero, effectively removing the corresponding features from the model. This is particularly useful

when dealing with datasets with a large number of features, as it helps in identifying the most
important features.

2. Sparsity: The L1 penalty encourages a sparse solution, where only the most important

features are kept, and the less important ones are discarded (set to zero).

3. Better for High-Dimensional Data: Lasso regression is highly effective when there are

many features, especially when some of them may not be informative. It can help prevent

overfitting by reducing the complexity of the model.

 Choosing the Regularization Parameter λ:

 Like Ridge regression, the regularization parameter λ needs to be tuned to achieve the best

balance between model complexity and performance.

 A common approach to determine the optimal λ is cross-validation, where the model is trained

on different values of λ, and the value that minimizes the validation error is selected.

 Lasso vs. Ridge:

 Ridge Regression: The L2 regularization used in Ridge regression reduces the size of

coefficients but rarely sets them to exactly zero.

 Lasso Regression: The L1 regularization used in Lasso regression not only reduces the size of

coefficients but also eliminates some coefficients entirely by setting them to zero, thus

performing feature selection.

 Elastic Net: Combines both L1 and L2 regularization (Ridge + Lasso), allowing for both

shrinkage and feature selection.

Summary:

 Lasso regression adds an L1 regularization term to the linear regression cost function, which

penalizes the absolute values of the coefficients, encouraging sparsity and performing feature

selection.

 The regularization parameter λ controls the strength of the penalty, and as λ increases, more

coefficients are driven to zero, making the model simpler.

 Benefits: Lasso regression is particularly useful when dealing with high-dimensional datasets

where many features may be irrelevant, as it automatically selects the





 True positives: The number of positive observations the model correctly
predicted as positive.
 False-positive: The number of negative observations the model
incorrectly predicted as positive.
 True negative: The number of negative observations the model correctly
predicted as negative.
 False-negative: The number of positive observations the model

incorrectly predicted as negative .

Hyperplane

A hyperplane is a decision plane which separates between a set of objects having

different class memberships.

Margin

A margin is a gap between the two lines on the closest class points.

This is calculated as the perpendicular distance from the line to support vectors or

closest points.

If the margin is larger in between the classes, then it is considered a good margin, a

smaller margin is a bad margin.

The main objective is to segregate the given dataset in the best possible way.

The distance between the either nearest points is known as the margin.
The objective is to select a hyperplane with the maximum possible margin between

support vectors in the given dataset.

SVM searches for the maximum marginal hyperplane in the following

steps:

1. Generate hyperplanes which segregates the classes in the best way. Left-hand

side figure showing three hyperplanes black, blue and orange. Here, the blue

and orange have higher classification error, but the black is separating the two

classes correctly.

2. Select the right hyperplane with the maximum segregation from the either

nearest data points as shown in the right-hand side figure.

Dealing with non-linear and inseparable planes

Some problems can’t be solved using linear hyperplane, In such situation,

SVM uses a kernel trick to transform the input space to a higher dimensional space

as shown on the right.

The data points are plotted on the x-axis and z-axis (Z is the squared sum of both x

and y: z=x^2=y^2).

SVM Kernels

The SVM algorithm is implemented in practice using a kernel.

A kernel transforms an input data space into the required form.

SVM uses a technique called the kernel trick.

Here, the kernel takes a low-dimensional input space and transforms it into a higher

dimensional space.

In other words, you can say that it converts non separable problem to separable

problems by adding more dimension to it.

It is most useful in non-linear separation problem. Kernel trick helps you to build a
more accurate classifier.

Linear Kernel

A linear kernel can be used as normal dot product any two given observations.

The product between two vectors is the sum of the multiplication of each pair of

input values.

K(x, xi) = sum(x * xi)

Polynomial Kernel

A polynomial kernel is a more generalized form of the linear kernel.

The polynomial kernel can distinguish curved or nonlinear input space.

K(x,xi) = 1 + sum(x * xi)^d

Where d is the degree of the polynomial. d=1 is similar to the linear transformation.

The degree needs to be manually specified in the learning algorithm.

Radial Basis Function Kernel

The Radial basis function kernel is a popular kernel function commonly used in

support vector machine classification.

RBF can map an input space in infinite dimensional space.

K(x,xi) = exp(-gamma * sum((x – xi^2))

Here gamma is a parameter, which ranges from 0 to 1.

A higher value of gamma will perfectly fit the training dataset, which causes over-

fitting.

Gamma=0.1 is a good default value. The value of gamma needs to be manually

specified in the learning algorithm.

Advantages

SVM Classifiers offer good accuracy and perform faster prediction compared to

Naïve Bayes algorithm.

They also use less memory because they use a subset of training points in the

decision phase.

SVM works well with a clear margin of separation and with high dimensional space.

Disadvantages

SVM is not suitable for large datasets because of its high training time and it also

takes more time in training compared to Naïve Bayes.

It works poorly with overlapping classes and is also sensitive to the type of kernel

used.

SVM hyperplane

The line equation and hyperplane equation — same, it’s a different

way to express the same thing.

w.x+b=0 which is same as w.x =0 (which has more dimensions)

if w.x+b=0 then we get the decision boundary

→The yellow dashed line

if w.x+b=1 then we get (+) class hyperplane

for all positive(x) points satisfy this rule (w.x+b ≥1)

if w.x+b=-1 then we get (-) class hyperplane

dimensionality reduction
 for all negative(x) points satisfy this rule (w.x+b≤-1) In Numpy, the number of independent
features or variables in a dataset is known as a dimension.

 In Mathematics, Dimension is defined as the minimum number of coordinates needed to

specify a vector in space.

 Dimensionality Reduction refers to the techniques that reduce the number of input
features/variables in a dataset

 There are two main methods for reducing dimensionality:

 Feature selection – In this, we are interested in finding k of the d dimensions that give us
the most information, and we discard the other (d − k) dimensions.

 Feature extraction – In this, we are interested in finding a new set of k dimensions that are
combinations of the original d dimensions

 The most widely used feature extraction methods are:

 Principal Component Analysis (PCA) - Unsupervised

 Linear Discriminant Analysis (LDA) - Supervised

 Both are linear projection methods

 PCA is a dimensionality reduction technique that enables you to identify correlations and
patterns in a dataset so that it can be transformed into a dataset of significantly lower
dimension with out loss of any important information

 Can be used to:

 Reduce number of dimensions in data

 Find patterns in high-dimensional data

 Visualize data of high dimensionality

 Example applications:

 Face recognition

 Image compression

 Gene expression analysis

 Variance – for calculating the variation of data distributed across dimensionality of

graph
 Covariance – calculating dependencies and relationship between features
 Standardizing data – Scaling our dataset within a specific range for unbiased output
 Covariance matrix – Used for calculating interdependencies between the features or

variables and also helps in reduce it to improve the performance

 EigenValues and EigenVectors – Eigenvectors’ purpose is to find out the largest variance
that exists in the dataset to calculate Principal Component. Eigenvalue means the
magnitude of the Eigenvector. Eigenvalue indicates variance in a particular direction and
whereas eigenvector is expanding or contracting X-Y (2D) graph without altering the
direction

 Steps

 Original Data
 Normalize the original data (mean =0, variance =1)
 Calculating covariance matrix
 Calculating Eigen values, Eigen vectors, and normalized Eigenvectors
 Calculating Principal Component (PC)
 Plot the graph for orthogonality between PCs
Advantages
 Used for Dimensionality Reduction
 PCA will assist in eliminating all related features, sometimes referred to as multi-
collinearity.
 The time required to train the model is substantially shorter because of PCA’s
reduction in the number of features.
 PCA aids in overcoming overfitting by eliminating the extraneous features from your
dataset.
Disadvantages
 Useful for quantitative data but not effective with qualitative data.
 Interpretation of PC is difficult from original data



 Agglomerative Clustering is a type of hierarchical clustering algorithm. It is an

unsupervised machine learning technique that divides the population into several

clusters such that data points in the same cluster are more similar and data points in

different clusters are dissimilar.

 Points in the same cluster are closer to each other.

 Points in the different clusters are far apart.

 On the other hand, the divisive method starts with one cluster with all given objects
and then splits it iteratively to form smaller clusters
 Pros
 No assumption of a particular number of clusters (i.e. k-means)
 May correspond to meaningful taxonomies
 Cons
 Once a decision is made to combine two clusters, it can’t be undone
 Too slow for large data sets, O(𝑛2 log(𝑛))
 dbscan
 There are different approaches and algorithms to perform clustering tasks
which can be divided into three sub-categories:
 Partition-based clustering: E.g. k-means, k-median
 Hierarchical clustering: E.g. Agglomerative, Divisive
 Density-based clustering: E.g. DBSCAN
 DBSCAN stands for Density-Based Spatial Clustering of Applications with Noise. It
is able to find arbitrary shaped clusters and clusters with noise (i.e. outliers).
 In DBSCAN, instead of guessing the number of clusters, will define two hyper
parameters: epsilon and minPoints to arrive at clusters.
 Epsilon (ε): The distance that specifies the neighborhoods. Two points are considered
to be neighbors if the distance between them are less than or equal to epsilon
 minPoints(n): Minimum number of data points to define a cluster.

 DBSCAN stands for Density-Based Spatial Clustering of Applications with Noise. It is able

to find arbitrary shaped clusters and clusters with noise (i.e. outliers).
 In DBSCAN, instead of guessing the number of clusters, will define two hyper
parameters: epsilon and minPoints to arrive at clusters.
 Epsilon (ε): The distance that specifies the neighborhoods. Two points are considered to be

neighbors if the distance between them are less than or equal to epsilon
 minPoints(n): Minimum number of data points to define a cluster.



 Q-Learning:
 Q-learning is an Off policy RL algorithm, which is used for the temporal difference
Learning. The temporal difference learning methods are the way of comparing
temporally successive predictions.

 It learns the value function Q (S, a), which means how good to take action "a" at a
particular state "s."

 The below flowchart explains the working of Q- learning:

 State Action Reward State action (SARSA):

 SARSA stands for State Action Reward State action, which is an on-policy temporal
difference learning method. The on-policy control method selects the action for each
state while learning using a specific policy.

 The goal of SARSA is to calculate the Q π (s, a) for the selected current policy π and
all pairs of (s-a).

 The main difference between Q-learning and SARSA algorithms is that unlike Q-
learning, the maximum reward for the next state is not required for updating the
Q-value in the table.

 In SARSA, new action and reward are selected using the same policy, which has
determined the original action.

 The SARSA is named because it uses the quintuple Q(s, a, r, s', a'). Where,
s: original state
a: Original action
r: reward observed while following the states
s' and a': New state, action pair.

Steel
No ratings yet
Steel
52 pages
Car Price Prediction Report
No ratings yet
Car Price Prediction Report
24 pages
Da Sem Unit 3-1
No ratings yet
Da Sem Unit 3-1
13 pages
Lecture 12 Regression
No ratings yet
Lecture 12 Regression
55 pages
Predictive Unit 1
No ratings yet
Predictive Unit 1
22 pages
6.classification & Regression
No ratings yet
6.classification & Regression
45 pages
Unit-2 Ak
No ratings yet
Unit-2 Ak
106 pages
Module 2 Modified
No ratings yet
Module 2 Modified
67 pages
Unit 5
No ratings yet
Unit 5
18 pages
Midsem Compressed
No ratings yet
Midsem Compressed
32 pages
Parametric
No ratings yet
Parametric
15 pages
Predective Analytics
No ratings yet
Predective Analytics
11 pages
11 ANOVA (Student Version)
No ratings yet
11 ANOVA (Student Version)
30 pages
2.3 Assumptions of Linear Regression
No ratings yet
2.3 Assumptions of Linear Regression
16 pages
Technology in Education Technology Presentation in Blue Peach Illustrative Style
No ratings yet
Technology in Education Technology Presentation in Blue Peach Illustrative Style
11 pages
BD Unit II
No ratings yet
BD Unit II
57 pages
Assumptions in Linear Regression
No ratings yet
Assumptions in Linear Regression
3 pages
YOLOv5 Pytorch Implementation
No ratings yet
YOLOv5 Pytorch Implementation
14 pages
SDLC Sha
No ratings yet
SDLC Sha
5 pages
Lo 4
No ratings yet
Lo 4
3 pages
The Capability Maturity Model Integration
No ratings yet
The Capability Maturity Model Integration
3 pages
ML Lab PGM 6
No ratings yet
ML Lab PGM 6
5 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
4 pages
Lecture Notes To Accompany: Operations Management, 10 Edition Stevenson (Mcgraw-Hill 2009)
No ratings yet
Lecture Notes To Accompany: Operations Management, 10 Edition Stevenson (Mcgraw-Hill 2009)
105 pages
Mod8 DM
No ratings yet
Mod8 DM
13 pages
Buss6002 - 540654297. (1) - 1
No ratings yet
Buss6002 - 540654297. (1) - 1
5 pages
Regression Logistic Unit3 Notes
No ratings yet
Regression Logistic Unit3 Notes
6 pages
Data Analytivs-Unit-2
No ratings yet
Data Analytivs-Unit-2
24 pages
Ex 4.2 Forecasting Stock Price Closure For Each Day
No ratings yet
Ex 4.2 Forecasting Stock Price Closure For Each Day
4 pages
ML Combined
No ratings yet
ML Combined
254 pages
Ds Module 4
No ratings yet
Ds Module 4
73 pages
机器学习
No ratings yet
机器学习
41 pages
Int 354
No ratings yet
Int 354
4 pages
CC02 Group6 Report
No ratings yet
CC02 Group6 Report
36 pages
Frye LGD As A Function of The Default Rate 091013 PDF
No ratings yet
Frye LGD As A Function of The Default Rate 091013 PDF
13 pages
Oral Aswers Dsbda
No ratings yet
Oral Aswers Dsbda
7 pages
Regression Analysis Lasso and Ridge Regression 1678810035
No ratings yet
Regression Analysis Lasso and Ridge Regression 1678810035
18 pages
DSEnd
No ratings yet
DSEnd
30 pages
Linear Regression
No ratings yet
Linear Regression
38 pages
Practical # 10
No ratings yet
Practical # 10
5 pages
Linear MMSE Estimation of Random Variables
No ratings yet
Linear MMSE Estimation of Random Variables
6 pages
ML 01 (Pranavv)
No ratings yet
ML 01 (Pranavv)
14 pages
Chapter 3
No ratings yet
Chapter 3
15 pages
BUSD2027 QualityMgmt Module2
No ratings yet
BUSD2027 QualityMgmt Module2
168 pages
ML Unit-2 Material
No ratings yet
ML Unit-2 Material
20 pages
DLVS Unit 1
No ratings yet
DLVS Unit 1
36 pages
ML 1 PPT Unit 1
No ratings yet
ML 1 PPT Unit 1
93 pages
Article Module 4
No ratings yet
Article Module 4
8 pages
Sparsely Constrained NN
No ratings yet
Sparsely Constrained NN
6 pages
Hair PPT Ch05
No ratings yet
Hair PPT Ch05
18 pages
Module 3-1
No ratings yet
Module 3-1
7 pages
Modern Pridictive Modelling (Regression)
No ratings yet
Modern Pridictive Modelling (Regression)
12 pages
Machine Learning
No ratings yet
Machine Learning
62 pages
Data Analytics Mid
No ratings yet
Data Analytics Mid
15 pages
Model Evaluation
No ratings yet
Model Evaluation
18 pages
Se CH
No ratings yet
Se CH
8 pages
Moving Average
No ratings yet
Moving Average
7 pages
Datamining Unit4
No ratings yet
Datamining Unit4
21 pages
m2 Data Analytic and Visualization
No ratings yet
m2 Data Analytic and Visualization
53 pages
ML 2 ND Unit
No ratings yet
ML 2 ND Unit
50 pages
Week 4 Lecture Slides BUS265 2023
No ratings yet
Week 4 Lecture Slides BUS265 2023
45 pages
Michael J Panik - Regression Modeling - Methods, Theory, and Computation With SAS-CRC Press (2009)
No ratings yet
Michael J Panik - Regression Modeling - Methods, Theory, and Computation With SAS-CRC Press (2009)
806 pages
Assignment 9
No ratings yet
Assignment 9
8 pages
ML PR-2
No ratings yet
ML PR-2
11 pages
Lecture 8 Linear and Multiple Regression
No ratings yet
Lecture 8 Linear and Multiple Regression
55 pages
Linear Regression Algorithm
No ratings yet
Linear Regression Algorithm
16 pages
Group 1 Practical
No ratings yet
Group 1 Practical
16 pages
Capstone Project
No ratings yet
Capstone Project
6 pages
Linear Regression
No ratings yet
Linear Regression
5 pages
SemVII MachineLearning
No ratings yet
SemVII MachineLearning
22 pages
Notes XII AI
No ratings yet
Notes XII AI
11 pages
ML DL NLP Definitions
No ratings yet
ML DL NLP Definitions
22 pages
DSA Shotnotes For 2 Units
No ratings yet
DSA Shotnotes For 2 Units
5 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
12 pages
2-Training and Testing Models, Evaluation Metrics-01-07-2023
No ratings yet
2-Training and Testing Models, Evaluation Metrics-01-07-2023
23 pages
Selecting A Forecasting Technique
No ratings yet
Selecting A Forecasting Technique
58 pages
DSR Notes 3 To 5
No ratings yet
DSR Notes 3 To 5
70 pages
Teit ML2
No ratings yet
Teit ML2
11 pages
Predictive Modelling Project 2
100% (4)
Predictive Modelling Project 2
32 pages
Data Science Module 5 Q & A
No ratings yet
Data Science Module 5 Q & A
8 pages
Bucket Computer
No ratings yet
Bucket Computer
10 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
DM Unit - 3
No ratings yet
DM Unit - 3
21 pages
Literature Review (Stock Market Prediction)
No ratings yet
Literature Review (Stock Market Prediction)
14 pages
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs Mod
No ratings yet
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs Mod
64 pages
COMP1801 - Copy 1
No ratings yet
COMP1801 - Copy 1
18 pages
Random Forest - Basics
No ratings yet
Random Forest - Basics
9 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
超参数字典Hyperparameter Dictionary
No ratings yet
超参数字典Hyperparameter Dictionary
8 pages
Averaging and Exponential Smoothing Methods
No ratings yet
Averaging and Exponential Smoothing Methods
21 pages
Mobile and Wireless Communication Complete Lecture Notes #8
No ratings yet
Mobile and Wireless Communication Complete Lecture Notes #8
25 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.