0% found this document useful (0 votes)

5 views20 pages

Unit 1 Notes

This document provides an overview of machine learning, including its basic concepts, types, and applications. It covers supervised, unsupervised, semi-supervised, and reinforcement learning, along with data preprocessing techniques and the importance of bias and variance in model performance. Additionally, it highlights practical applications such as image recognition, speech recognition, and product recommendations.

Uploaded by

priyankadeva02012005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views20 pages

Unit 1 Notes

Uploaded by

priyankadeva02012005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

UNIT – I

Introduction to Machine Learning – SCSB4009

UNIT 1 INTRODUCTION TO MACHINE LEARNING

Machine learning-basic concepts in machine learning- types of machine learning-

examples of machine learning- applications- the bias variance- data pre-
processing- noise removal-normalization.

Introduction
Machine Learning is a field of study that gives the computers to Learn Without
Being Explicitly Programmed” "A computer program is said to learn from
experience E with respect to some class of tasks T and performance measure
P, if its performance at tasks in T, as measured by P, improves with
experience E."(Tom Michel)

"Field of study that gives computers the ability to learn without being explicitly
programmed". Learning = Improving with experience at some task
- Improve over task T,
- with respect to performance measure P,
- based on experience
- .E.g., Learn to lay checkers
- T : Play checkers
- P : % of games won in world tournament
- E: opportunity to play against self
Model
A model of machine learning is a set of programs that can be used to find the
pattern and make a decision from an unseen dataset. It can be any one of the
following
- • Mathematical Equation
- • Relational Diagrams Like Graphs/Trees
- • Logical If/Else Rules
- • Groupings Called Clusters Learning
Training set, Test set and Validation set

• Divide the total dataset into three subsets:

– Training data is used for learning the parameters of the model.

– Validation data is not used of learning but is used for deciding

what type of modeland what amount of regularization works best.

– Test data is used to get a final, unbiased estimate of how well the
network works. We expect this estimate to be worse than on the
validation data.

We could then re-divide the total dataset to get another unbiased estimate of the
true error rate.

DIFFERENCE BETWEEN TRADITIONAL PROGRAMMING VS MACHINE

LEARNING
Need for Machine Learning

- The Data-Information-Knowledge-Wisdom (DIKW) pyramid

illustrates the progression of raw data to valuable insights. It gives you a
framework to discuss the level of meaning and utility within data. Each level
of the pyramid builds on lower levels, and to effectively make data-driven
decisions, you need all four levels.
- Wisdom is the ability to make well-informed decisions and take
effective action based on understanding of the underlying knowledge.
- Knowledge is the result of analyzing and interpreting information to
uncover patterns, trends, and relationships. It provides an understanding of
"how" and "why" certain phenomena occur.
- Information is organized, structured, and contextualized data.
Information is useful for answering basic questions like "who," "what,"
"where," and "when."
- Data refers to raw, unprocessed facts and figures without context. It is
the foundation for all subsequent layers but holds limited value in isolation.
Types of Machine Learning

Supervised learning
Supervised learning is defined as when a model gets trained on a “Labelled
Dataset”.
Labelled datasets have both input and output parameters.
In Supervised Learning algorithms learn to map points between inputs and
correct outputs. It has both training and validation datasets labelled.

Example: Consider a scenario where you have to build an image classifier to

differentiate between cats and dogs. If you feed the datasets of dogs and cats
labelled images to the algorithm, the machine will learn to classify between a dog
or a cat from these labeled images. When we input new dog or cat images that it
has never seen before, it will use the learned algorithms and predict whether it is
a dog or a cat. This is how supervised learning works, and this is particularly an
image classification.
There are two main categories of supervised learning that are mentioned below:

Classification- deals with predicting categorical target variables, which

represent discrete classes or labels. For instance, classifying emails as spam or
not spam, or predicting whether a patient has a high risk of heart disease
Regression - deals with predicting continuous target variables, which represent
numerical values. For example, predicting the price of a house based on its size,
location, and amenities, or forecasting the sales of a product.

Unsupervised learning
Unsupervised learning is a type of machine learning technique in which an
algorithm discovers patterns and relationships using unlabeled data.
Unlike supervised learning, unsupervised learning doesn’t involve providing the
algorithm with labeled target outputs.
The primary goal of Unsupervised learning is often to discover hidden patterns,
similarities, or clusters within the data, which can then be used for various
purposes, such as data exploration, visualization, dimensionality reduction, and
more.

Example: Consider that you have a dataset that contains information about the
purchases you made from the shop. Through clustering, the algorithm can group
the same purchasing behavior among you and other customers, which reveals
potential customers without predefined labels. This type of information can help
businesses get target customers as well as identify outliers.
There are two main categories of unsupervised learning that are mentioned
below:

Clustering - Clustering is the process of grouping data points into clusters based
on their similarity. This technique is useful for identifying patterns and
relationships in data without the need for labeled examples.
Association - Association rule learning is a technique for discovering
relationships between items in a dataset. It identifies rules that indicate the
presence of one item implies the presence of another item with a specific
probability.

Semi-supervised learning

o It is the combination of supervised and un supervised learning models.

o Training data includes a few desired outputs

Reinforcement learning

Reinforcement machine learning algorithm is a learning method that interacts

with the environment by producing actions and discovering errors.

Trial, error, and delay are the most relevant characteristics of reinforcement
learning. In this technique, the model keeps on increasing its performance using
Reward Feedback to learn the behavior or pattern.

These algorithms are specific to a particular problem e.g. Google Self Driving
car, AlphaGo where a bot competes with humans and even itself to get better
and better performers in Go Game.

Each time we feed in data, they learn and add the data to their knowledge which
is training data. So, the more it learns the better it gets trained and hence
experienced.

Example:Consider that you are training an AI agent to play a game like chess.
The agent explores different moves and receives positive or negative feedback
based on the outcome. Reinforcement Learning also finds applications in which
they learn to perform tasks by interacting with their surroundings.

Examples of machine learning

Recognizing patterns:
– Facial identities or facial expressions
– Handwritten or spoken words
– Medical images
• Generating patterns:
– Generating images or motion sequences (demo)
• Recognizing anomalies:
– Unusual sequences of credit card transactions
– Unusual patterns of sensor readings in a nuclear power plant or unusual
sound in your car engine.
• Prediction:
-Future stock prices or currency exchange rates
• The web contains a lot of data. Tasks with very big datasets often use
machine learning
– especially if the data is noisy or non-stationary.
• Spam filtering, fraud detection:
• Recommendation systems:
– Lots of noisy data.

Applications of machine learning

1. Image Recognition

Image recognition is one of the most common applications of machine learning.

It is used to identify objects, persons, places, digital images, etc. The popular use
case of image recognition and face detection is, Automatic friend tagging
suggestion:
Facebook provides us a feature of auto friend tagging suggestion. Whenever we
upload a photo with our Facebook friends, then we automatically get a tagging
suggestion with name, and the technology behind this is machine learning's face
detection and recognition algorithm.

2. Speech Recognition

Speech recognition is a process of converting voice instructions into text, and it

is also known as "Speech to text", or "Computer speech recognition." At present,
machine learning algorithms are widely used by various applications of speech
recognition. Google assistant, Siri, Cortana, and Alexa are using speech
recognition technology to follow the voice instructions.
3. Traffic prediction

It predicts the traffic conditions such as whether traffic is cleared, slow-moving,

or heavily congested with the help of two ways:
Real Time location of the vehicle form Google Map app and sensors
Average time has taken on past days at the same time.
Everyone who is using Google Map is helping this app to make it better. It takes
information from the user and sends back to its database to improve the
performance.

4. Email Spam and Malware Filtering

Whenever we receive a new email, it is filtered automatically as important,

normal, and spam. We always receive an important mail in our inbox with the
important symbol and spam emails in our spam box, and the technology behind
this is Machine learning. Below are some spam filters used by Gmail:
1. Content Filter
2. Header filter
3. General blacklists filter
4. Rules-based filters
5. Permission filters

5. Product recommendations

Machine learning is widely used by various e-commerce and entertainment

companies such as Amazon, Netflix, etc., for product recommendation to the user.
Whenever we search for some product on Amazon, then we started getting an
advertisement for the same product while internet surfing on the same browser
and this is because of machine learning.
Bias and Variance

Bias
Bias is simply defined as the inability of the model because of that there is some
difference or error occurring between the model’s predicted value and the actual
value. These differences between actual or expected values and the predicted
values are known as error or bias error or error due to bias. Bias is a systematic
error that occurs due to wrong assumptions in the machine learning process.

Let Y be the true value of a parameter, and let be an estimator of Y based on

a sample of data. Then, the bias of the estimator is given by:

Where is the expected value of the estimator . It is the measurement of

the model that how well it fits the data.

Low Bias: Low bias value means fewer assumptions are taken to build the target
function. In this case, the model will closely match the training dataset.
High Bias: High bias value means more assumptions are taken to build the target
function. In this case, the model will not match the training dataset closely.

Variance
Variance is the measure of spread in data from its mean position. In machine
learning variance is the amount by which the performance of a predictive model
changes when it is trained on different subsets of the training data. More
specifically, variance is the variability of the model that how much it is sensitive
to another subset of the training dataset. i.e. how much it can adjust on the new
subset of the training dataset.
Let Y be the actual values of the target variable, and be the predicted values
of the target variable. Then the variance of a model can be measured as the
expected value of the square of the difference between predicted values and the
expected value of the predicted values.

Where is the expected value of the predicted values. Here expected value is
averaged over all the training data.
Variance errors are either low or high-variance errors.

Low variance: Low variance means that the model is less sensitive to changes in
the training data and can produce consistent estimates of the target function with
different subsets of data from the same distribution. This is the case of underfitting
when the model fails to generalize on both training and test data.
High variance: High variance means that the model is very sensitive to changes
in the training data and can result in significant changes in the estimate of the
target function when trained on different subsets of data from the same
distribution. This is the case of overfitting when the model performs well on the
training data but poorly on new, unseen test data. It fits the training data too
closely that it fails on the new training dataset.

Data preprocessing

In real world the available data is

1. Incomplete data
2. Inaccurate data
3. Outlier data
4. Data with missing values
5. Data with inconsistent values
6. Duplicate data
Data preprocessing improves the quality of the data mining techniques. The raw
data must be preprocessed to give accurate results. The process of detection and
removal of errors in data is called data cleaning. Data wrangling means making
the data processable for machine learning algorithms. Some of the data errors
include human errors such as typographical errors or incorrect measurement and
structural errors like improper data formats. Data errors can also arise from
omission and duplication of attributes. Noise is a random component and involves
distortion of a value or introduction of spurious objects. Often, the noise is used
if the data is a spatial or temporal component. Certain deterministic distortions in
the form of a streak are known as artifacts.
Data preprocessing involves the following steps:
1. Getting the dataset
2. Importing libraries
3. Importing datasets
4. Finding Missing Data
5. Encoding Categorical Data
6. Splitting dataset into training and test set
7. Feature scaling

1. Get the Dataset

To create a machine learning model, the first thing we required is a dataset as a

machine learning model completely works on data. The collected data for a
particular problem in a proper format is known as the dataset.
Dataset may be of different formats for different purposes, such as, if we want to
create a machine learning model for business purpose, then dataset will be
different with the dataset required for a liver patient. So each dataset is different
from another dataset. To use the dataset in our code, we usually put it into a CSV
file. However, sometimes, we may also need to use an HTML or xlsx file.
2. Importing libraries
In order to perform data preprocessing using Python, we need to import some
predefined Python libraries. These libraries are used to perform some specific
jobs. There are three specific libraries that we will use for data preprocessing,
which are:

i) Numpy
ii) Matplotlib
iii) Pandas

3. Importing dataset
We need to import the datasets which we have collected for our machine learning
project. But before importing a dataset, we need to set the current directory as a
working directory. To set a working directory in Spyder IDE, we need to follow
the below steps:

a) Save your Python file in the directory which contains dataset.

b) Go to File explorer option in Spyder IDE, and select the required directory.
c) Click on F5 button or run option to execute the file.

4. Finding missing data

If our dataset contains some missing data, then it may create a huge problem for
our machine learning model. Hence it is necessary to handle missing values
present in the dataset. There are mainly two ways to handle missing data, which
are:
By deleting the particular row: The first way is used to commonly deal with null
values. In this way, we just delete the specific row or column which consists of
null values.
By calculating the mean: In this way, we will calculate the mean of that column
or row which contains any missing value and will put it on the place of missing
value.

5. Encoding Categorical data

Categorical data is data which has some categories such as, in our dataset; there
are two categorical variable, Country, and Purchased.
Since machine learning model completely works on mathematics and numbers,
but if our dataset would have a categorical variable, then it may create trouble
while building the model. So it is necessary to encode these categorical variables
into numbers.

6. Splitting the Dataset into the Training set and Test set
In machine learning data preprocessing, we divide our dataset into a training set
and test set. This is one of the crucial steps of data preprocessing as by doing this,
we can enhance the performance of our machine learning model.Suppose, if we
have given training to our machine learning model by a dataset and we test it by
a completely different dataset. Then, it will create difficulties for our model to
understand the correlations between the models.

If we train our model very well and its training accuracy is also very high, but we
provide a new dataset to it, then it will decrease the performance. So we always
try to make a machine learning model which performs well with the training set
and also with the test dataset. Here, we can define these datasets as:

Training Set: A subset of dataset to train the machine learning model, and we
already know the output.
Test set: A subset of dataset to test the machine learning model, and by using the
test set, model predicts the output.
7. Feature Scaling
Feature scaling is the final step of data preprocessing in machine learning. It is a
technique to standardize the independent variables of the dataset in a specific
range. In feature scaling, we put our variables in the same range and in the same
scale so that no any variable dominate the other variable.

Noise removal

Noise is a random error or variance in a measured value. It results from inaccurate

measurements, inaccurate data collection, or irrelevant information.The following
could be the reason for noisy data:
i) Errors in data collection, such as malfunctioning sensors or human error during
data entry, can introduce noise into machine learning.
ii) Noise can also be introduced by measurement mistakes, such as inaccurate
instruments or environmental conditions.
iii) Another form of noise in data is inherent variability resulting from either
natural fluctuations or unforeseen events.
iv) If data pretreatment operations like normalization or transformation are not
done appropriately, they may unintentionally add noise.
v) Inaccurate data point labeling or annotation can introduce noise and affect the
learning process.
It can be removed by using binning, a method where the given data values are
sorted and distributed into equal-frequency bins, which are also called buckets.
The binning method then uses the neighbor values to smooth the noisy data. Some
of the techniques commonly used are ‘smoothing by means’ where the mean of
the bin removes the values of the bins, ‘smoothing by bin medians’ where the bin
median replaces the bin values, and ‘smoothing by bin boundaries’ where the bin
value is replaced by the closest bin boundary. The maximum and minimum values
are called bin boundaries. Binning methods may be used as a discretization
technique.

Noise removal techniques:

Data preprocessing: It consists of methods to improve the quality of the data and
lessen noise from errors or inconsistencies, such as data cleaning, normalization,
and outlier elimination.
Fourier Transform:The Fourier Transform is a mathematical technique used to
transform signals from the time or spatial domain to the frequency domain. In the
context of noise removal, it can help identify and filter out noise by representing
the signal as a combination of different frequencies. Relevant frequencies can be
retained while noise frequencies can be filtered out.
Constructive Learning:Constructive learning involves training a machine
learning model to distinguish between clean and noisy data instances. This
approach typically requires labeled data where the noise level is known. The
model learns to classify instances as either clean or noisy, allowing for the
removal of noisy data points from the dataset.
Autoencoders: Autoencoders are neural network architectures that consist of an
encoder and a decoder. The encoder compresses the input data into a lower-
dimensional representation, while the decoder reconstructs the original data from
this representation. Autoencoders can be trained to reconstruct clean signals while
effectively filtering out noise during the reconstruction process.
Normalization

Normalization is an essential step in the preprocessing of data for machine

learning models, and it is a feature scaling technique. Normalization is
especially crucial for data manipulation, scaling down, or up the range of
data before it is utilized for subsequent stages in the fields of soft
computing, cloud computing, etc.
Data normalization improves the consistency and comparability of
different predictive models by standardizing the range of independent
variables or features within a dataset, leading to more steady and
dependable results.
Although there are so many feature normalization techniques in Machine
Learning, few of them are most frequently used. These are as follows:

Min-Max Scaling:
This technique is also referred to as scaling. As we have already discussed
above, the Min-Max scaling method helps the dataset to shift and rescale
the values of their attributes, so they end up ranging between 0 and 1.
Standardization scaling:
Standardization scaling is also known as Z-score normalization, in which
values are centered around the mean with a unit standard deviation, which
means the attribute becomes zero and the resultant distribution has a unit
standard deviation. Mathematically, we can calculate the standardization
by subtracting the feature value from the mean and dividing it by standard
deviation.

There are several reasons for the need for data normalization as follows:

i) Normalisation is essential to machine learning for a number of reasons.

Throughout the learning process, it guarantees that every feature
contributes equally, preventing larger-magnitude features from
overshadowing others.
ii) It enables faster convergence of algorithms for optimisation, especially
those that depend on gradient descent. Normalisation improves the
performance of distance-based algorithms like k-Nearest Neighbours.
iii) Normalisation improves overall performance by addressing model
sensitivity problems in algorithms such as Support Vector Machines and
Neural Networks.
iv) Because it assumes uniform feature scales, it also supports the use of
regularisation techniques like L1 and L2 regularisation.
v) In general, normalisation is necessary when working with attributes that
have different scales; otherwise, the effectiveness of a significant attribute
that is equally important (on a lower scale) could be diluted due to other
attributes having values on a larger scale.

Advantages of Data Normalization

1. More clustered indexes could potentially be produced.
2. Index searching was accelerated, which led to quicker data retrieval.
3. Quicker data modification commands.
4. The removal of redundant and null values to produce more compact data.
5. Reduction of anomalies resulting from data modification.
6. Conceptual clarity and simplicity of upkeep, enabling simple adaptations
to changing needs.
7. Because more rows can fit on a data page with narrower tables,
searching, sorting, and index creation are more efficient.

Disadvantages of Data Normalization

1. It gets harder to link tables together when the information is spread
across multiple ones. It gets even more interesting to identify the database.
2. Given that rewritten data is saved as lines of numbers rather than actual
data, tables will contain codes rather than actual data. That means that you
have to keep checking the query table.
3. This information model is very hard to query because it is meant for
programmes, not ad hoc queries. Operating system friendly query devices
frequently perform this function. It is composed of SQL that has been
accumulated over time. If you don’t first understand the needs of the client,
it may be challenging to demonstrate knowledge and understanding.
4. A comprehensive understanding of the various conventional structures
is essential to completing the standardisation cycle successfully. Careless
use can lead to a poor plan with significant anomalies and inconsistent data.

Unit3 - Updated
No ratings yet
Unit3 - Updated
116 pages
UNIT-I
No ratings yet
UNIT-I
38 pages
07 5123 08 Zigbee Cluster Library 1
No ratings yet
07 5123 08 Zigbee Cluster Library 1
1,213 pages
Machine Learning-basic Concepts
No ratings yet
Machine Learning-basic Concepts
52 pages
Machine Learning Practical File
No ratings yet
Machine Learning Practical File
41 pages
Unit 3
No ratings yet
Unit 3
33 pages
SONG Et Al. - 2019 - FIRMING UP INEQUALITY
No ratings yet
SONG Et Al. - 2019 - FIRMING UP INEQUALITY
50 pages
Chapter 1 PPT
No ratings yet
Chapter 1 PPT
26 pages
UNIT III_AIML
No ratings yet
UNIT III_AIML
47 pages
UNIT III;dkd
No ratings yet
UNIT III;dkd
48 pages
ML UNIT 1
No ratings yet
ML UNIT 1
20 pages
4CH1 2C Que 20211120
No ratings yet
4CH1 2C Que 20211120
24 pages
Working Principles FACTS Presentation
No ratings yet
Working Principles FACTS Presentation
15 pages
Unit-1 new
No ratings yet
Unit-1 new
48 pages
Data Analysis
100% (1)
Data Analysis
34 pages
EE313 Lesson 2
No ratings yet
EE313 Lesson 2
13 pages
Data Science IV
No ratings yet
Data Science IV
126 pages
Machine Learning- UNIT I (1)
No ratings yet
Machine Learning- UNIT I (1)
70 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
42 pages
01. ML,Types,Application,Life Cycle,Issues
No ratings yet
01. ML,Types,Application,Life Cycle,Issues
29 pages
Lagrange EoM
No ratings yet
Lagrange EoM
32 pages
Unit-1 Part-1 Material
No ratings yet
Unit-1 Part-1 Material
45 pages
Machine Learning Lecture Notes
No ratings yet
Machine Learning Lecture Notes
19 pages
ML Unit 1
No ratings yet
ML Unit 1
42 pages
Rizwan Report
No ratings yet
Rizwan Report
23 pages
ML Unit1.2
No ratings yet
ML Unit1.2
24 pages
Surds and Indices Questions Specially For Sbi Po Prelims
No ratings yet
Surds and Indices Questions Specially For Sbi Po Prelims
14 pages
UNIT III
No ratings yet
UNIT III
39 pages
Chapter1
No ratings yet
Chapter1
30 pages
Unit-I Machine Leaning Notes
No ratings yet
Unit-I Machine Leaning Notes
13 pages
ML1
No ratings yet
ML1
33 pages
mehakreport
No ratings yet
mehakreport
23 pages
Eda 5
No ratings yet
Eda 5
48 pages
Machine Learning
No ratings yet
Machine Learning
16 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
68 pages
Machine Learning
No ratings yet
Machine Learning
73 pages
Atoms and molecules worksheet
No ratings yet
Atoms and molecules worksheet
2 pages
Field-Testing of Power Semiconductor Modules: Application Note
No ratings yet
Field-Testing of Power Semiconductor Modules: Application Note
11 pages
Reviewer in Practical Research Ii 1
No ratings yet
Reviewer in Practical Research Ii 1
3 pages
ML Lecture - 1
No ratings yet
ML Lecture - 1
33 pages
Unit 5
No ratings yet
Unit 5
26 pages
unit V
No ratings yet
unit V
67 pages
Grammar and Tense
No ratings yet
Grammar and Tense
14 pages
Module 1
No ratings yet
Module 1
22 pages
Datasheet - ANTENA MASTER RB921GS-5HPacD-19S
No ratings yet
Datasheet - ANTENA MASTER RB921GS-5HPacD-19S
3 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
UNIT 1 All Notes
No ratings yet
UNIT 1 All Notes
24 pages
5th Sem Report
No ratings yet
5th Sem Report
29 pages
SpaceWire Users Guide
No ratings yet
SpaceWire Users Guide
117 pages
About Us: 20,000/-From Each Candidate. There Are Only Limited Seats Available. Interested
No ratings yet
About Us: 20,000/-From Each Candidate. There Are Only Limited Seats Available. Interested
3 pages
QM 10 Dynare
No ratings yet
QM 10 Dynare
28 pages
Learning
No ratings yet
Learning
24 pages
ML Unit 1
No ratings yet
ML Unit 1
6 pages
What is Machine Learning
No ratings yet
What is Machine Learning
10 pages
Cse443 11904916
No ratings yet
Cse443 11904916
24 pages
Unit-I
No ratings yet
Unit-I
8 pages
ML ch1
No ratings yet
ML ch1
20 pages
Follicle-Stimulating Hormone: A) Tris (2,2'-Bipyridyl) Ruthenium (II) - Complex (Ru (Bpy) )
No ratings yet
Follicle-Stimulating Hormone: A) Tris (2,2'-Bipyridyl) Ruthenium (II) - Complex (Ru (Bpy) )
4 pages
AME MODULE LIST BY DGCA
100% (1)
AME MODULE LIST BY DGCA
11 pages
ML Unit1(HKB)
No ratings yet
ML Unit1(HKB)
7 pages
Seminar Report Bhavesh
No ratings yet
Seminar Report Bhavesh
25 pages
Machine Learning
No ratings yet
Machine Learning
20 pages
Machine Learning(MCA)
No ratings yet
Machine Learning(MCA)
5 pages
Practical Scaffold Training Manual: Part 1: Basic Scaffolding
No ratings yet
Practical Scaffold Training Manual: Part 1: Basic Scaffolding
146 pages
ML Unit 1
No ratings yet
ML Unit 1
19 pages
ML Notes
No ratings yet
ML Notes
15 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
13 pages
Training Report On Machine Learning
No ratings yet
Training Report On Machine Learning
27 pages
Fundamentals of Machine Learning II
No ratings yet
Fundamentals of Machine Learning II
13 pages
Theory of Combustion & Fuel Oil Firing System
100% (2)
Theory of Combustion & Fuel Oil Firing System
50 pages
6.5.1.2 Packet Tracer - Layer 2 Security
No ratings yet
6.5.1.2 Packet Tracer - Layer 2 Security
6 pages
Truncated_Doc_4
No ratings yet
Truncated_Doc_4
3 pages
Load Charts Manual: STC800S Truck Crane
No ratings yet
Load Charts Manual: STC800S Truck Crane
18 pages
DAIOT UNIT 5 (1) Own
No ratings yet
DAIOT UNIT 5 (1) Own
13 pages
AI - Module-III (Introduction To ML)
No ratings yet
AI - Module-III (Introduction To ML)
20 pages
EC-506 Scilab Laboratory Manual: Submitted by
No ratings yet
EC-506 Scilab Laboratory Manual: Submitted by
29 pages
Evolution of Machine Learning
No ratings yet
Evolution of Machine Learning
7 pages
1 ML Landscape, ML Categories
No ratings yet
1 ML Landscape, ML Categories
3 pages
Adre Brochure
No ratings yet
Adre Brochure
12 pages
Learning VIM Gently - Sujata Biswas
100% (2)
Learning VIM Gently - Sujata Biswas
52 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
LL Determination From Casagrande and Fall Cone Results Example
No ratings yet
LL Determination From Casagrande and Fall Cone Results Example
3 pages
Machine Learning Tutorial
100% (1)
Machine Learning Tutorial
44 pages
Confirmatory Factor Analysis (CFA) of First Order Factor Measurement Model-ICT Empowerment in Nigeria
No ratings yet
Confirmatory Factor Analysis (CFA) of First Order Factor Measurement Model-ICT Empowerment in Nigeria
8 pages
Drenaj Stradal Cu Teava
No ratings yet
Drenaj Stradal Cu Teava
2 pages
Advance Review Weekly Exam 2 Bacolod
No ratings yet
Advance Review Weekly Exam 2 Bacolod
4 pages
The Fundamentals of Machine Learning: Building Intelligent Systems from Data
From Everand
The Fundamentals of Machine Learning: Building Intelligent Systems from Data
Ethan Bennett
No ratings yet
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
Mastering Machine Learning: A Comprehensive Guide to Success
From Everand
Mastering Machine Learning: A Comprehensive Guide to Success
Rick Spair
No ratings yet
Python Machine Learning: Introduction to Machine Learning with Python
From Everand
Python Machine Learning: Introduction to Machine Learning with Python
Frank Millstein
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unit 1 Notes

Uploaded by

Unit 1 Notes

Uploaded by

UNIT – I

Introduction to Machine Learning – SCSB4009

Machine learning-basic concepts in machine learning- types of machine learning-

• Divide the total dataset into three subsets:

– Training data is used for learning the parameters of the model.

– Validation data is not used of learning but is used for deciding

DIFFERENCE BETWEEN TRADITIONAL PROGRAMMING VS MACHINE

- The Data-Information-Knowledge-Wisdom (DIKW) pyramid

Example: Consider a scenario where you have to build an image classifier to

Classification- deals with predicting categorical target variables, which

o It is the combination of supervised and un supervised learning models.

o Training data includes a few desired outputs

Reinforcement machine learning algorithm is a learning method that interacts

Examples of machine learning

Applications of machine learning

Image recognition is one of the most common applications of machine learning.

Speech recognition is a process of converting voice instructions into text, and it

It predicts the traffic conditions such as whether traffic is cleared, slow-moving,

4. Email Spam and Malware Filtering

Whenever we receive a new email, it is filtered automatically as important,

Machine learning is widely used by various e-commerce and entertainment

Let Y be the true value of a parameter, and let be an estimator of Y based on

Where is the expected value of the estimator . It is the measurement of

In real world the available data is

1. Get the Dataset

To create a machine learning model, the first thing we required is a dataset as a

a) Save your Python file in the directory which contains dataset.

4. Finding missing data

5. Encoding Categorical data

Noise is a random error or variance in a measured value. It results from inaccurate

Noise removal techniques:

Normalization is an essential step in the preprocessing of data for machine

i) Normalisation is essential to machine learning for a number of reasons.

Advantages of Data Normalization

Disadvantages of Data Normalization

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.