0% found this document useful (0 votes)

19 views28 pages

Basics of ML

Sample, Population, Matrices

Uploaded by

Tamilarasi Suresh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views28 pages

Basics of ML

Sample, Population, Matrices

Uploaded by

Tamilarasi Suresh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 28

Class 2

Sample
• A sample statistic is a piece of information you get from a fraction of a
population.
• A sample is just a part of a population.
• For example, let’s say your population was every American, and you
wanted to find out how much the average person earns. Time and
finances stop you from knocking on every door in America, so you
choose to ask 1,000 random people. This one thousand people is
your sample.
• One you have your sample, you’ll get some kind of statistic. A statistic
is really just a piece of information—in this example, average
earnings.
Populatio
n
• A population is a whole, it’s every member
of a group.
• A population is the opposite to a sample,
which is a fraction or percentage of a group
• Sometimes it’s possible to survey every
member of a group.
• A classic example is the Census, where it’s
the law that you have to respond. Note: if
you do manage to survey everyone, it
actually is called a census
• If you go into a candy store, the owner might have samples of their
products on display.
• It wouldn’t be possible for you to sample everything in the store;
• Financially the owner wouldn’t want you to taste everything for free.
And you probably wouldn’t want to eat a sample of candy from a
couple hundred jars or you might get sick to your stomach.
• So, you might base your opinion about the entire store’s candy line
based on the samples they have to offer.
• The same logic holds true for most surveys in stats; You’re only going
to want to take a sample of the whole population (“population” in
this example would be the entire candy line).
• The result is a statistic about that population.
• Statistics are when you base your data from samples.
Scalar Multiplication

• Input : mat[][] = {{2, 3} {5, 4}}

•k=5
• Output : 10 15
25 20
We multiply 5 with every element.
4 Types of Distance Metrics in Machine Learning
• These distance metrics are used in both supervised and unsupervised
learning, generally to calculate the similarity between data points.
• Euclidean Distance
• Manhattan Distance
• Minkowski Distance
• Hamming Distance
1. Euclidean Distance
• Euclidean Distance represents the shortest distance between two
points.
• Most machine learning algorithms including K-Means use this
distance metric to measure the similarity between observations. Let’s
say we have two points as shown below:
Formula for Euclidean distance :
2dimensional space

N-dimensional space
• Where,
• n = number of dimensions
• pi, qi = data points

Sample Points
Manhattan Distance
• Manhattan Distance is the sum of absolute differences between
points across all the dimensions.
• Since the representation is 2 dimensional, to calculate Manhattan
Distance, we will take the sum of absolute distances in both the x and
y directions.
• So, the Manhattan distance in a 2-dimensional space is given as:
And the generalized formula for an n-dimensional space is given as:
Minkowski Distance
• Minkowski Distance is the generalized form of Euclidean and
Manhattan Distance.
• The formula for Minkowski Distance is given as:

Here, p represents the order of the norm. Let’s calculate the Minkowski Distance
of the order 3:
When the order(p) is 1, it will represent Manhattan Distance and when the order in the
above formula is 2, it will represent Euclidean Distance.
Hamming Distance
Hamming Distance measures the similarity between two strings of the
same length.
The Hamming Distance between two strings of the same length is the
number of positions at which the corresponding characters are
different.
• Let’s understand the concept using an example. Let’s say we have two
strings:
• “euclidean” and “manhattan”
Since the length of these strings is equal, we can calculate the
Hamming Distance. We will go character by character and match the
strings. The first character of both the strings (e and m respectively) is
different. Similarly, the second character of both the strings (u and a) is
different. and so on.
• Look carefully – seven characters are different whereas two
characters (the last two characters) are similar:

Hence, the Hamming Distance here will be 7. Note that larger the Hamming Distance
between two strings, more dissimilar will be those strings (and vice versa).

ML Unit 2
No ratings yet
ML Unit 2
22 pages
Module 5
No ratings yet
Module 5
370 pages
Pattern Recognition - Clustering - Classification
No ratings yet
Pattern Recognition - Clustering - Classification
177 pages
Ders 3-4 Descriptives of Statistics
No ratings yet
Ders 3-4 Descriptives of Statistics
31 pages
Statistics 12th Edition by James T McClave
No ratings yet
Statistics 12th Edition by James T McClave
306 pages
Chap2 Data
No ratings yet
Chap2 Data
101 pages
Reachable Distance Function For KNN Classification
No ratings yet
Reachable Distance Function For KNN Classification
152 pages
Unit V: Distance and Rule Based Models
No ratings yet
Unit V: Distance and Rule Based Models
56 pages
ML Unit 2
No ratings yet
ML Unit 2
24 pages
TE IT DMBI Module2 Data Preprocessing L8-L11
No ratings yet
TE IT DMBI Module2 Data Preprocessing L8-L11
73 pages
L2 - Mathematical Preliminaries
No ratings yet
L2 - Mathematical Preliminaries
24 pages
Unit 2 ML
No ratings yet
Unit 2 ML
89 pages
Unit 3
No ratings yet
Unit 3
36 pages
Source: Books by Tan, Steinbach, Kumar Han, Kamber & Pei Evans Dinesh Kumar + Experiential Knowledge
No ratings yet
Source: Books by Tan, Steinbach, Kumar Han, Kamber & Pei Evans Dinesh Kumar + Experiential Knowledge
26 pages
Chapter 2
No ratings yet
Chapter 2
70 pages
Chapter 1 Introduction - Bio
No ratings yet
Chapter 1 Introduction - Bio
58 pages
CS361 FA23 Lec2 Post
No ratings yet
CS361 FA23 Lec2 Post
67 pages
III Clustering
No ratings yet
III Clustering
87 pages
Clustering Part4
No ratings yet
Clustering Part4
79 pages
C C: GS - 301 C I:: Ourse ODE Ourse Nstructor
No ratings yet
C C: GS - 301 C I:: Ourse ODE Ourse Nstructor
38 pages
Chapter 3 Updated
No ratings yet
Chapter 3 Updated
67 pages
Introduction To Classification - KNN
No ratings yet
Introduction To Classification - KNN
29 pages
Basa, Et Al.
No ratings yet
Basa, Et Al.
44 pages
Atlas of Urban Expansion 2016 Volume 1 Full
No ratings yet
Atlas of Urban Expansion 2016 Volume 1 Full
500 pages
SEMINAR
No ratings yet
SEMINAR
19 pages
ML Unit - 2
No ratings yet
ML Unit - 2
85 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
45 pages
PDM Minimum Guidelines and Indicators Mpca Jan 2022
100% (1)
PDM Minimum Guidelines and Indicators Mpca Jan 2022
9 pages
3 Unit PR NonParametric Decision Making
No ratings yet
3 Unit PR NonParametric Decision Making
78 pages
Reviewer For MMW
No ratings yet
Reviewer For MMW
9 pages
Clustering
0% (1)
Clustering
127 pages
ML Lecture 13 KNN
No ratings yet
ML Lecture 13 KNN
14 pages
Showfile
No ratings yet
Showfile
130 pages
DS5 Statistics
No ratings yet
DS5 Statistics
67 pages
Measuring Distances: Applied Multivariate Statistics - Spring 2012
No ratings yet
Measuring Distances: Applied Multivariate Statistics - Spring 2012
25 pages
DMi 03-Proximity
No ratings yet
DMi 03-Proximity
51 pages
Customer Satisfaction On HDFC Bank
No ratings yet
Customer Satisfaction On HDFC Bank
106 pages
Lecture 7 - Distance Measures
No ratings yet
Lecture 7 - Distance Measures
38 pages
Statistics Assignment 05
50% (2)
Statistics Assignment 05
14 pages
Learning Content
No ratings yet
Learning Content
11 pages
Gec 104 Reviewer
No ratings yet
Gec 104 Reviewer
4 pages
Introduction To Data Science: Tom A S Horv Ath
No ratings yet
Introduction To Data Science: Tom A S Horv Ath
39 pages
Exam 1 Review
No ratings yet
Exam 1 Review
2 pages
Clustering Lecture 1: Basics: Jing Gao
No ratings yet
Clustering Lecture 1: Basics: Jing Gao
62 pages
Machine Learning
No ratings yet
Machine Learning
50 pages
Q4 STATISTICS AND PROBABILITY Final
100% (1)
Q4 STATISTICS AND PROBABILITY Final
38 pages
Cluster Analysis
No ratings yet
Cluster Analysis
29 pages
Introduction To AI and ML - UNIT 4
No ratings yet
Introduction To AI and ML - UNIT 4
29 pages
Entrepreneur CEO Research Dr. Mona Kadry - Research 3rd General
No ratings yet
Entrepreneur CEO Research Dr. Mona Kadry - Research 3rd General
8 pages
Machine Learning With K Nearest Neighbors Course Notes 365 Data Science
No ratings yet
Machine Learning With K Nearest Neighbors Course Notes 365 Data Science
24 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
51 pages
Chapter III
No ratings yet
Chapter III
14 pages
CS2209 Similarity Distances
No ratings yet
CS2209 Similarity Distances
23 pages
IV Distance and Rule Based Models 4.1 Distance Based Models
No ratings yet
IV Distance and Rule Based Models 4.1 Distance Based Models
45 pages
High Quality Assessment in Retrospect
No ratings yet
High Quality Assessment in Retrospect
62 pages
AQL (Accepted Quality Level)
No ratings yet
AQL (Accepted Quality Level)
12 pages
Matrix Research
No ratings yet
Matrix Research
11 pages
4.1 K-Nearest Neighbours (K-NN
No ratings yet
4.1 K-Nearest Neighbours (K-NN
9 pages
pp1 (Auto-Saved)
No ratings yet
pp1 (Auto-Saved)
43 pages
4.4-InstanceBasedLearning Part 1
No ratings yet
4.4-InstanceBasedLearning Part 1
16 pages
Supervised Learning
No ratings yet
Supervised Learning
24 pages
Selecting Samples
No ratings yet
Selecting Samples
26 pages
Stats Lec01
No ratings yet
Stats Lec01
9 pages
Statistics: Cambridge International Examinations General Certificate of Education Ordinary Level
No ratings yet
Statistics: Cambridge International Examinations General Certificate of Education Ordinary Level
12 pages
Euclidean Distance
No ratings yet
Euclidean Distance
3 pages
Lecture 12 Distance Metrics Different Distance Metrics in Machine Learning
No ratings yet
Lecture 12 Distance Metrics Different Distance Metrics in Machine Learning
12 pages
Image Processing Mahalanobis Distance
No ratings yet
Image Processing Mahalanobis Distance
17 pages
Data 101 Midterm Review
No ratings yet
Data 101 Midterm Review
4 pages
Business Statistics and Research Methodology Theory
No ratings yet
Business Statistics and Research Methodology Theory
39 pages
Love Yourself As A Person, Doubt Yourself As A Therapist
No ratings yet
Love Yourself As A Person, Doubt Yourself As A Therapist
16 pages
Statistical Distance
No ratings yet
Statistical Distance
3 pages
Distance Functions
No ratings yet
Distance Functions
7 pages
Presentation 4
No ratings yet
Presentation 4
39 pages
Nishikawa-Pacher - Who Are The 100 Largest Scientific Publishers by Journal Count
No ratings yet
Nishikawa-Pacher - Who Are The 100 Largest Scientific Publishers by Journal Count
14 pages
Different Distances Used in K-NN
No ratings yet
Different Distances Used in K-NN
8 pages
Newsletter QP Issue 20 October 2020
No ratings yet
Newsletter QP Issue 20 October 2020
13 pages
Dist
No ratings yet
Dist
14 pages
Presentation 3
No ratings yet
Presentation 3
23 pages
Muranda KT R164148B
No ratings yet
Muranda KT R164148B
60 pages
Research Pro ICT With Questionnaire
No ratings yet
Research Pro ICT With Questionnaire
24 pages
Ai A10s2245 2
No ratings yet
Ai A10s2245 2
16 pages
Tagakaulo in Trade: A Phenomenological Exploration On The Journey of Language Preservation
No ratings yet
Tagakaulo in Trade: A Phenomenological Exploration On The Journey of Language Preservation
14 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
12 pages
Glossary of Terms in PR1
No ratings yet
Glossary of Terms in PR1
8 pages
Sample and Sampling Process
No ratings yet
Sample and Sampling Process
37 pages
POS
No ratings yet
POS
9 pages
Causes of Non-Numeracy of Grade 8 Students: Introduction and Rationale
No ratings yet
Causes of Non-Numeracy of Grade 8 Students: Introduction and Rationale
13 pages
Principle of Marketing Final Review Report
No ratings yet
Principle of Marketing Final Review Report
43 pages
An Empirical Study of Distance Metrics For K-Nearest Neighbor Algorithm
No ratings yet
An Empirical Study of Distance Metrics For K-Nearest Neighbor Algorithm
6 pages
Possible Questions
No ratings yet
Possible Questions
8 pages
My Article
No ratings yet
My Article
20 pages
MATH101
No ratings yet
MATH101
10 pages
Mba Statistics Midterm Review Sheet
No ratings yet
Mba Statistics Midterm Review Sheet
1 page
UBE and Nigerian Socio-Economic Development: An Evaluation of The MDGs
No ratings yet
UBE and Nigerian Socio-Economic Development: An Evaluation of The MDGs
24 pages
Eng2015 Quiz 1 Reviewer
No ratings yet
Eng2015 Quiz 1 Reviewer
4 pages
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Basics of ML

Uploaded by

Basics of ML

Uploaded by

Class 2

• Input : mat[][] = {{2, 3} {5, 4}}

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.