0% found this document useful (0 votes)
12 views79 pages

Chapter 2 - Image Classification

Uploaded by

Muhammad Ashraf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views79 pages

Chapter 2 - Image Classification

Uploaded by

Muhammad Ashraf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

Image Classification

Notes based on
Machine Learning for the Web by YiNing at ITP, NYU,
How to build a Teachable Machine with TensorFlow.js by Nikhil Thorat from Google Brain, PAIR,
and Daniel Shiffman from the Coding Train

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 1 The Energy University


Classification: Basic Human task

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 2 The Energy University


Classification: Basic Human task
Input: image Output:

cat
bird
deer
dog
truck

Resolution:
800 x 600 x 3 (RGB)

All images from https://unsplash.com/s/photos/cat

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 3 The Energy University


Classification: Basic AI task
Input: image Output: Probability:

cat 0.82
bird 0.02
AI deer 0.04
Magic dog 0.10
Box truck 0.02

Resolution:
800 x 600 x 3 (RGB)

All images from https://unsplash.com/s/photos/cat

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 4 The Energy University


AI Classification: Challenges
• Change view angle

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 5 The Energy University


AI Classification: Challenges
• Variation within class

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 6 The Energy University


AI Classification: Challenges
• Sub-classes (different cat breeds)

Grumpy Cat
Maine Coon Bristish Shorthair (Mixed breed)
2012-2019

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 7 The Energy University


AI Classification: Challenges
• Background blending / camouflage

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 8 The Energy University


AI Classification: Challenges
• Lighting change

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 9 The Energy University


AI Classification: Challenges
• Deformation

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 10 The Energy University


AI Classification: Challenges
• Occlusion

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 11 The Energy University


Image Classification: Very Useful!
Medical Imaging
Whale recognition

Levy et al, 2016 Figure reproduced with permission

Galaxy Classification

Dieleman et al, 2014


From left to right: public domain by NASA, usage permitted by
ESA/Hubble, public domain by NASA, and public domain. Kaggle Challenge This image by Christin Khan is in the public domain and
originally came from the U.S. NOAA.

The Energy University


Image Classification: Building Block for other tasks!
Example: Object Detection

Person

Horse
This image is free to use under the Pexels license

The Energy University


Image Classification: Building Block for other tasks!
Example: Object Detection

Background
Horse
Person
Car
Truck
This image is free to use under the Pexels license

The Energy University


Image Classification: Building Block for other tasks!
Example: Object Detection

Background
Horse
Person
Car
Truck
This image is free to use under the Pexels license

The Energy University


Image Classification: Building Block for other tasks!
Example: Image Captioning

riding What word


cat to say next?
horse
man
when Caption:
Man riding horse

This image is free to use under the Pexels license
<STOP>
The Energy University
Image Classification: Building Block for other tasks!
Example: Image Captioning

riding What word


cat to say next?
horse
man
when Caption:
Man riding horse

This image is free to use under the Pexels license
<STOP>
The Energy University
Image Classification: Building Block for other tasks!
Example: Image Captioning

riding What word


cat to say next?
horse
man
when Caption:
Man riding horse

This image is free to use under the Pexels license
<STOP>
The Energy University
Image Classification: Building Block for other tasks!
Example: Image Captioning

riding What word


cat to say next?
horse
man
when Caption:
Man riding horse

This image is free to use under the Pexels license
<STOP>
The Energy University
Image Classification: Building Blocks for other tasks.
• Object detection

Car
Car Bike

Output: bounding boxes and class

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 20 The Energy University


Image Classification: Building Blocks for other tasks.
• Object segmentation

Output: segments and class

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 21 The Energy University


Image Classification: Building Blocks for other tasks.
• Image captioning

Output: description: There are two cars and a bike on the road

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 22 The Energy University


Image Classification: Building Blocks for other tasks.
• Playing go

(1, 1) What’s the


(1, 2) next move?

(1, 19)
...
(19, 19)

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 23 The Energy University


Classification: Basic AI task
Input: image Output:
Class: Probability:
cat 0.82
bird 0.02
AI deer 0.04
Magic dog 0.10
Box truck 0.02

Resolution:
800 x 600 x 3 (RGB)

All images from https://unsplash.com/s/photos/cat

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 24 The Energy University


An Image Classifier

def classify_image(image):
# some AI magic
return class_label

Unlike e.g. sorting a list of numbers,

no obvious way to hard-code the algorithm for


recognizing a cat, or other classes.

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 25 The Energy University


You can try edge detection…

Find edges Find corners

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 26 The Energy University


Machine Learning: Data driven approach
1. Dataset – Collect a dataset of images with labels
2. Train – Use ML to train classifier
3. Test/deploy – evaluate classifer on new images
Example training set
def train(images, labels):
# Machine learning!
return model

def predict(model, test_image):


# use model to predict label
return predicted_label

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 27 The Energy University


Image Classification Datasets: MNIST
10 classes: Digits 0 to 9
28x28 grayscale images
50k training images
10k test images

Results from MNIST often do not


hold on more complex datasets!

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 28 The Energy University


Image Classification Datasets: Fashion MNIST
10 classes
28x28 grayscale images
50k training images
10k test images

https://github.com/zalandoresearch/fashion-mnist

Results from MNIST often do not


hold on more complex datasets!

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 29 The Energy University


Image Classification Datasets: CIFAR10

10 classes
50k training images (5k per class)
10k testing images (1k per class)
32x32 RGB images

https://www.cs.toronto.edu/~kriz/cifar.html

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 30 The Energy University


Image Classification Datasets: CIFAR100
100 classes
50k training images (500 per class)
10k testing images (100 per class)
32x32 RGB images

20 superclasses with 5 classes each:

Aquatic mammals:
beaver, dolphin, otter, seal, whale

Trees:
Maple, oak, palm, pine, willow
https://www.cs.toronto.edu/~kriz/cifar.html

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 31 The Energy University


Image Classification Datasets: ImageNet

1000 classes

~1.3M training images (~1.3K per class)


50K validation images (50 per class)
100K test images (100 per class)

Performance metric: Top 5 accuracy

Algorithm predicts 5 labels for each


image; one of them needs to be right

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 32 The Energy University


Image Classification Datasets: MIT Places

365 classes of different scene types

~8M training images


18.25K val images (50 per class)
328.5K test images (900 per class)

Images have variable size, often


resized to 256x256 for training

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 33 The Energy University


A Large-scale Mechanical Components Benchmark for Deep Neural
Networks
• https://mechanical-components.herokuapp.com/
• https://www.youtube.com/watch?v=EaKo7ky15uA

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 34 The Energy University


Datasets
1. MNIST
2. CIFAR10
3. CIFAR100
4. ImageNet
5. MIT Places

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 35 The Energy University


Datasets: Number of Training Pixels

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 36 The Energy University


Your first classifier: Nearest Neighbour

def train(images, labels): Memorize all data


# Machine learning! and labels
return model

def predict(model, test_image):


Predict the label of
# use model to predict label
the most similar
return predicted_label
training image

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 37 The Energy University


Distance Metric to compare images

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 38 The Energy University


Nearest Neighbor Classifier

The Energy University


Nearest Neighbor Classifier

Memorize training data

The Energy University


Nearest Neighbor Classifier

For each test image:


Find nearest training image
Return label of nearest image

The Energy University


Nearest Neighbor Classifier

Q: With N examples,
how fast is training?

The Energy University


Nearest Neighbor Classifier

Q: With N examples,
how fast is training?
A: O(1)

The Energy University


Nearest Neighbor Classifier

Q: With N examples,
how fast is training?
A: O(1)

Q: With N examples,
how fast is testing?

The Energy University


Nearest Neighbor Classifier

Q: With N examples,
how fast is training?
A: O(1)

Q: With N examples,
how fast is testing?
A: O(N)

The Energy University


Nearest Neighbor Classifier

Q: With N examples,
how fast is training?
A: O(1)

Q: With N examples,
how fast is testing?
A: O(N)

This is bad: We can


afford slow training, but
we need fast testing!
The Energy University
Nearest Neighbor Classifier

There are many methods for


fast / approximate nearest
neighbors; e.g. see
https://github.com/facebookresearch/faiss

The Energy University


What does this look like?

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 48 The Energy University


What does this look like?

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 49 The Energy University


Nearest Neighbor Decision Boundaries

The Energy University


Nearest Neighbor Decision Boundaries
x1
Nearest neighbors
in two dimensions

x0
The Energy University
Nearest Neighbor Decision Boundaries
x1
Nearest neighbors
in two dimensions

Points are training


examples; colors
give training labels

x0
The Energy University
Nearest Neighbor Decision Boundaries
x1
Nearest neighbors
in two dimensions

Points are training


examples; colors
give training labels

Background colors
give the category
x
a test point would
be assigned x0
The Energy University
Nearest Neighbor Decision Boundaries
x1
Decision boundary
Nearest neighbors is the boundary
in two dimensions between two
classification regions
Points are training
examples; colors
give training labels

Background colors
give the category
x
a test point would
be assigned x0
The Energy University
Nearest Neighbor Decision Boundaries
x1
Decision boundary
Nearest neighbors is the boundary
in two dimensions between two
classification regions
Points are training
Decision boundaries
examples; colors
can be noisy;
give training labels
affected by outliers
Background colors
give the category
x
a test point would
be assigned x0
The Energy University
Nearest Neighbor Decision Boundaries
x1
Decision boundary
Nearest neighbors is the boundary
in two dimensions between two
classification regions
Points are training
Decision boundaries
examples; colors
can be noisy;
give training labels
affected by outliers
Background colors How to smooth out
give the category decision boundaries?
x
a test point would Use more neighbors!
be assigned x0
The Energy University
K-Nearest Neighbors Instead of copying label from nearest neighbor,
take majority vote from K closest points

K=1 K=3

The Energy University


Using more neighbors helps smooth
K-Nearest Neighbors out rough decision boundaries

K=1 K=3

Lecture 2 - 58 The Energy University


Using more neighbors helps
K-Nearest Neighbors reduce the effect of outliers

K=1 K=3

Lecture 2 - 59 The Energy University


When K > 1 there can be
K-Nearest Neighbors ties between classes.
Need to break somehow!
K=1 K=3

Lecture 2 - 60 The Energy University


K-Nearest Neighbors: Distance Metric
L1 (Manhattan) distance L2 (Euclidean) distance

The Energy University


K-Nearest Neighbors: Distance Metric
L1 (Manhattan) distance L2 (Euclidean) distance

K=1

The Energy University


K-Nearest Neighbors: Distance Metric

With the right choice of distance metric, we can


apply K-Nearest Neighbor to any type of data!

The Energy University


K-Nearest Neighbors: Distance Metric

With the right choice of distance metric, we can


apply K-Nearest Neighbor to any type of data!

Example:
Compare
research
papers using
tf-idf similarity

http://www.arxiv-sanity.com/search?q=mesh+r-cnn

The Energy University


K-Nearest Neighbors:
Web Demo

Interactively move points around


and see decision boundaries change

Play with L1 vs L2 metrics

Play with changing number of


training points, value of K

http://vision.stanford.edu/teaching/cs231n-demos/knn/

The Energy University


Hyperparameters
• What is the best value of K to use?
• What is the best distance metric to use?

These are examples of hyperparameters: choices about our


learning algorithm that we don’t learn from the training data;
instead we set them at the start of the learning process

Very problem-dependent. In general need to try them all and


see what works best for our data / task.

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 66 The Energy University


Setting Hyperparameters
Idea #1: Choose hyperparameters that BAD: K = 1 always works
work best on the data perfectly on training data

Your dataset

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 67 The Energy University


Setting Hyperparameters
Idea #1: Choose hyperparameters that BAD: K = 1 always works
work best on the data perfectly on training data

Your dataset

Idea #2: Split data into train and test, choose BAD: No idea how algorithm
hyperparameters that work best on test data will perform on new data

train test

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 68 The Energy University


Setting Hyperparameters
Idea #1: Choose hyperparameters that BAD: K = 1 always works
work best on the data perfectly on training data

Your dataset

Idea #2: Split data into train and test, choose BAD: No idea how algorithm
hyperparameters that work best on test data will perform on new data

train test

Idea #3: Split data into train, val, and test; choose
hyperparameters on val and evaluate on test Better!

train validate test

9/10/2023 Zafri Baharuddin EEEB4023/ECEB463 AI 69 The Energy University


K-Nearest Neighbor: Universal Approximation
As the number of training samples goes to infinity, nearest
neighbor can represent any(*) function!

(*) Subject to many technical conditions. Only continuous functions on a compact domain; need to make assumptions about spacing of training points; etc.

The Energy University


K-Nearest Neighbor: Universal Approximation
As the number of training samples goes to infinity, nearest
neighbor can represent any(*) function!

(*) Subject to many technical conditions. Only continuous functions on a compact domain; need to make assumptions about spacing of training points; etc.

The Energy University


K-Nearest Neighbor: Universal Approximation
As the number of training samples goes to infinity, nearest
neighbor can represent any(*) function!

(*) Subject to many technical conditions. Only continuous functions on a compact domain; need to make assumptions about spacing of training points; etc.

The Energy University


K-Nearest Neighbor: Universal Approximation
As the number of training samples goes to infinity, nearest
neighbor can represent any(*) function!

(*) Subject to many technical conditions. Only continuous functions on a compact domain; need to make assumptions about spacing of training points; etc.

The Energy University


K-Nearest Neighbor: Universal Approximation
As the number of training samples goes to infinity, nearest
neighbor can represent any(*) function!

(*) Subject to many technical conditions. Only continuous functions on a compact domain; need to make assumptions about spacing of training points; etc.

The Energy University


K-Nearest Neighbor on raw pixels is seldom used
- Very slow at test time
- Distance metrics on pixels are not informative
Original Boxed Shifted Tinted

Original image is
(all 3 images have same L2 distance to the one on the left)
CC0 public domain

The Energy University


Nearest Neighbor with ConvNet features works well!

Devlin et al, “Exploring Nearest Neighbor Approaches for Image Captioning”, 2015

The Energy University


Nearest Neighbor with ConvNet features works well!
Example: Image Captioning with Nearest Neighbor

Devlin et al, “Exploring Nearest Neighbor Approaches for Image Captioning”, 2015

The Energy University


Summary
In Image classification we start with a training set of images and labels, and
must predict labels on the test set
Image classification is challenging due to the semantic gap: we need
invariance to occlusion, deformation, lighting, intraclass variation, etc
Image classification is a building block for other vision tasks
The K-Nearest Neighbors classifier predicts labels based on nearest training
examples
Distance metric and K are hyperparameters
Choose hyperparameters using the validation set; only run on the test set
once at the very end!
The Energy University
Next time: Linear Classifiers

The Energy University

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy