0% found this document useful (0 votes)
196 views25 pages

Emotion AI Slides PDF

The document describes building an emotion AI system using deep learning models to analyze facial expressions in images. It involves training two separate models: 1) A convolutional neural network model to detect facial keypoints. 2) A classifier to identify one of five emotions (anger, disgust, sadness, happiness, surprise) based on the facial expression. The system will take an input image, detect the keypoints, identify the emotion, and return the combined prediction to automatically monitor and classify people's emotions in images.

Uploaded by

femi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
196 views25 pages

Emotion AI Slides PDF

The document describes building an emotion AI system using deep learning models to analyze facial expressions in images. It involves training two separate models: 1) A convolutional neural network model to detect facial keypoints. 2) A classifier to identify one of five emotions (anger, disgust, sadness, happiness, surprise) based on the facial expression. The system will take an input image, detect the keypoints, identify the emotion, and return the combined prediction to automatically monitor and classify people's emotions in images.

Uploaded by

femi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

INTRODUCTION TO EMOTION AI

• Artificial Emotional Intelligence or Emotion AI is a branch of AI that


allow computers to understand human non-non-verbal cues such as
body language and facial expressions.
• Affectiva offers cutting edge emotion AI tech:
https://www.affectiva.com/

Photo Credit: https://en.wikipedia.org/wiki/File:11vRwNSw.jpg


Photo Credit: https://www.flickr.com/photos/jurvetson/49352718206
PROJECT OVERVIEW
• The aim of this project is to classify people’s emotions based on their
face images.
• In this case study, we will assume that you work as an AI/ML consultant.
• You have been hired by a Startup in San Diego to build, train and deploy
a system that automatically monitors people emotions and expressions.
• The team has collected more than 20000 facial images, with their
associated facial expression labels and around 2000 images with their
facial key-
key-point annotations.
MODEL PREDICTION #1:
FACIAL EXPRESSION
ORIGINAL
(EMOTION): HAPPINESS
INPUT IMAGE
MODEL
EMOTION
PREDICTION #2:
AI MODEL KEY FACIAL POINTS
PROJECT OVERVIEW: PIPELINE TO
DETECT KEYPOINTS & EMOTIONS
1 FACIAL KEY
POINTS COMBINED
DETECTION
MODEL PREDICTION
EMOTION =
INPUT IMAGE HAPPINESS

2 FACIAL
EXPRESSION EMOTION
(EMOTION) CLASS:
DETECTION HAPPINESS
MODEL
PART 1. KEY FACIAL POINTS DETECTION
• In part #1, we will create a deep learning model based on Convolutional
Neural Network and Residual Blocks to predict facial key-
key-points.

Data Source: https://www.kaggle.com/c/facial-keypoints-detection/data


PART 1. KEY FACIAL POINTS DETECTION
• The dataset consists of x and y coordinates of 15 facial key points.
• Input Images are 96 x 96 pixels.
• Images consist of only one color channel (gray-
(gray-scale images). 


KEY FACIAL POINTS 

INPUT IMAGE COORDINATES


.
TRAINED KEY
.
FACIAL POINTS
DETECTOR MODEL

96

5
PART 2. FACIAL EXPRESSION
(EMOTION) DETECTION
• The second model will classify people’s emotion.
• Data contains images that belong to 5 categories:
o 0 = Angry ANGER DIGUST
o 1 = Disgust
o 2 = Sad
o 3 = Happy
o 4 = Surprise SURPRISE!

SAD HAPPINESS

Data is source from Kaggle: https://www.kaggle.com/c/challenges-in-representation-learning-


facial-expression-recognition-challenge/data
PART 2. FACIAL EXPRESSION
(EMOTION) DETECTION
• This dataset consists of 5 different classes.
• Images are 48 x 48 pixels

INPUT IMAGE
TARGET
CLASSES
CLASSIFIER Angry
48 (DEEP LEARNING Disgust
MODEL) Sad
Happiness
Surprise

48

7
NEURON MATHEMATICAL MODEL
• The brain has over 100 billion neurons communicating through
electrical & chemical signals. Neurons communicate with each other
and help us see, think, and generate ideas.
• Human brain learns by creating connections among these neurons.
ANNs are information processing models inspired by the human brain.
• The neuron collects signals from input channels named dendrites,
processes information in its nucleus, and then generates an output in
a long thin branch called axon.

HUMAN NEURON ARTIFICIAL NEURON

• Photo Credit: https://en.wikipedia.org/wiki/File:Neuron-no_labels2.png


• Photo Credit: https://www.flickr.com/photos/alansimpsonme/34752491090
NEURON MATHEMATICAL MODEL:
EXAMPLE
• Bias allows to shift the activation function curve up or down.
• Number of adjustable parameters = 4 (3 weights and 1 bias).
• Activation function “F”.
MULTI-LAYER PERCEPTRON NETWORK
MULTI-
• Let’s connect multiple of these neurons in a multi-
multi-layer fashion.
• The more hidden layers, the more “deep” the network will get.



   

 , 

 P1  

P  Node (n+1, i) representation

P= 2  Non-Linear Sigmoid Activation function


M  1
 PN1  
1   

m: number of neurons in the hidden layer


  , " : number of inputs

  ,
⋮ ⋱ ⋮
!, !, !,

!, !, !,
ANNs TRAINING & TESTING PROCESSES

Performance measure
Training Predicted

(Mean Square Error)


Desired
inputs Output (True)
$
# Output Y
X

$%#
Error = #

Update Network Weights


Predicted
Testing Output
FREEZE NETWORK $
#
inputs
WEIGHTS AND TEST THE X
NETWORK WITH NEW
DATA THAT THE MODEL
HAS NEVER SEEN BEFORE
DIVIDE DATA INTO TRAINING
AND TESTING
• Data set is generally divided into 80% for
TRAINING
training and 20% for testing.
DATASET
• Sometimes, we might include cross
80%
validation dataset as well and then we
divide it into 60%, 20%, 20% segments for
training, validation, and testing,
respectively (numbers may vary). TESTING DATASET
1. Training set: used for gradient 20%
calculation and weight update.
2. Validation set:
cross-validation to
o used for cross- TRAINING
assess training quality as training DATASET
proceeds. 60%
Cross-validation is implemented
o Cross-
VALIDATION
over-fitting which
to overcome over-
DATASET
occurs when algorithm focuses
20%
on training set details at cost of
TESTING DATASET
losing generalization ability.
20%
3. Testing set: used for testing trained
network.
GRADIENT DESCENT
• Gradient descent is an optimization algorithm used
to obtain the optimized network weight and bias
values
• It works by iteratively trying to minimize the cost
function
• It works by calculating the gradient of the cost
function and moving in the negative direction until
the local/global minimum is achieved
• If the positive of the gradient is taken,
local/global maximum is achieved
• The size of the steps taken are called the
learning rate
• If learning rate increases, the area covered in the
search space will increase so we might reach
global minimum faster
• However, we can overshoot the target
• For small learning rates, training will take much
longer to reach optimized weight values

Photo Credit: https://commons.wikimedia.org/wiki/File:Gradient_descent_method.png


Photo Credit: https://commons.wikimedia.org/wiki/File:Gradient_descent.png
GRADIENT DESCENT
THESE ARE MY TRAINING DATA (INPUTS
• Let’s assume that we want to AND OUTPUT)
obtain the optimal values for
parameters ‘m’ and ‘b’. Training Predicted Actual
inputs Output (True)
$
#
& '(∗
X MACHINE Output Y
LEARNING MODEL
& '(∗
GOAL IS TO FIND error = &4 % &
BEST PARAMETERS

• We need to first formulate a


loss function as follows:
Update Weights (parameters)
 
1 
1 
*+,
-./0
1+/ 2 (, '  33+3  &4 % &
" "
 
GRADIENT DESCENT

1 
H+,, -./0
1+/ 2 (, '  &4 % &

Sum of Squared Residuals


"
 OPTIMAL POINT
GRADIENT DESCENT WORKS AS FOLLOWS: GLOBAL MINIMUM

56788
1. Calculate the gradient (derivative) of the Loss function
59
2. Pick random values for weights (m, b) and substitute
3. Calculate the step size (how much are we going to update
the parameters?)
56788
:;<= 8>?< 6<@AB>BC A@;< ∗ CA@D><B; E ∗ 59
4. Update the parameters and repeat Parameters (m, b)
B<9 9<>CF; 76D 9<>CF; – 8;<= 8>?<
56788
9B<9 976D % E ∗ 59

*Note: in reality, this graph is 3D and has three axes, one for m, b and sum of squared residuals
CONVOLUTIONAL NEURAL NETWORKS:
ENTIRE NETWORK OVERVIEW
 FACIAL KEY
 POINTS
 PREDICTION





CONVOLUTION
KERNELS/ POOLING POOLING FLATTENING
FILTERS
FEATURE
DETECTORS
POOLING LAYER EMOTION
CONVOLUTIONAL LAYER (DOWNSAMPLING) ANGRY PREDICTION
HAPPY
SAD
….

Photo Credit: https://commons.wikimedia.org/wiki/File:Artificial_neural_network.svg


RESNET (RESIDUAL NETWORK)
• As CNNs grow deeper, vanishing gradient tend to occur
which negatively impact network performance.
• Vanishing gradient problem occurs when the gradient is
back-propagated to earlier layers which results in a very
back-
small gradient.
• Residual Neural Network includes “skip connection” feature
which enables training of 152 layers without vanishing
gradient issues.
• Resnet works by adding “identity mappings” on top of CNN.
• ImageNet contains 11 million images and 11,000 categories.
• ImageNet is used to train ResNet deep network.
RES-BLOCK FINAL MODEL
INPUT INPUT

CONVOLUTION BLOCK Zero padding

Conv2D
IDENTITY BLOCK BatchNorm, Relu

MaxPool2D
IDENTITY BLOCK
RES-BLOCK

OUTPUT RES-BLOCK

AveragePooling2D

Flatten()

Dense Layer, Relu, Dropout

Dense Layer, Relu, Dropout

Dense Layer, Relu,

KEY-POINTS OR
EMOTION
CONVOLUTION BLOCK IDENTITY BLOCK
INPUT INPUT

Main path Main path

Conv2D Conv2D

Short path BatchNorm, Relu


MaxPool2D Short path

BatchNorm, Relu

Conv2D – kernel(3*3) Conv2D – kernel(3*3)


Conv2D
BatchNorm, Relu BatchNorm, Relu
MaxPool2D

Conv2D Conv2D
BatchNorm

BatchNorm BatchNorm

+ +

Relu Relu
CONFUSION MATRIX

TRUE CLASS
TYPE I
+ - ERROR
PREDICTIONS

+ TRUE + FALSE +

FALSE - TRUE -
TYPE II -
ERROR
DEFINITIONS AND KPIS
• A confusion matrix is used to describe the performance of a
classification model:
o True positives (TP): cases when classifier predicted TRUE (they have
the disease), and correct class was TRUE (patient has disease).
o True negatives (TN): cases when model predicted FALSE (no disease),
and correct class was FALSE (patient do not have disease).
o False positives (FP) (Type I error): classifier predicted TRUE, but
correct class was FALSE (patient did not have disease).
o False negatives (FN) (Type II error): classifier predicted FALSE (patient
do not have disease), but they actually do have the disease
o Classification Accuracy = (TP+TN) / (TP + TN + FP + FN)
o Misclassification rate (Error Rate) = (FP + FN) / (TP + TN + FP + FN)
o Precision = TP/Total TRUE Predictions = TP/ (TP+FP) (When model
predicted TRUE class, how often was it right?)
o Recall = TP/ Actual TRUE = TP/ (TP+FN) (when the class was actually
TRUE, how often did the classifier get it right?)
PRECISION Vs. RECALL EXAMPLE
FACTS:
TRUE CLASS 100 PATIENTS TOTAL
91 PATIENTS ARE HEALTHY
+ - 9 PATIENTS HAVE CANCER
PREDICTIONS

• Accuracy is generally
TP = 1 FP = 1 misleading and is not enough
+ to assess the performance of
a classifier.
• Recall is an important KPI in
situations where:
- FN = 8 TN = 90 o Dataset is highly
imbalanced; cases when
you have small cancer
patients compared to
healthy ones.

o Classification Accuracy = (TP+TN) / (TP + TN + FP + FN) = 91%


o Precision = TP/Total TRUE Predictions = TP/ (TP+FP) = ½=50%
o Recall = TP/ Actual TRUE = TP/ (TP+FN) = 1/9 = 11%
MODEL DEPLOYMENT USING
TENSORFLOW SERVING:
• Let’s assume that we already trained our model and it is generating
good results on the testing data.
• Now, we want to integrate our trained Tensorflow model into a web
app and deploy the model in production level environment.
• The following objective can be obtained using TensorFlow Serving.
TensorFlow Serving is a high-
high-performance serving system for machine
learning models, designed for production environments.
• With the help of TensorFlow Serving, we can easily deploy new
algorithms to make predictions.
• In-order to serve the trained model using TensorFlow Serving, we need
In-
to save the model in the format that is suitable for serving using
TensorFlow Serving.
• The model will have a version number and will be saved in a structured
directory.
• After the model is saved, we can now use TensorFlow Serving to start
making inference requests using a specific version of our trained model
"servable".
RUNNING TENSORFLOW SERVING:
• There are some important parameters:
o rest_api_port: The port that you'll use for REST requests.
o model_name: You'll use this in the URL of REST requests. You can
choose any name
o model_base_path: This is the path to the directory where you've saved
your model.
• For more information regarding REST, check this
out: https://www.codecademy.com/articles/what-
https://www.codecademy.com/articles/what-is-
is-rest
• REST is a revival of HTTP in which http commands have
semantic meaning.
MAKING REQUEST IN TENSORFLOW
SERVING:
In-order to make prediction using TensorFlow Serving, we
• In-
need to pass the inference requests (image data) as a JSON
object.
• Then, we use python requests library to make a post request
to the deployed model, by passing in the JSON object
containing inference requests (image data).
• Finally, we get the prediction from the post request made to
the deployed model and then use argmax function to find
the predicted class.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy