CM20315 05 Loss
CM20315 05 Loss
• Two rules:
Regression
or for short:
Loss function
• Training dataset of I pairs of input/output examples:
or for short:
Returns a scalar that is smaller
when model maps inputs to
outputs better
Training
• Loss function:
Returns a scalar that is smaller
when model maps inputs to
outputs better
Loss function:
Now it’s a sum of terms, so doesn’t matter so much if the terms are small
Minimizing negative log likelihood
• By convention, we minimize things (i.e., a loss)
Inference
• But now we predict a probability distribution
• We need an actual prediction (point estimate)
• Find the peak of the probability distribution (i.e., mean for normal)
Loss functions
• Maximum likelihood
• Recipe for loss functions
• Example 1: univariate regression
• Example 2: binary classification
• Example 3: multiclass classification
• Other types of data
• Multiple outputs
• Cross entropy
Recipe for loss functions
Recipe for loss functions
Recipe for loss functions
Recipe for loss functions
Loss functions
• Maximum likelihood
• Recipe for loss functions
• Example 1: univariate regression
• Example 2: binary classification
• Example 3: multiclass classification
• Other types of data
• Multiple outputs
• Cross entropy
Example 1: univariate regression
Example 1: univariate regression
• Domain:
• Bernoulli distribution
• One parameter [0,1]
Example 2: binary classification
Problem:
• Output of neural network can be anything
• Parameter [0,1]
Solution:
• Pass through function that maps “anything to
[0,1]
Example 2: binary classification
Problem:
• Output of neural network can be anything
• Parameter [0,1]
Solution:
• Pass through logistic sigmoid function that
maps “anything to [0,1]:
Example 2: binary classification
Example 2: binary classification
Example 2: binary classification
• Domain:
• Categorical distribution
• K parameters [0,1]
• Sum of all parameters = 1
Example 3: multiclass classification
Problem:
• Output of neural network can be anything
• Parameters [0,1], sum to one
Solution:
• Pass through function that maps
“anything” to [0,1], sum to one
Example 3: multiclass classification
Example 3: multiclass classification
Minimum
negative log
likelihood
Cross entropy in machine learning
Minimum
negative log
likelihood
In machine learning:
Next up
• We have models with parameters!
• We have loss functions!
• Now let’s find the parameters that give the smallest loss
• Training, learning, or fitting the model
Feedback