0% found this document useful (0 votes)
8 views44 pages

AI Lecture 10

This document discusses Instance-Based Learning (Lazy Learning) and Artificial Neural Networks (ANN). It explains the K-Nearest Neighbors (K-NN) algorithm, its steps, advantages, and disadvantages, as well as the architecture and functioning of ANNs. Additionally, it covers activation functions and their importance in neural networks.

Uploaded by

dekukunmc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views44 pages

AI Lecture 10

This document discusses Instance-Based Learning (Lazy Learning) and Artificial Neural Networks (ANN). It explains the K-Nearest Neighbors (K-NN) algorithm, its steps, advantages, and disadvantages, as well as the architecture and functioning of ANNs. Additionally, it covers activation functions and their importance in neural networks.

Uploaded by

dekukunmc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

ARTIFICIAL

INTELLIGENCE
DR. MANAL TANTAWI
ASSOCIATE PROFESSOR
IN
SCIENTIFIC COMPUTING DEPARTMENT
FACULTY OF COMPUTER & INFORMATION
SCIENCES
AIN SHAMS UNIVERSITY
PART 10

➢INSTANCE BASED LEARNING (LAZY


LEARNING)
➢INTRODUCTION TO ARTIFICIAL NEURAL
NETWORKS (ANN)
INSTANCE BASED
LEARNING (LAZY
LEARNING)
INSTANCE BASED LEARNING (LAZY LEARNING)
➢ In machine learning, lazy learning is a learning method in which
generalization of the training data is, in theory, delayed until a
query is made to the system, as opposed to eager learning, where
the system tries to generalize the training data before receiving
queries.

➢ Examples for eager learning: neural networks, SVM and decision


trees.

➢ Example for lazy learning: K-Nearest Neighbors (K-NN) which


basically stores all the points, then uses that data when you make a
query to it.
K-NEAREST NEIGHBORS (K-NN) CLASSIFIER

➢ K-Nearest Neighbour is one of the simplest Supervised Learning Machine


Learning algorithms.

➢ K-NN algorithm assumes the similarity between the new data example and
available training data and put the new example into the class that is most like
the available classes.

➢ K-NN algorithm stores all training data and classifies a new data example
based on the similarity. This means when new data example appears then it can
be easily classified into a well suite class by using K-NN algorithm.
K-NEAREST NEIGHBORS (KNN) CLASSIFIER

➢ K-NN algorithm can be used for Regression as well as for Classification but
mostly it is used for the Classification problems.

➢ K-NN is a non-parametric algorithm, which means it does not make any


assumption on underlying data.
K-NEAREST
NEIGHBORS
(KNN)
CLASSIFIER
STEPS FOR K-NN
After defining the number K of the neighbors
•Step-1: Calculate the similarity between the new example and all examples
in the training dataset.

•Step-2: Take the K nearest neighbors as per the calculated similarity.

•Step-3: Among these k neighbors, count the number of the data examples in
each class.

•Step-4: Assign the new example to that class for which the number of the
neighbor is maximum.
SIMILARITY MEASURE

• EUCLIDEAN DISTANCE
EXAMPLE FOR 3-NN
Training data

X1 X2 Y
1 2 A
1 1 A
3 2 B
3 0 A
4 3 B

New example (3,4) what will be its label ??


EXAMPLE FOR 3-NN (USING EUCLIDEAN DISTANCE)
Training data
New example (3,4) what will be its label ??
X1 X2 Y
1 2 A Find Euclidean between (3, 4) and all other
1 1 A data points.
3 2 B With (1, 2) -> sqrt [(3−1)2 + (4−2)2 ] = 2.83
3 0 A With (1, 1) -> sqrt [(3−1)2 + (4−1)2 ] = 3.6
4 3 B With (3, 2) -> sqrt [(3−3)2 + (4−2)2 ] = 2
With (3, 0) -> sqrt [(3−3)2 + (4−0)2 ] = 4
With (4, 3) -> sqrt [(3−4)2 + (4−3)2 ] = 1.41
First three are (4, 3), ( 3, 2) and (1,2)
EXAMPLE FOR 3-NN (USING EUCLIDEAN DISTANCE)
Training data
New example (3,4) what will be its label ??
X1 X2 Y
1 2 A Find class of maximum number of examples
1 1 A (4, 3) -> B
3 2 B ( 3, 2) -> B
3 0 A (1, 2) -> A
4 3 B
Thus, the winner class for the 3 neighbors is B
FINDING THE BEST K

➢ There is no way to determine the best value for "K", so we


need to try some values to find the best out of them. The
most preferred value for K is 3 or 5.

➢ A very low value for K such as K=1 or K=2, can be noisy and
lead to the effects of outliers in the model.

➢ Large values for K are also not preferable.


K-NN ADVANTAGES & DISADVANTAGES

➢ ADVANTAGES
1) SIMPLE
2) NON-PARAMETRIC
➢ DISADVANTAGES
1) HIGH COMPUTATION OVERHEAD AND TIME CONSUMPTION ESPECIALLY FOR LARGE
DATASETS.
2) HIGH MEMORY USAGE ( SAVING ALL TRAINING DATA)
ARTIFICIAL NEURAL
NETWORKS (ANN)
WHY ANN ?

• ANN IS A POWERFUL SOLUTION FOR COMPLEX


NONLINEAR PROBLEMS
ARTIFICIAL NEURAL NETWORKS (ANN)
ARTIFICIAL NEURAL NETWORKS (ANN)
ARTIFICIAL NEURAL NETWORKS (ANN)
ARTIFICIAL NEURAL NETWORKS
➢ Try to mimic the brain.

➢ It is based on the interaction of many connected processing


elements.

➢ It can learn from experience to improve its performance.

➢ It can deal with incomplete information.

➢ It is the state of art technique for many applications.


CENTRAL NERVOUS SYSTEM
(BRAIN AND SPINAL CORD)
➢ In short, our nervous systems detect what is going on around us and inside of us;
they decide how we should act, alter the state of internal organs (heart rate
changes, for instance), and allows us to think about and remember what is going
on. To do this, it relies on a sophisticated network — neurons.

➢ There are billions of neurons in the brain; Each neuron is connected to some other
neurons (ex: 1000 neurons), creating an incredibly complex network of
communication. Neurons are considered the basic units of the nervous system.

➢ Neuron is the basic working unit of the brain, a specialized cell designed to
transmit information to other nerve cells, muscle, or gland cells.
BRAIN NEURON
➢ Neurons can only be seen using a microscope and
can be split into three parts:
Dendrites — these thin filaments carry information
from other neurons to the soma. They are the “input”
part of the cell.

Soma (cell body) — this portion of the neuron


receives information. It contains the cell’s nucleus.

Axon — this long projection carries information from


the soma and sends it off to other cells. This is the
“output” part of the cell. It normally ends with a
synapses connecting to the dendrites of other neurons.
TYPES OF NEURONS

Afferent neurons (sensory neurons)— these take


messages from the rest of the body and deliver
them to the central nervous system.

Interneurons — these relay messages between


neurons in the central nervous system.

Efferent neurons (ex: motor neurons)— these


take messages from the central nervous
system (brain and spinal cord) and deliver them
to cells in other parts of the body.
HOW DO NEURONS CARRY A MESSAGE?

➢ If a neuron receives many inputs from other


neurons, these signals add up until they
exceed a particular threshold.

➢ Once this threshold is exceeded, the neuron is


triggered to send an impulse along its axon
— this is called an action potential.

➢ An action potential is created by the


movement of electrically charged atoms (ions)
across the axon’s membrane.
ARTIFICIAL NEURAL NETWORKS (ANN)

ANN architectures
➢ Single Layer Perceptron (one output layer, no hidden
layers)

➢ Multi Layer Perceptron (one or more hidden layers)

➢ Recurrent Networks (feedback) (out of scope)


ARTIFICIAL NEURAL NETWORKS (ANN)
ANN architectures

Single Layer perceptron (SLP) Multi-Layer perceptron (MLP)


LINEARLY & NON-LINEARLY SEPARABLE PROBLEMS

Non-Linear
Linearly
Separable
separable

Single
layer Multi-
Perceptron layer
Perceptron
ARTIFICIAL NEURON
Bias neuron

X0

Ꝋ0 Computation (cell body)


Ꝋ1

Ꝋ2

Ꝋ3 Output (Axon)
1
𝑇𝑥 Weights
1 + 𝑒 −𝜃
inputs (Dendrites)
ACTIVATION FUNCTIONS

• ACTIVATION FUNCTIONS (ALSO NAMED TRANSFER FUNCTIONS OR


SQUASHING FUNCTIONS) PLAY A CRUCIAL ROLE IN NEURAL
NETWORKS, PERFORMING A VITAL FUNCTION TO SOLVE COMPLEX
PROBLEMS.
ACTIVATION FUNCTIONS

➢ Activation functions are necessary for neural networks because, without them, the output
of the model would simply be a linear function of the input. In other words, it wouldn’t be
able to handle large volumes of complex data.

➢ Activation functions can generally be classified into three main categories: binary step,
linear, and non-linear, with numerous subcategories, derivatives, variations, and other
calculations now being used in neural networks.
ACTIVATION FUNCTIONS
➢ Binary step is the simplest type of activation function, where the output is binary based
on whether the input is above or below a certain threshold. Linear functions are also
relatively simple, where the output is proportional to the input. Non-linear functions are
more complex and introduce non-linearity into the model, such as Sigmoid ,Tanh, Softmax
and Rectified Linear Unit (ReLU) (used in deep learning models).

➢ The non-linear activation functions are the commonly used and enable the neural network
by their nonlinearity to model more complex functions within every node, which enables
the neural network to learn more effectively.
ACTIVATION FUNCTIONS

RELU Function
Sigmoid Function Tanh (hyperbolic) Function (Deep Learning)
ACTIVATION FUNCTIONS
➢ SoftMax function is used in the neurons of outer layer of multiclass network.

➢ It turns logits value into probabilities by taking the exponents of each output
and then normalizing each number by the sum of those exponents so that the
entire output vector adds up to one. Logits are the raw score values
produced by the last layer of the neural network before applying any
activation function on it.

➢ The equation of the SoftMax function can be given as:


• BIAS CAN BE DEFINED AS THE CONSTANT WHICH IS ADDED
TO THE PRODUCT OF FEATURES AND WEIGHTS. IT IS USED
TO OFFSET THE RESULT. IT HELPS THE MODELS TO SHIFT THE
ACTIVATION FUNCTION TOWARDS THE POSITIVE OR
NEGATIVE SIDE.
BAIS OF
EACH
NEURON IN
ANN
BAIS OF EACH NEURON IN ANN (Ꝋ0)
For example: sigmoid activation function

if we vary the values of the weights Ꝋs, keeping bias if we vary the values of the weight Ꝋs, and vary the
‘Ꝋ0’=0, we will get the above graph value of bias ‘Ꝋ0’, we will get the above graph
ARTIFICIAL NEURON (SECOND VISIT)
Bias neuron

X0

Ꝋ0 Computation (cell body)


Ꝋ1

Ꝋ2

Ꝋ3 Output (Axon)
1
𝑇𝑥 Weights
1 + 𝑒 −𝜃
inputs (Dendrites)
ANN ARCHITECTURES

Single Layer perceptron Multi-Layer perceptron


SINGLE LAYER
PERCEPTRON
(LINEARLY SEPARABLE
PROBLEMS)
ONE OUTPUT
AND FUNCTION

-30

20

20
Y
0
0
ℎ𝜃 𝑥 = 𝑔( −30 𝑋0 + 20𝑋1 + 20 𝑋2 ) 0
1
OR FUNCTION

-10

20

20
Y
0
1
ℎ𝜃 𝑥 = 𝑔( −10 𝑋0 + 20𝑋1 + 20 𝑋2 ) 1
1

To be continued …..
CREDIT FOR

• MACHINE LEARNING – TOM M. MITCHELL


THANK YOU

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy