0% found this document useful (0 votes)

41 views31 pages

Chapter 7

The document discusses neural networks and neural language models, explaining that neural networks are composed of interconnected computational units modeled after neurons that take weighted inputs and produce an output. It provides details on the basic components of neural networks, including units, weights, biases, activation functions, and how they are arranged in layers to perform tasks like XOR classification through hidden representations. Feedforward neural networks are introduced as the simplest type of neural network with units connected in a forward direction from input to output layers.

Uploaded by

Fairooz Toroshe

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views31 pages

Chapter 7

Uploaded by

Fairooz Toroshe

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

CHAPTER 7 | Neural Networks and

Neural Language Models

“[M]achines of this character can behave in a very

complicated manner when the number of units is large.”

Alan Turing (1948) “Intelligent Machines”, page 6

1
Introduction

They are called neural because:

 Origins lie in the neuron

 Simplified model of the human
neuron like computing element
 Described in terms of
propositional logic
Introduction
Neural network is a

Using vector notation:

z = w· x+b
[w: weight vector, b: scalar bias b, x: input vector x]
Activation

 a non-linear function f to z
 output of this function : the activation value, a
y = a = f(z)
 Different activation functions:
• Sigmoid
• Tanh
• rectified linear unit or ReLU
Sigmoid
it maps the output into the range (0,1)


 the output of a neural unit:

 Example:

weight vector: w = [0.2,0.3,0.9] , bias: b = 0.5 and x = [0.5,0.6,0.1]

Sigmoid

 Used in the output layer of binary classification

 Disadvantage:
• Non-zero centered output
Tanh
 Advantages: mean of the activation closer to zero
Relu (Rectified linear unit)

y = ReLU(z) = max(z,0)

Advantages:

• Avoids vanishing gradient problem

• Doesn't become saturated
Vanishing Gradient:
Networks is trained by
 Propagating an error signal backwards

Now,

 gradients that are almost 0 cause the error signal to get smaller to be used
for training
 a problem called the vanishing gradient problem
The XOR problem
Can neural units compute simple functions of input?
AND OR
XOR
x1 x2 y x1 x2 y x1 x2 y
0 0 0 0 0 0 0 0 0
0 1 0 0 1 1 0 1 1
1 0 0 1 0 1 1 0 1
1 1 1 1 1 1 1 1 0
Perceptrons

A very simple neural unit
• Binary output (0 or 1)
• No non-linear activation function
Easy to build AND or OR with perceptrons
AND :

0
0 0 0

-1

Easy to build AND or OR with perceptrons
AND :

0
1 1 0

-1

Easy to build AND or OR with perceptrons
AND :

1
0 0 0

-1

Easy to build AND or OR with perceptrons
AND :

1
1 1 1

-1

Easy to build AND or OR with perceptrons
OR :

    0
0
0 0 0

0
Easy to build AND or OR with perceptrons
OR :

    0
0
1 1 1

0
Easy to build AND or OR with perceptrons
OR :

    1
1
0 0 1

0
Easy to build AND or OR with perceptrons
OR :

    1
1
1 1 1

0
Not possible to capture XOR with perceptrons !!

 Perceptron equation given x1 and x2, is the equation of a line

w1x1 + w2x2 + b = 0

 in standard linear format: x2 = (−w1/w2)x1 + (−b/w2)

 This line acts as a decision boundary
• 0 if input is on one side of the line
• 1 if on the other side of the line
Decision boundaries
x x x
12 12 12

?
0 x 0 x 0
0 1 0 1 0 1
1 1
a) x1 b) x1 c) x1

AND x2 OR x2 XOR x2
 Filled circles represent perceptron outputs of 1,
 white circles perceptron outputs of 0
 no way to draw a line that correctly separates the two categories for
XOR
The solution: neural networks
 Can be calculated by a layered network of perceptron units

 Can compute using two layers of ReLU-based units

• The middle layer (called h) has two units
• the output layer (called y) has one unit
The solution: neural networks
0 0
                           0
                 0
                                       0  input x = [0, 0]
0  In hidden layer, [0, -1]
0 0 -1 =>[0]  After Relu h layer as
0 [0, 0]
-1  Final output 0
The solution: neural networks
0 0
                           1
                 1
                                       1  input x = [0, 1]
0  In hidden layer, [1, 0]
1 0 0  After Relu h layer as
1 [1, 0]
-1  Final output 1
The hidden representation h
 hidden representations x
= [0, 1] and x = [1, 0] are
merged into h = [1, 0]
 The merger makes it easy
to linearly separate the
positive and negative
cases of XOR
Feedforward Neural Networks
 Simplest kind of neural network
 Multilayer network
 Units are connected with no cycles
 Outputs from units in each layer are passed to units in the
next higher layer
 No outputs are passed back to lower layers
 Sometimes called multi-layer perceptrons
Feedforward Neural
Networks
 feedforward networks have three
kinds of nodes
• input units, hidden units, and
output units
 fully-connected
• Takes as input the outputs from
previous layer
• link between every pair of units
from two adjacent layers
Hidden layer computation
3 steps : Multiplying the weight matrix W by the input vector x , adding the
bias vector  b , applying the activation function g

h = σ(Wx+b)
X = Where,
g[z1,z2,z3] =
[g(z1),g(z2),g(z3)]

+ =
Output layer computation
weight matrix U , U ∈ n2×n1
input vector (h)
intermediate output z, z ∈ R n2
Where, z = Uh

z : a vector of real-valued number

can’t be the output of the classifier
Softmax
For normalizing a vector of real values
To a vector that encodes a probability
distribution between 0 and 1
 sum to 1
Used for muliclass classification
Final Equation

h = σ(Wx+b)
z = Uh
y = softmax(z)

Where x ∈ n0 , h ∈ n1 , b ∈ n1 , W ∈ n1×n0 , U ∈ n2×n1 , and the

output vector y ∈ n2

Learning XOR - Gradient Based Learning - Hidden Units
No ratings yet
Learning XOR - Gradient Based Learning - Hidden Units
43 pages
Intelligent Automation Synopsys - Pascal Bornet
No ratings yet
Intelligent Automation Synopsys - Pascal Bornet
3 pages
Artificial Intelligence - Model Question Paper
100% (2)
Artificial Intelligence - Model Question Paper
2 pages
Neural Networks
100% (1)
Neural Networks
119 pages
Purposive Communication A
No ratings yet
Purposive Communication A
5 pages
P95 Course Slides
No ratings yet
P95 Course Slides
86 pages
02 Perceptron
No ratings yet
02 Perceptron
114 pages
Intelligent Agents
No ratings yet
Intelligent Agents
61 pages
Poster Template
100% (1)
Poster Template
1 page
ECBFMBP: Design of An Ensemble Deep Learning Classifier With Bio-Inspired Feature Selection For High-Efficiency Multidomain Bug Prediction
100% (1)
ECBFMBP: Design of An Ensemble Deep Learning Classifier With Bio-Inspired Feature Selection For High-Efficiency Multidomain Bug Prediction
24 pages
Module I
No ratings yet
Module I
109 pages
Power BI Bible
100% (4)
Power BI Bible
396 pages
6 - Feature Descriptor - HOG
No ratings yet
6 - Feature Descriptor - HOG
81 pages
CSE 433 - Presentation 0423
No ratings yet
CSE 433 - Presentation 0423
8 pages
Chapter 5 Artificial Neural Networks
No ratings yet
Chapter 5 Artificial Neural Networks
50 pages
Lecture 3
No ratings yet
Lecture 3
30 pages
Neural Network and Fuzzy Logic
50% (2)
Neural Network and Fuzzy Logic
54 pages
15 - DT and KNN Algorithm
No ratings yet
15 - DT and KNN Algorithm
34 pages
18 - Computational Complexity
No ratings yet
18 - Computational Complexity
21 pages
08 Neural Networks
No ratings yet
08 Neural Networks
47 pages
14 - Performance Measure - Final
No ratings yet
14 - Performance Measure - Final
17 pages
13 3 KMeans
No ratings yet
13 3 KMeans
15 pages
Bba 1
No ratings yet
Bba 1
36 pages
End-To-End Speaker Segmentation For Overlap-Aware Resegmentation
No ratings yet
End-To-End Speaker Segmentation For Overlap-Aware Resegmentation
5 pages
NLP ch4 l1
No ratings yet
NLP ch4 l1
23 pages
Neural Networks and Neural Language Models
No ratings yet
Neural Networks and Neural Language Models
27 pages
Machine Learning Extended Project
No ratings yet
Machine Learning Extended Project
3 pages
Y W X y F (X) X W X: Neural Networks Viewed As Directed Graph
No ratings yet
Y W X y F (X) X W X: Neural Networks Viewed As Directed Graph
12 pages
P5 Neural Nets
No ratings yet
P5 Neural Nets
114 pages
Paper 2-Application of Machine Learning Approaches in Intrusion Detection System
No ratings yet
Paper 2-Application of Machine Learning Approaches in Intrusion Detection System
10 pages
Unit I - Artificial Neural Network - Assignment 1
No ratings yet
Unit I - Artificial Neural Network - Assignment 1
4 pages
#Ai ML
No ratings yet
#Ai ML
102 pages
02A DL2023 NN Basics
No ratings yet
02A DL2023 NN Basics
52 pages
From Perceptron To Deep Neural Nets - Becoming Human - Artificial Intelligence Magazine
No ratings yet
From Perceptron To Deep Neural Nets - Becoming Human - Artificial Intelligence Magazine
36 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
87 pages
5 - From Linear Models To Multi-Layer Perceptrons
No ratings yet
5 - From Linear Models To Multi-Layer Perceptrons
45 pages
Neural Deep Learning
No ratings yet
Neural Deep Learning
221 pages
2003.00130 - James Wallbridge - Transformers For Limit Order Books
No ratings yet
2003.00130 - James Wallbridge - Transformers For Limit Order Books
16 pages
Video Representation Learning by Dense Predictive Coding 1909.04656
No ratings yet
Video Representation Learning by Dense Predictive Coding 1909.04656
13 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
67 pages
Lexical Analysis
No ratings yet
Lexical Analysis
153 pages
Neural Network
No ratings yet
Neural Network
82 pages
7 NN Apr 28 2021
No ratings yet
7 NN Apr 28 2021
81 pages
Ch5-Feedforward Neural Networks, Word Embeddings, Neural Language Models, and Word2vec PDF
No ratings yet
Ch5-Feedforward Neural Networks, Word Embeddings, Neural Language Models, and Word2vec PDF
67 pages
CSC 462 - AI-Week 01
No ratings yet
CSC 462 - AI-Week 01
30 pages
Deep Learning
No ratings yet
Deep Learning
180 pages
Human Resource Management: Definition: Human Resource Management (HRM or HR) Is The Strategic and Coherent Approach
No ratings yet
Human Resource Management: Definition: Human Resource Management (HRM or HR) Is The Strategic and Coherent Approach
2 pages
Collective Bargaining Agent
No ratings yet
Collective Bargaining Agent
3 pages
Code Question1-Adaline
No ratings yet
Code Question1-Adaline
29 pages
Journal
No ratings yet
Journal
10 pages
02 Deep Feedforward Learning - Notes
No ratings yet
02 Deep Feedforward Learning - Notes
34 pages
Chapter 03sp1718
No ratings yet
Chapter 03sp1718
32 pages
08 NN
No ratings yet
08 NN
117 pages
Module 5
No ratings yet
Module 5
27 pages
Management Lecture
No ratings yet
Management Lecture
62 pages
ML Lecture#4
No ratings yet
ML Lecture#4
109 pages
Ann Muj
No ratings yet
Ann Muj
65 pages
CH 12 - Artificial Neural Networks
No ratings yet
CH 12 - Artificial Neural Networks
39 pages
Week 03-04 - Deep Feedforward Networks - Intro
No ratings yet
Week 03-04 - Deep Feedforward Networks - Intro
141 pages
COS221 EO3 v2
No ratings yet
COS221 EO3 v2
4 pages
L2 Perceptrons, Function Approximation, Classification
No ratings yet
L2 Perceptrons, Function Approximation, Classification
89 pages
AN2DL 02 2324 Perceptron 2 FeedForward
No ratings yet
AN2DL 02 2324 Perceptron 2 FeedForward
55 pages
Revised PROOFREAD Thesis Document
No ratings yet
Revised PROOFREAD Thesis Document
74 pages
22SCSE1180094 - Shyam Lab File (SCA) !!!!
No ratings yet
22SCSE1180094 - Shyam Lab File (SCA) !!!!
28 pages
Neural Network Notes
No ratings yet
Neural Network Notes
8 pages
NN 1
No ratings yet
NN 1
6 pages
Dave Reed: Connectionist Approach To AI
No ratings yet
Dave Reed: Connectionist Approach To AI
26 pages
Ai Unit 4 Part 2
No ratings yet
Ai Unit 4 Part 2
45 pages
Neural Networks
No ratings yet
Neural Networks
28 pages
AND Operation Using NN
No ratings yet
AND Operation Using NN
53 pages
Lecture 18 - Kohonen SOM
No ratings yet
Lecture 18 - Kohonen SOM
17 pages
A Survey On Word Sense Disambiguation: Agatheeswaran.T, Vigneshwaran.M, Tamilnadu College of Engineering, Coimbatore
No ratings yet
A Survey On Word Sense Disambiguation: Agatheeswaran.T, Vigneshwaran.M, Tamilnadu College of Engineering, Coimbatore
22 pages
Computer Vision: Chapter 5. Segmentation
100% (1)
Computer Vision: Chapter 5. Segmentation
16 pages
ANN Architecture
No ratings yet
ANN Architecture
41 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
221 pages
Management Job Evaluation
No ratings yet
Management Job Evaluation
7 pages
Advanced Machine Learning
No ratings yet
Advanced Machine Learning
2 pages
Maven Advanced Tableau
No ratings yet
Maven Advanced Tableau
94 pages
Deep Learning Assignment2 Solutions PDF
No ratings yet
Deep Learning Assignment2 Solutions PDF
16 pages
DL 02 Deep Forward Networks
No ratings yet
DL 02 Deep Forward Networks
47 pages
DL Mod 1 Final
No ratings yet
DL Mod 1 Final
4 pages
Artificial Intelligence: Outline
No ratings yet
Artificial Intelligence: Outline
35 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
Realistic Face Image Generation Based On Generative Adversarial Network
No ratings yet
Realistic Face Image Generation Based On Generative Adversarial Network
4 pages
Slide 2
No ratings yet
Slide 2
35 pages
ECE/CS 559 - Neural Networks Lecture Notes #4 The Perceptron and Its Training
No ratings yet
ECE/CS 559 - Neural Networks Lecture Notes #4 The Perceptron and Its Training
12 pages
Module 2
No ratings yet
Module 2
44 pages
02 Neural Network
No ratings yet
02 Neural Network
28 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
216 pages
Module 2 DL Snotes P1
No ratings yet
Module 2 DL Snotes P1
16 pages
Leadership
No ratings yet
Leadership
9 pages
Information Brochure TIG
No ratings yet
Information Brochure TIG
7 pages
9.deep Feedforward Networks
100% (1)
9.deep Feedforward Networks
13 pages
Neural Networks
No ratings yet
Neural Networks
54 pages
Statistical Machine Learning-The Basic Approach and Current Research Challenges
No ratings yet
Statistical Machine Learning-The Basic Approach and Current Research Challenges
35 pages
Instrumentation and Process Control (ChE 323)
No ratings yet
Instrumentation and Process Control (ChE 323)
8 pages
Neural Network Presentation
No ratings yet
Neural Network Presentation
33 pages
Contents MLP PDF
No ratings yet
Contents MLP PDF
60 pages
Supervised Learning Neural Networks
No ratings yet
Supervised Learning Neural Networks
34 pages
Unit - II ML
No ratings yet
Unit - II ML
9 pages
Perceptron: Neuron Model (Special Form of Single Layer Feed Forward)
No ratings yet
Perceptron: Neuron Model (Special Form of Single Layer Feed Forward)
17 pages
Artificial Neural Network: Lecture Module 22
No ratings yet
Artificial Neural Network: Lecture Module 22
54 pages
Swagat Kumar: PHD Thesis: Kinematic Control of Redundant Manipulators Using Neural Networks
No ratings yet
Swagat Kumar: PHD Thesis: Kinematic Control of Redundant Manipulators Using Neural Networks
4 pages
DBMS CSE Syllabus
No ratings yet
DBMS CSE Syllabus
4 pages
Financial Statement
No ratings yet
Financial Statement
15 pages
Hidden Markov Model (HMM) Tutorial: Home Ciphers Cryptanalysis Hashes Resources
No ratings yet
Hidden Markov Model (HMM) Tutorial: Home Ciphers Cryptanalysis Hashes Resources
5 pages
Process Control
100% (1)
Process Control
3 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Chapter 7

Uploaded by

Chapter 7

Uploaded by

CHAPTER 7 | Neural Networks and

Neural Language Models

“[M]achines of this character can behave in a very

Alan Turing (1948) “Intelligent Machines”, page 6

They are called neural because:

 Origins lie in the neuron

 Network of small computing units

Using vector notation:

weight vector: w = [0.2,0.3,0.9] , bias: b = 0.5 and x = [0.5,0.6,0.1]

 Used in the output layer of binary classification

• Avoids vanishing gradient problem

 Can compute using two layers of ReLU-based units

z : a vector of real-valued number

Where x ∈ n0 , h ∈ n1 , b ∈ n1 , W ∈ n1×n0 , U ∈ n2×n1 , and the

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.