0% found this document useful (0 votes)

5 views38 pages

AI2025 Lecture06 Recording Slide

The document discusses shallow neural networks and their comparison to logistic regression, highlighting the importance of model complexity in tasks like classification and regression. It reviews key concepts such as supervised learning, gradient descent, and the structure of neural networks, including layers and neurons. The presentation emphasizes that shallow neural networks can perform more complex tasks than logistic regression, particularly in cases like XOR problems.

Uploaded by

chiyeon0607

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views38 pages

AI2025 Lecture06 Recording Slide

Uploaded by

chiyeon0607

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

1

AI System Semiconductor Design

Lecture6: Shallow Neural Networks
Lecturer: Taewook Kang
Acknowledgments
Lecture material adapted from
Prof. Woowhan Jung, DSLAB, Hanyang Univ.
Andrew Ng, DeepLearning AI

SKKU Kang Research Group / SEE3007 Spring 2025 1 1

Review: Supervised Learning

Dataset Goal: generalize the input-output relationship

𝒙 Model 𝑦ො ≈ 𝑦
Input Output
(features) (prediction)

Training?
𝟏 1 2 2 𝑚 𝑚
𝐷= 𝒙 ,𝑦 , 𝒙 ,𝑦 ,… 𝒙 ,𝑦 Building a model to make the model can predict
the labels by using train data
Each row of data is called an observation or a tuple

SKKU Kang Research Group / SEE3007 Spring 2025 Prof. Woowhan Jung, DSLAB, Hanyang Univ. 2
Review: Classification vs Regression
Q1. Classification? Regression?
100

Life span (years)

80
Predicted rating
70
Q2. Classification? Regression?
60

Cat
Classification Regression

Output Categorical value Numeric value

type (class)
Dog

SKKU Kang Research Group / SEE3007 Spring 2025 3

4
Review: Logistic Regression (Classification Problem)
Many vector/matrix
operations

Input Model Output Label

Architecture 𝑦ො = 𝑃 𝐶𝑎𝑡 𝑦=1

+ Parameters = 0.9

Compute loss
Update parameters
to minimize the loss

SKKU Kang Research Group / SEE3007 Spring 2025 4

Review: Logistic Regression
1
▪ Output: 𝑦ො = 𝜎 𝒘⊺ 𝒙
+ 𝑏 where 𝜎 𝑧 = 1+𝑒 −𝑧
▪ Loss: 𝐿 𝑦,
ො 𝑦 = −𝑦 log 𝑦ො − 1 − 𝑦 log 1 − 𝑦ො 1
1 𝑚 𝜎 𝑧 =
1 + 𝑒 −𝑧
▪ Cost: 𝐽 𝒘, 𝑏 = 𝑚 σ𝑖=1 𝐿 𝑦ො (𝑖) , 𝑦 (𝑖)
Sigmoid function
→ Good for probability output

𝑦
Features 𝒙

𝒘 𝑧 = 𝒘𝑇 𝒙 + 𝑏 𝑦ො = 𝑎 = 𝜎 𝑧 𝐿 𝑦,
ො 𝑦
Parameters
𝑏

SKKU Kang Research Group / SEE3007 Spring 2025 5

Review: Logistic Regression

▪ Example
𝑥1 𝑤1
▪ 𝒙=
𝑥2
, 𝒘=
𝑤2 1
▪ 𝑧 = 𝒘𝑇 𝒙 + 𝑏 = 𝑤1𝑥1 + 𝑤2𝑥2 + 𝑏 𝜎 𝑧 =
1 + 𝑒 −𝑧
▪ 𝑦ො = 𝑎 = 𝜎 𝑧 = 𝜎 𝑤1𝑥1 + 𝑤2𝑥2 + 𝑏

𝑦
Features 𝒙

𝒘 𝑧 = 𝒘𝑇 𝒙 + 𝑏 𝑦ො = 𝑎 = 𝜎 𝑧 𝐿 𝑦,
ො 𝑦
Parameters
𝑏

SKKU Kang Research Group / SEE3007 Spring 2025 6

Review - Gradient Descent
▪ Algorithm to minimize a cost function 𝐽 𝜃 θ = {𝒘, 𝑏}
▪ 𝐽 𝜃 : cost function
▪ 𝜃: model parameters
▪ 𝜂: Learning rate

Repeatedly update

𝜃 ∗ = 𝜃- 𝜂 ⋅ 𝛻𝜃 𝐽(𝜃)
𝜃∗
By Hakky St
SKKU Kang Research Group / SEE3007 Spring 2025 7
Computing the Parameters with Gradient Decent
𝑦ො = 𝜎 𝒘⊤ 𝒙 + 𝑏
∗
𝑚 𝜃 = 𝜃- 𝜂 ⋅ 𝛻𝜃 𝐽(𝜃)
1
𝐽 𝒘, 𝑏 = ෍ 𝐿 𝑦ො (𝑖) , 𝑦 (𝑖)
𝑚
𝑖=1

𝜕𝐽 𝜃
𝜕𝑤1
𝑤1 𝜕𝐽 𝜃
𝑤2 𝜕𝑤2
𝜃= … 𝛻𝜃 𝐽 𝜃 = …
𝑤𝑛 𝜕𝐽 𝜃
𝑏 𝜕𝑤𝑛
𝜕𝐽 𝜃
𝜕𝑏

SKKU Kang Research Group / SEE3007 Spring 2025 8

XOR Operator: Logistic Regression?

▪ In our previous assignment, we figured out that logistic regression isn’t doing a good
job on XOR type of dataset.

SKKU Kang Research Group / SEE3007 Spring 2025 9

Decision boundary of the logistic regression
𝑥2

1
𝜎 𝑧 =
1 + 𝑒 −𝑧
𝒘⊤ 𝒙 + 𝑏 = 0

𝜎 𝒘⊤ 𝒙 + 𝑏 = 0.5

1
𝑦ො = 𝜎 𝒘⊤ 𝒙 + 𝑏 where 𝜎 𝑧 = 1+𝑒 −𝑧

𝑥1 𝑃 𝑦 = 1 𝒙 = 𝑦ො = 𝜎 𝒘⊤ 𝒙 + 𝑏 > 0.5
𝑦: =0
⇔ 𝒘⊤ 𝒙 + 𝑏 > 0
=1

SKKU Kang Research Group / SEE3007 Spring 2025 10

Simple XOR problem: linearly separable?

or and xor

1 1 1 1 0 1 1 1 0
𝑥2 𝑥2 𝑥2
0 0 1 0 0 0 0 0 1

0 1 0 1 0 1
𝑥1 𝑥1 𝑥1

Solution: make a more complicated model!

SKKU Kang Research Group / SEE3007 Spring 2025 11

NEURAL NETWORKS MODEL

SKKU Kang Research Group / SEE3007 Spring 2025 12

Neuron

SKKU Kang Research Group / SEE3007 Spring 2025 13

Why Neuron?

https://en.wikipedia.org/wiki/Neuron
SKKU Kang Research Group / SEE3007 Spring 2025 14
What is a Neural Network?
𝑥1
𝑥2 𝑦ො
𝑥3

Neuron

SKKU Kang Research Group / SEE3007 Spring 2025 15

What is a Neural Network?
𝑥1
𝑥2 𝑦ො
𝑥3
1st layer 2nd layer
Neuron
𝑥1
𝑥2 𝑦ො

𝑥3

𝑾[1] 𝒛1 =𝑾1 𝒙+𝒃1 𝒂1 =𝜎 𝒛1

𝒃[1]
SKKU Kang Research Group / SEE3007 Spring 2025 16
What is a Neural Network?
𝑥1
𝑥2 𝑦ො
𝑥3
1st layer 2nd layer
Neuron
𝑥1
𝑥2 𝑦ො

𝑥3
𝑦
𝒙

𝑾[1] 𝒛1 =𝑾1 𝒙+𝒃1 𝒂1 =𝜎 𝒛1 𝑧 2 =𝒘 2 𝑇𝒂 1 +𝑏 2 𝑎2 =𝜎 𝑧 2 𝐿 𝑎 2 ,𝑦

𝒃[1] 𝒘[2]
SKKU Kang Research Group / SEE3007 Spring 2025 𝒃[2] 17
Sample Calculation
1st layer 2nd layer

𝑥1
𝑥2 𝑦ො

𝑥3
𝑦
𝒙
𝑇
𝑾[1] 𝒛1 =𝑾1 𝒙+𝒃1 𝒂1 =𝜎 𝒛1 𝑧 2 =𝒘2 𝒂1 +𝑏 2 𝑎2 =𝜎 𝑧 2 𝐿 𝑎 2 ,𝑦

𝒃[1] 𝒘[2]
𝒃[2]

SKKU Kang Research Group / SEE3007 Spring 2025 18

Sample Calculation
1st layer 2nd layer

𝑥1
𝑥2 𝑦ො

𝑥3
𝑦
𝒙
𝑇
𝑾[1] 𝒛1 =𝑾1 𝒙+𝒃1 𝒂1 =𝜎 𝒛1 𝑧 2 =𝒘2 𝒂1 +𝑏 2 𝑎2 =𝜎 𝑧 2 𝐿 𝑎 2 ,𝑦

𝒃[1] 𝒘[2]
𝒃[2]

SKKU Kang Research Group / SEE3007 Spring 2025 19

Neural Network Representation
▪ 2-layer fully-connected neural network Neuron
Hidden layer

[1]
Input layer 𝑎1
Output layer
𝑥1
[1]
𝑎2
𝑥2 𝑎[2] 𝑦ො = 𝑎[2]
[1]
𝑎3
𝑥3
[1] 𝑾[2] , 𝒃[2]
𝑎4
𝒂[𝟎] = 𝒙

*In our lecture, we

𝑾[1] , 𝒃[1] 𝒂[1] ∈ ℝ4
will keep using x
SKKU Kang Research Group / SEE3007 Spring 2025 20
Neural Network Representation
▪ 2-layer fully-connected neural network Neuron
Hidden layer

[1]
Input layer 𝑎1
Output layer
𝑥1
[1]
𝑎2
𝑥2 𝑎[2] 𝑦ො = 𝑎[2]
[1]
𝑎3
𝑥3
[1] 𝑾[2] , 𝒃[2]
𝑎4
𝒂[𝟎] = 𝒙
𝑾[1] , 𝒃[1] 𝒂[1] ∈ ℝ4

SKKU Kang Research Group / SEE3007 Spring 2025 21

Neural Network Representation
▪ 2-layer fully-connected neural network Neuron
Hidden layer

[1]
Input layer 𝑎1
Output layer
𝑥1
[1]
𝑎2
𝑥2 𝑎[2] 𝑦ො = 𝑎[2]
[1]
𝑎3
𝑥3
[1] 𝑾[2] , 𝒃[2]
𝑎4
𝒂[𝟎] = 𝒙
𝑾[1] , 𝒃[1] 𝒂[1] ∈ ℝ4 It can do a more complex job
than logistic regression!
SKKU Kang Research Group / SEE3007 Spring 2025 22
Shallow NN Vs. Deep NN (DNN)

▪ DNN: more than 1 hidden layer

▪ It can do more complicated tasks!

SKKU Kang Research Group / SEE3007 Spring 2025 https://www.go-rbcs.com/columns/deep-learning-to-the-rescue 23

Neural Network Representation
Neuron
1 1 𝑇 1
𝑧1 = 𝒘1 𝒙 + 𝑏1
1 1
𝑎1 =𝜎 𝑧1

𝑥1

𝑥2 𝑦ො

𝑥3
1 1𝑇 1
𝑧2 = 𝒘2 𝒙 + 𝑏2
1 1
𝑎2 = 𝜎 𝑧2

SKKU Kang Research Group / SEE3007 Spring 2025 24

Neural Network Representation
Neuron

[1]
𝑎1
𝑥1 1 1 𝑇 1 1 1
[1] 𝑧1 = 𝒘1 𝒙 + 𝑏1 𝑎1 = 𝜎 𝑧1
𝑎2
𝑥2 𝑎2 𝑦ො
1 1𝑇 1 1 1
[1] 𝑧2 = 𝒘2 𝒙 + 𝑏2 𝑎2 = 𝜎 𝑧2
𝑎3
𝑥3 1 1𝑇 1 1 1
[1] 𝑧3 = 𝒘3 𝒙 + 𝑏3 𝑎3 = 𝜎 𝑧3
𝑎4
𝑥1 1 1 𝑇 1 1
𝑎4 = 𝜎 𝑧4
1
𝒙 = 𝑥2 𝑧4 = 𝒘4 𝒙 + 𝑏4
𝑥3

𝑧 [2] =𝒘 2 𝑇 𝒂[1] + 𝑏 [2] 𝑎2 =𝜎 𝑧 2

SKKU Kang Research Group / SEE3007 Spring 2025 25

Neural Network with a Hidden Layer

Input: Parameters:
[1]
𝑎1 𝒙 ∈ ℝ𝑛 𝑾[1] ∈ ℝℎ×𝑛 𝒘[2] ∈ ℝℎ
𝑥1 𝒃[1] ∈ ℝℎ 𝑏 [2] ∈ ℝ
[1]
𝑎2
𝑥2 𝑎2 𝑦ො
[1]
𝑎3 Forward path:
𝑥3
[1] 𝒛[1] = 𝑾[1] 𝒙 + 𝒃[1]
𝑎4 Hidden layer
𝒂[1] = 𝜎 𝒛[1]
[2] 2 𝑇 [1]
𝑧 =𝒘 𝒂 + 𝑏 [2]
In this example, Output layer
n=3, h=4 (number of neurons) 𝑦ො = 𝑎 2 = 𝜎 𝑧 2

SKKU Kang Research Group / SEE3007 Spring 2025 26

ACTIVATION FUNCTIONS

SKKU Kang Research Group / SEE3007 Spring 2025 27

Neural Network with a Hidden Layer

Input: Parameters:
𝑔
𝒙 ∈ ℝ𝑛 𝑾[1] ∈ ℝℎ×𝑛 𝒘[2] ∈ ℝℎ
𝑥1 𝒃[1] ∈ ℝℎ 𝑏 [2] ∈ ℝ
𝑔

𝑥2 𝜎 𝑦ො
𝑔
Forward path:
𝑥3
𝑔 𝒛[1] = 𝑾[1] 𝒙 + 𝒃[1]
𝒂[1] = 𝑔 𝒛[1] where 𝑔 . is an activation function
𝑧 [2] =𝒘 2 𝑇 𝒂[1] + 𝑏 [2]
𝑦ො = 𝑎 2 = 𝜎 𝑧 2

SKKU Kang Research Group / SEE3007 Spring 2025 28

Why Non-Linear Activation Functions?

𝒙 𝒛1 =𝑾1 𝒙+𝒃1 𝒛2 =𝑾2 𝒛𝟏 +𝒃2 𝒛2

𝒛 2 = 𝑾 2 (𝑾 1 𝒙 + 𝒃 1 ) + 𝒃 2

= 𝑾 2 𝑾 1 𝒙 + (𝑾 2 𝒃 1 + 𝒃 2 )

= 𝑾′𝒙 + 𝒃′
where 𝑾′ = 𝑾 2 𝑾 1 and 𝒃′ = 𝑾 2 𝒃 1 + 𝒃 2

Series of linear functions without activation function

→ just a single linear function again
→ Not useful!
SKKU Kang Research Group / SEE3007 Spring 2025 29
Activation Functions
There are so many activation functions ..

We’ll cover some important and currently

widely used activation functions

Sigmoid
tanh
ReLU
LeakyReLU

SKKU Kang Research Group / SEE3007 Spring 2025 30

Sigmoid

1
▪𝜎𝑥 = 1+𝑒 −𝑥
▪ Range: 0,1
▪ Derivative
𝑑𝜎 𝑥
=𝜎 𝑥 1−𝜎 𝑥
𝑑𝑥
𝑑𝜎 𝑥
▪ 0<
𝑑𝑥
≤ 0.25

SKKU Kang Research Group / SEE3007 Spring 2025 31

Hyperbolic Tangent: tanh

𝑒 𝑥 −𝑒 −𝑥
▪ tanh 𝑥 =
𝑒 𝑥 +𝑒 −𝑥

= 2𝜎 2𝑥 − 1

▪ Range: −1,1
▪ Derivative
𝑑 tanh 𝑥
= 1 − tanh2 𝑥
𝑑𝑥
𝑑 tanh 𝑥
▪ 0 < 𝑑𝑥 ≤ 1

SKKU Kang Research Group / SEE3007 Spring 2025 32

Vanishing Gradient

“The term vanishing gradient refers to the fact that in a feedforward

network (FFN) the backpropagated error signal typically decreases (or
increases) exponentially as a function of the distance from the final layer”
by Jason Brownlee

▪ Chain rule (for a Deep NN)

𝜕𝐿 𝜕𝐿 𝜕𝑧 𝑛 𝜕𝑧 3 𝜕𝑧 2 𝜕𝑧 1
= [𝑛] ∗ 𝑛−1
∗ ⋯∗ 2
∗ 1
∗
𝜕𝑤 𝜕𝑧 𝜕𝑧 𝜕𝑧 𝜕𝑧 𝜕𝑤

SKKU Kang Research Group / SEE3007 Spring 2025

https://machinelearningmastery.com/how-to-fix-vanishing-gradients-using-the-rectified-linear-activation-function/ 33
Rectified Linear Unit: ReLU
▪ 𝑓 𝑥 = max 0, 𝑥
▪ Range: 0, ∞
▪ Derivative
0 𝑖𝑓 𝑥 < 0
𝑑𝑓 𝑥
=ቐ 1 𝑖𝑓𝑥 > 0
𝑑𝑥 𝑢𝑛𝑑𝑒𝑓𝑖𝑛𝑒𝑑 𝑖𝑓 𝑥 = 0

Most popular activation function

▪ Better than sigmoid or tanh to vanishing gradient
problem
▪ Simple to implement

SKKU Kang Research Group / SEE3007 Spring 2025 34

Leaky ReLU
𝑎𝑥 𝑖𝑓 𝑥 < 0
▪ 𝑓 𝑥 =ቊ
𝑥 𝑖𝑓 𝑥 ≥ 0 ReLU Leaky ReLU

▪ 𝑎 ≪ 1 (e.g, 𝑎 = 0.01)
▪ Range: −∞, ∞
▪ Derivative

𝑎 𝑖𝑓 𝑥 < 0
𝑑𝑓 𝑥
=ቐ 1 𝑖𝑓𝑥 > 0
𝑑𝑥 𝑢𝑛𝑑𝑒𝑓𝑖𝑛𝑒𝑑 𝑖𝑓 𝑥 = 0

▪ Sometimes better than ReLU to avoid dying ReLU issue

▪ But slower to compute
▪ ReLU is more popular
https://www.linkedin.com/pulse/better-activation-functions-prince-
SKKU Kang Research Group / SEE3007 Spring 2025
kumawat#:~:text=Leaky%20ReLU%20is%20an%20extension,become%20permanently%20inactive%20during%20training. 35
Pros and Cons of Activation Functions

1
Sigmoid: 𝑎 = 1+𝑒 −𝑥 𝑒 𝑥 − 𝑒 −𝑥
tanh 𝑥 : 𝑎 = 𝑥
𝑒 + 𝑒 −𝑥

ReLU: 𝑎 = max{0, 𝑧} Leaky ReLU: 𝑎 = max{𝑐𝑧, 𝑧}

SKKU Kang Research Group / SEE3007 Spring 2025 36

Pros and Cons of Activation Functions
Activation Function Note
sigmoid Rarely used in hidden layer. Only used in output layer in binary
classification since its output is 0~1 (probability)
Tanh Superior than sigmoid

ReLU Most popular one. Better at vanishing gradient problem. Very simple
to implement in hardware.
Leaky ReLU Sometimes better than ReLU in dying ReLU issue, but it takes more
calculations than ReLU. Less popular than ReLU.

Tip: If the performance is

similar, Use the simpler
one! (Use ReLU!)

SKKU Kang Research Group / SEE3007 Spring 2025 37

Neural Network with a Hidden Layer

𝑔
𝒛[1] = 𝑾[1] 𝒙 + 𝒃[1]
𝑥1
𝑔 𝒂[1] = 𝑔 𝒛[1] where 𝑔 . is an activation function

𝑥2 𝜎 𝑦ො 𝑧 [2] =𝒘 2 𝑇 𝒂[1] + 𝑏 [2]

𝑔
𝑦ො = 𝑎 2 = 𝜎 𝑧 2
𝑥3
𝑔

𝒙 𝒂[1] 𝑎2

You may use tanh, ReLU, Leaky ReLU as 𝑔

SKKU Kang Research Group / SEE3007 Spring 2025 38

AI2025 Lecture10 Recording Slide
No ratings yet
AI2025 Lecture10 Recording Slide
46 pages
AI2025 Lecture09 Inperson Slide
No ratings yet
AI2025 Lecture09 Inperson Slide
25 pages
AI2025 Lecture08 Recording Slide
No ratings yet
AI2025 Lecture08 Recording Slide
38 pages
Lecture 09 Slides - After
No ratings yet
Lecture 09 Slides - After
57 pages
Unit Ii DNN
No ratings yet
Unit Ii DNN
24 pages
16 DL 1
No ratings yet
16 DL 1
9 pages
DL Exp-3 16010422230
No ratings yet
DL Exp-3 16010422230
9 pages
6.3 HiddenUnits
No ratings yet
6.3 HiddenUnits
26 pages
Deep Learning: International Islamic University of Chittagong
No ratings yet
Deep Learning: International Islamic University of Chittagong
31 pages
AI2025 Lecture04 Recording Slide
No ratings yet
AI2025 Lecture04 Recording Slide
42 pages
Notes Chapter8
No ratings yet
Notes Chapter8
4 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
Deep Learning
No ratings yet
Deep Learning
40 pages
What Are The Activation Functions, How Do I Deter...
No ratings yet
What Are The Activation Functions, How Do I Deter...
3 pages
Neural Network Training
No ratings yet
Neural Network Training
73 pages
DeepLearing Theory
No ratings yet
DeepLearing Theory
51 pages
Data Mining and Machine Learning: Fundamental Concepts and Algorithms
No ratings yet
Data Mining and Machine Learning: Fundamental Concepts and Algorithms
59 pages
Sparseautoencoder 2011new
No ratings yet
Sparseautoencoder 2011new
19 pages
AML 03 Dense Neural Networks
No ratings yet
AML 03 Dense Neural Networks
20 pages
ANN Notes
No ratings yet
ANN Notes
7 pages
Training Neural Networks
No ratings yet
Training Neural Networks
109 pages
Machine Learning (CSO851) - Lecture 08
No ratings yet
Machine Learning (CSO851) - Lecture 08
27 pages
Lesson 3 Artificial Neural Network
No ratings yet
Lesson 3 Artificial Neural Network
77 pages
Neural Networks From Scratch: 3.1 Formal Neuron
No ratings yet
Neural Networks From Scratch: 3.1 Formal Neuron
8 pages
CS460 - Deep Learning - W02 & W03
No ratings yet
CS460 - Deep Learning - W02 & W03
44 pages
Neural Network (Basics)
No ratings yet
Neural Network (Basics)
48 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
36 pages
Deep Learning
No ratings yet
Deep Learning
19 pages
f8194544 Microsoft PowerPoint DeepLearning
No ratings yet
f8194544 Microsoft PowerPoint DeepLearning
28 pages
Lecture 2 - Neural Network v1.0
No ratings yet
Lecture 2 - Neural Network v1.0
64 pages
Lecture - 05 (Introduction To ANN)
No ratings yet
Lecture - 05 (Introduction To ANN)
27 pages
Deep Learning
No ratings yet
Deep Learning
78 pages
06 AIS302 ANN Backpropagation
No ratings yet
06 AIS302 ANN Backpropagation
83 pages
Lecture Slides 2 - Neural Networks - 2021
No ratings yet
Lecture Slides 2 - Neural Networks - 2021
42 pages
26 Neural Nets
No ratings yet
26 Neural Nets
77 pages
Chapter 9
No ratings yet
Chapter 9
73 pages
Deep Learning
No ratings yet
Deep Learning
299 pages
Week 14 (NN)
No ratings yet
Week 14 (NN)
49 pages
Unit II
No ratings yet
Unit II
12 pages
Module 2
No ratings yet
Module 2
13 pages
NN Unit - 1
No ratings yet
NN Unit - 1
27 pages
AyushChokhani AI Asiignment 2
No ratings yet
AyushChokhani AI Asiignment 2
12 pages
Activation Functions
No ratings yet
Activation Functions
4 pages
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
No ratings yet
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
84 pages
Sparse Autoencoder
No ratings yet
Sparse Autoencoder
15 pages
ML Lec-22
No ratings yet
ML Lec-22
25 pages
L4 Training Neural Networks en
No ratings yet
L4 Training Neural Networks en
48 pages
Tutorial 1,2
No ratings yet
Tutorial 1,2
12 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Deep Learning Module-02 Search Creators
No ratings yet
Deep Learning Module-02 Search Creators
15 pages
Introduction of Machine Learning
No ratings yet
Introduction of Machine Learning
61 pages
Lec 07 8
No ratings yet
Lec 07 8
40 pages
DNN Cluster S2 22 MidSem Makeup
No ratings yet
DNN Cluster S2 22 MidSem Makeup
7 pages
Chapter 2 - 2 Shallow Neural Network 2 - 2
No ratings yet
Chapter 2 - 2 Shallow Neural Network 2 - 2
34 pages
Deep Learning
100% (4)
Deep Learning
100 pages
cs188 Fa24 Lec24
No ratings yet
cs188 Fa24 Lec24
46 pages
DeepLearning Recap
No ratings yet
DeepLearning Recap
104 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Worked Examples in Advanced Mechanics of Materials using MATLAB
From Everand
Worked Examples in Advanced Mechanics of Materials using MATLAB
Eric Okoth Ogur
No ratings yet
(2주차 1차시 학습자료) 1.집적회로이론및설계 2강
No ratings yet
(2주차 1차시 학습자료) 1.집적회로이론및설계 2강
16 pages
(2주차 2차시 학습자료) 1.집적회로이론및설계 2강
No ratings yet
(2주차 2차시 학습자료) 1.집적회로이론및설계 2강
16 pages
(1주차 3차시 학습자료) 1.집적회로이론및설계 1강
No ratings yet
(1주차 3차시 학습자료) 1.집적회로이론및설계 1강
15 pages
AI2025 Lecture02 Recording Slides
No ratings yet
AI2025 Lecture02 Recording Slides
52 pages
AI2025 Lecture05 Inperson Slide
No ratings yet
AI2025 Lecture05 Inperson Slide
47 pages
Digital Integrated Circuit 01 Welcome&Intro
No ratings yet
Digital Integrated Circuit 01 Welcome&Intro
38 pages
Digital Integrated Circuit 02 Manufacturing Process and Layout
No ratings yet
Digital Integrated Circuit 02 Manufacturing Process and Layout
43 pages
(10주차 - 3차시 - 학습자료) 3.반도체 장비와 설비
No ratings yet
(10주차 - 3차시 - 학습자료) 3.반도체 장비와 설비
17 pages
Digital Integrated Circuit 03 Wires
No ratings yet
Digital Integrated Circuit 03 Wires
24 pages
Text Classification Improved BT Integrating Bidirectional LSTM With Two-Dimensional Max Pooling
No ratings yet
Text Classification Improved BT Integrating Bidirectional LSTM With Two-Dimensional Max Pooling
11 pages
Implementation of Single Layer Perceptron Model Using MATLAB
No ratings yet
Implementation of Single Layer Perceptron Model Using MATLAB
5 pages
Confirmation Group Edugene
No ratings yet
Confirmation Group Edugene
17 pages
CSE3008 Module3
No ratings yet
CSE3008 Module3
38 pages
Lecture 2
No ratings yet
Lecture 2
39 pages
DL Unit5 RNN
No ratings yet
DL Unit5 RNN
107 pages
Case Study 2025
No ratings yet
Case Study 2025
34 pages
EE837 2024 Introduction
No ratings yet
EE837 2024 Introduction
9 pages
cs231n 2019 Lecture10
No ratings yet
cs231n 2019 Lecture10
106 pages
Lecture 17 Transfer Learning
No ratings yet
Lecture 17 Transfer Learning
12 pages
Ain3001 Presentation Guideline For ML Midterm
No ratings yet
Ain3001 Presentation Guideline For ML Midterm
3 pages
Delve Deep Into End-To-End Automatic Speech Recognition Models
No ratings yet
Delve Deep Into End-To-End Automatic Speech Recognition Models
6 pages
Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks
No ratings yet
Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks
5 pages
Jeopardy Game
No ratings yet
Jeopardy Game
15 pages
Activation Functions in Deep Learning: A Comprehensive Survey and Benchmark
No ratings yet
Activation Functions in Deep Learning: A Comprehensive Survey and Benchmark
18 pages
Purple Gradient Artificial Intelligence Presentation
No ratings yet
Purple Gradient Artificial Intelligence Presentation
9 pages
Week1 Lecture2
No ratings yet
Week1 Lecture2
50 pages
MLT Presentation-B
No ratings yet
MLT Presentation-B
7 pages
EC360 Soft Computing S5-EC-Syllabus
No ratings yet
EC360 Soft Computing S5-EC-Syllabus
2 pages
Neuromorphic-Computing-chandra Sekhar - 23ec01021
No ratings yet
Neuromorphic-Computing-chandra Sekhar - 23ec01021
13 pages
19cse353 L23
No ratings yet
19cse353 L23
15 pages
Hassan Synpsis Slides
No ratings yet
Hassan Synpsis Slides
14 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
44 pages
PVSNet Palm Vein Authentication
No ratings yet
PVSNet Palm Vein Authentication
8 pages
Neural Network For Image Classification
No ratings yet
Neural Network For Image Classification
16 pages
McCulloch-Pitts Neuron
No ratings yet
McCulloch-Pitts Neuron
14 pages
NNDL Lab
No ratings yet
NNDL Lab
33 pages
Generative Adversarial Networks (Gans) : Date: 14.11.2022
100% (1)
Generative Adversarial Networks (Gans) : Date: 14.11.2022
12 pages
Deep Learning - Unit-V Two Marks
No ratings yet
Deep Learning - Unit-V Two Marks
5 pages
B - Principles of Training BP
No ratings yet
B - Principles of Training BP
11 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

AI2025 Lecture06 Recording Slide

Uploaded by

AI2025 Lecture06 Recording Slide

Uploaded by

1

AI System Semiconductor Design

SKKU Kang Research Group / SEE3007 Spring 2025 1 1

Dataset Goal: generalize the input-output relationship

Life span (years)

Output Categorical value Numeric value

SKKU Kang Research Group / SEE3007 Spring 2025 3

Input Model Output Label

Architecture 𝑦ො = 𝑃 𝐶𝑎𝑡 𝑦=1

SKKU Kang Research Group / SEE3007 Spring 2025 4

SKKU Kang Research Group / SEE3007 Spring 2025 5

SKKU Kang Research Group / SEE3007 Spring 2025 6

SKKU Kang Research Group / SEE3007 Spring 2025 8

SKKU Kang Research Group / SEE3007 Spring 2025 9

SKKU Kang Research Group / SEE3007 Spring 2025 10

Solution: make a more complicated model!

SKKU Kang Research Group / SEE3007 Spring 2025 11

SKKU Kang Research Group / SEE3007 Spring 2025 12

SKKU Kang Research Group / SEE3007 Spring 2025 13

SKKU Kang Research Group / SEE3007 Spring 2025 15

𝑾[1] 𝒛1 =𝑾1 𝒙+𝒃1 𝒂1 =𝜎 𝒛1

𝑾[1] 𝒛1 =𝑾1 𝒙+𝒃1 𝒂1 =𝜎 𝒛1 𝑧 2 =𝒘 2 𝑇𝒂 1 +𝑏 2 𝑎2 =𝜎 𝑧 2 𝐿 𝑎 2 ,𝑦

SKKU Kang Research Group / SEE3007 Spring 2025 18

SKKU Kang Research Group / SEE3007 Spring 2025 19

*In our lecture, we

SKKU Kang Research Group / SEE3007 Spring 2025 21

▪ DNN: more than 1 hidden layer

SKKU Kang Research Group / SEE3007 Spring 2025 https://www.go-rbcs.com/columns/deep-learning-to-the-rescue 23

SKKU Kang Research Group / SEE3007 Spring 2025 24

𝑧 [2] =𝒘 2 𝑇 𝒂[1] + 𝑏 [2] 𝑎2 =𝜎 𝑧 2

SKKU Kang Research Group / SEE3007 Spring 2025 25

SKKU Kang Research Group / SEE3007 Spring 2025 26

SKKU Kang Research Group / SEE3007 Spring 2025 27

SKKU Kang Research Group / SEE3007 Spring 2025 28

𝒙 𝒛1 =𝑾1 𝒙+𝒃1 𝒛2 =𝑾2 𝒛𝟏 +𝒃2 𝒛2

Series of linear functions without activation function

We’ll cover some important and currently

SKKU Kang Research Group / SEE3007 Spring 2025 30

SKKU Kang Research Group / SEE3007 Spring 2025 31

SKKU Kang Research Group / SEE3007 Spring 2025 32

“The term vanishing gradient refers to the fact that in a feedforward

▪ Chain rule (for a Deep NN)

SKKU Kang Research Group / SEE3007 Spring 2025

Most popular activation function

SKKU Kang Research Group / SEE3007 Spring 2025 34

▪ Sometimes better than ReLU to avoid dying ReLU issue

ReLU: 𝑎 = max{0, 𝑧} Leaky ReLU: 𝑎 = max{𝑐𝑧, 𝑧}

SKKU Kang Research Group / SEE3007 Spring 2025 36

Tip: If the performance is

SKKU Kang Research Group / SEE3007 Spring 2025 37

𝑥2 𝜎 𝑦ො 𝑧 [2] =𝒘 2 𝑇 𝒂[1] + 𝑏 [2]

You may use tanh, ReLU, Leaky ReLU as 𝑔

SKKU Kang Research Group / SEE3007 Spring 2025 38

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.