0% found this document useful (0 votes)

136 views13 pages

Different Deep CNN Architectures - LeNet, AlexNet, VGG

Uploaded by

Rani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

136 views13 pages

Different Deep CNN Architectures - LeNet, AlexNet, VGG

Uploaded by

Rani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Different Deep CNN Architectures - LeNet, AlexNet, VGG

16 September 2024 08:49 PM

LeNet 5 Architecture
➢ In the 1990s, Yann LeCun, Leon Bottou, Yosuha Bengio, and Patrick Haffner proposed the
LeNet-5 neural network design for character recognition in both handwriting and machine
printing.

➢ LeNet 5 architecture is the ‘Hello World’ in the domain of Convolution Neural Networks.

What is LeNet 5?
➢ LeNet is a convolutional neural network that Yann LeCun introduced in 1989.

➢ LeNet is a common term for LeNet-5, a simple convolutional neural network.

➢ The LeNet-5 signifies CNN’s emergence and outlines its core components.

➢ However, it was not popular at the time due to a lack of hardware, especially GPU (Graphics
Process Unit, a specialised electronic circuit designed to change memory to accelerate the
creation of images during a buffer intended for output to a show device) and alternative
algorithms, like SVM, which could perform effects similar to or even better than those of the
LeNet.

Features of LeNet-5
• Every convolutional layer includes three parts: convolution, pooling, and nonlinear
activation functions
• Using convolution to extract spatial features
• The average pooling layer is used for subsampling.
• ‘tanh’ is used as the activation function
• Using Multi-Layered Perceptron or Fully Connected Layers as the last classifier

➢ The LeNet-5 CNN architecture has seven layers. Three convolutional layers, two subsampling
layers, and two fully linked layers make up the layer composition.

LeNet-5 Architecture

First Layer
New Section 1 Page 1
First Layer
➢ A 32x32 grayscale image serves as the input for LeNet-5 and is processed by the first
convolutional layer comprising six feature maps or filters with a stride of one. From 32x32x1
to 28x28x6, the image’s dimensions shift.

Second Layer
➢ Then, using a filter size of 22 and a stride of 2, the LeNet-5 adds an average pooling layer or
sub-sampling layer. 14x14x6 will be the final image’s reduced size.

Third Layer
➢ A second convolutional layer with 16 feature maps of size 55 and a stride of 1 is then present.
Only 10 of the 16 feature maps in this layer are linked to the six feature maps in the layer
below, as can be seen in the illustration below.

New Section 1 Page 2

➢ The primary goal is to disrupt the network’s symmetry while maintaining a manageable
number of connections. Because of this, there are 1516 training parameters instead of 2400 in
these layers, and similarly, there are 151600 connections instead of 240000.

Fourth Layer
➢ With a filter size of 22 and a stride of 2, the fourth layer (S4) is once more an average pooling
layer. The output will be decreased to 5x5x16 because this layer is identical to the second layer
(S2) but has 16 feature maps.

Fifth Layer
➢ With 120 feature maps, each measuring 1 x 1, the fifth layer (C5) is a fully connected
convolutional layer. All 400 nodes (5x5x16) in layer four, S4, are connected to each of the 120
units in C5’s 120 units.

New Section 1 Page 3

units in C5’s 120 units.

Sixth Layer
➢ A fully connected layer (F6) with 84 units makes up the sixth layer.

Output Layer
➢ The SoftMax output layer, which has 10 potential values and corresponds to the digits 0 to 9, is
the last layer.

New Section 1 Page 4

Summary of LeNet-5 Architecture

Implementation: https://medium.com/@siddheshb008/lenet-5-architecture-explained-3b559cb2d52b (for Implementation of LeNet)

AlexNet Architecture
➢ The convolutional neural network (CNN) architecture known as AlexNet was created by Alex
Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, who served as Krizhevsky’s PhD advisor.

New Section 1 Page 5

➢ This was the first architecture that used GPU to boost the training performance.

➢ AlexNet consists of 5 convolution layers, 3 max-pooling layers, 2 Normalized layers, 2 fully

connected layers and 1 SoftMax layer.

➢ Each convolution layer consists of a convolution filter and a non-linear activation function
called “ReLU”.

➢ The pooling layers are used to perform the max-pooling function and the input size is fixed due
to the presence of fully connected layers.

➢ The input size is mentioned at most of the places as 224x224x3 but due to some padding which
happens it works out to be 227x227x3.

➢ Above all this AlexNet has over 60 million parameters.

Key Features:
• ‘ReLU’ is used as an activation function rather than ‘tanh’
• Batch size of 128
• SGD Momentum is used as a learning algorithm
• Data Augmentation is been carried out like flipping, jittering, cropping, colour
normalization, etc.
➢ AlexNet was trained on a GTX 580 GPU with only 3 GB of memory which couldn’t fit the
entire network. So the network was split across 2 GPUs, with half of the neurons(feature maps)
on each GPU.

Max Pooling
Max Pooling is a feature commonly imbibed into Convolutional Neural Network (CNN)
architectures. The main idea behind a pooling layer is to “accumulate” features from maps
generated by convolving a filter over an image. Formally, its function is to progressively
reduce the spatial size of the representation to reduce the number of parameters and
computations in the network. The most common form of pooling is max pooling.

New Section 1 Page 6

➢ Max pooling is done in part to help over-fitting by providing an abstracted form of the
representation. As well, it reduces the computational cost by reducing the number of
parameters to learn and provides basic translation invariance to the internal representation. Max
pooling is done by applying a max filter to (usually) non-overlapping sub-regions of the initial
representation.

➢ The authors of AlexNet used pooling windows, sized 3×3 with a stride of 2 between the
adjacent windows. Due to this overlapping nature of Max Pool, the top-1 error rate was reduced
by 0.4% and the top-5 error rate was reduced by 0.3% respectively. If you compare this to
using non-overlapping pooling windows of size 2×2 with a stride of 2, that would give the
same output dimensions.

ReLU Non-Linearity
➢ AlexNet demonstrates that saturating activation functions like Tanh or Sigmoid can be used to
train deep CNNs much more quickly. The image below demonstrates that AlexNet can achieve
a training error rate of 25% with the aid of ReLUs (solid curve). Compared to a network using
tanh, this is six times faster (dotted curve). On the CIFAR-10 dataset, this was evaluated.

Data Augmentation
New Section 1 Page 7
Data Augmentation
➢ Overfitting can be avoided by showing Neural Net various iterations of the same image.
Additionally, it assists in producing more data and compels the Neural Net to memorise the
main qualities.

• Augmentation by Mirroring
Consider that our training set contains a picture of a cat. A cat can also be seen as its mirror
image. This indicates that by just flipping the image above the vertical axis, we may double the
size of the training datasets.

• Augmentation by Random Cropping of Images

➢ Randomly cropping the original image will also produce additional data that is simply the
original data shifted.

➢ For the network’s inputs, the creators of AlexNet selected random crops with dimensions of
227 by 227 from within the 256 by 256 image boundary. They multiplied the size of the data by
2048 using this technique.

Dropout
➢ A neuron is removed from the neural network during dropout with a probability of 0.5. A
neuron that is dropped does not make any contribution to either forward or backward
propagation. As seen in the graphic below, each input is processed by a separate Neural
Network design. The acquired weight parameters are therefore more reliable and less prone to
overfitting.

New Section 1 Page 8

AlexNet Summary

Implementation: https://medium.com/@siddheshb008/alexnet-architecture-explained-b6240c528bd5 (for Implementation of AlexNet)

VGG Architecture
➢ The company Visual Geometry Group created VGGNet (by Oxford University).

➢ While GoogLeNet won the classification assignment at ILSVR2014, this architecture came
first.

➢ Understanding VGGNet is important since many contemporary image classification models are
constructed on top of it.

What is VGG-Net?
➢ It is a typical deep Convolutional Neural Network (CNN) design with numerous layers, and the
abbreviation VGG stands for Visual Geometry Group. The term “deep” describes the number
of layers, with VGG-16 or VGG-19 having 16 or 19 convolutional layers, respectively.
New Section 1 Page 9
of layers, with VGG-16 or VGG-19 having 16 or 19 convolutional layers, respectively.

➢ Innovative object identification models are built using the VGG architecture. The VGGNet,
created as a deep neural network, outperforms benchmarks on a variety of tasks and datasets
outside of ImageNet. It also remains one of the most often used image recognition architectures
today.

Distinguish Between AlexNet (above) and VGGNet

➢ VGG-16: The convolutional neural network model called the VGG model, or VGGNet, that
supports 16 layers is also known as VGG16.

➢ It was developed by A. Zisserman and K. Simonyan from the University of Oxford.

➢ The research paper titled “Very Deep Convolutional Networks for Large-Scale Image
Recognition” contains the model that these researchers released.

➢ In ImageNet, the VGG16 model achieves top-5 test accuracy of about 92.7 per cent. A dataset
called ImageNet has over 14 million photos that fall into almost 1000 types. It was also among
the most well-liked models submitted at ILSVRC-2014. It significantly outperforms AlexNet
by substituting several 3x3 kernel-sized filters for the huge kernel-sized filters. Nvidia Titan
Black GPUs were used to train the VGG16 model over many weeks.

➢ The VGGNet-16 has 16 layers and can classify photos into 1000 different object categories,
including keyboard, animals, pencil, mouse, etc., as discussed above. The model also accepts
images with a resolution of 224 by 224.7

New Section 1 Page 10

➢ VGG-19: The VGG19 model (also known as VGGNet-19) has the same basic idea as the
VGG16 model, with the exception that it supports 19 layers.

➢ The numbers “16” and “19” refer to the model’s weight layers (convolutional layers). In
comparison to VGG16, VGG19 contains three extra convolutional layers. In the final section of
this essay, we’ll go into greater detail on the features of the VGG16 and VGG19 networks.

VGG-Net Architecture
➢ Very tiny convolutional filters are used in the construction of the VGG network. Thirteen
convolutional layers and three fully connected layers make up the VGG-16.

Let’s quickly examine VGG’s architecture:

• Inputs: The VGGNet accepts 224224-pixel images as input. To maintain a consistent input
size for the ImageNet competition, the model’s developers chopped out the central 224224
patches in each image.
• Convolutional Layers: VGG’s convolutional layers use the smallest feasible receptive
field, or 33, to record left-to-right and up-to-down movement. Additionally, 11 convolution
filters are used to transform the input linearly. The next component is a ReLU unit, a
significant advancement from AlexNet that shortens training time. Rectified linear unit
New Section 1 Page 11
significant advancement from AlexNet that shortens training time. Rectified linear unit
activation function, or ReLU, is a piecewise linear function that, if the input is positive,
outputs the input; otherwise, the output is zero. The convolution stride is fixed at 1 pixel to
keep the spatial resolution preserved after convolution (stride is the number of pixel shifts
over the input matrix).
• Hidden Layers: The VGG network’s hidden layers all make use of ReLU. Local Response
Normalization (LRN) is typically not used with VGG as it increases memory usage and
training time. Furthermore, it doesn’t increase overall accuracy.
• Fully Connected Layers: The VGGNet contains three layers with full connectivity. The
first two levels each have 4096 channels, while the third layer has 1000 channels with one
channel for each class.

Understanding VGG-16
➢ The deep neural network’s 16 layers are indicated by the number 16 in their name, which is
VGG (VGGNet). This indicates that the VGG16 network is quite large, with a total of over 138
million parameters. Even by today’s high standards, it is a sizable network. The network is
more appealing due to the simplicity of the VGGNet16 architecture, nevertheless. Its
architecture alone can be used to describe how uniform it is.

➢ The height and width are decreased by a pooling layer that comes after a few convolution
layers. There are around 64 filters available, which we can then multiply by two to get about
128 filters, and so on up to 256 filters. In the last layer, we can use 512 filters.

VGG-16 Summary

New Section 1 Page 12

Implementation: https://medium.com/@siddheshb008/vgg-net-architecture-explained-71179310050f (for Implementation of VGG)

New Section 1 Page 13

Deeplearning - PPT - Unit 4 and 5
No ratings yet
Deeplearning - PPT - Unit 4 and 5
154 pages
Lecture05 DeepLearningCNN Trang 2
No ratings yet
Lecture05 DeepLearningCNN Trang 2
45 pages
AlexNet Algorithm Presentation ML AI Deep Learning
No ratings yet
AlexNet Algorithm Presentation ML AI Deep Learning
10 pages
Modern Convolutional Neural Networks
No ratings yet
Modern Convolutional Neural Networks
68 pages
DL UNIT 2 CNN Architectures
No ratings yet
DL UNIT 2 CNN Architectures
12 pages
XCXC
No ratings yet
XCXC
16 pages
04 - CNN Case Studies
No ratings yet
04 - CNN Case Studies
20 pages
Keras and Tensorflow
No ratings yet
Keras and Tensorflow
11 pages
465-Lecture 7
No ratings yet
465-Lecture 7
46 pages
Transfer Learning
No ratings yet
Transfer Learning
15 pages
Image Processing With Deep Learning
No ratings yet
Image Processing With Deep Learning
39 pages
Notes
No ratings yet
Notes
15 pages
Unit V
No ratings yet
Unit V
84 pages
DLP&P Notes Faculty: Ms. Meenakshi Chaudhary: What Is A Convolutional Neural Network (CNN) ?
No ratings yet
DLP&P Notes Faculty: Ms. Meenakshi Chaudhary: What Is A Convolutional Neural Network (CNN) ?
50 pages
Convolutional Networks
No ratings yet
Convolutional Networks
211 pages
Convolutional Neural Network2 26112024 015227pm
No ratings yet
Convolutional Neural Network2 26112024 015227pm
41 pages
Alexnet and Data Augmentation
No ratings yet
Alexnet and Data Augmentation
6 pages
DL - Unit IV
No ratings yet
DL - Unit IV
36 pages
COMP3220 Lect 11 - Introduction To Convolutional Neural Networks
No ratings yet
COMP3220 Lect 11 - Introduction To Convolutional Neural Networks
13 pages
DL3 QB
No ratings yet
DL3 QB
19 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
14 pages
Mổ xẻ cái AlexNet network
No ratings yet
Mổ xẻ cái AlexNet network
5 pages
Difference Between AlexNet, VGGNet, ResNet, and Inception
No ratings yet
Difference Between AlexNet, VGGNet, ResNet, and Inception
25 pages
Convolution Neural Networks
No ratings yet
Convolution Neural Networks
80 pages
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
From Everand
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
César Pérez López
No ratings yet
ML Lec 14 LeNeT CNN Architecture
No ratings yet
ML Lec 14 LeNeT CNN Architecture
14 pages
4 March 23 - DL
No ratings yet
4 March 23 - DL
79 pages
Unit 2 CNN
No ratings yet
Unit 2 CNN
15 pages
Lenet
No ratings yet
Lenet
1 page
DoMinhQuan 521H0290
No ratings yet
DoMinhQuan 521H0290
4 pages
Untitled Document
No ratings yet
Untitled Document
15 pages
Deep Learning Assign 2
No ratings yet
Deep Learning Assign 2
5 pages
Unit 3
No ratings yet
Unit 3
37 pages
Unit 5
No ratings yet
Unit 5
24 pages
Alex Net
No ratings yet
Alex Net
2 pages
Iii Unit - Deeplearning
No ratings yet
Iii Unit - Deeplearning
93 pages
Alex Net
No ratings yet
Alex Net
3 pages
Difference of LeNet and AlexNet
No ratings yet
Difference of LeNet and AlexNet
11 pages
Difference Between Alexnet, Vggnet, Resnet, and Inception
No ratings yet
Difference Between Alexnet, Vggnet, Resnet, and Inception
14 pages
Unit 3
No ratings yet
Unit 3
38 pages
Modern CNN Architectures
No ratings yet
Modern CNN Architectures
32 pages
BEFA
No ratings yet
BEFA
23 pages
Alex Net
No ratings yet
Alex Net
26 pages
ML II - Unit IV
No ratings yet
ML II - Unit IV
20 pages
DLlenet 5 Notes 1 Downl
No ratings yet
DLlenet 5 Notes 1 Downl
6 pages
23-CNN Operations - Architecture - Simple Convolution Network-09!09!2024
No ratings yet
23-CNN Operations - Architecture - Simple Convolution Network-09!09!2024
8 pages
Trustworthy - Final Essay
No ratings yet
Trustworthy - Final Essay
21 pages
Module3 Casestudy
No ratings yet
Module3 Casestudy
13 pages
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
No ratings yet
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
44 pages
DLle Net 5 Notes 2 Downl
No ratings yet
DLle Net 5 Notes 2 Downl
3 pages
DL Ass 742
No ratings yet
DL Ass 742
14 pages
Unit III
No ratings yet
Unit III
58 pages
Dense Net
No ratings yet
Dense Net
15 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
15 pages
DNN Architectures
No ratings yet
DNN Architectures
12 pages
18 GoogleNet 05 09 2024
No ratings yet
18 GoogleNet 05 09 2024
40 pages
Wa0002.
No ratings yet
Wa0002.
28 pages
Unit 4 Deeplearning
No ratings yet
Unit 4 Deeplearning
41 pages
Neural Network
No ratings yet
Neural Network
22 pages
Technical Report On DenseNet Architecture (Deep Learning Network Model)
No ratings yet
Technical Report On DenseNet Architecture (Deep Learning Network Model)
9 pages
05 Attention Slides
No ratings yet
05 Attention Slides
69 pages
Deep Learning For Vision Systems 1st Edition Mohamed Elgendy - The Ebook in PDF Format Is Available For Download
No ratings yet
Deep Learning For Vision Systems 1st Edition Mohamed Elgendy - The Ebook in PDF Format Is Available For Download
56 pages
Data Science Interview Preparation (#DAY 14)
No ratings yet
Data Science Interview Preparation (#DAY 14)
11 pages
Long Short Term Memory (LSTM)
No ratings yet
Long Short Term Memory (LSTM)
23 pages
Understanding AlexNet
No ratings yet
Understanding AlexNet
8 pages
Convolutional Neural Network Report
No ratings yet
Convolutional Neural Network Report
5 pages
MCQ Deep Learning Engineering Syllabus 1to 5 Unit ..
No ratings yet
MCQ Deep Learning Engineering Syllabus 1to 5 Unit ..
2 pages
Deep Learning & NLP Course Outline
No ratings yet
Deep Learning & NLP Course Outline
10 pages
A Comprehensive Review On Fake News Detection With Deep Learning
No ratings yet
A Comprehensive Review On Fake News Detection With Deep Learning
20 pages
Neural Metwork: Institut Teknologi Sepuluh Nopember (ITS) Surabaya - Indonesia
No ratings yet
Neural Metwork: Institut Teknologi Sepuluh Nopember (ITS) Surabaya - Indonesia
43 pages
Interfețe Vizuale Om-Mașină
No ratings yet
Interfețe Vizuale Om-Mașină
15 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
Generative AI Course Outline
No ratings yet
Generative AI Course Outline
4 pages
International Baccalaureate (IB) : Artificial Neural Networks - #1
No ratings yet
International Baccalaureate (IB) : Artificial Neural Networks - #1
33 pages
200-Article Text-3847-1-10-20230705
No ratings yet
200-Article Text-3847-1-10-20230705
7 pages
.-111111 - Tti, N: Untvi '7,'lo) LLT' Ll''l.it
No ratings yet
.-111111 - Tti, N: Untvi '7,'lo) LLT' Ll''l.it
4 pages
2 Days AI Deep Learning Workshop
No ratings yet
2 Days AI Deep Learning Workshop
9 pages
Unit 5-1
No ratings yet
Unit 5-1
6 pages
Adobe Scan 2025年1月13日
No ratings yet
Adobe Scan 2025年1月13日
1 page
Assignment Questions 2
No ratings yet
Assignment Questions 2
2 pages
CS230 Midterm Solutions Fall 2021
No ratings yet
CS230 Midterm Solutions Fall 2021
14 pages
Deep Learning Techniques: An Overview: January 2021
No ratings yet
Deep Learning Techniques: An Overview: January 2021
11 pages
Activation Function: Deep Neural Networks
No ratings yet
Activation Function: Deep Neural Networks
47 pages
Test 2 Lab 6
No ratings yet
Test 2 Lab 6
8 pages
Pretest Praktikum
75% (4)
Pretest Praktikum
3 pages
Multilayer Feed Forward Neural Network
No ratings yet
Multilayer Feed Forward Neural Network
8 pages
ISTE Faculty Chapter - MH286 Sardar Patel Institute of Technology Presents
No ratings yet
ISTE Faculty Chapter - MH286 Sardar Patel Institute of Technology Presents
4 pages
Associative Memory Neural Networks
100% (1)
Associative Memory Neural Networks
35 pages
Neural and Fuzzy Logic
No ratings yet
Neural and Fuzzy Logic
8 pages
Artificial Neural Networks (ch7)
No ratings yet
Artificial Neural Networks (ch7)
12 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Different Deep CNN Architectures - LeNet, AlexNet, VGG

Uploaded by

Different Deep CNN Architectures - LeNet, AlexNet, VGG

Uploaded by

Different Deep CNN Architectures - LeNet, AlexNet, VGG

16 September 2024 08:49 PM

➢ LeNet is a common term for LeNet-5, a simple convolutional neural network.

New Section 1 Page 2

New Section 1 Page 3

New Section 1 Page 4

Implementation: https://medium.com/@siddheshb008/lenet-5-architecture-explained-3b559cb2d52b (for Implementation of LeNet)

New Section 1 Page 5

➢ AlexNet consists of 5 convolution layers, 3 max-pooling layers, 2 Normalized layers, 2 fully

➢ Above all this AlexNet has over 60 million parameters.

New Section 1 Page 6

• Augmentation by Random Cropping of Images

New Section 1 Page 8

Implementation: https://medium.com/@siddheshb008/alexnet-architecture-explained-b6240c528bd5 (for Implementation of AlexNet)

Distinguish Between AlexNet (above) and VGGNet

➢ It was developed by A. Zisserman and K. Simonyan from the University of Oxford.

New Section 1 Page 10

Let’s quickly examine VGG’s architecture:

New Section 1 Page 12

New Section 1 Page 13

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.