0% found this document useful (0 votes)
46 views21 pages

Capsule Network - Kumar Shaswat

1. Capsule networks aim to address a key limitation of CNNs which is that they do not capture spatial relationships between simple and complex objects. 2. Capsule networks introduce the idea of capsules, which are groups of neurons that represent different properties of the same entity. 3. The capsule network architecture includes an encoder that learns to encode input images into vectors, and a decoder that learns to reconstruct the input image from these vectors.

Uploaded by

Kumar Shaswat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views21 pages

Capsule Network - Kumar Shaswat

1. Capsule networks aim to address a key limitation of CNNs which is that they do not capture spatial relationships between simple and complex objects. 2. Capsule networks introduce the idea of capsules, which are groups of neurons that represent different properties of the same entity. 3. The capsule network architecture includes an encoder that learns to encode input images into vectors, and a decoder that learns to reconstruct the input image from these vectors.

Uploaded by

Kumar Shaswat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

KUMAR SHASWAT

Capsule Netwo E18SOE802

rks
Bem BENNETT
UNIVERSITY
What we will Cover
CNNs Over - Simplified 

1 2 3
CNN Learns to Layers that are Finally dense layers
detect edges and deeper, learn to will combine very
colour gradients detect more complex high level features
features and give classification
predictions
CNN's
Drawback
Internal data representation
of a convolutional neural
network does not take into
account important spatial
hierarchies between simple
and complex objects.
CNN's Band Aid 
CNN approach to solve this issue is to Hinton: “The pooling operation used in
use max pooling or successive convolutional neural networks is a big
convolutional layers that reduce mistake and the fact that it works so
spacial size of the data flowing through well is a disaster.”
the network and therefore increase
the “field of view” of higher layer’s
neurons, thus allowing them to detect
higher order features in a larger region
of the input image.
Our Brain Accepts all these as Same
Hinton argues that brains, in fact, brain do
the opposite of rendering i.e inverse
graphics: from visual information received
Hinton's by eyes, it deconstruct a hierarchical
Idea representation of the world around us and
try to match it with already learned
patterns and relationships stored in the
brain.
Visual Representation of Requirement
Capsule: A capsule is a group of
neurons whose outputs represent
different properties of the same
CapsNet entity.
(Capsule
Network) Capsule Network: The idea is to add
capsules to a conventional neural
networks and to reuse output from
several of those capsules.
CapsNet Architecture
CapsNet Architecture
Architecture Encoder:
◦ Encoder part of the network takes as input a 28 by
28 MNIST digit image and learns to encode it into a
16-dimensional vector of instantiation parameters
Encoder Decoder
◦ The output of the network during prediction is a 10-
dimensional vectors of lengths of DigitCaps’ outputs.
Convolution Fully
Layer Connected#1 Decoder:
◦ Decoder takes a 16-dimensional vector from the
correct DigitCap and learns to decode it into an
PrimaryCaps Fully
Layer Connected #2 image of a digit
◦ Decoder forces capsules to learn features that are
useful for reconstructing the original image
DigitCaps Fully
Layer Connected#3
Convolution
layer
Input: 28x28 image (one color
channel).
Number of parameters:
(9*9+1)*256 = 20992
Kernels: 256, Size: 9*9*1
Output: 20x20x256 tensor.
Stride: 1
Input: 20x20x256 tensor.
Number of Capsules: 32
PrimaryCaps Each capsule applies: 8*(*9*9*256), stride 2
Layer Output: 6x6x8x32 tensor.
Number of parameters: 5308672.
PrimaryCaps
How Capsule works
Digit Caps Layer
Input: 6x6x8x32 tensor. = 1152 tensors
10 digit capsule one for each digit
Weight for each tensor: 8*16
1152c coefficients and 1152 b coefficients
Output: 16x10 matrix.
Number of parameters:
(1152*8*16+1152+1152)*10 = 1497600.
CapsNet Loss Function
Fully Connected Layers
Fully Connected Fully Connected Fully Connected
Layer#1 Layer#2 Layer# 3

Input: 16x10. Input: 16x10. Input: 16x10.

Output: 512. Output: 512. Output: 512.

No. of parameters: No. of parameters: No. of parameters:


(16*10 +1) *512 = (16*10 +1) *512 = (16*10 +1) *512 =
82432. 82432. 82432.
Input Convolutions ReLU PrimaryCaps

Routing by
Squashing DigitCaps
Agreement

Steps in CapsNet
Routing by
Agreement
Lower level capsule will
send its input to the
higher level capsule that
“agrees” with its input.
References
1. Sabour, S., Frosst, N. and Hinton, G.E., 2017. Dynamic routing between capsules. In Advances
in Neural Information Processing Systems (pp. 3856-3866).
2. Hinton, G.E., Krizhevsky, A. and Wang, S.D., 2011, June. Transforming auto-encoders.
In International Conference on Artificial Neural Networks (pp. 44-51). Springer, Berlin,
Heidelberg.
3. Understanding Hinton’s Capsule Networks by Max Pechyonkin
4. Understanding Capsule Networks — AI’s Alluring New Architecture by Nick Bourdakos

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy