0% found this document useful (0 votes)

15 views1 page

Safari - 07-Apr-2023 at 4:10 PM

This document provides an overview of neural network pruning techniques. It discusses different types of pruning structures like unstructured pruning which prunes individual connections and structured pruning which prunes entire filters or neurons. It also covers different criteria for determining what parts of the network to prune, such as the weight magnitude criterion which prunes weights with the smallest absolute values. The document aims to provide foundational knowledge on neural network pruning methods and help navigate the extensive literature on this topic.

Uploaded by

suriyars004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views1 page

Safari - 07-Apr-2023 at 4:10 PM

Uploaded by

suriyars004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

Open in app Sign up Sign In

Published in Towards Data Science

Hugo Tessier Follow

Sep 9, 2021 · 22 min read · Listen

Save

Neural Network
Pruning 101
All you need to know not to get lost

Whether it is in computer vision, natural

language processing or image generation,
deep neural networks yield the state of the
art. However, their cost in terms of
computational power, memory or energy
consumption can be prohibitive, making
some of them downright unaffordable for
most limited hardware. Yet, many
domains would benefit from neural
networks, hence the need to reduce their
cost while maintaining their performance.

That is the whole point of neural networks

compression. This field counts multiple
families of methods, such as quantization
[11], factorization [13], distillation [32] or,
and this will be the focus of this post,
pruning.

Neural network pruning is a method that

revolves around the intuitive idea of
removing superfluous parts of a network
that performs well but costs a lot of
resources. Indeed, even though large
neural networks have proven countless
times how well they could learn, it turns
out that not all of their parts are still useful
after the training process is over. The idea
is to eliminate these parts without
impacting the network’s performance.

Unfortunately, the dozens, if not

hundreds, of papers published each year
are revealing the hidden complexity of a
supposedly straight-forward idea. Indeed,
a quick overview of the literature yields
countless ways of identifying said useless
parts or removing them before, during or
after training; it even turns out that not all
kinds of pruning actually allow for
accelerating neural networks, which is
supposed to be the whole point.

The goal of this post is to provide a solid

foundation to tackle the intimidatingly
wild literature around neural network
pruning. We will review successively three
questions that seem to be at the core of the
whole domain: “What kind of part should I
prune?”, “How to tell which parts can be
pruned?” and “How to prune parts without
harming the network?”. To sum it up, we
will detail pruning structures, pruning
criteria and pruning methods.

1 — Pruning structures

1.1 — Unstructured pruning

When talking about the cost of neural
networks, the count of parameters is
surely one of the most widely used
metrics, along with FLOPS (floating-point
operations per second). It is indeed
intimidating to see networks displaying
astronomical amounts of weights (up to
billions for some), often correlated with
stellar performance. Therefore, it is quite
intuitive to aim at reducing directly this
count by removing parameters
themselves. Actually, pruning connections
is one of the most widespread paradigms
in the literature, enough to be considered
as the default framework when dealing
with pruning. The seminal work of Han et
al.[26] presented this kind of pruning and
served as a basis for numerous
contributions [18, 21, 25].

Directly pruning parameters has many

advantages. First, it is simple, since
replacing the value of their weight with
zero, within the parameter tensors, is
enough to prune a connection.
Widespread deep learning frameworks,
such as Pytorch, allow to easily access all
the parameters of a network, making it
extremely simple to implement. Still, the
greatest advantage of pruning connections
remains yet that they are the smallest,
most fundamental elements of networks
and, therefore, they are numerous enough
to prune them in large quantities without
impacting performance. Such a fine
granularity allows pruning very subtle
patterns, up to parameters within
convolution kernels, for example. As
pruning weights is not limited by any
constraint at all and is the finest way to
prune a network, such a paradigm is
called unstructured pruning.

However, this method presents a major,

fatal drawback: most frameworks and
hardware cannot accelerate sparse
matrices’ computation, meaning that no
matter how many zeros you fill the
parameter tensors with, it will not impact
the actual cost of the network. What does
impact it, however, is pruning in a way
that directly alters the very architecture of
the network, which any framework can
handle.

Difference between unstructured (left) and structured

(right) pruning: structured pruning removes both
convolution filters and rows of kernels instead of just
pruning connections. This leads to fewer feature maps
within intermediate representations. (image by author)

1.2 — Structured pruning

This is the reason why many works have
focused on pruning larger structures, such
as whole neurons [36] or, for its direct
equivalent within the more modern deep
convolutional networks, convolution
filters [40, 41, 66]. Filter pruning allows for
an exploitable and yet fine enough
granularity, as large networks tend to
include numerous convolution layers, each
counting up to hundreds or thousands of
filters. Not only does removing such
structures result in sparse layers that can
be directly instantiated as thinner ones,
but doing so also eliminates the feature
maps that are the outputs of such filters.

Therefore, not only are such networks

lighter to store, due to fewer parameters,
but also they require less computations
and generate lighter intermediate
representations, hence needing less
memory during runtime. Actually, it is
sometimes more beneficial to reduce
bandwidth rather than the parameter
count. Indeed, for tasks that involve large
images, such as semantic segmentation or
object detection, intermediate
representations may be prohibitively
memory-consuming, way more than the
network itself. For these reasons, filter
pruning is now seen as the default kind of
structured pruning.

Yet, when applying such a pruning, one

should pay attention to the following
aspects. Let’s consider how a convolution
layer is built: for Cin input channels and
Cout output ones, a convolution layer is
made of Cout filters, each counting Cin
kernels; each filter outputs one feature
map and within each filter, one kernel is
dedicated to each input channel.
Considering this architecture, and
acknowledging a regular convolutional
network basically stacks convolution
layers, when pruning whole filters, one
may observe that pruning a filter, and then
the feature map it outputs, actually results
in pruning the corresponding kernels in
the ensuing layer too. That means that,
when pruning filters, one may actually
prune twice the amount of parameters
thought to be removed in the first place.

Let’s consider too that, when a whole layer

happens to get pruned (which tends to
happen because of layer collapse [62], but
does not always break the network,
depending on the architecture), the
previous layer’s outputs are now totally
unconnected, hence pruned too: pruning
a whole layer may actually prune all its
previous layers whose outputs are not
somehow connected elsewhere (because
of residual connections [28] or whole
parallel paths [61]). Therefore, when
pruning filters, one should consider
computing the exact number of actually
pruned parameters. Indeed, pruning the
same amount of filters, depending on their
distribution within the architecture, may
not lead to the same actual amount of
pruned parameters, making any result
impossible to compare with.

Before changing topic, let’s just mention

that, albeit a minority, some works focus
on pruning convolution kernels, intra-
kernel structures [2,24, 46] or even specific
parameter-wise structures. However, such
structures need special implementations
to lead to any kind of speedup (as for
unstructured pruning). Another kind of
exploitable structure, though, is to turn
convolutions into “shift layers” by pruning
all but one parameter in each kernel,
which can then be summed up as a
combination of a shifting operation and a
1 × 1 convolution [24].

The danger of structured pruning: altering the input

and output dimensions of layers can lead to some
discrepancies. If on the left, both layers output the
same number of feature maps, that can be summed up
well afterward, their pruned counterparts on the right
produce intermediate representations of different
dimensions, that cannot be summed up without
processing them. (image by author)

2 — Pruning criteria
Once one has decided what kind of
structure to prune, the next question one
may ask could be: “Now, how do I figure
out which ones to keep and which ones to
prune?”. To answer that one needs a
proper pruning criteria, that will rank the
relative importance of the parameters,
filters or else.

2.1 — Weight magnitude criterion

One criterion that is quite intuitive and
surprisingly efficient is pruning weights
whose absolute value (or “magnitude”) is
the smallest. Indeed, under the constraint
of a weight-decay, those which do not
contribute significantly to the function are
expected to have their magnitude shrink
during training. Therefore, the
superfluous weights are expected to be
those of lesser magnitude [8].
Notwithstanding its simplicity, the
magnitude criterion is still widely used in
modern works [21, 26, 58], making it a
staple of the domain.

However, although this criterion seems

trivial to implement in the case of
unstructured pruning, one may wonder
how to adapt it to structured pruning. One
straightforward way is to order filters
depending on their norm (L 1 or L 2 for
example) [40, 70]. If this method is quite
straightforward one may desire to
encapsulate multiple sets of parameters
within one measure: for example, a
convolutional filter, its bias and its batch-
normalization parameters together, or
even corresponding filters within parallel
layers whose outputs are then fused and
whose channels we would like to reduce.

One way to do that, without having to

compute the combined norm of these
parameters, involves inserting a learnable
multiplicative parameter for each feature
map after each set of layers you want to
prune. This gate, when reduced to zero,
effectively prunes the whole set of
parameters responsible for this channel
and the magnitude of this gate accounts
for the importance of all of them. The
method hence consists in pruning the
gates of lesser magnitude [36, 41].

2.2 — Gradient magnitude pruning

Magnitude of the weight is not the only
popular criterion (or family of criteria)
that exists. Actually, the other main
criterion to have lasted up to now is the
magnitude of the gradient. Indeed, back in
the 80's some fundamental works [37, 53]
theorized, through a Taylor decomposition
of the impact of removing a parameter on
the loss, that some metrics, derived from
the back-propagated gradient, may
provide a good way to determine which
parameters could be pruned without
damaging the network.

More modern implementations of this

criterion [4, 50] actually accumulate
gradients over a minibatch of training data
and prune on the basis of the product
between this gradient and the
corresponding weight of each parameter.
This criterion can be applied to the
aforementioned gates too [49].

2.3 — Global or local pruning

One final aspect to take into consideration
is whether the chosen criterion is applied
globally to all parameters or filters of the
network, or if it is computed
independently for each layer. While global
pruning has proven many times to yield
better results, it can lead to layer collapse
[62]. A simple way to avoid this problem is
to resort to layer-wise local pruning,
namely pruning the same rate at each
layer, when the used method cannot
prevent layer collapse.

Difference between local pruning (left) and global

pruning (right): local pruning applies the same rate to
each layer while global applies it on the whole network
at once. (image by author)

3 — Pruning method
Now that we have got our pruning
structure and criterion, the only
parameter left is which method should we
use to prune a network. This is actually the
topic on which the literature can be the
most confusing, as each paper will bring
its own quirks and gimmicks, so much that
one may get lost between what is
methodically relevant and what is just a
specificity of a given paper.

This is why we will thematically overview

some of the most popular families of
method to prune neural networks, in an
order that highlights the evolution of the
use of sparsity during training.

3.1 — The classic framework: train, prune and

fine-tune
The first basic framework to know is the
train, prune and fine-tune method, which
obviously involves 1) training the network
2) pruning it by setting to 0 all parameters
targeted by the pruning structures and
criterion (these parameters cannot recover
afterwhile) and 3) training the network for
a few extra epochs, with the lowest
learning rate, to give it a chance to recover
from the loss in performance induced by
pruning. Usually, these last two steps can
be iterated, with each time a growing
pruning rate.

The method proposed by Han et al. [26]

applies this method, with 5 iterations
between pruning and fine-tuning, to
weight magnitude pruning. Iterating has
shown to improve performance, at the cost
of extra computation and training time.
This simple framework serves as a basis
for many works [26, 40, 41, 50, 66] and can
be seen as the default method over which
all the others have built.

3.2 — Extending the classic framework

While not straying too far, some methods
have brought significant modifications to
the aforementioned classic framework by
Han et al. [26]. Gale et al. [21] have pushed
the principle of iterations further by
removing an increasing amount of weights
progressively all along the training
process, which allows benefiting from the
advantages of iterations and to remove the
whole fine-tuning process. He et al. [29]
reduce prunable filters to 0, at each epoch,
while not preventing them from learning
and being updated afterward, in order to
let their weights grow back after pruning
while enforcing sparsity during training.

Finally, the method of Renda et al. [58]

involves fully retraining a network once it
is pruned. Unlike fine-tuning, which is
performed at the lowest learning-rate,
retraining follows the same learning-rate
schedule as training, hence its name:
“Learning-Rate Rewinding”. This
retraining has shown to yield better
performance than mere fine-tuning, at a
significantly higher cost.

3.3 — Pruning at initialization

In order to speed up training, avoid fine-
tuning and prevent any alteration of the
architecture during or after training,
multiple works have focused on pruning
before training. In the wake of SNIP [39],
many works have studied the use of the
work of Le Cun et al. [37] or of Mozer and
Smolensky [53] to prune at initialization
[12, 64], including intensive theoretical
studies [27, 38, 62]. However, Optimal
Brain Damage [37] relies on multiple
approximations including an “extremal”
approximation that “assumes that
parameter deletion will be performed
after training has converged” [37]; this fact
is rarely mentioned, even among works
that are based on it. Some works have
raised reservations about the ability of
such methods to generate masks whose
relevance outshines random ones of
similar distribution per layer [20].

Another family of methods that study the

relationship between pruning and
initialization gravitates around the
“Lottery Ticket Hypothesis” [18]. This
hypothesis states that “randomly-
initialized, dense neural network contains
a subnet-work that is initialized such that
— when trained in isolation — it can match
the test accuracy of the original network
after training for at most the same number
of iterations”. In practice, this literature
studies how well a pruning mask, defined
using an already converged network, can
be applied to the network back when it
was just initialized. Multiple works have
expanded, stabilized or studied this
hypothesis [14, 19, 45, 51, 69]. However,
once again multiple works tend to
question the validity of the hypothesis and
of the method used to study it [21, 42] and
some even tend to show that its benefits
rather came from the principle of fully
training with the definitive mask instead
of a hypothetical “Winning Ticket” [58].

Comparison between the classic “train, prune and fine-

tune” framework [26], the lottery ticket experiment [18]
and learning rate rewinding [58]. (image by author)

3.4 — Sparse training

The previous methods are linked by a
seemingly shared underlying theme:
training under sparsity constraints. This
principle is at the core of a family of
methods, called sparse training, which
consists in enforcing a constant rate of
sparsity during training while its
distribution varies and is progressively
adjusted. Introduced by Mocanu et al. [47],
it involves: 1) initializing the network with
a random mask that prunes a certain
proportion of the network 2) training this
pruned network during one epoch 3)
pruning a certain amount of weights of
lower magnitude and 4) regrowing the
same amount of random weights.

That way, the pruning mask, at first

random, is progressively adjusted to target
the least import weights while enforcing
sparsity all throughout training. The
sparsity level can be the same for each
layer [47] or global [52]. Other methods
have extended sparse training by using a
certain criterion to regrow weights instead
of choosing them randomly [15, 17].

402and grows2different weights

Sparse training cuts
periodically during training, which leads to an adjusted
:

ML System Optimization Lecture 11 Pruning Again
No ratings yet
ML System Optimization Lecture 11 Pruning Again
123 pages
Lec04 Pruning II
No ratings yet
Lec04 Pruning II
119 pages
Lec03 Pruning I
No ratings yet
Lec03 Pruning I
74 pages
Pruning and Quantization For Deep Neural Network Acceleration: A Survey
No ratings yet
Pruning and Quantization For Deep Neural Network Acceleration: A Survey
41 pages
Algorithms 17 00048 v2
No ratings yet
Algorithms 17 00048 v2
23 pages
Greedy Pruning For Continually Adapting Networks
No ratings yet
Greedy Pruning For Continually Adapting Networks
60 pages
Survey Pruning 1 - 2022-Methods For Pruning Deep
No ratings yet
Survey Pruning 1 - 2022-Methods For Pruning Deep
21 pages
1580 Rethinking The Value of Networ
No ratings yet
1580 Rethinking The Value of Networ
21 pages
模型剪枝在2d3d卷积网络中的研究与应用-悉尼大学在读博士生郭晋阳智东西公开课
No ratings yet
模型剪枝在2d3d卷积网络中的研究与应用-悉尼大学在读博士生郭晋阳智东西公开课
70 pages
4990 DPaI Differentiable Pruni
No ratings yet
4990 DPaI Differentiable Pruni
22 pages
Week 7
No ratings yet
Week 7
24 pages
Hrel: Filter Pruning Based On High Relevance Between Activation Maps and Class Labels
No ratings yet
Hrel: Filter Pruning Based On High Relevance Between Activation Maps and Class Labels
13 pages
L11 Learning III Neural Network Architectures
No ratings yet
L11 Learning III Neural Network Architectures
35 pages
Learning To Prune Deep Neural Networks Via Layer-Wise Optimal Brain Surgeon
No ratings yet
Learning To Prune Deep Neural Networks Via Layer-Wise Optimal Brain Surgeon
15 pages
Applsci 12 11184
No ratings yet
Applsci 12 11184
18 pages
ML System Optimization - Lecture 10 - Model Optimization Techniques
No ratings yet
ML System Optimization - Lecture 10 - Model Optimization Techniques
33 pages
Mmprune4U: Regularizing Multimodal Feature Distortion in Weight Pruning For Deep Neural Network Compression
No ratings yet
Mmprune4U: Regularizing Multimodal Feature Distortion in Weight Pruning For Deep Neural Network Compression
15 pages
Convolution Operation
No ratings yet
Convolution Operation
23 pages
4a TensorCores
No ratings yet
4a TensorCores
18 pages
Pruning Attention Heads of Transformer Models Using A Search
No ratings yet
Pruning Attention Heads of Transformer Models Using A Search
22 pages
ANN Module-III
No ratings yet
ANN Module-III
16 pages
Snip: S - N P C S: Ingle Shot Etwork Runing Based On Onnection Ensitivity
No ratings yet
Snip: S - N P C S: Ingle Shot Etwork Runing Based On Onnection Ensitivity
15 pages
Runtime Neural Pruning
No ratings yet
Runtime Neural Pruning
11 pages
Pruning Coupled With Learning, Ensembles of Minimal Neural Networks, and Future of XAI
No ratings yet
Pruning Coupled With Learning, Ensembles of Minimal Neural Networks, and Future of XAI
23 pages
Towards Efficient Neuromorphic Hardware Unsupervis
No ratings yet
Towards Efficient Neuromorphic Hardware Unsupervis
15 pages
Autosculpt: A Pattern-Based Model Auto-Pruning Framework Using Reinforcement Learning and Graph Learning
No ratings yet
Autosculpt: A Pattern-Based Model Auto-Pruning Framework Using Reinforcement Learning and Graph Learning
12 pages
Finding Lottery Tickets in Vision Models Via Data-Driven Spectral Foresight Pruning
No ratings yet
Finding Lottery Tickets in Vision Models Via Data-Driven Spectral Foresight Pruning
15 pages
Pruning Networks With Cross-Layer Ranking Amp K-Reciprocal Nearest Filters
No ratings yet
Pruning Networks With Cross-Layer Ranking Amp K-Reciprocal Nearest Filters
10 pages
Deep Model Compression Based On The Trai
No ratings yet
Deep Model Compression Based On The Trai
9 pages
Zehao Huang Data-Driven Sparse Structure ECCV 2018 Paper
No ratings yet
Zehao Huang Data-Driven Sparse Structure ECCV 2018 Paper
17 pages
Dep Pruning
No ratings yet
Dep Pruning
10 pages
MLSys 2020 What Is The State of Neural Network Pruning Paper
No ratings yet
MLSys 2020 What Is The State of Neural Network Pruning Paper
18 pages
Learning To Prune Filters in Convolutional Neural Networks: Qianguih, Uneumann @usc - Edu Suya - You.civ@mail - Mil
No ratings yet
Learning To Prune Filters in Convolutional Neural Networks: Qianguih, Uneumann @usc - Edu Suya - You.civ@mail - Mil
10 pages
DepGraph Towards Any Structural Pruning
No ratings yet
DepGraph Towards Any Structural Pruning
11 pages
FGBPrune Fined Grained Block Pruning in Large Language Models
No ratings yet
FGBPrune Fined Grained Block Pruning in Large Language Models
13 pages
Pruning Paper
No ratings yet
Pruning Paper
14 pages
To Prune, or Not To Prune: Exploring The Efficacy of Pruning For Model Compression
No ratings yet
To Prune, or Not To Prune: Exploring The Efficacy of Pruning For Model Compression
11 pages
2021-Huan Wang-Emerging Paradigms of Neural Network Pruning
No ratings yet
2021-Huan Wang-Emerging Paradigms of Neural Network Pruning
8 pages
Mackay Hazel PythoMachine Learning With Pytorch and Scikit Learn A Co
No ratings yet
Mackay Hazel PythoMachine Learning With Pytorch and Scikit Learn A Co
135 pages
Soft Filter Pruning For Accelerating Deep Convolutional Neural Networks
No ratings yet
Soft Filter Pruning For Accelerating Deep Convolutional Neural Networks
8 pages
Dynamic and Progressive Filter Pruning For Compressing Convolutional Neural Networks From Scratch
No ratings yet
Dynamic and Progressive Filter Pruning For Compressing Convolutional Neural Networks From Scratch
9 pages
20222-Article Text-24235-1-2-20220628
No ratings yet
20222-Article Text-24235-1-2-20220628
9 pages
Pruning Introduction
No ratings yet
Pruning Introduction
13 pages
Introduction
No ratings yet
Introduction
10 pages
Structured Pruning of Deep Convolutional Neural Netw Orks: Sajid Anwar, Kyuyeon Hwang and Wonyong Sung
No ratings yet
Structured Pruning of Deep Convolutional Neural Netw Orks: Sajid Anwar, Kyuyeon Hwang and Wonyong Sung
11 pages
Paper de Research de New
No ratings yet
Paper de Research de New
4 pages
Paper de Research
No ratings yet
Paper de Research
4 pages
P C N N R E I: Runing Onvolutional Eural Etworks FOR Esource Fficient Nference
No ratings yet
P C N N R E I: Runing Onvolutional Eural Etworks FOR Esource Fficient Nference
17 pages
References
No ratings yet
References
4 pages
Pruning Algorithms of Neural Networks - A Comparat
No ratings yet
Pruning Algorithms of Neural Networks - A Comparat
11 pages
1506 02626 PDF
No ratings yet
1506 02626 PDF
9 pages
Channel Pruning For Accelerating Very Deep Neural Networks
No ratings yet
Channel Pruning For Accelerating Very Deep Neural Networks
9 pages
Graph-Based Network Generation and CCTV Processing Techniques For Fire Evacuation
No ratings yet
Graph-Based Network Generation and CCTV Processing Techniques For Fire Evacuation
19 pages
Prune Regression Trees
No ratings yet
Prune Regression Trees
2 pages
Model Channel Pruning Method Based On Squeeze-and-Excitation Mechanism and Upper Quartile Truncation
No ratings yet
Model Channel Pruning Method Based On Squeeze-and-Excitation Mechanism and Upper Quartile Truncation
5 pages
An Introduction To Deep Learning On Remote Sensing Images (Tutorial) - Mdl4eo
100% (1)
An Introduction To Deep Learning On Remote Sensing Images (Tutorial) - Mdl4eo
13 pages
Pattern Classification Using Simplified Neural Networks With Pruning Algorithm
No ratings yet
Pattern Classification Using Simplified Neural Networks With Pruning Algorithm
7 pages
Faraone 2018
No ratings yet
Faraone 2018
4 pages
Science and Technology
No ratings yet
Science and Technology
106 pages
United States Patent: (21) Appl. N0.: 13/226,109 (Continued)
No ratings yet
United States Patent: (21) Appl. N0.: 13/226,109 (Continued)
17 pages
Face Recognition System
No ratings yet
Face Recognition System
29 pages
Neural Network Pruning and Pruning Parameters: G. Thimm and E. Fiesler
No ratings yet
Neural Network Pruning and Pruning Parameters: G. Thimm and E. Fiesler
2 pages
Multi Object Tracking in Traffic Environments: A Systematic Literature
No ratings yet
Multi Object Tracking in Traffic Environments: A Systematic Literature
13 pages
Machine Learning Roadmap
No ratings yet
Machine Learning Roadmap
31 pages
Advance Artifical Intelligence & ML Certification Program
No ratings yet
Advance Artifical Intelligence & ML Certification Program
33 pages
Lane Detection System
No ratings yet
Lane Detection System
67 pages
AI-Based Yoga Pose Estimation For Android Application
No ratings yet
AI-Based Yoga Pose Estimation For Android Application
4 pages
Hardware Accelerators For Autonomous Cars: A Review: Abstract
No ratings yet
Hardware Accelerators For Autonomous Cars: A Review: Abstract
14 pages
Diagnostics 13 00161
No ratings yet
Diagnostics 13 00161
26 pages
Python Deep Learning Second Edition Ivan Vasilev & Daniel Slater & Gianmario Spacagna &peter Roelants & Valentino Zocca
No ratings yet
Python Deep Learning Second Edition Ivan Vasilev & Daniel Slater & Gianmario Spacagna &peter Roelants & Valentino Zocca
51 pages
TrafficRuleViolationIRJET V9I6414
No ratings yet
TrafficRuleViolationIRJET V9I6414
8 pages
An Assistive Interface Protocol For Communication Between Visually and Hearing-Speech Impaired Persons in Internet Platform
No ratings yet
An Assistive Interface Protocol For Communication Between Visually and Hearing-Speech Impaired Persons in Internet Platform
15 pages
17CH10019 BTP Ii Report
No ratings yet
17CH10019 BTP Ii Report
22 pages
On The Automatic Generation of Medical Imaging Reports
No ratings yet
On The Automatic Generation of Medical Imaging Reports
10 pages
DIGITNET A Deep Handwritten Digit Detection and Re
No ratings yet
DIGITNET A Deep Handwritten Digit Detection and Re
13 pages
Single Shot Detection
No ratings yet
Single Shot Detection
8 pages
Attention-Guided Convolutional Neural Network For Detecting
No ratings yet
Attention-Guided Convolutional Neural Network For Detecting
4 pages
2023 Multi-Class Classification of Brain Tumor Types From MR Images Using EfficientNets
No ratings yet
2023 Multi-Class Classification of Brain Tumor Types From MR Images Using EfficientNets
12 pages
Image Denoising Using Complex-Valued Deep CNN
No ratings yet
Image Denoising Using Complex-Valued Deep CNN
12 pages
A Full-Image Full-Resolution End-To-End-Trainable CNN Framework For Image Forgery Detection
No ratings yet
A Full-Image Full-Resolution End-To-End-Trainable CNN Framework For Image Forgery Detection
15 pages
Stable Diffusion For High-Quality Image Reconstruction in Digital Rock Analysis
No ratings yet
Stable Diffusion For High-Quality Image Reconstruction in Digital Rock Analysis
15 pages
An End-To-End Approach To Detect Railway Track Defects Based On Supervised
No ratings yet
An End-To-End Approach To Detect Railway Track Defects Based On Supervised
11 pages
A Survey On Multimodal Aspect-Based Sentiment Analysis
No ratings yet
A Survey On Multimodal Aspect-Based Sentiment Analysis
14 pages
A Transformer-Based OFDM Receiver For Underwater Acoustic Communication
No ratings yet
A Transformer-Based OFDM Receiver For Underwater Acoustic Communication
5 pages
Saleh 2020
No ratings yet
Saleh 2020
6 pages
ML Article Writing
No ratings yet
ML Article Writing
3 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Safari - 07-Apr-2023 at 4:10 PM

Uploaded by

Safari - 07-Apr-2023 at 4:10 PM

Uploaded by

Open in app Sign up Sign In

Published in Towards Data Science

Hugo Tessier Follow

Sep 9, 2021 · 22 min read · Listen

Whether it is in computer vision, natural

That is the whole point of neural networks

Neural network pruning is a method that

Unfortunately, the dozens, if not

The goal of this post is to provide a solid

1.1 — Unstructured pruning

Directly pruning parameters has many

However, this method presents a major,

Difference between unstructured (left) and structured

1.2 — Structured pruning

Therefore, not only are such networks

Yet, when applying such a pruning, one

Let’s consider too that, when a whole layer

Before changing topic, let’s just mention

The danger of structured pruning: altering the input

2.1 — Weight magnitude criterion

However, although this criterion seems

One way to do that, without having to

2.2 — Gradient magnitude pruning

More modern implementations of this

2.3 — Global or local pruning

Difference between local pruning (left) and global

This is why we will thematically overview

3.1 — The classic framework: train, prune and

The method proposed by Han et al. [26]

3.2 — Extending the classic framework

Finally, the method of Renda et al. [58]

3.3 — Pruning at initialization

Another family of methods that study the

Comparison between the classic “train, prune and fine-

3.4 — Sparse training

That way, the pruning mask, at first

402and grows2different weights

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.