0% found this document useful (0 votes)
5 views9 pages

Stacked Autoencoders. | Towards Data Science

The document discusses the concept of dimensionality reduction in data science, focusing on techniques such as Principal Component Analysis (PCA) and Autoencoders. It highlights the use of Stacked Autoencoders for handling complex datasets, providing a detailed implementation guide using Python. The author demonstrates how to build and train a Stacked Autoencoder to achieve high accuracy in reconstructing input data.

Uploaded by

20pcse02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views9 pages

Stacked Autoencoders. | Towards Data Science

The document discusses the concept of dimensionality reduction in data science, focusing on techniques such as Principal Component Analysis (PCA) and Autoencoders. It highlights the use of Stacked Autoencoders for handling complex datasets, providing a detailed implementation guide using Python. The author demonstrates how to build and train a Stacked Autoencoder to achieve high accuracy in reconstructing input data.

Uploaded by

20pcse02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

The world’s leading publication for data science, AI, and ML professionals.

LATEST EDITOR’S PICKS DEEP DIVES CONTRIBUTE NEWSLETTER

DATA SCIENCE

Stacked Autoencoders.
Extract important features from data using deep
learning.
Rajas Bakshi
Jun 28, 2021 6 min read

Photo by Mika Baumeister on Unsplash

Dimensionality reduction

While solving a Data Science problem, did you ever come across a
dataset with hundreds of features? or perhaps a thousand features?
If no, then you don’t know how challenging it can be to develop an
e!cient model. Dimensionality reduction for those who don’t know
is an approach to filter the essential features from the data.

Having more input features in the data makes the task of predicting
the dependent feature challenging. A large number of elements can
sometimes cause the model to have poor performance. The cause
behind this could be the model may try to find the relation between
the feature vector and output vector that is very weak or
nonexistent. There are various methods used for reducing the
dimensions of the data, and a comprehensive guide on the same can
be found on the link below.

https://www.analyticsvidhya.com/blog/2018/08/dimensionality-
reduction-techniques-python/

Principal Component Analysis (PCA)

PCA is one of the popular approach used for Dimensionality


Reduction. PCA can help you to find a vector of the most relevant
features. This new set of features are called principal components.
The first main component is extracted so that it explains the most
variation in the dataset. The second central component, which is
unrelated to the first, attempts to explain the remaining variation in
the dataset. The third principle component tries to explain the
interpretation that the previous two principal components can’t
explain, and so on. Although this approach helps us in reducing the
dimensions, PCA is only e!cient when the relation between the
dependent features and independent features is linear. For a deeper
understanding of PCA, visit the link below.

https://towardsdatascience.com/a-one-stop-shop-for-principal-
component-analysis-5582fb7e0a9c

Autoencoder

Autoencoders are used to reduce the dimensions of data when a


nonlinear function describes the relationship between dependent
and independent features. Autoencoders are a type of unsupervised
artificial neural networks. Autoencoders are used for automatic
feature extraction from the data. It is one of the most promising
feature extraction tools used for various applications such as speech
recognition, self-driving cars, face alignment / human gesture
detection. The architecture of the Autoencoder is shown in the
figure below
AutoEncoder Source: Introduction to autoencoders.

As seen in the figure above, an autoencoder architecture is divided


into three parts: The encoder, bottleneck, and decoder. The encoder
picks the crucial features from the data, while the decoder attempts
to recreate the original data using the critical components. By
retaining just the characteristics needed to reconstruct the data, the
autoencoders decrease the data dimension. Autoencoders are a type
of feed-forward network that may be trained using the same
procedures as feed-forward networks. The output of the
Autoencoder is the same as the input with some loss. Thus,
autoencoders are also called lossy compression technique.
Moreover, autoencoders can perform as PCA if we have one dense
layer with a linear activation function in each encoder and decoder.

Stacked Autoencoder

Some datasets have a complex relationship within the features.


Thus, using only one Autoencoder is not su!cient. A single
Autoencoder might be unable to reduce the dimensionality of the
input features. Therefore for such use cases, we use stacked
autoencoders. The stacked autoencoders are, as the name suggests,
multiple encoders stacked on top of one another. A stacked
autoencoder with three encoders stacked on top of each other is
shown in the following figure.
Image by author

According to the architecture shown in the figure above, the input


data is first given to autoencoder 1. The output of the autoencoder 1
and the input of the autoencoder 1 is then given as an input to
autoencoder 2. Similarly, the output of autoencoder 2 and the input
of autoencoder 2 are given as input to autoencoder 3. Thus, the
length of the input vector for autoencoder 3 is double than the input
to the input of autoencoder 2. This technique also helps to solve the
problem of insu!cient data to some extent.

Implementing Stacked autoencoders using python

To demonstrate a stacked autoencoder, we use Fast Fourier


Transform (FFT) of a vibration signal. The FFT vibration signal is used
for fault diagnostics and many other applications. The data has very
complex patterns, and thus a single autoencoder is unable to reduce
the dimensions of the data. The figure below is a plot of the FFT
waveform. The amplitude of the FFT is transformed to be between 0
and 1.

Image by author

To get a better visual understanding, we reshape the signal into a


63*63 matrix and plot it (As it is a vibration signal converted to the
image, take it with a grain of salt). The figure below is the image
representation of the vibration signal.
Image by author

I know it isn’t easy to see a lot in this image. However, we can still
see a few features in the picture. The bright white shot
approximately at (0,15) is the peak seen in the FFT of the vibration
signal.

Now we start with creating our Autoencoder.

batch_size = 32

input_dim = x_train[0].shape[0] #num of predictor variables learning_rate = 1e-5

input_layer = Input(shape=(input_dim, ), name="input")

#Input Layer
encoder = Dense (2000, activation="relu", activity_regularizer=regularizers.l1(learning_rate))(in

#Encoder's first dense layer

encoder = Dense (1000, activation="relu",

activity_regularizer=regularizers.l1(learning_rate))(encoder)

#Encoder's second dense layer

encoder = Dense (500, activation="relu", activity_regularizer=regularizers.l1(learning_rate))(enc

# Code layer

encoder = Dense (200, activation="relu", activity_regularizer=regularizers.l1(learning_rate))(enc

# Decoder's first dense layer


decoder = Dense(500, activation="relu", activity_regularizer=regularizers.l1(learning_rate))(enco

# Decoder's second dense layer

decoder = Dense(1000, activation="relu", activity_regularizer=regularizers.l1(learning_rate))(dec

# Decoder's Third dense layer


decoder = Dense(2000, activation="relu", activity_regularizer=regularizers.l1(learning_rate))(dec

# Output Layer

decoder = Dense(input_dim, activation="sigmoid", activity_regularizer=regularizers.l1(learning_ra

The autoencoder designed above has two dense layers on both


sides: encoder and decoder. Notice that the number of neurons in
each decoder and encoder are the same. Moreover, the decoder is
the mirror reflection of the encoder.

As we see, the FFT signal has 4000 data points; thus, our input and
output layers have 4000 neurons. When we go deep into the
network, subsequently, the number of neurons decreases. Finally, at
the code layer, we have only 200 neurons. Thus this autoencoder
tries to reduce the number of features from 4000 to 200.

Now, we build the model, compile it and fit it on our training data.
As the autoencoder’s target output is the same as the input, we
pass x_train as both inputs as well as output.

autoencoder_1 = Model(inputs=input_layer, outputs=decoder)

autoencoder_1.compile(metrics=['accuracy'],loss='mean_squared_error',optimizer='adam')

satck_1 = autoencoder_1.fit(x_train, x_train,epochs=200,batch_size=batch_size)

Once we have trained our first autoencoder, we concatenate the


output and input of the first autoencoder.

autoencoder_2_input = autoencoder_1.predict(x_train)

autoencoder_2_input = np.concatenate((autoencoder_2_input , x_train))

Now the input for autoencoder 2 is ready. Thus, we build, compile


and train autoencoder 2 on our new dataset.

autoencoder_2 = Model(inputs=input_layer, outputs=decoder)

autoencoder_2.compile(metrics=['accuracy'],loss='mean_squared_error',optimizer='adam')

satck_2 = autoencoder_2.fit(autoencoder_2_input, autoencoder_2_input,epochs=100,batch_size=batch_

Once we have trained our autoencoder 2 we move towards training


our third autoencoder. As we did for our second autoencoder, the
input to the third autoencoder is a concatenation of output and
input of our second autoencoder.

autoencoder_3_input = autoencoder_2.predict(autoencoder_2_input)

autoencoder_3_input = np.concatenate((autoencoder_3_input, autoencoder_2_input))

And now, lastly, we train our third autoencoder. As we did for the
last two encoders, we build, compile and train on our new data.

autoencoder_2 = Model(inputs=input_layer, outputs=decoder)

autoencoder_3.compile(metrics=['accuracy'], loss='mean_squared_error', optimizer='adam')

satck_3 = autoencoder_3.fit(autoencoder_3_input, autoencoder_3_input, epochs=50, batch_size=16)

After training our stacked autoencoder, we achieve an accuracy of


approximately 90%. This means that our stacked autoencoders can
recreate our original input signal with about 90% of accuracy.
The image of the original and recreated signal is shown below.

Image by author

· · ·

WRITTEN BY

Rajas Bakshi
See all from Rajas Bakshi

Topics:

Data Analysis Data Science Deep Learning

Dimensionality Reduction Feature Engineering

Share this article:

Related Articles

ARTIFICIAL INTELLIGENCE DATA SCIENCE

Implementing Hands-on Time Series


Convolutional Neural Anomaly Detection using
Networks in TensorFlow Autoencoders, with Python
Step-by-step code guide to building a Here’s how to use Autoencoders to
Convolutional Neural Network detect signals with anomalies in a few
lines of…
Shreya Rao
August 20, 2024 6 min read Piero Paialunga
August 20, 2024 6 min read Piero Paialunga
August 21, 2024 12 min read

DATA SCIENCE DATA SCIENCE

Solving a Constrained Back To Basics, Part Uno:


Project Scheduling Linear Regression and Cost
Problem with Quantum Function
Annealing An illustrated guide on essential machine
Solving the resource constrained project learning concepts
scheduling problem (RCPSP) with D- Shreya Rao
Wave’s hybrid constrained quadratic February 3, 2023 6 min read
model (CQM)
Luis Fernando PÉREZ ARMAS, Ph.D.
August 20, 2024 28 min read

DATA SCIENCE DATA SCIENCE

Must-Know in Statistics: How to Make the Most of


The Bivariate Normal Your Experience as a TDS
Projection Explained Author
Derivation and practical examples of this A quick guide to our resources and FAQ
powerful concept TDS Editors
Luigi Battistoni September 13, 2022 4 min read
August 14, 2024 7 min read

DATA SCIENCE DATA SCIENCE

Our Columns Optimizing Marketing


Columns on TDS are carefully curated Campaigns with Budgeted
collections of posts on a particular idea Multi-Armed Bandits
or category…
With demos, our new solution, and a
TDS Editors
video
November 14, 2020 4 min read
Vadim Arzamasov
August 16, 2024 10 min read
DATA SCIENCE

Back to Basics, Part Tres:


Logistic Regression
An illustrated guide to everything you
need to know about Logistic Regression
Shreya Rao
March 2, 2023 8 min read

Sign up to our newsletter

Your home for data science and Al. The world’s leading Your email*
publication for data science, data analytics, data engineering,
First name*
machine learning, and artificial intelligence professionals.

© Insight Media Group, LLC 2025


Last name*
ABOUT • PRIVACY POLICY • TERMS OF USE

Job title*

Job level*

Please Select

Company name*

I consent to receive newsletters and other communications from Towards


Data Science publications.*

Subscribe Now

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy