0% found this document useful (0 votes)
18 views19 pages

SC Exp1

The document summarizes key activation functions used in neural networks: 1) The identity function simply returns the input value without changes. It maintains linearity and is used in output layers of neural networks. 2) The binary step function outputs 0 for negative inputs and 1 for positive inputs, creating a hard threshold. It is used in binary classification but lacks differentiability. 3) The sigmoid function outputs values between 0-1, introducing nonlinearity. It is differentiable and commonly used but suffers from vanishing gradients. 4) The bipolar sigmoid function, also called tanh, maps inputs to outputs between -1 to 1 with a center at 0. It addresses limitations of the standard sigmoid function.

Uploaded by

Kavya Vasa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views19 pages

SC Exp1

The document summarizes key activation functions used in neural networks: 1) The identity function simply returns the input value without changes. It maintains linearity and is used in output layers of neural networks. 2) The binary step function outputs 0 for negative inputs and 1 for positive inputs, creating a hard threshold. It is used in binary classification but lacks differentiability. 3) The sigmoid function outputs values between 0-1, introducing nonlinearity. It is differentiable and commonly used but suffers from vanishing gradients. 4) The bipolar sigmoid function, also called tanh, maps inputs to outputs between -1 to 1 with a center at 0. It addresses limitations of the standard sigmoid function.

Uploaded by

Kavya Vasa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 19

Batch: B4 Roll No.

: 16010122812

Experiment No. 01

Grade: AA / AB / BB / BC / CC / CD /DD

Signature of the Staff In-charge with date

Title: Activation functions.


___________________________________________________________________________________

Objective: To implement activation functions of Neural Network.

___________________________________________________________________________________

Expected Outcome of Experiment:

CO1 : Identify and describe soft computing techniques and their roles
____________________________________________________________________________________
Books/ Journals/ Websites referred:

 J.S.R.Jang, C.T.Sun and E.Mizutani, “Neuro-Fuzzy and Soft Computing”, PHI, 2004, Pearson
Education 2004.
 Davis E.Goldberg, “Genetic Algorithms: Search, Optimization and Machine Learning”, Addison
Wesley, N.Y., 1989.
____________________________________________________________________________________
Pre Lab/ Prior Concepts:

Neural networks, sometimes referred to as connectionist models, are parallel-distributed models that
have several distinguishing features-
1) A set of processing units;
2) An activation state for each unit, which is equivalent to the output of the unit;
3) Connections between the units. Generally each connection is defined by a weight wjk that
determines the effect that the signal of unit j has on unit k;
4) A propagation rule, which determines the effective input of the unit from its external inputs;
5) An activation function, which determines the new level of activation based on the effective
input and the current activation;
6) An external input (bias, offset) for each unit;
7) A method for information gathering (learning rule);
K. J. Somaiya College of Engineering, Mumbai-77

8) An environment within which the system can operate, provide input signals and, if necessary,
error signals.
____________________________________________________________________________________
Implementation Details:

Most units in neural network transform their net inputs by using a scalar-to-scalar function called an
activation function, yielding a value called the unit's activation. Except possibly for output units, the
activation value is fed to one or more other units. Activation functions with a bounded range are
often called squashing functions. Some of the most commonly used activation functions are: -

1) Identity function: -
In soft computing, the identity function is a fundamental concept used in various areas such as
artificial intelligence, machine learning, and neural networks. The identity function is a simple
mathematical function that takes an input and returns the same value as output. In other words, the
function maps each input element to itself without any alteration.

The identity function is commonly denoted as "f(x) = x" or "y = x," where 'x' represents the input
value, and 'y' represents the output value. Regardless of the value of 'x', the output 'y' will always be
the same as 'x' in the identity function.

Mathematically, the identity function can be represented as:

f(x) = x

The identity function is particularly useful in various soft computing algorithms and techniques,
including neural networks. In neural networks, the identity function is often used as the activation
function in some layers to maintain the linearity of the data transformation. For instance, in a multi-
layer perceptron (MLP) neural network, the identity function can be used in the output layer to
perform regression tasks, where the model is required to predict continuous values without any
specific transformation.

In summary, the identity function in soft computing is a simple and essential concept that helps
maintain the original values of inputs, playing a significant role in various applications within the
field of artificial intelligence and machine learning.

2) Binary step function: -


The binary step function is a type of activation function used in soft computing, particularly in
artificial neural networks and other machine learning models. It is one of the simplest activation
functions, and its output is binary, i.e., it takes only two possible values.

The binary step function is defined as follows:


Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

f(x) = {
0, if x < 0
1, if x >= 0
}

In other words, when the input 'x' is less than zero, the function returns a value of 0, and when the
input is greater than or equal to zero, the function returns a value of 1. This function creates a hard
threshold, where any input below zero is mapped to 0, and any input equal to or above zero is
mapped to 1.

The binary step function is commonly used in binary classification problems, where the goal is to
categorize inputs into two classes (e.g., positive and negative, yes and no, etc.). The function acts as
a simple decision maker, assigning one of the two classes to the input based on the threshold at zero.

However, one limitation of the binary step function is that it is not differentiable at the threshold
point (x = 0). This lack of differentiability makes it unsuitable for some optimization algorithms used
in training neural networks, such as gradient descent. As a result, the binary step function is not
commonly used in modern neural networks for training purposes.

Instead, other activation functions like the sigmoid function, ReLU (Rectified Linear Unit), or the
softmax function are preferred in most cases because they are differentiable and can better facilitate
the learning process during training. Nevertheless, the binary step function remains an essential
concept in the study of activation functions and their role in soft computing and artificial neural
networks.

3) Sigmoid function: -
The sigmoid function is a commonly used activation function in soft computing, especially in
artificial neural networks and logistic regression. It is valued for its smooth and continuously
differentiable nature, which allows for efficient optimization during training.

The standard sigmoid function is also known as the logistic function and is defined as follows:

f(x) = 1 / (1 + exp(-x))

In this equation, 'x' is the input value, 'exp' represents the exponential function, and 'f(x)' is the output
value of the sigmoid function.

Properties of the sigmoid function:

Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

1. Range: The output of the sigmoid function lies between 0 and 1, which is useful for problems
where we want to interpret the output as a probability. The function maps any real-valued input to a
probability value in the range [0, 1].

2. S-shaped curve: The sigmoid function has an S-shaped curve, which means that the output
increases monotonically as the input increases. This property helps in introducing non-linearity to the
neural network, allowing it to learn complex relationships in the data.

3. Differentiability: The sigmoid function is differentiable for all input values, making it suitable for
optimization algorithms like gradient descent during the training process. This differentiability
allows the model to update its weights and learn from the training data effectively.

4. Saturated behavior: The sigmoid function can suffer from vanishing gradients when the input
values become very large or very small. This can lead to slow or stalled learning during training in
deep neural networks, known as the vanishing gradient problem.

While the sigmoid function was commonly used as the default activation function in the early days
of neural networks, it has some limitations, particularly the vanishing gradient problem. As a result,
other activation functions like ReLU (Rectified Linear Unit) and its variants have gained popularity
in modern deep learning architectures. These alternative functions help mitigate the vanishing
gradient problem and facilitate faster convergence during training.

However, the sigmoid function is still valuable in certain scenarios, such as binary classification
problems or as an output activation for models where the outputs are interpreted as probabilities.

4) Bipolar sigmoid function: this function is defined as follows: -


The bipolar sigmoid function, also known as the hyperbolic tangent function or tanh function, is
another type of sigmoid activation function used in soft computing and artificial neural networks. It
is an extension of the standard sigmoid function and is valued for its ability to map input values to a
range from -1 to 1, providing a center around zero.

The bipolar sigmoid function is defined as follows:

f(x) = (2 / (1 + exp(-2x))) - 1

In this equation, 'x' represents the input value, 'exp' denotes the exponential function, and 'f(x)' is the
output value of the bipolar sigmoid function.

Properties of the bipolar sigmoid function:

Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

1. Range: The output of the bipolar sigmoid function lies between -1 and 1, which makes it suitable
for problems where the input data is centered around zero. This range provides a more balanced
output for both positive and negative input values.

2. S-shaped curve: Similar to the standard sigmoid function, the bipolar sigmoid function also has
an S-shaped curve, introducing non-linearity to neural networks and allowing them to learn complex
patterns in the data.

3. Differentiability: The bipolar sigmoid function is differentiable for all input values, facilitating
efficient optimization during the training process using methods like gradient descent.

4. Centered at zero: One of the main advantages of the bipolar sigmoid function over the standard
sigmoid function is that it is centered at zero. This means that when the input is zero, the output of
the function is also zero, making it more suitable for data with mean-centered distributions.

The bipolar sigmoid function is commonly used as an activation function in neural networks,
particularly in hidden layers. It helps to introduce non-linearity and allows the network to learn and
approximate complex functions effectively.

While the bipolar sigmoid function has some benefits, it still suffers from the vanishing gradient
problem when input values are very large or very small. In deep neural networks, this issue can
hinder learning and slow down training.

5) ReLU function: this function is defined as follows: -


The Rectified Linear Unit (ReLU) function is a popular activation function used in artificial neural
networks and deep learning models. It is a simple and computationally efficient function that
introduces non-linearity to the network, allowing it to learn complex relationships in the data.

The ReLU function is defined as follows:

f(x) = max(0, x)

In this equation, 'x' represents the input value, 'max' denotes the maximum function, and 'f(x)' is the
output value of the ReLU function.

Properties of the ReLU function:

Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

1. Non-linearity: The ReLU function is a non-linear activation function. It returns the input value 'x'
if 'x' is positive (greater than or equal to zero) and returns zero if 'x' is negative. This non-linearity
enables neural networks to approximate more complex functions and better capture intricate
patterns in the data.

2. Sparsity: One interesting property of the ReLU function is that it introduces sparsity in the
network. When the input is negative, the output is zero, effectively "deactivating" the neuron and
making it less computationally expensive during forward and backward propagation. This sparsity
can lead to more efficient and faster training of deep neural networks.

3. Avoiding vanishing gradient: The ReLU function helps mitigate the vanishing gradient problem,
which can occur when using sigmoid or tanh functions. This issue happens because these traditional
sigmoidal functions saturate for large positive and negative inputs, resulting in very small gradients.
The ReLU function, on the other hand, avoids this saturation for positive inputs, leading to more
stable and faster convergence during training.

However, the ReLU function also has a potential drawback known as the "dying ReLU" problem.
This occurs when a neuron always outputs zero due to a large negative input. If a neuron gets stuck
in this state, it becomes inactive and does not contribute to the learning process. To address this
problem, variants of the ReLU function have been proposed, such as the Leaky ReLU and
Parametric ReLU, which allow small negative values to pass through to prevent neurons from
dying.

In summary, the ReLU function is a widely used activation function in deep learning due to its
simplicity, non-linearity, and effectiveness in mitigating the vanishing gradient problem. While it
may have some limitations, various modifications and variants have been introduced to address
those issues and make the ReLU family of functions an essential component of modern neural
network architectures.

1. Identity function: -
Code: -
import numpy as np
import matplotlib.pyplot as plt

# Identity activation function


def identity_activation(x):
return x

Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

# Generate input values


input_values = np.linspace(-10, 10, 20)

# Apply identity activation function to input values


output_values = identity_activation(input_values)

# Plotting
plt.figure(figsize=(8, 6))
plt.plot(input_values, output_values, label='Identity Function', color='blue')
plt.xlabel('Input')
plt.ylabel('Output')
plt.title('Identity Activation Function')
plt.legend()
plt.grid(True)
plt.show()

# Print the output values


print("Input Values:\n", input_values)
print("\nOutput Values:\n", output_values)

Outputs: -

Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

2. Binary Step function: -


Code: -
import numpy as np
import matplotlib.pyplot as plt

# Binary step activation function


def binary_step(x):
return np.where(x < 0, 0, 1)

# Input values
input_values = np.array([-3, -1, 0, 2, 4])

# Applying the binary step activation function to the input values


output_values = binary_step(input_values)

# Printing the output values


print("Input Values:", input_values)
print("Output Values:", output_values)

# Plotting the input and output values


plt.figure(figsize=(8, 6))
plt.plot(input_values, output_values, marker='o')
plt.title("Binary Step Activation Function")
plt.xlabel("Input")
plt.ylabel("Output")
plt.xticks(input_values)
plt.yticks([0, 1])
plt.grid(True)
Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

plt.show()

Outputs: -

3. Sigmoid function: -
Code: -
import numpy as np
import matplotlib.pyplot as plt

def sigmoid(x):
return 1 / (1 + np.exp(-x))

# Generate an array of input values


input_values = np.linspace(-10, 10, 20) # Creating 20 values between -10 and 10

# Apply the sigmoid function to the input values


Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

output_values = sigmoid(input_values)

# Print the input and output values


for x, y in zip(input_values, output_values):
print(f"Input: {x:.2f}, Output: {y:.4f}")

# Plot the input values against the output values


plt.figure(figsize=(8, 6))
plt.plot(input_values, output_values, label='Sigmoid Function')
plt.title('Sigmoid Activation Function')
plt.xlabel('Input')
plt.ylabel('Output')
plt.legend()
plt.grid()
plt.show()

Outputs: -

Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

4. Bipolar sigmoid function: -


Code: -
import numpy as np
import matplotlib.pyplot as plt

def bipolar_sigmoid(x):
return (2 / (1 + np.exp(-2 * x))) - 1

# Create an array of input values


input_values = np.linspace(-10, 10, 50)

# Apply the bipolar sigmoidal activation function to the input values


output_values = bipolar_sigmoid(input_values)

# Print the input and output values


for i in range(len(input_values)):
print(f"Input: {input_values[i]:.2f}\tOutput: {output_values[i]:.2f}")

Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

# Plot the input and output values


plt.figure(figsize=(8, 6))
plt.plot(input_values, output_values, label='Bipolar Function')
plt.title('Bipolar Sigmoid Activation Function')
plt.xlabel('Input')
plt.ylabel('Output')
plt.legend()
plt.grid()
plt.show()

Outputs: -

Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

5. ReLU function: -
Code: -
import numpy as np
import matplotlib.pyplot as plt

# Define the ReLU activation function


def relu(x):
return np.maximum(0, x)

# Input values as an array


input_values = np.array([-2, -1, 0, 1, 2, 3, 4, 5])

# Apply ReLU activation function to the input values


output_values = relu(input_values)

# Print the input and output values


for i in range(len(input_values)):
print(f"Input: {input_values[i]}, Output: {output_values[i]}")
Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

# Plot the input and output values


plt.figure(figsize=(8, 6))
plt.plot(input_values, output_values, marker='o')
plt.xlabel('Input')
plt.ylabel('Output')
plt.title('ReLU Activation Function')
plt.grid(True)
plt.show()

Outputs: -

Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

Conclusion: - Thus, we have successfully implemented 4 Activation Functions of Neural


Network.

Post Lab Descriptive Questions: -

1. Explain the concept behind using Activation function.


Answer: -
Activation functions play a crucial role in artificial neural networks and deep learning models. They
introduce non-linearity to the network, which allows it to learn and approximate complex
relationships in the data. The concept behind using activation functions can be understood from the
following points:

1. Introducing non-linearity: Without activation functions, the neural network would be limited to
performing only linear transformations of the input data. A combination of linear operations would
still result in a linear operation. However, real-world data is often non-linear, and complex patterns
cannot be represented by a series of linear transformations alone. Activation functions introduce non-
linearity to the network, enabling it to learn and represent non-linear relationships between the input
and output.

2. Learning complex representations: By introducing non-linearity, activation functions enable neural


networks to learn complex features and representations from the data. These features are learned in
the hidden layers of the network, where each neuron applies the activation function to its weighted
sum of inputs. The activation function decides whether a neuron should "fire" (activate) or remain
inactive based on its input, thus capturing higher-order patterns and abstractions.

3. Decision-making and output mapping: In many tasks, neural networks are used for decision-
making, such as classifying objects in an image or predicting a target value. Activation functions play
a crucial role in these tasks by mapping the output of the neurons to specific ranges or binary values,
making them suitable for classification or regression problems.

4. Gradient propagation during backpropagation: During the training process, the neural network
adjusts its weights and biases to minimize the error between predicted and actual output. Activation
functions also influence how gradients are propagated backward through the network during the
backpropagation algorithm. Different activation functions have different gradient properties, which
can impact the speed and stability of learning. Activation functions like ReLU help alleviate the
vanishing gradient problem, which can occur in deeper networks with traditional sigmoidal
activation functions.

Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

5. Computational efficiency: Activation functions should also be computationally efficient, as they


are applied element-wise to the neurons' outputs. Simple and efficient activation functions enable
faster training and inference times for the neural network.

In summary, activation functions are an essential part of neural networks because they introduce non-
linearity, allowing the network to learn complex patterns and make decisions. They play a vital role
in the learning process, ensuring that the network can effectively learn from data, generalize to new
examples, and perform well on various tasks, including classification, regression, and other machine
learning problems.

2. Explain the different properties of activation functions.


Answer: -
Activation functions used in neural networks have various properties that influence their performance
and behavior. Understanding these properties is essential for selecting the appropriate activation
function for a specific task. Here are the different properties of activation functions:

1. Non-linearity: One of the fundamental properties of activation functions is non-linearity. Non-


linear activation functions introduce non-linearities to the network, enabling it to learn and
approximate complex relationships in the data. Without non-linearity, the neural network would be
limited to performing only linear transformations of the input data, which is insufficient for modeling
real-world data with complex patterns.

2. Range of output: The range of the activation function's output refers to the values the function can
take as an output. Ideally, an activation function should map the input to a specific range that is
suitable for the task at hand. For example, the sigmoid and softmax functions map the input to the
range (0, 1), making them suitable for classification tasks where outputs represent probabilities. On
the other hand, ReLU and its variants map the input to the range [0, ∞), which is useful for most
hidden layers of a neural network.

3. Continuity and Differentiability: Activation functions should be continuous and differentiable (or
at least piecewise differentiable) for efficient gradient-based optimization algorithms like
backpropagation to work effectively. The continuity ensures that small changes in input produce
small changes in output, while differentiability is essential for calculating gradients during the
training process. Functions like ReLU are continuous but not differentiable at zero, leading to the use
of leaky ReLU or other variants.

4. Sparsity: Some activation functions, like ReLU, introduce sparsity in the network. When the input
is negative, the output is zero, effectively "deactivating" the neuron. This sparsity can lead to more
efficient and faster training of deep neural networks, as fewer neurons need to be updated during
backpropagation.

Department of Computer Engineering

Page No : SC/ Sem V/ 2023


K. J. Somaiya College of Engineering, Mumbai-77

5. Vanishing and Exploding Gradients: The problem of vanishing and exploding gradients refers to
the issue of gradients becoming extremely small or extremely large during the training process.
Activation functions can influence the occurrence of these problems. For example, the sigmoid and
tanh functions can suffer from the vanishing gradient problem when inputs are very large or very
small. ReLU helps mitigate the vanishing gradient problem, but it can still lead to the exploding
gradient problem for very high learning rates.

6. Computational Efficiency: Activation functions should be computationally efficient, especially


when applied to large-scale datasets and deep neural networks. Efficient activation functions help
speed up training and inference times.

7. Saturation and Zero-Centering: Activation functions like sigmoid and tanh can suffer from
saturation when inputs are very large or very small, leading to vanishing gradients. Additionally, they
are not zero-centered, which may cause optimization challenges in certain scenarios. ReLU and its
variants address these issues by avoiding saturation and being zero-centered.

In summary, understanding the properties of activation functions is crucial for selecting appropriate
activation functions for specific tasks and designing effective neural network architectures. Different
activation functions possess unique characteristics that can impact the performance, stability, and
convergence of the neural network during training and inference.

Date: - 06-08-2023 Signature of faculty in-charge

Department of Computer Engineering

Page No : SC/ Sem V/ 2023

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy