DL Unit 6
DL Unit 6
ext Books:
[1] Goodfellow, I., Bengio,Y., and Courville, A., Deep Learning, MIT Press, 2016..
[2] Bishop, C. ,M., Pattern Recognition and Machine Learning, Springer, 2006.
[4] Matrix Computations, Golub, G.,H., and Van Loan,C.,F, JHU Press,2013.
[5] Neural Networks: A Classroom Approach, Satish Kumar, Tata McGraw-Hill Ed., 2004.
Abbreviations
NN: Neural Network
DC: Direct Current
📌 1.2. Architecture
Consists of an input layer, multiple hidden layers, and an output layer.
Activation Functions:
o ReLU (Rectified Linear Unit): Helps in faster convergence.
o Sigmoid/Tanh: Useful for probabilistic outputs but suffer from vanishing
gradient problems.
o Softmax: For multi-class classification in the output layer.
📌 1.3. Forward and Backward Propagation
Forward Propagation: Computes output from input through layers.
Backward Propagation: Updates weights using Gradient Descent to minimize the loss
function.
3.4 Regularizations
📌 2.1. Need for Regularization
Regularization techniques are used to reduce overfitting by introducing a penalty on
complex models, encouraging simpler models with better generalization.
📌 2.2. Types of Regularization
L1 Regularization (Lasso): Adds a penalty proportional to the absolute value of
weights. Promotes sparsity.
L2 Regularization (Ridge): Adds a penalty proportional to the square of the weights.
Helps in weight decay.
Dropout: Randomly drops neurons during training, forcing the network to avoid over-
reliance on specific paths.
Early Stopping: Monitors validation loss and stops training when performance
degrades.
Data Augmentation: Introduces variability in training data (e.g., image rotations, flips)
to improve robustness.