0% found this document useful (0 votes)
7 views3 pages

DSE 5251 Makeup

The document outlines the makeup examination for the Deep Learning subject in the M. Tech (Data Science) program, scheduled for July 2023. It includes various questions related to neural networks, including activation functions, optimization techniques, training procedures, and architectures like CNNs and Transformers. Candidates are instructed to answer all questions and may assume missing data as needed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views3 pages

DSE 5251 Makeup

The document outlines the makeup examination for the Deep Learning subject in the M. Tech (Data Science) program, scheduled for July 2023. It includes various questions related to neural networks, including activation functions, optimization techniques, training procedures, and architectures like CNNs and Transformers. Candidates are instructed to answer all questions and may assume missing data as needed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

II SEMESTER M.

TECH (DATA SCIENCE) MAKEUP EXAMINATIONS, JULY 2023


SUBJECT: DEEP LEARNING [DSE 5251]
REVISED CREDIT SYSTEM
Time: 3 hours 06/07/2023 MAX. MARKS: 50

Instructions to Candidates:
❖ Answer ALL the questions.
❖ Missing data, if any, may be suitably assumed.

M CO BL

5 CO1 2
With necessary graphs and equations, explain the following for neural networks:
i) Sigmoid activation function
1A ii) ReLU activation function
iii) Bias-variance trade-off
iv) Momentum based gradient descent

3 CO3 3
You are training a neural network on the ImageNet dataset, and you are
thinking of using gradient descent as your optimization function. Which of the
following statements is true? Justify your answer.

1B i) It is possible for Stochastic Gradient Descent to converge faster than Batch


Gradient Descent.
ii) It is possible for Mini Batch Gradient Descent to converge faster than
Stochastic Gradient Descent.

2 CO1 3
In a Deep Neural Network, explain why a constant learning rate is not ideal. What
1C are the learning schedule strategies that can be adopted?

Explain the training procedure in a Neural Network containing an input layer with 5 CO2 3
2 neurons, 1 hidden layer with 3 neurons and an output layer with 2 neurons.
2A Choose appropriate activation and loss functions with necessary justification.
Assume random values for inputs, weights and biases.

3 CO1 2
2B Explain the following weight initialization strategy in neural networks:
i) Unsupervised Pre-training.

DSE 5251 Page 1 of 3


ii) Xavier and He Initialization.

How does batch normalization improve the performance of a deep learning 2 CO1 2
2C model?

Consider a Convolutional Neural Network (CNN) denoted by the layers in the 5 CO2 3
first column of Table Q3A (given below). The CNN is being used to classify a
given image into any one of 10 classes. Complete the Table Q3A by filling in the
shape of the output volume (activation shape) and the number of parameters at
each layer (including the bias). You can write the activation shape in the format
H x W x C, where H, W and C are the height, width and channel dimensions,
respectively. Clearly explain all the computations.

Notations:

* CONV F-K denotes a convolutional layer with K filters with height and width
equal to F.

* Pool-R denotes a R x R max-pooling layer with stride of R and 0-padding.

* FLATTEN flattens its inputs.

* FC-N denotes a fully-connected layer with N neurons.

* Softmax denotes the output layer.


3A
Table Q3A:

Layer Activation Shape Number of


Parameters
Input 16 x 16 x 5
CONV 5-6
(Padding=2, Stride =1)
ReLU
Pool-2
CONV 5-16
(Padding=0, Stride =1)
ReLU
Pool-2
FLATTEN
FC-120
FC-84
Softmax

DSE 5251 Page 2 of 3


3 CO3 3
Explain the following in the context of CNN.
i) Sparse Connectivity
3B
ii) Shared Weights
iii) Equivariance to translation

2 CO2 2
3C Explain the role of skip connections in the ResNet CNN architecture.

For each of the following tasks: design an encoder-decoder based model, draw a 5 CO2 2
neat diagram and explain the computations:

i) Given a video as input produce a caption for the video as an output by


4A understanding the action and the event in the video.

ii) Given an image and a query based on the image as input, produce an answer
to the query as an output.

3 CO2 2
With the help of a neat diagram and necessary computations explain the working
4B for LSTM based recurrent neural networks.

Explain the use of Truncated Backpropagation Through Time in the context of 2 CO3 2
4C recurrent neural network training.

5 CO2 2

5A With the help of a neat diagram and necessary computations explain how the
encoder part of the Transformer architecture is used for machine translation.

3 CO1 2
What are the different approaches for regularization in autoencoders. Explain how
5B regularization is ensured in sparse and contractive autoencoders.

2 CO2 2
5C With the help of a neat diagram explain the working of Generative Adversarial
Networks.

DSE 5251 Page 3 of 3

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy