0% found this document useful (0 votes)
15 views7 pages

Eng21cs0302 - Sgan

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views7 pages

Eng21cs0302 - Sgan

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

SEQUENCE NETWORKS AND GAN ASSIGNMENT

Name-Prajwal MJ
Class-7D
USN-ENG21CS0302

1
INTRODUCTION
This report outlines the step-by-step development of a Convolutional Neural
Network (CNN) to solve the MNIST handwritten digit classification problem.
The MNIST dataset, widely used in computer vision, includes 60,000 training
images and 10,000 test images of single-digit numbers (0–9). Each digit is
represented as a grayscale image of 28x28 pixels. The goal of this project is to
classify these images accurately by building a CNN model from scratch,
evaluating it thoroughly, optimizing its performance, and deploying it for
prediction.

OBJECTIVES
1. Build and train a CNN model to classify handwritten digits in the MNIST
dataset.
2. Establish a reliable model evaluation framework using techniques like k-fold
cross-validation.
3. Optimize model architecture and parameters to improve performance.
4. Finalize and save the mode for real-world application and prediction tasks.

STEPS
1. Setting Up the Development Environment
 The tutorial uses Python 3 with Keras running on top of TensorFlow, a
popular framework for deep learning.
 Anaconda is recommended for environment setup, simplifying the
installation and management of required packages.
 The development environment requires the following libraries:
 TensorFlow/Keras: To build and train the CNN.
 NumPy: For numerical operations and data manipulation.
 Matplotlib: For visualizing data and model learning curves.

2
2. Understanding and Preprocessing the MNIST Dataset
 The MNIST dataset contains grayscale images of digits (0-9), each in a
28x28 pixel format.
 Images are preprocessed as follows:
 Reshape the images to include a single color channel, making them
compatible with CNN input requirements (28x28x1).
 Normalize pixel values from their original range of [0, 255] to [0, 1].
This helps stabilize the learning process and accelerates convergence.

3. Developing the Baseline CNN Model


 The baseline model is designed with a simple architecture suitable for
image classification:
 Input Layer: The input shape is (28, 28, 1), corresponding to the reshaped
grayscale images.
 Convolutional Layer: A single convolutional layer with 32 filters and a
kernel size of 3x3. The activation function is ReLU, and the kernel
initializer is "he_uniform."
 Pooling Layer: A max-pooling layer reduces the spatial dimensions of the
feature maps, helping to reduce computational load and capture important
features.
Flattening Layer: Converts the 2D feature maps into a 1D array to prepare
for the dense layer.
 Dense Layer: A dense layer with 100 neurons and ReLU activation to
interpret the features extracted by the convolutional layers.
 Output Layer: Uses softmax activation with 10 neurons (one for each
digit class) to output probabilities for each class.

 # define cnn model


 def define_model():
 model = Sequential()
 model.add(Conv2D(32, (3, 3), activation='relu',
kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
 model.add(MaxPooling2D((2, 2)))
 model.add(Flatten())
 model.add(Dense(100, activation='relu',
kernel_initializer='he_uniform'))
 model.add(Dense(10, activation='softmax'))
 # compile model

3
 opt = SGD(learning_rate=0.01, momentum=0.9)
 model.compile(optimizer=opt, loss='categorical_crossentropy',
metrics=['accuracy'])
 return model

The baseline model is compiled with:


 Optimizer: Stochastic Gradient Descent (SGD) with a learning rate of
0.01 and a momentum of 0.9.
 Loss Function: Categorical Cross-Entropy, as it is suited for multi-class
classification.
 Metric: Classification accuracy to assess model performance.
4. Model Evaluation Methodology
 K-Fold Cross-Validation:
 The baseline model is evaluated using 5-fold cross-validation, where the
dataset is split into 5 parts, and the model is trained on 4 parts and
validated on the remaining part in each fold.
 Cross-validation helps evaluate model performance, ensuring that the
model generalizes well across different subsets of data.
Learning Curves:
 During each training run, the accuracy and loss are tracked and plotted to
create learning curves.
 These curves provide insights into whether the model is underfitting,
overfitting, or has an appropriate fit for the data.

5. Model Improvement Techniques

4
 Batch Normalization:
Batch normalization is applied to stabilize learning and improve
performance. It is added after each convolutional and dense layer to
normalize the output distribution, accelerating convergence.
 Increasing Model Depth:
The depth of the CNN is increased by adding additional convolutional
and pooling layers, following a structure similar to VGG-like
architectures.
 Each convolutional layer block is followed by a pooling layer, with more
filters (e.g., 64 filters) added in deeper layers to capture more complex
patterns.
 Hyperparameter Tuning:
Tuning the learning rate and exploring different optimizer configurations
help optimize model performance.
 Adjusting batch size and experimenting with deeper network structures
can also further enhance accuracy.
6. Finalizing and Saving the Model
 After refining the model, the final architecture is trained on the entire
training dataset.
 The model is saved in an H5 file format, enabling easy loading and
deployment in future applications.
 Evaluation on Test Set:
The final model is evaluated on the reserved test dataset to obtain a
realistic estimate of its performance on unseen data.
 Practical Application:
Saving the model allows it to be reused for predictions in a real-world
setting, making it possible to classify new handwritten digit images.
# fit model
model.fit(trainX, trainY, epochs=10, batch_size=32, verbose=0)

# save model
model.save('final_model.h5')

7. Making Predictions on New Data

5
 The document includes a demonstration of how to use the saved model to
classify a new image.
 Steps for prediction:
 Load and preprocess the imageConvert it to grayscale, resize it to 28x28
pixels, and normalize the pixel values.
 Predict using the model: The image is passed through the model, which
outputs a probability distribution across the 10 digit classes.
 Interpret the result: The class with the highest probability is selected as
the predicted digit.

Robust Model Evaluation:


 Cross-validation and learning curves are essential for assessing the
model’s ability to generalize.
 By analyzing learning curves, we can determine if a model is overfitting
or underfitting, allowing for targeted adjustments to the model structure.
 Preprocessing and Parameter Tuning:
 Preprocessing steps like reshaping and normalization have a significant
impact on the model’s training efficiency and accuracy.
 Techniques like batch normalization and model depth increase can
improve the model’s ability to capture complex patterns in the data.

Real-World Application Potential:


 The CNN, though trained on the MNIST dataset, demonstrates a
methodology applicable to other image classification tasks.
 This model can be adapted to different datasets with minimal
modifications to preprocessing and architecture, supporting a wide range
of applications in computer vision.

CONCLUSION

6
The tutorial provides a thorough approach to building, evaluating, and
deploying a CNN for handwritten digit classification. By following these steps,
developers gain a robust understanding of CNNs and practical techniques for
optimizing performance. The methodology demonstrated here can be easily
adapted to other datasets and classification problems, making it a valuable
learning resource for machine learning practitioners.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy