First Phase Final
First Phase Final
Presentation Flow
• Introduction
• Literature survey
• Aim of the Project
• Objectives
• Block diagram
• Methodology
• Hardware/Software requirement
• Expected Results
• Action Plan
• References
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Introduction
• Optimizing a neural network for identifying scene images involves improving its
accuracy, efficiency, and generalization capability.
• Convolution Neural Networks is one of the deep learning algorithms which
implemented to sample produces better outcomes as compared to other
machine learning algorithms.
• Deep Learning is one of the technique for predicting values.
• The data-set has been taken with some images from different types of location.
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Introduction
Introduction
1
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Literature survey
Paper 1 : Robust Indoor and Outdoor classification of a Scene.
Literature survey
Paper 2 : Automated analysis of Scene using Image Feature Extraction.
S. Omar Gilani, • Gist to category of This method allows fast • Human-Like Scene • Limitations in Scene
Mohsin Jamil, contextual and efficient scene Understanding Similarity
Zahra Fazal, features. categorization with few • Scalability • Dependence on
Muhammad Samran • PCA is a method errors. It allowed images • Dimensionality Global Features
Navid, Rabeil Sakina. used for of same category to Reduction
(Published on 2016) dimensionality be correctly clustered • High Accuracy
reduction. and very few images,
• Clustering is used farthest from thecentroid
to classify similar of the clusters were
data items into wrongly categorized.
groups.
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Literature survey
Paper 3 : Alex-Net model based Scene image classification technique
Jing Sun & Xibiao Cai, • Alex net In this paper, we • High Classification • Computational
Fuming Sun & Jianguo • D CNN extract the last layer Performance Complexity
Zhang features of the deep • Efficiency in Training • Dependence on Data
(Published on 2016) convolutional neural • End-to-End Learning Augmentation
network for scene • Limited
classification by using Interpretability
the Alex-Net model in
the deep learning
framework.
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Literature survey
Paper 4 : Computer network connection enhancement optimization algorithm based on CNN
Authors, Methodology Results obtained Advantages Disadvantages
Year of publication,
type of journal
Hanju Li Sichuan CONVOLUTIONAL This proposed CNN- • Enhanced Feature • High Computational
University, Chengdu. NEURAL NETWORK based algorithm Extraction Capability Complexity
(Published on 2021) (CNN) enhances computer • Improved • Dependence on
network connections by Generalization Training Data Quality
fusing multiple CNNs, Performance • Limited Comparison
improving stability and • Dynamic with Other Methods
recognition accuracy. Optimization
The average field
neural network
algorithm proves
efficient and realistic for
optimization, offering
good economic
feasibility.
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Literature survey
Paper 5 : Research on Image Classification Improvement Based on Convolutional Neural Networks with Mixed Training
Objectives
• To collect the dataset of images.
• To preprocess the collected dataset.
• To test and train the augmented dataset.
• To evaluate and deploy trained model into applications.
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Block diagram
Methodology
1. Data Collection -
Dataset: Total 2080 scene image dataset
Categories: Select 6 classes {Mountain, Beach, Desert, Pond, Forest, Farmland}
2. Preprocessing -
Preprocessing consist of resizing standardise images to a fixed size(128x128).
Augmentation: Apply transformations (rotation, flipping, zooming) to reduce
overfitting.
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Methodology
• Model Architecture: Two CNN architectures are employed
VGG-16:
- 13 convolutional layers + 3 fully connected layers.
- Uses small 3×3 kernels and max-pooling for hierarchical feature extraction.
• Training :
Transfer Learning: Initialise with pre-trained weights
Loss Function: Categorical cross-entropy for multi-class classification
Regularization: To prevent overfitting.
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Methodology
• Evaluation:
Evaluate the performance of the trained model on a held-out test set. This will
give an estimate of how well the model will perform on new data.
• Deployment:
Integrate the trained model to production
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Software requirement
• Tensor flow - An open-source deep learning framework used for building and
training neural networks
• NumPy/SciPy - numerical operations.
• Python 2.18.0- development environment
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Hardware requirement
• GPU - Minimum NVIDIA GTX 1080 (8GB VRAM) for small-scale training.
• CPU & RAM:
CPU: Intel i7/Xeon or AMD Ryzen 7.
RAM: 8GB+
• Storage: SSD
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Result obtained:
Objective 1:
• Dataset collection from Kaggle, consist of 2080 images which belonging to 6
classes.
• The classes are
-Pond
-Beach
-Farmland
-Desert
-Mountain
-Forest
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Result obtained:
Objectives 2:
PRE-PROCESSING STEPS
1. Initialisation:
• We taken the image size as 128x128, batch size=32, total batches=65,
epoche =25
2. Normalization:
• In normalization we take the image batch, image size and its RGB value
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Result obtained:
3. Augmentation
• In augmentation horizontal and vertical flipping, rotating.
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Result obtained:
Objective 3:
1.Training data with augmentation
• Rescale=1.0 / 255: Normalizes pixel values from the range [0, 255] to [0, 1]
• validation_split=0.2: Splits 20% of the dataset for validation and keeps 80% for training
• rotation_range=20: Randomly rotates images up to 20 degrees
• width_shift_range=0.2: Randomly shifts the image horizontally (left or right) by up to
20% of the image width
• height_shift_range=0.2: Randomly shifts the image vertically (up or down) by up to 20%
of the image height
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Result obtained:
• shear_range=0.2:Applies a slanting (shear) transformation.
• zoom_range=0.2:Randomly zooms in or out by up to 20%.
• horizontal_flip=True: Randomly flips images left-to-right.
2. Model complier:
• Adam optimizer- It is an efficient and adaptive method that adjusts learning rates during
training to help the model converge faster and more reliably.
• Loss function- which measures how far off the model’s predictions are from the actual
class labels.
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Obtained Results:
3. Training and Validation:
• 1664 images are taken for training
• 416 images taken for validation
• The below table shows the accuracy of parameters
PARAMETER PERCENTAGE
TRAIN ACCURACY 87.46%
TEST ACCURACY 86%
LOSS 36.55%
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Conclusion
• Optimization of the neural networks to identify scene images is crucial to the development of
visual recognition automation systems. Application of a robust CNN design and appropriate
preprocessing of data, augmentation techniques, and hyperparameters such as the learning
rate, batch size, and number of epochs can help improve the performance of the model
significantly. Methods like data normalization, image data augmentation, and dropout are
utilized to prevent overfitting and improve generalizability across multiple environments.
• The application of the Adam optimizer with the categorical cross-entropy loss function,
coupled with a well-selected activation strategy (e.g., ReLU and Softmax), enables effective
training and correct multi-class classification. The accuracy metrics illustrate that with correct
optimization, the model can effectively classify multiple scene classes (e.g., beach, mountain,
forest) even under complicated visual scenarios.
• This is done through emphasizing the capability of deep learning to comprehend scenes at
the same time as providing a firm groundwork for its application to autonomous systems,
intelligent surveillance, and image retrieval based on content.
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
Action Plan:
Sl. No. Tasks to be performed Deadline ( Expected date/week of completion)
4. Objective 1 30/03/2025
5. Objective 2 10/04/2025
6. Objective 3 04/05/2025
7. Objective 4
References
1. Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent
Vanhoucke, and Andrew Rabinovich. (2015) "Going deeper with convolutions." In Proceedings of the IEEE conference on
computer vision and pattern recognition
2. Gao, Jingyu, Jinfu Yang, Guanghui Wang, and Mingai Li. (2016) "A novel feature extraction method for scene recognition
based on centred convolutional restricted Boltzmann machines." Neuro computing 214: 708-717.
3. Masood, Sarfaraz, Tarun Luthra, Himanshu Sundriyal, and Mumtaz Ahmed. (2017) "Identification of diabetic retinopathy
in eye images using transfer learning." In 2017 International Conference on Computing, Communication and Automation
(ICCCA), pp. 1183-1187. IEEE.
4. Kibria, Sakib B., and Mohammad S. Hasan. (2017) "An analysis of Feature extraction and Classification Algorithms for
Dangerous Object Detection." In 2017 2nd International Conference on Electrical & Electronic Engineering (ICEEE), pp. 1-
4. IEEE.
5. D. Jia, D. Wei, L.J. Li, L. Kai, and F.F. Li, ImageNet: A Large-scale Hierarchical Image Database, IEEE Conference on
Computer Vision and Pattern Recognition, 2009.
6. Y. Jiang, Research on scene image content representation and classification, National University of Defence Technology
Doctoral Thesis, 2010.
National Education Society®
J N N College of Engineering, Shivamogga
Department of Electronics & Communication Engineering
THANK YOU
*