Sample 1 DL Internship Report GTM
Sample 1 DL Internship Report GTM
DEEP LEARNING
(To Identify Ships In A Satellite Image)
Summer Internship Report Submitted in partial fulfilment
of the requirement for undergraduate degree of
Bachelor of Technology
In
By
DECLARATION
I submit this industrial training work entitled “To Identify Ships In A Satellite Image”
to GITAM (Deemed To Be University), Hyderabad in partial fulfilment of the requirements for
the award of the degree of “Bachelor of Technology” in “Computer Science and
Engineering”. I declare that it was carried out independently by me under the guidance of Mr.
D. Srinivasa Rao, Asst. Professor, GITAM (Deemed To Be University), Hyderabad, India.
The results embodied in this report have not been submitted to any other University or
Institute for the award of any degree or diploma.
Hyderabad-502329, India
Dated: 21 -07-2020
CERTIFICATE
This is to certify that the Industrial Training Report entitled “To Identify
Ships In A Satellite Image” is being submitted by Kasarla Sahith Reddy (221710308025) in
partial fulfilment of the requirement for the award of Bachelor of Technology in Computer
Science And Engineering at GITAM (Deemed to Be University), Hyderabad during the
academic year 2020-2021.
It is faithful record work carried out by him at the Computer Science And
Engineering Department, GITAM University Hyderabad Campus under my guidance and
supervision.
II
ACKNOWLEDGEMENT
Apart from my effort, the success of this internship largely depends on the
encouragement and guidance of many others. I take this opportunity to express my gratitude to
the people who have helped me in the successful competition of this internship.
I would like to thank respected Dr. N. Siva Prasad, Pro Vice Chancellor, GITAM
Hyderabad and Dr. N. Seetharamaiah, Principal, GITAM Hyderabad.
I would like to thank respected S. Phani Kumar, Head of the Computer Science and
Engineering for giving me such a wonderful opportunity to expand my knowledge for my own
branch and giving me guidelines to present an internship report. It helped me a lot to realize of
what we study for.
I would like to thank the respected faculties Mr. D.Srinivasa Rao who helped me to
make this internship a successful accomplishment.
I would also like to thank my friends who helped me to make my work more organized
and well-stacked till the end.
221710308025
ABSTRACT
Deep learning has achieved great success in many fields, such as computer vision and
natural language processing. Compared to traditional machine learning methods, deep learning
has a strong learning ability and can make better use of datasets for feature extraction. Image
recognition is one of the most important fields of image processing and computer vision which
can be achieved by deep learning.
To Identifying Ships in Satellite Images is the main motive of this project. Ship
identification on satellite imagery can be used for fisheries management, monitoring of
smuggling activities, ship traffic services, and naval warfare. Real Dataset is collected from
website. I have adapted the view point of looking at features of the dataset, for deep
understanding of the problem. My primary objective of this case study was to to classify
whether the satellite image is consisting of ship or not. I have chosen Convolutional Neural
Network (CNN) method, which has the advantage of being able to extract features
automatically and produce reliable prediction for ship identification. Scaling is used for data as
it is used to normalize and reduces computational complexity Conclusion is made on the base
of achieve as higher accuracy as possible and correctly classifying unknown satellite images of
ships which are not present in training dataset.
III
Table of Contents
CHAPTER 3: PYTHON 11
3.1. Introduction To Python 11
3.1.1. History Of Python 11
3.2. How to Setup Python 11
3.2.1. Installation(using python IDLE) 11
3.2.2. Installation(using Anaconda) 12
3.3. Features Of Python 13
3.4. Python Variable Types 14
3.4.1. Python Numbers 14
IV
CONCLUSION 49
REFERENCES 50
List of Figures
FIG 5.2.2 Checking for total number of ship and non ship images 25
FIG 7.1.2 A 5x5 input feature map and 3x3 convolution of depth1 34
FIG 7.1.3 Example of 3x3 convolution performed over 5x5 input feature map 35
1.1 INTRODUCTION:
Machine Learning(ML) is the scientific study of algorithms and statistical models that
computer systems use in order to perform a specific task effectively without using explicit
instructions, relying on patterns and inference instead. It is seen as a subset of Artificial
Intelligence(AI).
Consider some of the instances where machine learning is applied: the self-driving
Google car, cyber fraud detection, online recommendation engines—like friend suggestions on
Facebook, Netflix showcasing the movies and shows you might like, and “more items to
consider” and “get yourself a little something” on Amazon—are all examples of applied
machine learning. All these examples echo the vital role machine learning has begun to take in
today’s data-rich world.
Machines can aid in filtering useful pieces of information that help in major
advancements, and we are already seeing how this technology is being implemented in a wide
variety of industries.
With the constant evolution of the field, there has been a subsequent rise in the uses,
demands, and importance of machine learning. Big data has become quite a buzzword in the
last few years; that’s in part due to increased sophistication of machine learning, which helps
analyse those big chunks of big data. Machine learning has also changed the way data
extraction, and interpretation is done by involving automatic sets of generic methods that have
replaced traditional statistical techniques. The process flow depicted here represents how
machine learning works
Traditionally, data analysis was always being characterized by trial and error, an
approach that becomes impossible when data sets are large and heterogeneous. Machine
learning comes as the solution to all this chaos by proposing clever alternatives to analyzing
huge volumes of data.
By developing fast and efficient algorithms and data-driven models for real-time
processing of data, machine learning can produce accurate results and analysis.
The types of machine learning algorithms differ in their approach, the type of data they
input and output, and the type of task or problem that they are intended to solve.
When an algorithm learns from example data and associated target responses that
can consist of numeric values or string labels, such as classes or tags, in order to later predict
the correct response when posed with new examples comes under the category of supervised
learning.
When an algorithm learns from plain examples without any associated response,
leaving to the algorithm to determine the data patterns on its own. This type of algorithm tends
to restructure the data into something else, such as new features that may represent a class or a
new series of uncorrelated values. They are quite useful in providing humans with insights into
the meaning of data and new useful inputs to supervised machine learning algorithms.
Deep learning algorithms run data through several “layers” of neural network
algorithms, each of which passes a simplified representation of the data to the next layer. Most
machine learning algorithms work well on datasets that have up to a few hundred features, or
columns.
Basically deep learning is itself a subset of machine learning but in this case the machine
learns in a way in which humans are supposed to learn. The structure of deep learning model
is highly similar to a human brain with large number of neurons and nodes like neurons in
human brain thus resulting in artificial neural network. In applying traditional machine learning
algorithms we have to manually select input features from complex data set and then train them
which becomes a very tedious job for ML scientist but in neural networks we don’t have to
manually select useful input features, there are various layers of neural networks for handling
complexity of the data set and algorithm as well.
In one of the recent project on human activity recognition , when applied traditional
machine learning algorithm like K-NN then we have to separately detect human and its activity
also had to select impactful input parameters manually which became a very tedious task as
data set was way too complex but the complexity dramatically reduced on applying artificial
neural network, such is the power of deep learning. Yes it’s correct that deep learning
algorithms take lots of time for training sometimes even weeks as well but its execution on new
data is so fast that its not even comparable with traditional ML algorithms.
Deep learning has enabled Industrial Experts to overcome challenges which were
impossible, a decades ago like Speech and Image recognition and Natural Language
Processing. Majority of the Industries are currently depending on it , be it Journalism,
Entertainment, Online Retail Store, Automobile, Banking and Finance, Healthcare,
Manufacturing or even Digital Sector. Video recommendations, Mail Services, Self Driving
cars, Intelligent Chat bots, Voice Assistants are just trending achievements of Deep Learning.
i. Translations:
Although automatic machine translation isn’t new, deep learning is helping enhance
automatic translation of text by using stacked networks of neural networks and allowing
translations from images.
There's not just one AI model at work as an autonomous vehicle drives down the street.
Some deep-learning models specialize in streets signs while others are trained to recognize
pedestrians. As a car navigates down the road, it can be informed by up to millions of individual
AI models that allow the car to act.
v. Computer vision:
Deep learning has delivered super-human accuracy for image classification, object
detection, image restoration and image segmentation—even handwritten digits can be
recognized. Deep learning using enormous neural networks is teaching machines to automate
the tasks performed by human visual systems.
The machines learn the punctuation, grammar and style of a piece of text and can use
the model it developed to automatically create entirely new text with the proper spelling,
grammar and style of the example text. Everything from Shakespeare to Wikipedia entries have
been created.
Deep-learning applications for robots are plentiful and powerful from an impressive
deep-learning system that can teach a robot just by observing the actions of a human
completing a task to a housekeeping robot that’s provided with input from several other AIs
in order to take action.
Just like how a human brain processes input from past experiences, current input from senses
and any additional data that is provided, deep-learning models will help robots execute tasks
based on the input of many different AI opinions.
The deep learning, data mining and machine learning share a foundation in data science,
and there certainly is overlap between the two. Data mining can use machine learning
algorithms to improve the accuracy and depth of analysis, and vice-versa; machine learning
can use mined data as its foundation, refining the dataset to achieve better results.
You could also argue that data mining and machine learning are similar in that they
both seek to address the question of how we can learn from data. However, the way in which
they achieve this end, and their applications, form the basis of some significant differences.
Machine Learning comprises of the ability of the machine to learn from trained data set
and predict the outcome automatically. It I a subset of artificial intelligence.
Deep Learning is a subset of machine learning. It works in the same way on the machine
just like how the human brain processes information. Like a brain can identify the patterns by
comparing it with previously memorized patterns, deep learning also uses this concept.
Deep learning can automatically find out the attributes from raw data while machine
learning selects these features manually which further needs processing. It also employs
artificial neural networks with many hidden layers, big data, and high computer resources.
Data Mining is a process of discovering hidden patterns and rules from the existing
data. It uses relatively simple rules such as association, correlation rules for the decision-
making process, etc. Deep Learning is used for complex problem processing such as voice
recognition etc. It uses Artificial Neural Networks with many hidden layers for processing. At
times data mining also uses deep learning algorithms for processing the data.
10
CHAPTER 3 : PYTHON
Python is available on a wide variety of platforms including Linux and Mac OS X. Let's
understand how to set up our Python environment.
The most up-to-date and current source code, binaries, documentation, news, etc., is
available on the official website of Python.
Installing python is generally easy, and nowadays many Linux and Mac OS
distributions include a recent python.
11
12
Easy-to-learn: Python has few keywords, simple structure, and a clearly defined syntax,
This allows the student to pick up the language quickly.
Easy-to-read: Python code is more clearly defined and visible to the eyes.
Easy-to-maintain: Python's source code is fairly easy-to-maintaining.
A broad standard library: Python's bulk of the library is very portable and cross-
platform compatible on UNIX, Windows, and Macintosh.
13
Portable: Python can run on a wide variety of hardware platforms and has the same
interface on all platforms.
Extendable: You can add low-level modules to the Python interpreter. These modules
enable programmers to add to or customize their tools to be more efficient.
Databases: Python provides interfaces to all major commercial databases.
GUI Programming: Python supports GUI applications that can be created and ported to
many system calls, libraries and windows systems, such as Windows MFC, Macintosh,
and the X Window system of Unix.
Variables are nothing but reserved memory locations to store values. This means that when
you create a variable you reserve some space in memory. Based on the data type of a variable,
the interpreter allocates memory and decides what can be stored in the reserved memory.
Therefore, by assigning different data types to variables, you can store integers, decimals or
characters in these variables.
Numbers
Strings
Lists
Tuples
Dictionary
Number data types store numeric values. Number objects are created when you assign
a value to them.
Python supports four different numerical types − int (signed integers) long (long
integers, they can also be represented in octal and hexadecimal) float (floating point
real values) complex (complex numbers).
14
A list contains items separated by commas and enclosed within square brackets ([]).
To some extent, lists are similar to arrays in C. One difference between them is that all
the items belonging to a list can be of different data type.
The values stored in a list can be accessed using the slice operator ([ ] and [:]) with
indexes starting at 0 in the beginning of the list and working their way to end -1.
The plus (+) sign is the list concatenation operator, and the asterisk (*) is the repetition
operator.
15
Tuples have no append or extend method. You can't remove elements from a tuple.
Tuples have no remove or pop method.
They work like associative arrays or hashes found in Perl and consist of key-value pairs.
A dictionary key can be almost any Python type, but are usually numbers or strings.
Values, on the other hand, can be any arbitrary Python object.
Dictionaries are enclosed by curly braces ({ }) and values can be assigned and accessed
using square braces ([]).
You can use numbers to "index" into a list, meaning you can use numbers to find out
what's in lists. You should know this about lists by now, but make sure you understand
that you can only use numbers to get items out of a list.
What a dict does is let you use anything, not just numbers. Yes, a dict associates one
thing to another, no matter what it is.
You can define functions to provide the required functionality. Here are simple rules
to define a function in Python. Function blocks begin with the keyword def followed by the
function name and parentheses (i.e.()).
Any input parameters or arguments should be placed within these parentheses. You can
also define parameters inside these parentheses
The code block within every function starts with a colon (:) and is indented. The
statement returns [expression] exits a function, optionally passing back an expression to the
caller. A return statement with no arguments is the same as return None.
16
Defining a function only gives it a name, specifies the parameters that are to be
included in the function and structures the blocks of code. Once the basic structure of a function
is finalized, you can execute it by calling it from another function or directly from the Python
prompt.
3.6.1 Class:
A user-defined prototype for an object that defines a set of attributes that characterize any object
of the class. The attributes are data members (class variables and instance variables) and
methods, accessed via dot notation.
● Class variable: A variable that is shared by all instances of a class. Class variables are
defined within a class but outside any of the class's methods. Class variables are not used
as frequently as instance variables are.
● Data member: A class variable or instance variable that holds data associated with a
class and its objects.
● Instance variable: A variable that is defined inside a method and belongs only to the
current instance of a class.
● Defining a Class:
● We define a class in a very similar way how we define a function.
● Just like a function ,we use parentheses and a colon after the class name(i.e. ():).
17
● The init method — also called a constructor — is a special method that runs when an
instance is created so we can perform any tasks to set up the instance.
● The init method has a special name that starts and ends with two underscores:__init__().
18
Numpy: In Python we have lists that serve the purpose of arrays, but they are slow to process.
Numpy aims to provide an array object that is up to 50x faster that traditional Python lists. The
array object in Numpy is called ndata, it provides a lot of supporting functions that make
working with ndarray very easy. Arrays are very frequently used in data science, where speed
and resources are very important.
Matplotlib: Matplotlib is one of the most popular Python packages used for data visualization.
It is a cross-platform library for making 2D plots from data in arrays. Matplotlib is written in
Python and makes use of Numpy, the numerical mathematics extension of Python. It provides
an object-oriented API that helps in embedding plots in applications using Python GUI toolkits
such as PyQt, WxPythonotTkinter. It can be used in Python and IPython shells, Jupyter
notebook and web application servers also Matplotlib has a procedural interface named the
Pylab, which is designed to resemble MATLAB, a proprietary programming language
developed by MathWorks. Matplotlib along with Numpy can be considered as the open source
equivalent of MATLAB.
TensorFlow: TensorFlow is an end-to-end open source platform for machine learning. It has
a comprehensive, flexible ecosystem of tools, libraries and community resources that lets
researchers push the state-of-the-art in ML and developers easily build and deploy ML powered
applications.
19
OS: The OS module in python provides functions for interacting with the operating
system. OS, comes under Python's standard utility modules. This module provides a portable
way of using operating system dependent functionality.
JSON: Python has a built-in package called json, which can be used to work with JSON data.
Convolutional Neural Network (CNN): The CNN algorithm is efficient at recognition and
highly adaptable. It’s also easy to train because there are fewer training parameters, and is
scalable when coupled with backpropagation.
20
The aim of this dataset is to help difficult task of detecting the ships in satellite images.
The dataset is also distributed as a JSON formatted text file shipsnet.json. The loaded object
contains data, label, scene_ ids, and location lists.
labels: Valued 1 or 0, representing the “ship” class and “no-ship” class, respectively.
Scene id: The unique identifier of the PlanetScope visual scene the image chip was extracted
from. The scene id can be used with the Planet API to discover and download the entire scene.
Longitude_latitude: The longitude and latitude coordinates of the image centre point, with
values separated by a single underscore.
The pixel value data for each 80x80 RGB image is stored as a list of 19200 integers within
the data list. The first 6400 entries contain the red channel values, the next 6400 the green, and
the final 6400 the blue. The image is stored in row-major order, so that the first 80 entries of
the array are the red channel values of the first row of the image.
The “ship” class includes 1000 images. Images in this class are near-centred on the body of a
single ship. Ships of different sizes, orientations, and atmospheric collection conditions are
included.
21
The "no-ship" class includes 3000 images. A third of these are a random sampling of different
land cover features - water, vegetation, bare earth, buildings, etc. - that do not include any
portion of an ship. The next third are "partial ships" that contain only a portion of an ship, but
not enough to meet the full definition of the "ship" class. The last third are images that have
previously been mislabelled by machine learning models, typically caused by bright pixels or
strong linear features.
Objective of the problem is to help address the difficult task of identifying the presence of large
ships in satellite images. Automating this process can be applied to many issues including
monitoring port activity levels and supply chain analysis. Goal is to identify ships in satellite
images. To classify whether ships are present or not present.
22
Python has a built-in open() function to open a file. This function returns a file object, also
called a handle, as it used to read or modify the file accordingly.
We can specify the mode while opening a file. In mode, we specify whether we want to read,
write or append to the file. We can also specify if we want to open the file in text mode or
binary mode.
The default is reading in text mode. In this mode, we get strings when reading from the file.
Python has built-in package called json, while can be used to work JSON data. Contents of json
file are stored in dataset variable
Pandas DataFrame to which all the operations can be performed which helps us to access each
and every row as well as columns and each and every value can be access using the dataframe.
Any missing value or NaN value have to be cleaned.
23
Descriptive statistics include those that summarize the central tendency, dispersion and
shape of a dataset’s distribution, excluding nan values. Analyses both numeric and object
series, as well as DataFrame column sets of mixed data types. The output will vary depending
on what is provided.
For numeric data, the result’s index will include count, mean, std, min, max as well as
lower, 50 and upper percentiles. By default the lower percentile is 25 and the upper percentile
is 75.The 50 percentile is the same as the median.
For object data (e.g. strings or timestamps), the result’s index will include count,
unique, top and freq. The top is the most common value. The freq is the most common value’s
frequency. Timestamps also include the first and last items.
If multiple object values have the highest count, then the count and top results will be
arbitrarily chosen from among those with the highest count.
For mixed data types provided via a DataFrame , the default is to return only an analysis
of numeric columns. If the dataframe consists only of object and categorical data without any
numeric columns, the default is to return an analysis of both the object and categorical columns.
If include='all' is provided as an option, the result will include a union of attributes of each
type.
24
Fig 5.2.2: Checking for total number of ship and non ship images
Observations:-
To know the image names from the main folder these following steps are done accordingly.
25
2. Here base directory is created so that it makes joining the other paths easy.
26
4. By, using one of the image name and displaying a sample image using matplotlib package.
27
There are a number of schemes that have been developed to indicate the presence of
missing data in a table or DataFrame. Generally, they revolve around one of two strategies:
using a mask that globally indicates missing values, or choosing a sentinel value that indicates
a missing entry.
28
Feature Selection is the process where you automatically or manually select those
features which contribute most to your prediction variable or output in which you are interested
in. Having irrelevant features in your data can decrease the accuracy of the models and make
your model learn based on irrelevant features. In Classification of images input should be of
numpy array or image.
Here data column can be considered as input and labels column can be considered as output.
Remaining columns are irrelevant.
In the dataset locations, scene_ids columns are irrelevant so these columns are to be
dropped.
29
In CNN image is defined in the form of array based on that it will classify the images.
So, as the project data is not directly involving the images so data should be in the array format.
As the data is in not array format it should be converted to array format of building the model.
Here, the data column which is in the numpy array format is named as items and labels column
which is in the numpy array format is named as output_labels.
30
When doing image classification, it is important to make sure you are using the correct size of
images; otherwise you may get unexpected results or errors.
Here, the shape of the items data is not in correct size format so reshaping should be done for
correct format to proceed further.
Scaling is done only for items column where image data is represented in numpy
array format. The items column is divided by 255 because every digital image is formed by
pixel having value in range 0~255. 0 is black and 255 is white. For colourful image, it
contains three maps: Red, Green and Blue, and all the pixel still in the range 0~255. Since
255 is the maximin pixel value. Rescale 1./255 is to transform every pixel value from range
[0,255] -> [0,1].which helps treating all images in the same manner.
32
A breakthrough in building models for image classification came with the discovery
that a convolutional neural network (CNN) could be used to progressively extract higher- and
higher-level representations of the image content. Instead of pre-processing the data to derive
features like textures and shapes, a CNN takes just the image's raw pixel data as input and
"learns" how to extract these features, and ultimately infer what object they constitute. CNN
receives an input feature map: a three-dimensional matrix where the size of the first two
dimensions corresponds to the length and width of the images in pixels. The size of the third
dimension is 3 (corresponding to the 3 channels of a colour image: red, green, and blue). The
CNN comprises a stack of modules, each of which performs three operations.
1. Convolution:
A convolution extracts tiles of the input feature map, and applies filters to them to
compute new features, producing an output feature map, or convolved feature (which may have
a different size and depth than the input feature map).Convolutions are defined by two
parameters:
Size of the tiles that are extracted (typically 3x3 or 5x5 pixels).
The depth of the output feature map, which corresponds to the number of filters that are
applied.
During a convolution, the filters (matrices the same size as the tile size) effectively slide
over the input feature map's grid horizontally and vertically, one pixel at a time, extracting each
corresponding tile.
33
In fig 7.1.1 a 3x3 convolution of depth 1 performed over a 5x5 input feature map, also
of depth 1. There are nine possible 3x3 locations to extract tiles from the 5x5 feature map, so
this convolution produces a 3x3 output feature map.
In Figure 7.1.1, the output feature map (3x3) is smaller than the input feature map (5x5).
If you instead want the output feature map to have the same dimensions as the input feature
map, 2you can add padding (blank rows/columns with all-zero values) to each side of the input
feature map, producing a 7x7 matrix with 5x5 possible locations to extract a 3x3 tile.
For each filter-tile pair, the CNN performs element-wise multiplication of the filter
matrix and the tile matrix, and then sums all the elements of the resulting matrix to get a single
value. Each of these resulting values for every filter-tile pair is then output in the convolved
feature matrix (see Figures 7.1.2 and 7.1.3).
Fig 7.1.2: A 5x5 input feature map and 3x3 convolution of deph1
34
In Figure:
Left: The 3x3 convolution is performed on the 5x5 input feature map.
For each filter-tile pair, the CNN performs element-wise multiplication of the filter
matrix and the tile matrix, and then sums all the elements of the resulting matrix to get a single
value. Each of these resulting values for every filter-tile pair is then output in the convolved
feature matrix.
During training, the CNN "learns" the optimal values for the filter matrices that enable
it to extract meaningful features (textures, edges, shapes) from the input feature map. As the
number of filters (output feature map depth) applied to the input increases, so does the number
of features the CNN can extract. However, the trade-off is that filters compose the majority of
resources expended by the CNN, so training time also increases as more filters are added.
Additionally, each filter added to the network provides less incremental value than the previous
one, so engineers aim to construct networks that use the minimum number of filters needed to
extract the features necessary for accurate image classification.
2. ReLU
Following each convolution operation, the CNN applies a Rectified Linear Unit (ReLU)
transformation to the convolved feature, in order to introduce nonlinearity into the model. The
ReLU function.
35
F(x)=max(0,x) returns x for all values of x > 0, and returns 0 for all values of x ≤ 0.
3. Pooling
After ReLU comes a pooling step, in which the CNN down samples the convolved feature (to
save on processing time), reducing the number of dimensions of the feature map, while still
preserving the most critical feature information. A common algorithm used for this process is
called max pooling.
Max pooling operates in a similar fashion to convolution. We slide over the feature map and
extract tiles of a specified size. For each tile, the maximum value is output to a new feature
map, and all other values are discarded. Max pooling operations take two parameters:
Stride: the distance, in pixels, separating each extracted tile. Unlike with convolution, where
filters slide over the feature map pixel by pixel, in max pooling, the stride determines the
locations where each tile is extracted. For a 2x2 filter, a stride of 2 specifies that the max
pooling operation will extract all non-overlapping 2x2 tiles from the feature map (see Figure
7.1.4).
36
In Figure 7.1.4.
Left: Max pooling performed over a 4x4 feature map with a 2x2 filter and stride of 2.
Right: the output of the max pooling operation. Note the resulting feature map is now 2x2,
preserving only the maximum values from each tile.
At the end of a convolutional neural network are one or more fully connected layers
(when two layers are "fully connected," every node in the first layer is connected to every node
in the second layer). Their job is to perform classification based on the features extracted by
the convolutions. Typically, the final fully connected layer contains a softmax activation
function, which outputs a probability value from 0 to 1 for each of the classification labels the
model is trying to predict.
Preventing Overfitting
As with any machine learning model, a key concern when training a convolutional neural
network is overfitting: a model so tuned to the specifics of the training data that it is unable to
generalize to new examples.
Data augmentation: artificially boosting the diversity and number of training examples by
performing random transformations to existing images to create a set of new variants. Data
augmentation is especially useful when the original training data set is relatively small.
Dropout regularization: Randomly removing units from the neural network during a training
gradient step.
37
As the model is binary classification the output dense value is 1 and using sigmoid as
activation function for the last dense layer. Sigmoid activation function gives output for
binary classification in range 0 to 1 whereas for the value less than 0.5 its classifies to
0 and for the value greater than 0.5 its classifies to 1.
Same padding: It applies padding to the input image so that input image gets fully
covered by the filter and specified stride. It is same because, for stride 1,the output is
same as the input.
38
Here, total 2,700,009 parameters to trained in model. After building compiling should be
done.
Here we need to define how to calculate the loss or error. Since we are using a binary
classification, we can use binary_crossentropy. With the optimizer parameter, we pass how to
adjust the weights in the network such that the loss gets reduced. There are many options that
can be used and here I used Adam Optimizer method. Finally, the metrics parameter will be
used to estimate how good our model is here we use the accuracy.
Adam optimizer:
Adam optimization is a stochastic gradient descent method that is based on adaptive estimation
of first-order and second-order moments.
39
Binary crossentropy:
Binary crossentropy is a loss function that is used in binary classification tasks. These are tasks
that answer a question with only two choices (yes or no, A or B, 0 or 1, left or right).
Accuracy metric:
This metric creates two local variables, total and count that are used to compute the frequency
with which y_pred matches y_true. This frequency is ultimately returned as binary accuracy:
an idempotent operation that simply divides total by count.
Splitting the data : after the pre-processing is done then the data is split into train and Validation
sets
In deep learning in order to access the performance of the classifier. You train the
classifier using 'training set' and then test the performance of your classifier on unseen
'validation set'. An important point to note is that during training the classifier only uses
the training set. The validation set must not be used during training the classifier. The
validation set will only be available during testing the classifier.
Training set - a subset to train a model.(Model learns patterns between Input and
Output)
40
Validation set - a subset to test the trained model.(To test whether the model has
correctly learnt )
The amount or percentage of Splitting can be taken as specified.
First we need to identify the input and output variables and we need to separate the input
set and output set.
An epoch is a term used in machine learning and indicates the number of passes of the
entire training dataset the machine learning algorithm has completed. Datasets are
usually grouped into batches (especially when the amount of data is very large). Some
people use the term iteration loosely and refer to putting one batch through the model
as an iteration.. This helps in attaining good accuracy to the model.
Here, the model has been split into 80% for trained data and rest 20% of the data is of
validation data by using valididation_split parameter.
41
Model validation is the process of evaluating a trained model on test data set. This provides the
generalization ability of a trained model. Here I provide a step by step approach to complete
first iteration of model validation in minutes.
The model are validate after completion of training and testing the model.
Checking the accuracy scores as metrics to validate the models.
42
43
By observing from the above graph test accuracy and validation accuracy doesn't have much
variation.
44
We have a method called predict , using this method we need to predict the output for
given input and we need to compare that the model is predicting correctly or not. If the
model is predicting correctly then the model is good otherwise some changes should be
done.
Now, I am trying to predict an unknown image which is not present in the training or
validation set.
Below image is an unknown image with a different size. Now it should predict the image is
consisting of ship or not.
In order to predict the new image we need to resize and scale the new image. So, that image is
classified without any errors.
45
Firstly, we import image from TensorFlow, then we load the image and paste the URL
of image we are predicting.
After loading we convert the image to array format and then check the shape and type
of image. Now we will be using this during resizing the image we are predicting on to
convert it into required shape.
Next, we will be applying scaling on the image and to get the dimensions right, expand
dims enters the process, finally we check the shape and observe that we got the desired
result.
46
47
48
CONCLUSION
It is concluded after performing thorough building model using CNN gives 94.75%
accuracy. The model is computed to get good accuracy for unknown data that detects ship
images and non-ship images correctly which leads to clear understanding of the how the model
parameters should be defined to get best accuracy and not leading to under fit or over fit. Thus,
it leads to point of getting the solution for the problem statement of identification of ships
through satellite images.
49
REFERENCES
● https://en.wikipedia.org/wiki/Deep_learning
● https://en.wikipedia.org/wiki/Machine_learning
● https://www.kaggle.com/rhammell/ships-in-satellite-imagery
● https://www.tensorflow.org/tutorials/images/cnn
● https://towardsdatascience.com/building-a-convolutional-neural-network-cnn-in-keras
50