0% found this document useful (0 votes)
13 views12 pages

Research Paper

ghg

Uploaded by

506Pavan Holge
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views12 pages

Research Paper

ghg

Uploaded by

506Pavan Holge
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Research Article

Image Classification Using Deep Learning And


Machine Learning

Nadim Patel, Pavan Holge , Abhishekh Pawar And Shubham Gaikwad.


Not only were traditional artificial neural networks and machine learning difficult to meet the processing needs of massive
images in feature extraction and model training but also they had low efficiency and low classification accuracy when they were
applied to image classification. Therefore, this paper proposed a deep learning model of image classification, which aimed to
provide foundation and support for image classification and recognition of large datasets. Firstly, based on the analysis of the
basic theory of neural network, this paper expounded the different types of convolution neural network and the basic process
of its application in image classification. Secondly, based on the existing convolution neural network model, the noise reduction
and parameter adjustment were carried out in the feature extraction process, and an image classification depth learning model
was proposed based on the improved convolution neural network structure. Finally, the structure of the deep learning model
was optimized to improve the classification efficiency and accuracy of the model. In order to verify the effectiveness of the deep
learning model proposed in this paper in image classification, the relationship between the accuracy of several common network
models in image classification and the number of iterations was compared through experiments. The results showed that the
model proposed in this paper was better than other models in classification accuracy. At the same time, the classification
accuracy of the deep learning model before and after optimization was compared and analyzed by using the training set and
test set. The results showed that the accuracy of image classification had been greatly improved after the model proposed in
this paper had been optimized to a
certain extent. algorithm to realize the image classification. With the
deepening of the application of machine vision in many
fields, people can use computer vision technology to quickly
realize the processing operations of image information,
such as noise reduction and feature extraction [3]. The
1. Introduction research of image classification based on machine learning
has made some progress. From the adoption of artificial
Not only is using manual method to select image features
recognition to the continuous application of computer
time-consuming and laborious, but also the feature quality
vision technology in image classification, scholars have
depends on professional knowledge and practical
made some achievements in the research of image
experience to a certain extent, which limits the application
classification and recognition.
of manual feature selection method [1]. With the advent of
the era of big data and artificial intelligence, the traditional Machine learning is an important branch of artificial
artificial way to obtain image information has been unable intelligence. Although machine learning has experienced
to meet the needs of applications in related fields. In recent half a century of development, there are still some unsolved
years, the method of obtaining images by computer has problems, for example, complex image understanding and
made some progress, but, for the massive image recognition, natural language translation, and
information processing, the traditional computer vision is recommendation system [4]. Deep learning is an important
still inefficient. For example, traditional machine vision branch developed on the basis of machine learning. It
methods have low accuracy in detecting low-level features makes full use of the hierarchical characteristics of artificial
and high-level features of images [2]. Classification and neural network and biological neural system to process
processing of massive image information is the prerequisite information and obtains high-level features by learning low-
for in-depth mining of image information. It is also of great level features and adopting feature combination method,
significance for the expansion of computer vision in the field so as to realize image classification or regression. Different
of image processing application. At present, the research of from traditional machine learning, deep learning uses
image classification by computer has been gradually applied multilayer neural network to automatically learn the image
to agriculture, industry, aerospace, and other fields. The and extract the deep-seated features of the image.
research of image classification mainly includes efficient Different depth learning models can be formed according to
and accurate classification of large-scale image information different feature learning and its combination. However,
and accurate classification of image semantic similar the accuracy of image classification is not high and the
feature information. operation efficiency of the existing deep learning model is
low. Therefore, based on the existing basic theory of
In order to recognize and classify the image, we need to
convolution neural network, this paper establishes a deep
use the image acquisition tool to obtain the original image
learning model of image classification by improving the
information, then clean the image data and extract and
structure of convolution neural network, which provides a
filter the features, and finally use the machine learning
2 Advances in Multimedia

basis for image classification under complex environmental extraction, they have similarities in image classification and
conditions. recognition. They all go through the steps of image
information input, data proprocessing, feature extraction,
model training, and classification output.
2. Related Works
In image classification, some scholars have carried out a
Image classification provides an important basis for image lot of research work on image feature representation and
depth processing and the application of computer vision classifier selection. For example, the deep learning model
technology in related fields. Traditional image classification based on feature representation can be effectively applied
mainly goes through different stages, such as image to the recognition and classification of various images.
proprocessing, feature extraction, classifier construction, Some scholars use deep convolution neural networks (DCN)
and learning training [5]. Traditional image classification to deeply extract image features and apply them to large-
methods mainly use the extracted basic image features to scale dataset ImageNet [9]. Experiments show that the
realize image classification, which can provide a basis for model can effectively classify large data image sets. In
further obtaining the semantic information of images by addition, the deep learning model can effectively learn and
computer. Traditional image classification generally uses describe image features. For example, the deep learning
image color, texture, and other information to calculate model can better describe the hierarchical features through
image features and uses support vector machine and unsupervised learning, and the features extracted by the
logistic regression to realize image classification [6]. The model not only have strong expression ability but also
results of image classification not only depend on the improve the efficiency of image classification. A large
extracted features to a great extent but also are affected by number of research results show that, with the deepening
the knowledge and experience of relevant fields. of the research of image classification methods in related
Not only are the manually acquired features difficult to fields, deep learning model has gradually replaced the
apply to image classification but also a lot of time is spent in traditional artificial feature extraction and machine learning
analyzing feature data. At the same time, the traditional methods and will be widely concerned by scholars in the
machine learning cannot be applied to the processing of field of image recognition and classification [10].
large datasets, and it is difficult to realize the optimization
of feature design, feature selection, and model training, 3. Fundamentals of Image
which makes the classification effect of the model poor. Classification Algorithm
Therefore, image classification methods using traditional
machine learning are affected in many application fields [7]. 3.1. Basic Theory of Neural Network. The traditional neural
Research shows that because texture, shape, and color network, referred to as artificial neural network (ANN), is a
features can be used for image classification and hot spot in the field of early artificial intelligence [5].
recognition, low-level basic features can be used as the Artificial neural network mainly uses the neurons of
basis of image classification. Traditional image classification network model to abstract the characteristics of external
methods generally use single feature extraction or feature things, so as to be used by computer to complete
combination and take the extracted features as the input information processing.
value of support vector machine. In recent years, some Artificial neural network generally establishes the
progress has been made in image classification using corresponding network structure according to the different
artificial neural network classifier. In order to improve the construction methods of neurons. Neural network is an
accuracy of image classification, we can focus on the operation model composed of several different nodes or
standardized design of low-level features such as texture, neurons connected with each other. Each node in the model
shape, and color. is a processing function, and the connection between
Deep learning realizes the training of large-scale different nodes uses weight to represent the memory ability
datasets through multilevel network model and adopts the of artificial neural network. The output of neural network
method of layer-by-layer feature extraction to obtain the depends on the connection form, weight value, and
high-level features of the image. Not only is the deep excitation function of different nodes. At the same time, the
learning network model used to extract the basic features neural network model is mainly constructed according to
of the image but also it can obtain the deep features of the some algorithm or function to express some specific logical
image through multiple hidden layers. Compared with operation.
traditional machine learning methods, the features A basic neural network model usually includes
obtained by deep learning method are not only accurate but information input layer, hidden layer, and calculation result
also conducive to image classification. In the process of output layer. Different layers can contain several neurons
image recognition and classification, the way of feature [11]. Neurons represent a transformation or operation,
learning and combination is mainly determined by the deep which is completed by the activation function of neurons.
learning model [8]. At present, the commonly used deep Two adjacent layers of neurons are connected to each
learning models are sparse model, restricted Boltzmann other, as shown in Figure 1.
machine model, and convolution neural network model. As can be seen from Figure 1, the neural network model
Although these models have some differences in feature includes 11 neurons: 3 input layers, 5 hidden layers, and 3
[DOCUMENT TITLE]
Image Classification Using Deep Learning And Deep Learning

output layers. The structure belongs to a two-layer neural Hidden layer


network model, where W1 and W2 are the weight matrices
of the hidden layer and the output layer, respectively.
Deep learning method is a part of machine learning. It is
widely used in natural language recognition and image .
Input layer Output layer
detection and classification. Moreover, deep learning .
comes from the theory of artificial neural network. By .
referring to the human brain for hierarchical processing of
information, different levels of neural networks are
established. Deep learning effectively extracts multilevel . . .
feature information by simulating human brain, so as to . . .
obtain the key feature information of image, text, and other . . .
data. Deep learning mainly describes the specific object
characteristics through hierarchical processing according to
a large amount of edge feature information. It is a process . . .
from low-level feature extraction to high-level feature . . .
combination. As an important method of machine learning, . . .
deep learning is an effective method to process big data and
obtain abstract features by using neural network model.
Deep learning is a multilayer deep neural network
.
model. According to the connection law of human brain
.
neurons, the sample features are processed in different W1
.
W2
layers of the model, and the deep features of the sample
data are obtained in turn. Similar to deep neural network,
artificial neural network belongs to hierarchical structure.
The model is composed of multilayer perceptron, including
input layer, hidden layer, and output layer. Different from Figure 1: Connections between neurons in different layers.
the deep neural network, the artificial neural network
model only contains two to three layers of forward neural
network, and the number of neurons in each layer is small, extraction. It not only solves the shortage of manual or
so the processing ability of large datasets is limited. Because knowledge but also avoids the preference problem in
the deep neural network model contains many layers and feature extraction. At the same time, the deep learning
each layer contains a large number of neurons, the neural model can obtain representative high-level features
through the organic combination of low-level features [12].
network model not only can realize the abstract expression
As shown in Figure 2 , the working processes of deep
of data but also has strong learning function. Compared
with the traditional machine learning methods, the deep learning and traditional machine learning are compared.
learning model does not need to rely on manual design and In the neural network model, the activation function is
feature used to perform nonlinear operation on the input data of
neurons in order to extract effective feature information
from the original input data. Activation function is a
nonlinear function. With the increase of the number of
layers of neural network model, the most effective feature
information can be obtained after many iterations and data
training.
In order to realize the classification of features, Softmax
function is often used as the activation function in the
neural network model and used in the output layer of the
model [13]. Softmax function is generally used as a
classifier. The calculation formula is as follows:
exp R k
Sk n. (1) k 1 exp Rk

In the above formula, n is the number of neurons in the


current layer and Rk is the nonlinear transformation value of
the k-th neuron in this layer.
Softmax is a classifier that can output different feature
categories. The value of each neuron contained in Softmax
Image data Image data

4 Advances in Multimedia

Basic feature
extraction

Artificial feature
extraction

Multi-layer feature
extraction

Weight training

Weight training

Result output Result output

(a) ( b)

Figure 2: Comparison between deep learning model and traditional machine learning algorithm. (a) Deep learning algorithm.
can be considered as the probability of the corresponding Convolution neural network generally includes three
category. The calculation process of Softmax function is different types of processing layers: convolution layer,
shown in Figure 3. pooling layer, and full connection layer. Among them, the
(b) Traditional machine learning algorithm. feature extraction task is completed by the convolution
layer, and the pooling layer is used for feature mapping. The
full connection layer is similar to the general neural network
structure. All nodes in this layer are not connected to each
R1 S1 other but completely connected to the nodes of the
previous layer. In addition, like other neural networks,
convolution neural networks also have data input layer and
R2 S2 result output layer.
The calculation task of convolution neural network is
mainly completed through the convolution layer, and the
convolution kernel in the convolution layer is the core of the
Softmax convolution neural network model. The convolution layer
function
uses convolution check to convolute the input image and
Rk Sk extract the characteristic information of the image. The
images processed by convolution operation will gradually
become smaller, and the pixels at the edge of the image
have little effect on the output results.
As shown in Figure 5, assuming that the original input
image is a 3 × 3 matrix, the original image is convoluted
Rn Sn
through a convolution check with a size of 2 × 2, and the
corresponding feature map is output.
Figure 3: Schematic diagram of Softmax function calculation. Generally, there is a strong correlation between
adjacent pixels in the image. Convolution kernel mainly
extracts features from the local area of the image and sends
3.2. Basic Theory of Convolution Neural Network. the extracted local features to the high level for integration
Convolution neural network (CNN) is a typical network processing. Because the bottom feature of the image is
structure in deep learning model [14]. Different from independent of its position, it can not only use the same
traditional machine learning, convolution neural network convolution check to extract the relevant features but also
can be better used for image and time series data reduce the number of parameters of the neural network
processing, especially for image classification and language through the shared weight characteristic of the convolution
recognition. kernel, so as to improve the training efficiency of the
The basic structure of convolution neural network is shown network model.
in Figure 4.
[DOCUMENT TITLE]
Image Classification Using Deep Learning And Deep Learning

For complex images, in order to reduce the amount of Therefore, some people put forward the AlexNet model

Image input

Input layer Convolution layer Pool layer

Full connection layer Output layer

Figure 4: Structure diagram of convolution neural network.

Image (3×3)

Convolution kernel (2×2) Feature output (2×2)


a1 a2 a3
a1+3a2+ a2+3a3+
1 3
5b1+7b2 5b2+7b3

b1 b2 b3

5 7
b1+3b2+ b2+3b3+
c1 c2 c3 5c1+7c2 5c2+7c3

Figure 5: Schematic diagram of convolution operation process.


parameter training of the model, the pooling layer in based on the LeNet structure, applied the convolution
convolution neural network can be used to reduce the size neural network to the processing of complex images, and
of feature map. During pooling, the depth and size of the provided a theoretical basis for the application of deep
image can remain unchanged. The operation of pooling learning model in the field of computer vision [17]. The
layer generally includes max pooling and average pooling network structure of AlexNet is shown in Figure 7.
[15], as shown in Figure 6. AlexNet is a network structure with 8 layers. The model
includes 5 convolution layers and 3 full connection layers.
The model uses the ReLU function as the activation function
3.3. Convolution Neural Network Model. In the existing to avoid the gradient dispersion phenomenon caused by the
image detection and recognition models, such as ResNet, large number of layers of the network model. In order to
Mask-RCNN, and Faster R-CNN models, they are usually reduce network training time, AlexNet uses multiple GPUs
based on common network models [16]. LeNet is the most for training. In order to suppress neurons with small
basic convolution neural network model [9]. After response ability, AlexNet uses LRN (local response
transforming the LeNet model, the LeNet-5 model is normalization) processing layer to establish a competition
established to classify ordinary images. Because the LeNet- mechanism for neurons, so as to make the value with large
5 model is not deep enough, it cannot extract enough image response ability increase continuously, so as to increase the
features during model training. Therefore, it cannot be generalization ability of the model. In addition, in order to
applied to the classification of complex images. prevent neurons from forward propagation and back
Average pooling
6 Advances in Multimedia
3 2 6 5

2 5 3 2
3 4

1 4 4 6
3 5
5 6

2 5 2 8
5 8

Max pooling

Figure 6: Schematic diagram of pool layer operation type.

227×227×3

11×11 3×3 5×5 3×3

Max pool Max pool


Image input Conv1 Conv2

3×3 3×3 3×3 3×3


=
Max pool FC FC Softmax
Conv3 Conv4 Conv5 function

9216 4096 4096

Figure 7: Schematic diagram of AlexNet network model structure.

propagation, the dropout mechanism is adopted in the full Table 1: The network structure of Vgg-16. connection layer
of AlexNet model, so that the output results of all hidden layer Convolution
neurons are 0, so as to avoid the complex interaction between Layer type Convolution Characteristic
kernel
neurons. kernel size diagram size
number
At present, VggNet is a widely used deep convolution neural Input layer — 448 × 448 —
network model. Compared with other models, VggNet not only Convolution layer 3×3 448 × 448 128
has better generalization ability but also can be effectively used Pool layer 3×3 448 × 448 —
for the recognition of different types of images [11]. For
Convolution layer 3×3 224 × 224 256
example, convolution neural networks such as FCN, UNet, and
[DOCUMENT TITLE]
Image Classification Using Deep Learning And Deep Learning

SegNet are based on VggNet model. In recent years, Vgg-16 and Pool layer 2 ×2 112 ×112 —
Vgg-19 networks have been commonly used for VggNet models Convolution layer 3×3 112 ×112 768
[18]. The network structure of Vgg-16 is shown in Table 1. Pool layer 2×2 56 × 56 —
The Vgg-16 network structure has 16 layers in total, Full
excluding the pooling layer and Softmax layer. The convolution connection layer 4096 — —
core size is 3 × 3, the pooling layer size is 2 × 2, and the pooling Full connection
layer adopts the maximum pooling operation with step size of 2. 4096 — —
layer
Vgg-16 network uses convolution blocks instead of
Full connection
convolution layers, in which each convolution block 1000 — —
contains 2 3 convolution layers, which is conducive to layer
reducing the network model parameters. At the same time, Softmax classifier — — —
Vgg-16 network adopts ReLU activation function to enhance the improved algorithm is based on VggNet network, the
the training ability of the model. Although Vgg model has training time of the model may increase [19]. The improved
more layers than AlexNet model, the convolution kernel of image classification algorithm is shown in Figure 8.
Vgg model is smaller than that of AlexNet model. Therefore, In order to solve the problem of automatic noise
the number of training iterations of Vgg model is less than reduction of complex image structure, the normalized
that of AlexNet model. network of encoder is used for classification in this paper.
Based on the existing convolution neural network model,
4. Image Classification Model Based on the noise reduction automatic encoder and sparse
automatic encoder are organically combined, and the input
Improved CNN original image information is normalized on the sparse
4.1. Improved Image Classification Model Framework. automatic encoder. Then, the improved convolution neural
Because image classification algorithms are usually used in network model is used to extract the image feature
systems with high real-time requirements, image information, and the Softmax classifier is used to classify the
classification algorithms need to consider real-time features [14].
performance. For complex neural network models, image When the improved convolution neural network model
classification needs to consume a lot of time. Therefore, this is used to classify images, it is very necessary to proprocess
paper simplifies VggNet model and takes it as the model the image such as noise reduction and grayscale, select a
basis of image classification. certain number of training sets and test sets from the
Considering the distribution characteristics of datasets dataset, and then take the training set as the input object of
used for model training, a typical dataset can be selected as the model after unsupervised learning processing.
the weight of the model to initialize the training dataset. Secondly, the hidden layer of the noise reduction automatic
When the model is pretrained and reaches a certain encoder is used to encode and decode the input object, and
accuracy, the number of nodes in Softmax layer is reduced the processing results are output to the sparse automatic
by ten times, and then the dataset is used for weight encoder of the next layer for normalization. The data is
training. Considering that the data processed by the model trained layer by layer through the hidden layer of sparse
may be affected by various noises, a noise reduction automatic encoder, and finally the training results of sparse
automatic encoder is added to the model to eliminate the automatic encoder are output to Softmax classifier. In order
noise interference, and the existing dataset is extended to improve the classification accuracy, gradient descent
through the data enhancement method to enhance the method can be used to strengthen the training of classifier
generalization ability of the model. model parameters in order to improve the performance of
Considering that the image classification algorithm image classification depth learning model. Finally, the
needs to meet certain real-time performance, the network model is verified by using the image test set, and
corresponding image classification model is established and the effectiveness of the image classification method is
optimized based on VggNet model. Among them, the tested according to the classification results output by the
algorithm combining convolution neural network and noise model.
reduction automatic encoder can be used. Because there The improved convolution neural network model can
may be overfitting problem in image classification, it can be overcome the problem that the traditional neural network
optimized by data enhancement. Compared with other is only limited to some features in image classification.
algorithms, this classification algorithm has certain Through the normalization of sparse automatic encoder,
generalization performance in the case of small amount of the overfitting phenomenon of model in data processing
data. In addition, the algorithm also adds a noise reduction can be avoided, and more abstract representative features
automatic encoder, which can effectively reduce the impact can be obtained by using the hidden layer of sparse
of data noise on the performance of the model, so as to automatic encoder to train the data layer by layer. The
ensure that the model has good generalization ability. Since improved model adopts Softmax classifier, which can make
the classification result closer to the real value. The
8 Advances in Multimedia

improved deep learning network model is mainly divided convolution kernel is, the fewer features extracted by the
into two stages: training and testing. The training stage is convolution layer are, and the worse the image
mainly used to build an effective image classification model, classification effect may be. Therefore, the reasonable
and the testing stage is mainly to evaluate and analyze the optimization of the size of convolution kernel can improve
model according to the experimental classification results. the accuracy of image classification.
Figure 9 shows the workflow of the improved deep learning Because the convolution neural network model mainly
network model. extracts the image features layer by layer through different
convolution layers, the number of convolution layers will
affect the feature extraction quality of the model to a
4.2. Image Classification Model Optimization. It is known certain extent. Similar to the number of convolution kernels

Weight Hidden layer Weight update


initialization weight training forward propagation

Data Feature Feature


Image data set Adjust Softmax
preprocessing extraction Back propagation
classification
fine tuning
Data Depth feature
preprocessing extraction

Data Training parameter


Image data set Noise reduction
preprocessing adjustment Output
Are constraints met ?
N classification
results
Y
Figure 8: Working diagram of improved image classification algorithm.
Training set

Network
model

Output
Data Get weight forward Testing and
classification
preprocessing propagation analysis
results

Figure 9: Working flow diagram of improved deep learning network model.


from the existing research that the convolution neural contained in the convolution layer, the more convolution
network model can be optimized from the aspects of data layers, the finer the features obtained by the model
enhancement and adjusting training methods, and the classifier, which may lead to overfitting phenomenon, while
optimization of convolution neural network model is the less convolution layers, the coarser the features
related to the type of model used. For example, for the deep obtained by the model classifier, which may lead to the
learning model with more layers, the training parameters decline of image classification accuracy. Therefore, the
can be optimized. optimization of convolution layers can improve the
According to the structure of convolution neural classification accuracy of the model.
network, convolution layer is used to extract features, and In order to improve the accuracy of image classification
the size of convolution kernel determines the extraction and recognition, the depth learning model proposed in this
quality of image features. From the perspective of image paper needs to be optimized. Firstly, a smaller convolution
composition, adjacent pixels can generally form the edge kernel is selected in the first convolution layer in order to
lines of the image. Several edge lines form the image extract more image feature details. Secondly, the maximum
texture, and the image texture is combined into several pool sampling operation is adopted in the model to avoid
local patterns. The local pattern is the basic element of the the overfitting problem. As shown in Figure 10, the
image. Through the convolution layer of the network optimized image classification model consists of three
model, different types of features can be extracted and the convolution layers, in which the convolution kernel of each
local pattern of the image can be formed. When the convolution layer decreases in turn. At the same time, after
convolution kernel is smaller, although the convolution each convolution layer, the features are processed by ReLU
layer extracts more features, there may be overfitting activation function, and the generated features are used as
problems in the model. On the contrary, the larger the the input of the maximum pooling layer. The model adopts
[DOCUMENT TITLE]
Image Classification Using
9×9Deep Learning And Deep Learning
3×3 5×5 3×3 3×3 3×3

Image input Conv1 Max pool Conv2 Max pool Conv3 Max pool

Softmax
function

FC3 FC2 FC1

Figure 10: Structure diagram of optimized image classification model.


three full connection layers, takes the processing result of The optimized convolution neural model uses Softmax
the last full connection layer as the input of Softmax function to classify the images, and Softmax function uses
classifier, and then generates the classification result of the supervised learning algorithm to regress the features [21].
image. In the classification process, category y of the image target
The convolution layer parameters of the model include can have M different values. If the image training set is (x1,
the size and number of convolution kernels. The first y1), · · · , (xi, yi) , where xi represents the image training
convolution layer is close to the image input layer and is sample, yi is the image classification category, and yi {1, 2,
mainly used to extract the basic features of the image, so · · · , M}, the cost function of Softmax regression algorithm
the parameters of the first convolution layer have a great can be expressed as
influence on the features. In order to facilitate the further
processing of features in the subsequent convolution layer,
1 N M1 yi
a smaller convolution kernel needs to be used to extract the
k log Mexp αT k xTi . (4) r(α)
attribute information such as shadow, boundary, and light
of the image. N i 1k 1 j 1 exp αj xi

The convolution layer maps the obtained features


Assuming that M markers are accumulated in the cost
through the activation function. Therefore, the optimized
function, the probability calculation of training sample x as
convolution neural network model adopts ReLU activation
category k can be obtained, which is expressed as
function [20], which can be expressed by mathematical
function as follows:
exp αTk xi
y(x) Max(0, x). (2) λ yi k | xi; α M T . (5)
j 1 exp αj xi
When the traditional convolution neural network model
uses ReLU activation function to train features, it may lose
useful feature information in the process of image
classification. In order to prevent the loss of useful features 5. Experiment and Analysis
during image classification, it can be improved on the basis
of the existing ReLU activation function [20]. The optimized 5.1.SelectionofDatasetsandExperimentalMethods. In order
activation function can be expressed as to verify the effectiveness of the image classification depth
learning model proposed in this paper, the Flower dataset
x provided by Oxford University was used in the experiment
cii, x i < 0, [22]. These images describe the proportion, shape, and light
changes of different types of flowers, and the images of
y xi (3) xi, xi ≥ 0. some flower categories change greatly. The dataset
contains 17 categories of flower datasets; each category
From the improved calculation formula of activation contains 80 pictures, for a total of 1360 pictures. In the
function, when the input feature is less than zero, it can not experiment, the dataset is randomly divided into three
only retain the negative value information in the feature parts as the training set, verification set, and test set of the
map but also increase the reinforcement learning of model. A partial image of the Flower dataset is shown in
effective features. Figure 11.
10 Advances in Multimedia

The experiment is based on Matlab, Python, and deep tni


learning framework. Through the proprocessing, feature
extraction, training, and classification of Flower dataset, the In the above formula, tni represents the number of
effectiveness and accuracy of the model in image correctly classified samples, tn denotes the number of test
classification are verified. samples, tri indicates the number of correctly classified
Fritillary Dandelion Lily Valley Daisy Daffodil samples of type i, and tni expresses the number of test
samples of type i.
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
50 100 150 200 250 300 350 400 450 500
Iterations
LeNet
AlexNet
VggNet

Figure 12: The relationship between the classification accuracy of


Cowslip Tulip Tigerlily Crocus Bluebell the model and the number of iterations.

5.2. Results and Analysis. In order to facilitate comparative


analysis, three common network models, LeNet, AlexNet,
and VggNet, are selected for comparison in the experiment.
According to the experimental comparison results, the
relationship between the classification accuracy and the
number of iterations of the model when using different
network models to classify flower images is shown in Figure
12. When iterating 50 times, the accuracy of LeNet model is
28%, that of AlexNet model is 17%, and that of VggNet
model is 48%. When the number of iterations is 100 , the
accuracy of LeNet model is 45%, that of AlexNet model is
39%, and that of VggNet model is 72%. In addition, when
the VggNet model converges, the highest accuracy rate is as
high as 75%, the highest accuracy rate of LeNet model is
58%, and the highest accuracy rate of AlexNet model is 41
Figure 11: partial image of the Flower dataset [22]. %.
In order to test the effect of image classification after
In the experiment, the classification accuracy is used to the model is optimized to a certain extent, the accuracy of
evaluate the robustness of the deep learning network flower image classification before and after the model
model. According to different evaluation angles, optimization is compared in the experiment, as shown in
classification accuracy generally includes overall accuracy Figure 13. The comparison results show that, for the
and category classification accuracy [23]. The overall training dataset, the optimized model converges faster in
accuracy AC is expressed by the ratio of the number of the early stage of training and slower in the middle stage of
correctly classified samples to the total number of samples, training, while the two models in the later stage of training
while the classification accuracy CCAi is expressed by the are basically the same. For the test dataset, the optimized
ratio of the number of correctly classified samples to the model is higher than the nonoptimized model in terms of
number of such tests. The calculation formula is as follows: convergence speed and image classification accuracy.
Therefore, the model optimization method proposed in this
tr paper can effectively improve the accuracy of image
AC , tn classification.
(6) In addition, in order to test the relationship between the
tri loss value function of the optimized model and the number
CCAi . of iterations, the training set and test set are used to
[DOCUMENT TITLE]
Image Classification Using Deep Learning And Deep Learning

compare the models before and after optimization, as 6. Conclusion


shown in Figure 14. The loss value function of the
nonoptimized model shows an upward trend with the Aiming at the problems of large time overhead and low
increase of the number of iterations, indicating that the classification accuracy in traditional image classification
nonoptimized model has the phenomenon of overfitting, methods, a deep learning model of image classification
while the loss value function of the optimized model shows based on machine learning was proposed in this paper. By
a downward trend with the increase of the number of analyzing the basic theory of neural network, this paper
iterations. It can be seen that the cost of parameter training expounded the types of convolution neural network and its
can be reduced through model optimization. application in image classification. Using the existing
1 convolution neural network model, through noise reduction
0.9 and parameter adjustment in the feature extraction
0.8
process, an image classification depth learning model based
0.7
0.6 on improved convolution neural network was proposed. In
0.5 order to improve the classification efficiency and accuracy
0.4 of the model, this paper optimized the proposed deep
0.3 learning model. Finally, the accuracy of several common
0.2
network models in image classification was compared
0.1
0 through experiments. The results showed that the
0 10 20 30 40 50 60 70 proposed model was better than other models in
Iterations classification accuracy. At the same time, the classification
accuracy before and after the optimization of the deep
Train_optimizationTest_optimization
learning model was compared and analyzed. The results
Train_no_optimization Test_no_optimization showed that the optimized model had a great improvement
Figure 13: Comparison of accuracy before and after model in the accuracy of image classification. How to classify
optimization. dynamic targets in complex environment is a problem that
needs further research in the future.

4 Data Availability
3.5
3 The labeled dataset used to support the findings of this
2.5 study is available from the corresponding author upon
2 request.
1.5
1 Conflicts of Interest
0.5
0 The authors declare that there are no conflicts of interest.
0 10 20 30 40 50 60 70
Iterations
Acknowledgments
Train_optimizationTest_optimization
Train_no_optimization Test_no_optimization This work was supported by the Shijiazhuang Posts and
Figure 14: Iterative comparison of model loss value before and
Telecommunications Technical College.
after optimization.
References
From the above experimental comparison results of the [1] S. H. Kim and H. L. Choi, “Convolutional neural networkbased
relationship between the accuracy of common network multi-target detection and recognition method for
models in image classification and the number of iterations, unmanned airborne surveillance systems,” INTERNATIONAL
it is known that the model proposed in this paper is superior JOURNAL OF AERONAUTICAL AND SPACE
to other models in classification accuracy. By comparing the SCIENCES, vol. 20, no. 4, pp. 1038–1046, 2019.
classification accuracy of the deep learning model on the [2] P. W. Song, H. Y. Si, H. Zhou, R. Yuan, E. Q. Chen, and Z. D.
training set and the test set before and after optimization, Zhang, “Feature extraction and target recognition of moving
it is known that the accuracy of image classification can be image sequences,” IEEE Access, vol. 8, pp. 147148 – 147161,
significantly improved after a certain degree of 2020.
optimization. [3] W. Y. Zhang, X. H. Fu, and W. Li, “The intelligent vehicle target
recognition algorithm based on target infrared features
combined with lidar,” Computer Communications, vol. 155 ,
pp. 158–165, 2020.
12 Advances in Multimedia

[4] M. Li, H. P. Bi, Z. C. Liu et al., “Research on target recognition model based on transfer learning,” IEEE Access, vol. 8, pp.
system for camouflage target based on dual modulation,” 173450–173460, 2020.
Spectroscopy and Spectral Analysis, vol. 37, no. 4, pp. 1174 – [20] Z. M. Guo, Y. Jiang, and S. H. Bi, “Detection probability for
1178, 2017. moving ground target of normal distribution using infrared
[5] S. J. Wang, F. Jiang, B. Zhang, R. Ma, and Q. Hao, satellite,” Optik, vol. 181, pp. 63–70, 2019.
“Development of UAV-based target tracking and recognition [21] S. Matteoli, M. Diani, and G. Corsini, “Automatic target
systems,” IEEE Transactions on Intelligent Transportation recognition within anomalous regions of interest in
Systems, vol. 21, no. 8, pp. 3409–3422, 2020. hyperspectral images,” Ieee Journal of Selected Topics in
[6] O. Kechagias-Stamatis and N. Aouf, “Evaluating 3D local Applied Earth Observations and Remote Sensing, vol. 11, no.
descriptors for future LIDAR missiles with automatic target 4 , pp. 1056–1069, 2018.
recognition capabilities,” The Imaging Science Journal, vol. 65 [22] M. E. Nilsback and A. Zisserman, “Automated flower
, no. 7, pp. 428–437, 2017. classification over a large number of classes,” in Proceedings
[7] M. Ding, Z. J. Sun, L. Wei, Y. F. Cao, and Y. H. Yao, “Infrared of the Sixth Indian Conference on Computer Vision, Graphics
target detection and recognition method in airborne & Image Processing, pp. 722–729, Bhubaneswar, India,
photoelectric system,” Journal of Aerospace Information December 2008.
Systems, vol. 16, no. 3, pp. 94–106, 2019. [23] C. Y. Zhang, B. L. Guo, N. N. Liao et al., “A public dataset for
[8] W. L. Xue and T. Jiang, “An adaptive algorithm for target space common target detection,” KSII TRANSACTIONS ON
recognition using Gaussian mixture models,” Measurement, INTERNET AND INFORMATION SYSTEMS, vol. 16, no. 2 , pp.
vol. 124, pp. 233–240, 2018. 365–380, 2022.
[9] F. Liu, T. S. Shen, S. J. Guo, and J. Zhang, “Multi-spectral ship
target recognition based on feature level fusion,”
Spectroscopy and Spectral Analysis, vol. 37, no. 6, pp. 1934–
1940, 2017.
[10] S. Razakarivony and F. Jurie, “Vehicle detection in aerial
imagery: a small target detection benchmark,” Journal of
Visual Communication and Image Representation, vol. 34 ,
pp. 187–203, 2016.
[11] O. K. Stamatis and N. Aouf, “A new passive 3-D automatic
target recognition architecture for aerial platforms,” IEEE
Transactions on Geoscience and Remote Sensing, vol. 57, no.
1 , pp. 406–415, 2019.
[12] L. Y. Ma, X. W. Liu, Y. Zhang, and S. L. Jia, “Visual target
detection for energy consumption optimization of unmanned
surface vehicle,” Energy Reports, vol. 8, pp. 363–369, 2022.
[13] Z. Geng, H. Deng, and B. Himed, “Ground moving target
detection using beam-Doppler image feature recognition,”
IEEE Transactions on Aerospace and Electronic Systems, vol.
54, no. 5, pp. 2329–2341, 2018.
[14] Z. M. Guo, Y. Jiang, and S. H. Bi, “Detection probability for
moving ground target of normal distribution using an imaging
satellite,” Chinese Journal of Electronics, vol. 27, no. 6 , pp.
1309–1315, 2018.
[15] Y. K. Bai, “Target detection method of underwater moving
image based on optical flow characteristics,” Journal of
Coastal Research, vol. 93, no. sp1, p. 668, 2019.
[16] W. Z. Wu, J. W. Zou, J. Chen, S. Y. Xu, and Z. P. Chen,
“Falsetarget recognition against interrupted-sampling
repeater jamming based on integration decomposition,” IEEE
Transactions on Aerospace and Electronic Systems, vol. 57,
no. 5 , pp. 2979–2991, 2021.
[17] L. L. Yu, Q. X. Yang, and L. M. Dong, “Aircraft target detection
using multimodal satellite-based data,” Signal Processing,
vol. 155, pp. 358–367, 2019.
[18] I. Mahmud and Y. Z. Cho, “Detection avoidance and
priorityaware target tracking for UAV group reconnaissance
operations,” Journal of Intelligent and Robotic Systems, vol.
92 , no. 2, pp. 381–392, 2018.
[19] T. Yulin, S. H. Jin, G. Bian, and Y. H. Zhang, “Shipwreck target
recognition in side-scan sonar images by improved YOLOv3

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy