0% found this document useful (0 votes)
6 views5 pages

Cen Gil 2017

This study presents a GPU-based Convolutional Neural Network (CNN) approach for image classification, specifically using a LeNet model to classify images of cats and dogs. The method utilizes the Caffe library and processes data efficiently through parallel computing enabled by GPU technology, achieving acceptable accuracy and speed. The research highlights the importance of deep learning in various fields, including medicine and security, and demonstrates the effectiveness of CNNs in image classification tasks.

Uploaded by

velo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views5 pages

Cen Gil 2017

This study presents a GPU-based Convolutional Neural Network (CNN) approach for image classification, specifically using a LeNet model to classify images of cats and dogs. The method utilizes the Caffe library and processes data efficiently through parallel computing enabled by GPU technology, achieving acceptable accuracy and speed. The research highlights the importance of deep learning in various fields, including medicine and security, and demonstrates the effectiveness of CNNs in image classification tasks.

Uploaded by

velo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

A GPU-Based Convolutional Neural Network

Approach for Image Classification


Emine CENGøL1, Ahmet ÇINAR2, Zafer GÜLER3
1,2,3
Departure of Computer Engineering, Firat University, Elazig, Turkey
{ecengil1, acinar2, zguler3}@firat.edu.tr

Abstract—Deep learning obtains successful results in solving many studies. In the 1950s, academic fields dealing with
many machine learning problems. In this study, image artificial intelligence introduced two methods for computer
classification process is performed by using Convolutional Neural vision. Artificial Neural Networks and Decision Trees.
Network (CNN) which is the most used architecture of deep Artificial neural networks are a method inspired by the
learning. Image classification is used in a lot of basic field like
medicine, education and security. Conditions that correct
working mechanism of the human brain and a layered network
classification has vital importance may be especially in medicine structure has been created by simulating the way the nerve
field. Therefore, improved methods are needed in this issue. cells work. This structure has been used for many years in
Although several algorithms for image classification have been many artificial intelligence problems and has achieved
developed over the years, they have not been used with the successful results [1].
discovery of Convolutional Neural Networks. Convolutional The basis of deep learning is artificial neural networks.
Neural Networks provide better results than existing methods in Artificial Neural Network has been developed to provide
the literature due to advantages such as processing by extracting architects with a better modeling of the brain. In the 1980s,
hidden features, allowing parallel processing thanks to parallel
deep constraints did not allow intensive matrix manipulation,
structure, and real time operation. Furthermore, we use
Convolutional Neural Networks in the proposed method. In this
so deep learning could not be turned into practice. In the late
study, the image classification process is performed by using like '80s, Hinton and Lecun's back-propagation algorithm [2]
a LeNet network model. The caffe library, which is often used for attracted the attention of the Advanced Research Institute of
deep learning, is utilized. Our method is trained and tested with Canada, despite the fact that it did not resonate in the scientific
images of cats and dogs taken from the kaggle dataset. 10.000 areas, and research groups of institute affiliated universities
tagged data is used for training and 5.000 unlabeled data is used were perhaps the reference groups for deep learning on this
for testing. Owing to Convolutional Neural Networks allow issue [3].
parallel processing, GPU technology has been used. In our Deep learning has been used in many machine learning
method is used GPU technology and classification is evaluated
problems and has achieved successful results. Recent
with acceptable accuracy rate and speed performance.
developments in deeper learning enable efficient processing of
Index Terms—Keywords: Deep Learning, Image Classification, large numbers of images. Deep learning has many specialized
Convolutional Neural Network, GPU. architectures such as Boltzman machines, Restricted Boltzman
machines, Autoencoders and Convolutional Neural Networks.
I. INTRODUCTøON Convolutional networks, in particular, achieve successful
Image and signal processing subjects have been interested results in image classification. Along with the progress of
and studied for many years in the academic and scientific technology, almost every field has become inevitable to
areas. By using machine learning techniques, raw data such as benefit from technology. Image classification is beneficial for
image, sound and signal are taken and processed such as many key areas such as medicine, safety and education.
image processing, natural language processing and signal There are many studies on image classification using
processing. Convolutional Neural Networks. One of them, Xiaohong et
Many machine learning techniques have been developed to al. [4], pointed out the importance of early detection of
present from past. Wavelet transformation, artificial neural Alzheimer's disease in preventing disease. Computed
networks, fuzzy logic and deep learning that we often hear in tomography has not been included in the clinical decision-
recent years are some of these techniques. The accuracy and making process for the diagnosis of Alzheimer's disease even
speed performance of these techniques are based on the if the human brain is visualized. At this point, they used deep
problem that they use as success criteria. learning architecture to classify computed tomography
Machine learning, which began with probability and images. Accepted images are defined to be included in one of
statistics in the past, has gone through some processes and three classes, AD, Lesion, and Normal. The classification of
made significant progress. Today, the most advanced the used CNN network has been realized with high accuracy.
technique is deep learning which has proved its success in Xiaohong et al. [5] developed a method using CNN
architecture to classify video images of electrocardiography
978-1-5386-1880-6/17/$31.00 ©2017 IEEE by pointing to the problem of the diagnosis of heart diseases.
The network was built with hand-designed features in a data- In this section, the most commonly used deep learning
driven learning framework that includes spatial and temporal architectures will be briefly mentioned.
information from video images and leads to a two dimensional
A. Boltzman Machines
convolutional neural network. Because the data to be
classified is stationary, the acceleration measurement along the In 1986, Geoffy Hinton and Terrence Sejnowski developed
time axis is used to represent temporal motion information a new learning method for use in a Hopfield network. The
using the transient optical flow technique. The method has resulting network is called the Boltzman machine (BM). The
been developed by combining two networks. This architecture Boltzman machine has the same basic operating elements as
provides an acceptable accuracy of 92.1%. the Hoppfield network [12]. Boltzman Machines, which we
C. Smampinato et al. [6] pointed out the importance of can call the basis of Restricted Boltzman machines, are a
skeletal bone age assessment in identifying some health different kind of Markov Pascal fields. The BM is used in the
problems. Bone age is usually assessed by left hand uncovering and modeling of the interaction between the
examination using various medical methods. However, these visible data layer and the hidden layer and within the layers
methods have not been generalized to different races, age themselves. The values learned in hidden layers in Boltzman
ranges and sexes. In the study, a deep learning approach was machines are an important architecture for deep neural
introduced to automatically assess skeletal bone age. The networks since they can provide input for the next BM [11].
study provides a comprehensive basis for similar studies. B. Restricted Boltzman Machines
Classification of breast tumors [7], classification of ECG
signals [8], and face, not-face binary classification [9] are Restricted Boltzman Machines (RBM) are a special kind
studies using convolutional networks existing in the literature. of Markov Random Fields. It consists of two layers, visible
and hidden. Boltzman machines also consist of two layers and
A. Krizhevsky et al. [10] have made the ImageNet
classification with deep Convolutional networks. A deep the units in the layer have connections in themselves. This
convolutional neural network has been trained, dividing the increases the complexity. The simplified structure formed by
removing intra-layer connections is Restricted Boltzman
object into 1000 different classes from 1.2 million images with
Machines. Restricted Boltzman Machines are conditional two-
high resolution. To accelerate learning, unsaturated neurons
way graphics [12]. There are no connections between the
and convolution operation have used a very effective GPU
application. The newly developed ready-to-use dropout visible units in the RBM structure. Visible units and hidden
method is proven to be very effective for downloading an units make bidirectional connections to each other. Each link
has a weight value. RBM can be trained in supervised or
appropriate value in full-connected layers. A wider set of data,
unsupervised ways according to the problem.
more powerful learning models and better techniques have
been used to prove that the approach is better than others. As a C. Autoencoders
result, the classification process was performed with a very Autoencoders are neural network architectures that target
low error rate (0.15) and the success of the classification of untagged data as output in the same way. By taking input data
convolutional neural network was observed. here, hidden attributes are extracted by increasing or
Following ImageNet2012's Krizhevsky et al.'s deep decreasing the number of units in the hidden layer [11]. The
learning technique, the success of deep learning in image number of neurons in the hidden layer may be more or less.
classification was seen, and in the following year at Autoencoders update parameters by artificial neural network
ImageNet2013, many participants used deep learning methods. Differently, in this architecture, error is calculated by
techniques and achieved the same success. So that all of the using input layer data instead of label used in supervised
first 20 participants used deep learning techniques (2013). learning, when error of output layer is specified [11]. The
In this paper, we will make a binary classification using a structure of Autoencoder shown in Figure 1.
convolutional neural network model. The structure of this
article is as follows. In Chapter 2, we talked about the
structure of the deep learning architectures. The method
proposed in Chapter 3 is presented. In the last section, chapter
4, the aim of the study is summarized and the future work is
mentioned.
II. DEEP LEARNING
Deep learning algorithms are based on artificial neural
networks. In situations where the number of hidden layers in
artificial neural networks exceeds five, complexity increases
and the situation cannot be improved. Deep learning Fig.1. Structure of Autoencoder
architectures are multi-layered and contain many parameters.
The most used architectures are Autoencoders, Restricted III. PROPOSED METHOD
Boltzman Machines and Convolutional Neural Networks [11]. There are many libraries used for deep learning such as
matConvnet [13], Theano [14] and Caffe [15]. In this work, it
Fig.2. Architecture of developed CNN

has benefited by the Caffe library. The Caffe library is Each calculation layer of the network forms a majority of the
compatible with many software environments. The proposed feature map. Each feature map is a plane. The weight of
method is implemented in the python language and Ubuntu neurons in each plane is equal to each other. If the weights are
operating system because of its good in image processing equal to each other, the network can operate at the same time.
field. The kaagle [16] dataset we chose to implement that is Thus, parallel computation can be performed. The feasibility
consists of 25,000 labeled and 12,500 unlabeled images. In the of parallel computing is a fundamental advantage of CNN. A
study, 10.000 and 5.000 pieces of data were taken from the sub-sampling layer is placed between the two convolutional
related data for training and testing respectively. The flow layers to reduce the size of the successive feature maps. The
chart of the method is as in Figure 3. parameters such as the number of layers of the model to be
improved, the size of the input image, etc. are shaped
according to the process to be performed. If the number of
layers is too large, it is necessary to find an optimum value
because increasing the number of parameters will adversely
affect the performance even if the accuracy rate is increased in
the solution of the problem. The network model we use in our
method looks like this.
In Figure 2, the architecture used in developed method is
presented. Our Convolutional network consists of eight layer.
The first layer is convolution layer we call the core of
network. The input image has resized to 60x60 from 64X64 as
via the formula 64-5 + 1= 60 because we use a 5x5 filter in
this layer. We are used Rectified Linear Unit (ReLu) as the
activation function after every convolutional layer. The other
most commonly used activation function is the sigmoid
function. We chose Relu because of better integration skills
[18].
The second layer is the pooling layer that is used to
Fig.3.Flow chart of the proposed method modify the input of the next layer. In this layer the image is
resized through 60: 2 = 30 to 30x30. Respectively, the third
A. Convolutional Neural Networks and fourth layers and the fourth and fifth layers are the
Convolutional neural networks are a specialized convolution and pooling layers as first two layers. The seventh
architecture of deep learning. CNN is inspired by the human layer that follows these layers is the fully-connected layer. The
visual cortex. It is widely used in image processing and last layer is the soft-max layer that predicts the classification
pattern recognition. It is a feed forward network. There are process. In this layer, the result obtained by processing the
many layers that can be trained. It has a weight-sharing image in the network is determined by which class it belongs
structure. In this way, the structure of the network model is to.
simplified. This made it more similar to biological neural B. CUDA(Compute Unified Device Architecture)
networks [17].
The first component that comes to mind when you say
The core of a CNN is the convolutional layers.
"operation" is the CPU. The CPU is useful for sequential
Convolutional layers include two-dimensional planes of many
operations. But when it comes to parallel programming, GPUs
neurons. These planes are known as feature maps. Each
are more prominent with CUDA. In the past, GPUs have been
neuron of a feature map is connected by a small subset of
working as graphic accelerators. However, parallel
neurons in the feature map of the previous layer. The receptive
programming has been developed as a powerful, general
fields of neighboring neurons are overlaid and the weights of
purpose and fully programmable parallel data processor for
these receptive fields are shared throughout all neurons of the
operations that require it [19].
same feature map. The property layer of a convolutional layer
In GPU and CPU architectures, a GPU core is quite
is partially or completely dependent on the previous layer.
different from a CPU core. A CPU core has been designed
with very complex control logic to execute serial programs. A parallel computation on the cloud platform. Some of the
GPU core is designed with simpler control logic that focuses methods using the GPU platform are [22, 23, 24]. Although
on the operation of parallel data tasks [19]. The GPU has these studies are different in terms of methods and subjects,
thousands of smaller and more efficient seeds than the CPU. they are similar in terms of using GPU to develop the methods
CUDA can be used in mathematical applications thanks to the they developed using the CNN algorithm.
hundreds of kernel on the graphics card. The use of CUDA is
C. Implementation Details
crucial for the processing of large data in digital applications
[20]. CUDA has a wide range of applications including We performed the classification process with our CNN
computational fluid dynamics, image and video processing, based algorithm using images of cats and dogs taken from the
computational biochemistry simulations, fluid dynamics, air Kaggle dataset. The CNN structure has manipulated by
and climate modeling, signal processing, data analysis and extracting hidden attributes. Received images are resized to
much more. 64x64 to all of the same size. Mean image has performed to
training images, which is the most preprocessing step in
learning applications. We perform the proposed method in the
ubuntu 16.04 operating system using nvidia Geforce 920M
with 2 gb of memory. In developed model, GPU technology
has utilized while training was being conducted. So the day-
long training could be reduced to a few hours. Because the
proposed method used GPU technology, the training that will
last for days is over in just a few hours.

Fig. 4.
CUDA memory topologyFigure 4 shows the CUDA
memory topology. CUDA has the ability to process in parallel
with its hundreds of cores, which provides fast data processing
capability and therefore high processing performance. Hence,
since the market, various industries and applications have (a) Input images (b) Result images
achieved great success using CUDA C in different areas. Fig. 5. Samples images existed in Kaggle
Figure 4 shows the CUDA memory topology. CUDA has
the ability to process in parallel with its hundreds of cores, The cat and dog images taken from the Kaggle dataset are
which provides fast data processing capability and therefore resized and histogram equalization operations are performed.
high processing performance. Hence, since the market, various Fig.5 shows the initial states of the images(a)and the state
industries and applications have achieved great success using after the preprocessing(b).
CUDA C in different areas. In the classification, the soft-max function predicted
CUDA, which is generally used for processing and which class of the image belongs to the result obtained. In our
conversion, is the result of using parallel-arrays cores that can model, we started with a bias of 1. The Gaussian distribution
share data through the memory model it has, by spreading the for the initial values of weights has used with a standard
work that the CPU will do in a single layout. In parallel deviation of 0.01. Dropout was used to reduce overfitting in
execution, the divided operations are performed slowly but fully connected layers. The weights were renewed every 64
execute in a shorter time than the CPU. iterations.
The structure of the convolutional neural networks allows
parallel programming as described in section 3.1. Because of
this feature, applications using CNN algorithm can be done by
parallel calculation in cloud platform, CUDA programming in
computers with NVIDIA supported GPU. In case of problems
such as image and video processing, the size of data to be
processed is rather high because it increases accuracy. When
the algorithms used to solve these problems are processed
with sequential CPU, the training and testing parts can last for
days.
Tianyi Liu et al. [21] use the CNN algorithm for facial
recognition. In order to speed up this process, it performs
Fig.6. Learning Curve of the model
[11] H. Özcan, “Deep Learning Practices in Very Low
The learning curve of the model is as shown on the figure Resolution Face Images,” Master thesis, Naval War
6. As seen in the figure, the accuracy rate remains constant College,Turkey,2014.
after 5000 iterations. Following the training, the method has [12] H. A. Açarçiçek, “Emotion Recognition From Voice By
completed with 83% accuracy in the test phase using untagged Using Deep Learning,” License Thesis, østanbul Technical
data in the database. It is inevitable that the classification University, 2015.
accuracy of our small number of data is increased when the [13] A. Vedaldi, & K. Lenc, “Matconvnet: Convolutional neural
number of data is increased. networks for matlab,”In Proceedings of the 23rd ACM
international conference on Multimedia (pp. 689-692). ACM,
IV. CONCLUSIONS AND FUTURE WORKS 2015.
This work presents a classification process using the deep [14] J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R.
Pascanu, G. Desjardins, ... and Y. Bengio, “Theano: A CPU and
learning architecture which is frequently used to solve image
GPU math compiler in Python.,”In Proc. 9th Python in Science
processing problems. Although existing classification Conf (pp. 1-7), 2010.
processes are considered successful, their use in fields such as
[15] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R.
safety and health is critical, where it is critical to find the right Girshick, ... and T. Darrell,“Caffe: Convolutional architecture
one. Therefore, methods are needed to improve the accuracy for fast feature embedding,” In Proceedings of the 22nd ACM
rate. international conference on Multimedia (pp. 675-678). ACM,
Deep learning proved itself by solving many machine 2014.
learning problems. The use of deep learning architectures is [16] (2017)The kaggle website. [Online]. Available:
important in the methods developed at this point. In our future http://www.kaggle.com/
work, we will improve the proposed model in terms of speed [17] T. Liu, S. Fang, Y. Zhao, P. Wang, and J. Zhang, “
performance and accuracy. We will perform facial recognition Implementation of Training Convolutional Neural
by pointing out the need to increase safety recently with the Network,”2015, arXiv preprint arXiv:1506.01195.
model we have developed. [18] K. Zhang, Y. Huang, H. Wu, and L. Wang,”Facial smile
detection based on deep learning features,” In Pattern
REFERENCES Recognition (ACPR), 2015 3rd IAPR Asian Conference on (pp.
[1] G. Do÷an,”Portfolio Appraisal in a Private Insurance 534-538). IEEE. 2015, November.
Company by Using Artificial Neural Networks,”Master thesis, [19] J. Cheng , M. Grossman , T. MCKercher , “Professional
Hacettepe University, Ankara, Turkey, 2010. Cuda C Programming”,Copyright © 2014 by John Wiley &
[2] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Sons, Inc., Indianapolis, Indiana.
Nature, 521(7553), 436-444, 2015. [20] Z. Guler, A. Cinar, “GPU-based image segmentation using
[3] C. HacÕo÷lu, “Emotion Recognition from Voice by Using level set method with scaling approach”, In: International
Deep Learning,” Master thesis, østanbul Technical University, Conference on Advanced Information Technologies and
østanbul, Turkey, 2014. Applications (ICAITA), 2013.
[4] X. W. Gao, R. Hui, and Z. Tian, “Classification of CT brain [21] T. Liu, S. Fang, Y. Zhao, P. Wang, and J.
based on deep learning networks,” Computer Methods and Zhang,”Implementation of Training Convolutional Neural
Programs in Biomedicine 138(2017): 49-56. Networks.” arXiv preprint arXiv:1506.01195. 2015.
[5] X. W. Gao, W. Li, M. Loones and, L. Wang,”A fused deep [22] E. Cengil, A. ÇÕnar,”Robust Face Detection with CNN”, In
learning architecture for viewpoint classification of Proceedings of the 4nd ICAT international conference on
echocadiography,” Information Fusion,2017, Vol.36, 103-113. Advanced Technology & Sciences (pp. 47-51). ICAT, 2016.
[6] C. Spampinato, S. Palazzo, D. Giordano, M. Aldinucci, and [23] L. Wang, C. Y. Lee, Z. Tu, and S. Lazebnik, “Training
R. Leonardi,” Deep learning for automated skeletal bone age deeper convolutional networks with deep supervision,” arXiv
assessment in X-ray images,” Medical Image Analysis, 2017, preprint arXiv:1505.02496, 2015.
Vol.36, 41-51. [24] H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, “ A
[7] Q. Zhang, Y. Xiao, W. Dai, J. Suo, C. Wang, J. Shi, and H. convolutional neural network cascade for face detection,” In
Zheng,.”Deep learning based classification of breast tumors with Proceedings of the IEEE Conference on Computer Vision and
shear-wave elastography,” Ultrasonics, 2016, 72:150-157. Pattern Recognition (pp. 5325-5334), 2015.
[8] M. M. Al Rahhal, Y. Bazi, H. AlHichri, N. Alajlan, F.
Melgani, and R. R. Yager, “ Deep learning approach for active
classification of electrocardiogram signals”,Information
Sciences, 2016, 345:340-354.
[9] H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, “ A
convolutional neural network cascade for face detection” In:
Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition. 2015. p. 5325-5334.
[10] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet
Classification with Deep Convolutional Neural Networks,”
Advances in Neural Information Processing Systems,2012,
1907-1105..

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy