0% found this document useful (0 votes)
124 views7 pages

Detection of Stored-Grain Insects Using Deep Learning

thesis paper

Uploaded by

trev
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views7 pages

Detection of Stored-Grain Insects Using Deep Learning

thesis paper

Uploaded by

trev
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Computers and Electronics in Agriculture 145 (2018) 319–325

Contents lists available at ScienceDirect

Computers and Electronics in Agriculture


journal homepage: www.elsevier.com/locate/compag

Original papers

Detection of stored-grain insects using deep learning T


a a,⁎ a b b
Yufeng Shen , Huiling Zhou , Jiangtao Li , Fuji Jian , Digvir S. Jayas
a
Beijing University of Posts and Telecommunications, PO Box 137, Road Xitucheng 10, Haidian District, Beijing 100876, China
b
Department of Biosystems Engineering, University of Manitoba, Winnipeg, MB R3T 5V6, Canada

A R T I C L E I N F O A B S T R A C T

Keywords: A detection and identification method for stored-grain insects was developed by applying deep neural network.
Stored-grain insect Adults of following six species of common stored-grain insects mixed with grain and dockage were artificially
Object detection added into the developed insect-trapping device: Cryptoleste Pusillus(S.), Sitophilus Oryzae(L.), Oryzaephilus
Insect classification Surinamensis(L.), Tribolium Confusum(Jaquelin Du Val), Rhizopertha Dominica(F.). Database of Red Green and
Convolutional neural network
Blue (RGB) images of these live insects was established. We used Faster R-CNN to extract areas which might
Trap images
contain the insects in these images and classify the insects in these areas. An improved inception network was
Faster R-CNN
developed to extract feature maps. Excellent results for the detection and classification of these insects were
achieved. The test results showed that the developed method could detect and identify insects under stored grain
condition, and its mean Average Precision (mAP) reached 88.

1. Introduction increase the difficulty of feature extraction.


Krizhevsky et al. (2012) used deep convolutional neural networks
Image recognition to detect and identify insects in a stored product got first on ImageNet Large Scale Visual Recognition Challenge in 2012.
is the critical component of a stored-grain insect monitoring system. Ding and Taylor (2016) used the Sliding Window method to obtain
The main challenges in the image recognition of these insects are to regions of interest and applied a 5-layer convolutional neural network
identify areas of the image containing insects in grain mixed with other to determine whether the regions contained a moth. They got a higher
materials (mostly the fine materials and broken grain kernels) and to recall rate using convolutional neural network than that using LogReg
classify the small body size insects conglutinated with other insect algorithm. Liu et al. (2016) applied the GrabCut for the segmentation of
species and/or the same species and/or the other materials in the target paddy field pests and classified the pests using a 8-layer convolutional
area. neural network. The accuracy of the convolutional neural network was
Object detection system such as pedestrian detection and vehicle higher than that of the Histogram of Oriented Gradient (HOG), Speeded
detection applies the region proposal algorithms to infer locations of Up Robust Features (SURF). However, the network used by Liu et al.
objects (Girshick et al., 2013). The early developed region proposal was too shallow, and could not extract more effective features when the
algorithms include Selective Search, Sliding Window, Rigor, Super- targets were similar in appearance. In this study, we used an Online
pixels and Gaussian (Hosang et al., 2015). The Region Proposal Net- Insect Trapping Device (OITD) (Wang et al., 2016) to capture the
work (RPN) Ren et al., 2017 was proposed in 2016, which applied images of live insects with or without fines, foreign materials, dockages
convolutional neural network method to get the areas of interest more and broken grains (referred to as FFDB) under laboratorial conditions.
quickly and accurately through the acceleration of Graphics Processing The insects imaged were: Cryptoleste Pusillus(S.), Sitophilus Oryzae(L.),
Unit (GPU). Oryzaephilus Surinamensis(L.), Tribolium Confusum(Jaquelin Du Val),
In the field of the insects classification based on computer vision, Rhizopertha Dominica(F.) and Lasioderma Serricorne(F.). A dataset with
most of researches focused on the extracting of insects’ features in- 739 images of the insects with or without FFDB was established, and
cluding texture, shape, and local characteristics (Qiu et al., 2003; Zhang artificially marked. To increase the accuracy of the convolutional
et al., 2009, 2005; Wu et al., 2015; Jayas, 2017). Procedure of this neural network, we applied a 27-layer convolutional neural network to
feature extraction was complex under in-situ situations, and those ex- extract the features from the images of stored-product insects, and
tracted features might not accurately represent the image character- adopted the Softmax as classifier to identify the insects. The Faster R-
istics of insects. In the practical application, the variation of image CNN method was used to locate and classify the insects. First, the in-
background, impurities, illumination and insect’s gestures will also ception network (Szegedy et al., 2015) extracted the feature maps of


Corresponding author.
E-mail address: huiling@bupt.edu.cn (H. Zhou).

https://doi.org/10.1016/j.compag.2017.11.039
Received 18 June 2017; Received in revised form 16 October 2017; Accepted 26 November 2017
0168-1699/ © 2018 Elsevier B.V. All rights reserved.
Y. Shen et al. Computers and Electronics in Agriculture 145 (2018) 319–325

Fig. 1. Ground truth.

each image, then the RPN returned the coordinates of the areas which Table 1
might have insects in the feature maps. These coordinates were merged Number of images and insects used for training and testing.
by Non Maximum Suppression (NMS) Neubeck and Van Gool, 2006 and
SOa LSb TCc RDd OSe CPf
mapped to the feature maps. These target areas were classified by the
improved inception network. The coordinates of the target areas were Training Images 77 54 98 114 63 117
also corrected at the same time (Girshick, 2015). Finally, NMS was used Insects 1206 1088 1423 1908 830 2957
to merge the overlapping target boxes. Testing Images 33 28 42 21 23 69
In this study, we developed a method based on the Faster R-CNN, Insects 465 514 446 335 358 978
which can be used to identify the insects mixed with FFDB under dif- Total Images 110 82 140 135 86 186
ferent illumination conditions. This method improved the accuracy of Insects 1671 1602 1869 2243 1188 3935
the insect detection. The rest of this article is organized to introduce
a
SO = Sitophilus Oryzae.
this image processing procedure. b
LS = Lasioderma Serricorne.
c
TC = Tribolium Confusum.
d
RD = Rhizopertha Dominica.
2. Image acquisition and preprocessing e
OS = Oryzaephilus Surinamensis.
f
CP = Cryptolestes Pusillus.
The resolution of the images taken by the OITD system was
1944 × 2592 pixels. These images were collected under the conditions 3. Object detection network
of multi live insects mixed with artificially added FFDB in wheat in
order to increase the difficulty of detection and simulate the real si- The detection steps for the target object were: acquire the region
tuation in grain warehouses. The FFDB included fine materials and proposals of the image by RPN, merge these proposals as candidate
broken grains (Fig. 1). Multiple live insects of single species were added boxes by NMS, map these candidates boxes to the feature maps, classify
into OITD system every time and multiple pictures were snapshotted. these regions of candidate boxes by classification network, use the NMS
Some pictures were randomly chosen as the testing set, and the rest to merge these overlapping candidate boxes. The detection process is
were used as the training set. This procedure was repeated and new shown schematically in Fig. 2 and details are provided in the following
insects were added for the next replicate. Table 1 shows the number of sections.
the images and insects in the training and testing sets.
To enrich the training set, extract image features accurately, and 3.1. Feature extraction network
generalize model to prevent overfitting, the image dataset was aug-
mented by flipping and color jittering. To account for the change of In order to obtain high-quality model, the width (different sizes of
illumination level and insects' posture, the color jittering was conducted kernels were used to extract the same feature maps) and depth of the
by randomly adjusting the saturation, contrast, brightness, and sharp- neural network model should be increased. The inception structure
ness of the image. After augmentation, the size of training set was in- (Szegedy et al., 2015) was adopted to extend the convolutional layers of
creased by 12 folds of the original training set. The original images had the model. The main purpose of the inception structure was to find a
a high resolution. To improve the training and testing speed and to simple dense component to replace an optimal local sparsity structure
reduce the GPU consumption, the image resolution was lowered to and repeated this structure spatially. This procedure would cluster to-
600 × 800 pixels. Each insect in the images was artificially marked by a gether the units with high relative correlation to form the next layer
blue bounding box (Fig. 1) as ground truth. The marked blue bounding and this next layer would connect to its top layer. The adopted incep-
box was used for training. tion structure is shown in Fig. 3. To reduce the number of parameters

320
Y. Shen et al. Computers and Electronics in Agriculture 145 (2018) 319–325

Fig. 2. Flow chart of object detection.

2
smooth L1 (x) = ⎧ 0.5x if|x| < 1

⎩|x|− 0.5 otherwise (4)

When the resolution of input images was 600 × 800 pixels, the
output was 50 × 37 × 36 and 50 × 37 × 18. This output was reshaped
to 16,650 × 4 and 16,650 × 2, respectively. Then 16,650 regression
boundaries and foreground/background scores were obtained. During
training, the 2000 rectangular boxes with the highest score were casted
to the classification network as proposal regions. Then the proposal
regions were merged by NMS as the candidate boxes.

3.3. Classification network

To classify the region in a candidate box, the RoI (region of interest)


Fig. 3. Inception structure.
Pooling Layer (Ren et al., 2017) was used to map the candidate box to
the feature map which was the output of the seventh inception struc-
and to improve the speed of operation, the 1 × 1 convolution kernels ture. The RoI Pooling Layer was the simplified form of Spatial Pyramid
were used before 3 × 3 and 5 × 5 convolution kernels. Pooling layer (He et al., 2015) with only one scale. In this study, the RoI
Pooling Layer sampled the feature map to scale 7 × 7. After that, two
3.2. Region proposal network fully connection layers were used to unify the feature map as a 1024
dimension feature vector.
In the literature (Ren et al., 2017), VGG16 (Simonyan and The classification network and the RPN shared the parameters of
Zisserman, 2015) was usually used as a base network, but the VGG16 convolutional layers. Two fully connection layers were used to code the
network contained a large number of parameters, because it adopted features to a 44 dimension vector and a 11 dimension vector, respec-
the fully connection layer of 4096 × 4096. We used the inception tively, in the classification network. After adjusting 500 candidate
network to replace VGG16 network, and assigned the output of the boxes input by region proposal network, the coordinates and the cor-
seventh inception structure as a feature map. To further extract fea- responding category score (defined by the probability of the true ca-
tures, two inception structures were connected with the feature maps, tegory) were obtained. During the training, the loss value of the clas-
then the convolutional layer with kernels of 3 × 3 was applied to re- sification network was consisted of the logarithmic loss and the
duce the thickness of convolution features to 256 dimensions. This regression loss. Logarithmic loss (L cls ) was calculated by the corre-
output was assigned as the features of region proposal network. We sponding probability (Pu ) of the true category (u) Girshick, 2015:
coded the features as nine scales of bounding boxes' coordinates and the L cls = −logPu (5)
bounding boxes' scores.
Training followed multitask loss. The loss function was defined as Regression loss (L reg ) (Girshick, 2015) was calculated as:
(Ren et al., 2017):
4
1 1 L reg = ∑ smooth L1 (t iu−vi)
L({pi},{pi}) = ∑ L cls (pi,p∗i ) +λ ∑ p∗i L reg (ti,t ∗i ) (6)
Ncls i
Nreg i (1) i= 1

where N is the total amount of anchors (Ren et al., 2017), i is the an- where vi is the corresponding prediction parameter of the true class, t iu
chor’s index in the mini-batch, pi is the prediction probability of the ith is the real translation and scaling parameters. The total loss (L)
anchor, p∗i is the label of the ith anchor, ti is the coordinate of the Girshick, 2015 is shown in Eq. (7):
predicted bounding box, and t ∗i is the ground truth coordinate. Classi-
L cls + λ L reg (when u is foreground)
fication loss (L cls ) is the logarithmic loss of two categories: foreground L= ⎧

⎩ L cls (when u is background) (7)
and background (Ren et al., 2017):
L cls (pi,p∗i ) = −(log(p∗i pi) + log((1−p∗i )(1−pi))) (2) In the insect classification, to balance between the precision and the
recall of the classification network, we defined a rule: when the Pu of a
Regression loss L reg (Ren et al., 2017): candidate box was higher than 0.5, the box was considered to contain
L reg (ti,t ∗i ) = smooth L1 (ti−t ∗i ) an insect which belonged to the category of u. In order to prevent
(3)
overfitting and increase the sparsity of the network, the dropout layer
smooth L1 is the robust loss function: (Srivastava et al., 2014) was added before the fully connection layer.

321
Y. Shen et al. Computers and Electronics in Agriculture 145 (2018) 319–325

3.4. Non maximum suppression

After obtaining the coordinates of the 500 candidate boxes and their
corresponding category scores, each candidate box was saved as a
vector (r, c, h, w, n). The r, c, h and w were the coordinates of the
candidate box, and n was the score of the candidate box. The procedure
of this non maximum suppression was: calculate the areas of all win-
dows and the overlapping area of the candidate box, find the candidate
box with the highest score and the candidate box with the second-
highest score, calculate the intersection over union (IoU) of the two
candidate boxes. If the IoU is higher than the threshold, suppress the
target box with the second-highest score. During our testing, we found
the threshold value of 0.7 got the highest mAP.

3.5. Model optimization

Faster R-CNN with deep networks had two issues. The first issue is
that the optimization ability of the model was significantly reduced
because the flow of the deep inception network’s information was
blocked (He et al., 2016). In the literature, Szegedy et al. (2015) added
two additional Softmax layers in the inception network to calculate new
loss value, and the gradient of the network was calculated using the
new loss value. He et al. (2016) introduced the method of the shortcut
connection to solve the problem of flow blocking. The shortcut con-
nections were leveraged to reduce the influence of gradient dis-
appearance and improve the degradation phenomenon of information
flow. The second issue is that the body size of stored-grain insects is
relatively small. The feature map size of the output of the seventh in-
ception was only 1/16 of the original image. The receptive field of each
neuron was too large, so it was not sensitive to a small target. To solve
these two issues an improved inception network was developed. We
combined the output of the second inception with the output of the
seventh inception through the fully connection layers. We also added a
Rol Pooling layer after the second inception layer (Fig. 4). The proposal
regions were mapped to the output of the second inception. This de-
veloped method could directly back-propagate the gradient from the
deep layers to the shallow layers of the inception network. By adding
the shortcut connection, the network could extract the feature maps of
the shallow layers which were 1/8 of the original image. This method
was more sensitive to small targets than that used by He et al. (2016).
If the classification and positioning function were achieved by the
fully connection layer, a large amount of memory was required. The
Singular Value Decomposition (SVD) was operated on the fully con-
nection layer and the number of parameters was reduced, so a small
Fig. 4. The improved detection network.
amount of memory was required. The SVD was used to decompose the
parameter matrix (W), and W was approximated by the former t ei-
genvalues: RPN and classification network at the same time). A dropout layer after
the fully connection layer was added and only 50% of the fully con-
W= U∑ VT ≈ U(:,1: t)· ∑ (1: t,1: t)·V(:,1: t)T (8) nection layer's neurons were activated in each training iteration. The
network weights were initialized by the pre-trained model trained by
The forward propagation was divided into two steps (Girshick,
the ImageNet, and the method of Stochastic Gradient Descent (SGD)
2015). This was equivalent to splitting the fully connection layer into
with momentum was adopted to update parameters (Wilson and
two layers and these two layers were connected by the fully connection
Martinez, 2003). For each region proposal generated by RPN, nine
layer with lower dimension in the middle. In this study, the two fully
candidate boxes were generated with three sizes (128 pixels, 256 pixels
connection layers were divided by SVD to reduce the number of para-
and 512 pixels) and three aspect ratios (0.5, 1 and 2). During training,
meters. The improved object detection network is shown in Fig. 4.
2000 candidate boxes obtained by the non-maximum suppression were
divided into foreground and background as a training set to train the
4. Experiment and analysis classification network. Four loss values were analyzed: regression and
logarithmic loss of RPN, regression and logarithmic loss of classification
4.1. Experiment network. The training iteration was 140,000 times. The sum of four loss
values is shown in Fig. 6.
The training set was augmented by flipping and color jittering, the During testing, the candidate boxes generated by RPN were ordered
images in Fig. 5 are examples of the “data augmentation”. The images by their scores. Five hundred candidate boxes with the highest score
of augmented training set were normalized to 600 × 800 pixels. Two were selected as the candidate boxes. Classification and regional posi-
images as a batch were send into the neural network for training. tion adjustment were conducted for these candidate boxes by the fully
Neural network adopted the training method of end to end (training

322
Y. Shen et al. Computers and Electronics in Agriculture 145 (2018) 319–325

Fig. 5. Data augmentation. (a) Original image;


(b) images generated by data augmentation.

Fig. 6. Loss graph for 14,000 iterations.

Fig. 7. Detection results. (a) 500 candidate boxes extracted by RPN; (b) Boxes through classification network merged by NMS; (c) Boxes whose category scores ≥ 0.5.

Table 2
The mAP, run time and model size of different models.

Model SOa LSb TCc RDd OSe CPf mAP Run time (s) Model size

VGG16 90.19 71.55 93.17 89.66 71.54 79.82 82.66 0.226 547M
Inception 89.76 70.91 91.20 84.49 74.97 76.98 81.39 0.168 160M
Improved inception 95.48 77.82 95.72 92.44 79.54 86.95 87.99 0.182 261M
R-FCN + ResNet101 88.54 74.81 95.26 93.63 80.47 88.56 86.88 0.246 200M
Improved inception + SVD 95.31 72.92 96.86 93.79 79.86 86.17 87.49 0.183 62M

The bold values mean the best performance in these models.

connection layer of the classification network. In Fig. 7, (a) shows the (AP) are as shown below:
500 candidate boxes extracted by RPN and merged by NMS, (b) shows
the boxes through classification network merged by NMS, and (c) shows number of correct detection
R=
the boxes whose category scores were over 0.5. We used the mAP as a total number of object (9)
model of performance evaluation indicators (Wojek et al., 2012), the
mAP was the mean value of each category's average precision (AP). The number of correct detection
computation formulas of recall (R), precision (P) and average precision P=
total number of detection (10)

323
Y. Shen et al. Computers and Electronics in Agriculture 145 (2018) 319–325

Table 3
The mAP of the improved inception model trained on different datasets.

Dataset SOa LSb TCc RDd OSe CPf mAP Run time (s)

600 × 800 92.52 71.09 91.95 92.00 74.97 85.35 84.65 0.182
600 × 800 + augmentation 95.48 77.82 95.72 92.44 79.54 86.95 87.99 0.182
1000 × 1300 + augmentation 96.04 73.89 95.63 93.67 79.21 89.65 88.02 0.227

The bold values mean the best performance in these models.

Fig. 8. Ground truth and the performance of different models. (a–e) Ground truth; (f–j) VGG16; (k–o) Improved inception model with SVD operation.

Fig. 9. Examples of error detection. (a) Insects with close adhesion; (b) Missing detection of occluded insects; (c) Insects with distortion; (d) Unstable focus of insects; (e) One species was
classified as two species.

1
AP = ∫0 PdR (11)

4.2. Analysis

4.2.1. Model evaluation


The inception network adopted the structure of inception, had
better width, better diversity of extracted features, small number of
parameters, and faster running speed than that of VGG16. In order to
evaluate the insect detection capability of the Faster R-CNN with im-
proved inception network, Faster R-CNN with VGG16, Faster R-CNN
with inception, and R-FCN (Wojek et al., 2012) with ResNet101 (Dai
et al., 2016) were also trained. In addition, the SVD operation was
conducted on the improved inception network to reduce the model
parameters, and the impact of SVD operation on the model detection
was observed. The comparison of test results of different models trained
on the augmented dataset with 600 × 800 pixels is shown in Table 2.
Fig. 10. An example image uploaded by OITD in a grain warehouse.
These results indicated that the improved inception network had the

324
Y. Shen et al. Computers and Electronics in Agriculture 145 (2018) 319–325

best performance on the testing set and inception network had the neural networks. The improved inception network had achieved a
fastest speed. The improved inception model with SVD operation was higher mAP (87.99) than the inception network proposed in literature
199M smaller than the original model size, while the mAP value just (Liu et al., 2016) (81.39) and VGG16 (82.66). Besides, we enriched the
had a small drop. datasets and got a 3.34 improvement of mAP and compressed the model
In order to find the influence of the data augmentation and the from 261M to 62M by SVD operation with 0.5 reduction on mAP. We
image resolution on the model performance, images with 600 × 800 also got the images from grain warehouses to improve the identification
pixels, images with 600 × 800 pixels and augmentation, and images accuracy in the future.
with 1000 × 1300 pixels and augmentation were used to train the
improved inception model. The mAP of the improved inception model Acknowledgements
trained on different datasets was shown in Table 3. The performance of
this model was significantly improved by the augmentation, while the This work is partially supported by the Program of Introducing
high resolution of training images slightly improved the performance Talents of Discipline to Universities of China (B08004), and the 2015
and had a lower running speed. China Special Fund for Grain-scientific Research in Public Interest
(201513002).
4.2.2. Discussion
The improved inception model with SVD operation trained on References
images of 600 × 800 pixels resolution with augmentation was selected
as the final model. The comparison of detection results between the Dai, J., Li, Y., He, K., et al., 2016. R-FCN: object detection via region-based fully con-
VGG16 and the developed inception network is shown in Fig. 8. The volutional networks. Comput. Sci.
Ding, W., Taylor, G., 2016. Automatic moth detection from trap images for pest man-
developed detection method could effectively detect insects with dif- agement. Comput. Electr. Agric. 123(C), 17–28.
ferent gestures and slight adhesion. However, this method could make Girshick, R., 2015. Fast R-CNN. Comput. Sci.
errors when there were severe adhesion, occlusion, distortion, and Girshick, R., Donahue, J., Darrell, T., et al., 2013. Rich feature hierarchies for accurate
object detection and semantic segmentation. Computer Vision and Pattern
unstable focus of the insects (Fig. 9). The reason for one species clas- Recognition. IEEE, pp. 580–587.
sified as two species was that both category scores of Sitophilus Oryzae He, K., Zhang, X., Ren, S., et al., 2015. Spatial pyramid pooling in deep convolutional
and Tribolium Confusum were higher than 0.5. networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37 (9),
1904–1916.
The inception network published in the literature (Szegedy et al.,
He, K., Zhang, X., Ren, S., et al., 2016. Deep residual learning for image recognition.
2015) had deeper structures than VGG16, so the gradient diffusion Conference on Computer Vision and Pattern Recognition. IEEE 770–778.
problem was more serious and the detection performance was poorer Hosang, J., Benenson, R., Dollár, P., et al., 2015. What makes for effective detection
proposals? IEEE Trans. Pattern Anal. Mach. Intell. 38 (4), 814–830.
than VGG16. The improved inception network had improved the de-
Jayas, D.S., 2017. The role of sensors and bio-imaging in monitoring food quality. Resour.
tection performance compared with inception network and VGG16. Mag. 24(2), 12–13.
Although the R-FCN with ResNet101 could achieve a good mAP, one Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. ImageNet classification with deep
image took 0.246 s on this model was slower than other models. SVD convolutional neural networks. In: International Conference on Neural Information
Processing Systems. Curran Associates Inc, pp. 1097–1105.
operation had a significant effect on the compression of the fully con- Liu, Z., Gao, J., Yang, G., et al., 2016. Localization and classification of paddy field pests
nection layer, which reduced the size of the model to about 60M. using a saliency map and deep convolutional neural network. Sci. Rep. 6, 20410.
The data augmentation could significantly improve the performance Neubeck, A., Van Gool, L., 2006. Efficient non-maximum suppression. In: International
Conference on Pattern Recognition. IEEE, pp. 850–855.
of the model. The high resolution of images had no contribution to the Qiu, D., Zhang, H., Chen, T., et al., 2003. Software design of an intelligent detection
higher mAP, and images with high resolution increased the unnecessary system for stored-grain pests based on machine vision. Trans. Chinese Soc. Agric.
details. In addition, increasing resolution would increase the memory Mach. 34 (2), 83–85.
Ren, S., He, K., Girshick, R., et al., 2017. Faster R-CNN: towards real-time object detection
consumption and reduce the detection efficiency of images. with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39 (6),
However, the dataset adopted in this paper just contained six types 1137–1149.
of stored-grain insects, and the pictures taken in the laboratory differ Simonyan, K., Zisserman, A., 2015. Very deep convolutional networks for large-scale
image recognition. Comput. Sci.
from the pictures taken by OITD in the grain warehouses. Fig. 10 shows Srivastava, N., Hinton, G., Krizhevsky, A., et al., 2014. Dropout: a simple way to prevent
an image uploaded by OITD in a grain warehouse, the image contained neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958.
lots of booklice. By May 2017, we had arranged OITD for 8 different Szegedy, C., Liu, W., Jia, Y., et al., 2015. Going deeper with convolutions. In: Conference
on Computer Vision and Pattern Recognition. IEEE, pp. 1–9.
warehouses of China. In the future research, we will enrich our image
Wang, D., Zhou, H., et al., 2016. Research on image acquisition and recognition for stored
dataset with these images from these grain warehouses, improve our grain pests. In: International Conference on Artificial Intelligence & Industrial
algorithm, and apply the automatically detection system inside stored Engineering.
grain bins. Wilson, D.R., Martinez, T.R., 2003. The general inefficiency of batch training for gradient
descent learning. Neural Netw. 16 (10), 1429–1451.
Wojek, C., Dollar, P., Schiele, B., et al., 2012. Pedestrian detection: an evaluation of the
5. Conclusions state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34 (4), 743.
Wu, Y., Wang, K., Tao, F., 2015. Classification of stored-grain insects based on the Extend
Shearlet Transform, Krawtchouk Moment and SVM. J. Chinese Cereals Oils Assoc.
This paper applied the object detection algorithm, which was based 30(11), 103–109.
on Faster R-CNN, to detect stored-grain insects under field condition Zhang, H., Fan, Y., Tian, G., 2005. Research of the stored-grain insects classification based
with impurities. The method could detect the insects with slight ad- on image processing techniques. J. Zhengzhou Inst. Technol. 26(1), 19–22.
Zhang, H., Mao, H., Qiu, D., 2009. Feature extraction of image classification on stored-
hesion. An improved inception network was also developed to enhance grain insects. Trans. Chinese Soc. Agric. Eng. 25 (2), 126–130.
the accuracy of small insect detection through the deep convolutional

325

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy