CNN 6
CNN 6
ABSTRACT Northern maize leaf blight is one of the major diseases that endanger the health of maize. The
complex background of the field and different light intensity make the detection of diseases more difficult.
A multi-scale feature fusion instance detection method, based on convolutional neural network, is proposed
to detect maize leaf blight. The proposed technique incorporates three major steps of data set preprocessing
part, fine-tuning network and detection module. In the first step, the improved retinex is used to process
data sets, which successfully solves the problem of poor detection effects caused by high-intensity light.
In the second step, the improved RPN is utilized to adjust the anchor box of diseased leaves. The improved
RPN network identifies and deletes negative anchors, which reduces the search space of the classifier
and provides better initial information for the detection network. In this paper, a transmission module is
designed to connect the fine-tuning network with the detection module. On the one hand, the transmission
module fuses the features of the low-level and high-level to improve the detection accuracy of small target
diseases. On the other hand, the transmission module converts the feature map associated with the fine-tuning
network to the detection module, thus realizing the feature sharing between the detection module and the
fine-tuning network. In the third step, the detection module takes the optimized anchor as input, focuses
on detecting the diseased leaves. By sharing the features of the transmission module, the time-consuming
process of using candidate regions layer by layer to detect is eliminated. Therefore, the efficiency of the
whole model has reached the efficiency of the one-stage model. In order to further optimize the detection
effect of the model, we replace the loss function with generalized intersection over union (GIoU). After
60000 iterations, the highest mean average precision (mAP) reaches 91.83%. The experimental results
indicate that the improved model outperforms several existing methods in terms of greater precision and
frames per second (FPS).
INDEX TERMS Northern maize leaf blight, disease detection, transmission module, retinex, single shot
multiBox detector (SSD).
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/
VOLUME 8, 2020 33679
J. Sun et al.: Northern Maize Leaf Blight Detection Under Complex Field Environment Based on Deep Learning
B. MULTI-LAYER INPUT RPN NETWORK more location and detail information. However, due to less
With the addition of region proposal network (RPN), the divi- convolution, they have lower semantics and more noise.
sion of anchor box is more detailed. However, in practical High-level features have stronger semantic information, but
application, the efficiency and precision are not enough in their resolution is very low and their perception of details is
the modified SSD with RPN network. Because a feature map poor. How to combine them efficiently is the key to improve
generated more than 45,000 anchor boxes. A large number of the accuracy of detection model.
anchor boxes are located in the background and need to be Transmission module (TM) is designed to improve the
filtered in the next step. Therefore, it is necessary to adjust detection effect on small targets and detection efficiency in
the structure of RPN to detect disease areas effectively. this paper. The feature map associated with the anchor is
The two-stage method has solved the problem of class adopted to fused feature by transmission module. As shown
imbalance well. As is shown in Fig. 4, the three-layer con- in Fig. 5, firstly, two 3 × 3 convolutions are carried to the
volution of the original RPN [26] network is replaced by one feature map and one 4×4 deconvolution is used to expand the
Kernel (size = 3×3, Channel = 1024), two Kernel (size = high-level feature map, then they are subjected to elemental
1 × 1, Channel is 1024 and 256), and a four-layer convolution summation to achieve the purpose of feature fusion. In order
of Kernel (size = 1 × 1, Channel = 512). Convolution to ensure the identifiability of the detection features, one 3×3
calculation in RPN network is adopted to slide on the feature convolution is added to the summed feature map. The module
map. Meanwhile, a series of region proposals are sent out to refines the features and sum the corresponding elements with
provide better initial information for the detection network. the deep features. The network takes the summation result
In this paper, the 320∗ 320 size of feature map is taken as as the feature of the current layer to the detection module,
an example. To deal with the problem of diseased position and it solves the problem that low-level feature used in the
with different scales, the anchor is extracted on four feature traditional SSD is insufficient. Thus, the detection accuracy
layers from the input feature map, stride sizes 5, 10, 20, of small target is improved. The fine-tuning network will
and 40 pixels are chosen as four feature layers. Each feature only send the anchors judged as target disease to the detec-
layers are combined with four different scales (20, 40, 80, tion module through the transmission module, thus realizing
160) and three kinds of aspect ratio (1:1, 1:2, 2:1). Finally, the feature sharing between the detection module and the
12 anchors with different sizes are generated. We follow the fine-tuning network.
design of anchor scales over different layers, which ensure
that anchors of different sizes can have the same density D. GENERALIZED IOU
on the image [27]. In this study, the largest IoU values Smooth-L1 is used to optimize the bounding box of the SSD.
and the samples with IoU > 0.5 are selected as positive The loss measured by distance does not fully reflect the actual
samples. Meanwhile, all anchors with negative confidence detection situation of the detection box. As shown in Fig. 6,
> 0.99 are removed, that is to say, most background anchors when the three norm values reach the same value, there is a
are removed. As a result, the complexity of the model was big difference in the actual detection effect (a big difference
reduced. The problem of class imbalance is alleviated and the in the IoU). The phenomenon indicates that the distance norm
testing time is shortened [28]. cannot accurately reflect the real detection effect. The effect
of the target detection directly affects by the accuracy of the
C. TRANSMISSION MODULE bounding box regression. Thus, the IoU-based loss can not
In many researches, fusing features [29], [30] of different only accurately reflect the detection effect of the bounding
scales is an important measure to improve detection perfor- box and the ground truth, but also has the scale invariance.
mance. Low-level features have higher resolution and contain Therefore, the accuracy of target detection can be effectively
FIGURE 7. Detection model of NLB. Conv1, Conv2, Conv3, and Conv4 corresponding to different input image (512∗ 512, 320∗ 320) is
64∗ 64, 32∗ 32, 16∗ 16, 8∗ 8(40∗ 40, 20∗ 20, 10∗ 10, 5∗ 5), the number of channels is 512, 512, 1024, 512 (512, 512, 1024, 512). P4 is the
highest-level input (no deconvolution) obtained from the feature map after three convolution cores (size 3∗ 3, step size 1, channel
256) and pooling. P3 is obtained from the feature map after convolution, pooling, and the sum of elements after deconvolution with
P4. P2 and P1 are the same process.
B. SETTING OF TEST PARAMETERS TABLE 1. Model parameter setting and test accuracy.
V. EXPERIMENTAL RESULTS
A. COMPARISON OF DISEASED POSITION
In this section, we not merely shows the conclusions obtained
by training the images of the different sizes, but also com-
pare with the result of traditional SSD, which has been
trained from two different data set. The mean Average Pre-
cisions (mAP) and the Frames Per Second (FPS) of the
models are listed in Table 1. As a result, these improvements by high-intensity light. The mAP of SSD is improved from
have proved to be effective for improving the performance 71.8% to 75.42%. The accuracy of Data set B produces
of the new model. In the following parts, the impact of 5.31% higher than the accuracy of Data set A in model 5,
these improvements in the overall network framework will and the mAP improves by 2.26% in model 6. In general,
be analyzed. the accuracy of Data set B is higher than that of Data set A.
In Fig. 8, part of the detection accuracies (model 6) are shown
B. THE EFFECT OF IMPROVED RETINEX MODEL ON MAP in Data set A and Data set B. In view of the specific problems
Comparing the mAP of Data set A with that of Data set B of the data set in this study, the improved retinex model
in each model, it can be concluded that the retinex greatly effectively solve the problem that the disease position is not
improves the problem of poor detection accuracy caused obvious.
FIGURE 8. Detection accuracy of Data set A and Data set B under model 5. Images 1-3 and Images 4-5 show the detection effects of Data set A and Data
set B under model 6 respectively.
C. THE EFFECT OF TRANSMISSION MODULE the model 2 (512∗ 512) in Data set B, the model 4 (512∗ 512)
COMBINED WITH RPN NETWORK ON MAP achieves 13.29% mAP. The transmission module performs
It is clear that the proposed architecture for the detection of feature layer fusion and combines the high-level semantic
maize leaf blight under complex background is more effective features with the previous layer features by deconvolution,
than SSD model in Fig. 9. The multi-layer input RPN network which improves the semantic information of the bottom fea-
improves the initial information by adjusting the position ture layer. Therefore, the detection effect of model 3 and 4
of region proposal for the classification and precise adjust- on small targets is improved. A partial visualization of the
ment of the detection network. Compared with the original model 3 and model 4 under Data set B is shown in Fig. 9.
SSD model, the mAP of model 3 (320∗ 320) is improved to It can be clearly seen from Fig. 9 that the models are more
85.65%, but its FPS reduce from 48 to 45.2. Compared with effective than the original SSD model under Data set B.
The images (1-3) show the detection effect of SSD. Although E. COMPARISON WITH OTHER MODELS
some small targets are detected, there is still a missing detec- Based on the preprocessed data set B, Table 2 compares
tion. The images (4-6) show the detection effect of model 4, our model with other detection methods. Our method with
more small diseased position are detected and no missed Resnet-101 produces 91.83 mAP that is better than other
detection occurred. detection models based on Resnet-101. If the input picture
(i.e., 512∗ 512) is further enlarged, a better detection effect
D. THE EFFECT OF GIOU ON MAP may be obtained. Generally speaking, the one-stage detection
From Fig. 10, it is clearly that the mAP is improved by method (e.g., RetinaNet, DSSD) still produces a relatively
optimizing the original loss function. Comparing model 3 good FPS, but the detection accuracy is still worse than the
(320∗ 320) with model 5 in Data set B, the mAP increases two-stage method (e.g., RelationNet, SNIP). This is because
from 85.65% to 88.79%, and the mAP also has improve- the anchor generated by the one-stage detection method is
ment (1.76%) in Data set A. The best performance of our only a logical structure, which only needs to be classified and
method is 91.83% (512∗ 512) in model 6. The mAP is higher regressed. The anchor generated by the two-stage detection
(1.23%) than that of model 4 (512∗ 512). As can be seen from will map to the area of feature map, and then re-enter the area
Fig. 10, the detection accuracy of diseased position has been to the full connection layer for classification and regression.
improved. The main explanation is that GIoU is adopted to Although our proposed method is slightly inferior to the
redefine the loss. The GIoU can accurately reflect the real one-stage detection method in FPS, it has greatly improved
detection situation compared with the traditional smooth-L1. its FPS due to the sharing of features of the transmission
The images (1-3) and the images (4-6) show the detection module. As far as the disease data set we use, on the premise
effects of model 4 and model 6 respectively. Adding GIoU of ensuring the detection accuracy, improving the efficiency
into the basis of original model, the detection accuracy of of the overall model will provide greater help to the whole
diseased position is improved. production process of intelligent agriculture.
VI. CONCLUSION [10] X. Bai, Z. Cao, L. Zhao, J. Zhang, C. Lv, C. Li, and J. Xie, ‘‘Rice heading
In this paper, the convolutional neural network was applied stage automatic observation by multi-classifier cascade based rice spike
detection method,’’ Agricult. Forest Meteorol., vol. 259, pp. 260–270,
to the detection of maize leaf blight. A promising detection Sep. 2018.
performance in complex field was achieved, which could be [11] K. P. Ferentinos, ‘‘Deep learning models for plant disease detection
attributed to the improvements that we had made based on and diagnosis,’’ Comput. Electron. Agricult., vol. 145, pp. 311–318,
Feb. 2018.
SSD. In the proposed method, series of steps were amal- [12] J. Ma, K. Du, L. Zhang, F. Zheng, J. Chu, and Z. Sun, ‘‘A segmentation
gamated, including data preprocessing, feature fusion, fea- method for greenhouse vegetable foliar disease spots images using color
ture sharing, disease detection. The main reason behind data information and region growing,’’ Comput. Electron. Agricult., vol. 142,
pp. 110–117, Nov. 2017.
preprocessing was to reduce the influence of high-intensity [13] M. A. Khan, T. Akram, M. Sharif, M. Awais, K. Javed, H. Ali, and T. Saba,
light on image identification and improve detection accuracy. ‘‘CCDF: Automatic system for segmentation and recognition of fruit crops
In order to further improve the detection accuracy, feature diseases based on correlation coefficient and deep CNN features,’’ Comput.
Electron. Agricult., vol. 155, pp. 220–236, Dec. 2018.
fusion was utilized to produce the best possible results. In our
[14] Z. Lin, S. Mu, F. Huang, K. A. Mateen, M. Wang, W. Gao, and
proposed method, we also took into account the improvement J. Jia, ‘‘A unified matrix-based convolutional neural network for fine-
of detection efficiency. The transmission module not only grained image classification of wheat leaf diseases,’’ IEEE Access, vol. 7,
realized the feature fusion, but also transferred the relevant pp. 11570–11590, 2019.
[15] S. Zhang, W. Huang, and C. Zhang, ‘‘Three-channel convolutional neu-
anchor information in the fine-tuning network to the detection ral networks for vegetable leaf disease recognition,’’ Cognit. Syst. Res.,
modules, realizing the feature sharing between the modules, vol. 53, pp. 31–41, Jan. 2019.
and improving the detection efficiency. Compared with the [16] J. Ma, K. Du, F. Zheng, L. Zhang, Z. Gong, and Z. Sun, ‘‘A recognition
method for cucumber diseases using leaf symptom images based on deep
original SSD model, the mAP of new models was higher convolutional neural network,’’ Comput. Electron. Agricult., vol. 154,
(from 71.80% to 91.83%) than the mAP of original SSD. pp. 18–24, Nov. 2018.
The FPS of the new model also had certain improvement [17] S. Sladojevic, M. Arsenovic, A. Anderla, D. Culibrk, and D. Stefanovic,
‘‘Deep neural networks based recognition of plant diseases by leaf
(from 24 to 28.4) and had reached the standard of real-time image classification,’’ Comput. Intell. Neurosci., vol. 2016, pp. 1–11,
detection. Jun. 2016.
The new model of this study was useful for the detection of [18] H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese,
‘‘Generalized intersection over union: A metric and a loss for bounding
maize leaf blight in complex background. The disease detec- box regression,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.
tion model was efficient and accurate, which could replace (CVPR), Jun. 2019, pp. 658–666.
the on-site identification of human experts. It could reduce [19] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and
the labor force and overcome the subjectivity of selecting A. C. Berg, ‘‘SSD: Single shot multibox detector,’’ in Proc. Eur. Conf.
Comput. Vis. (ECCV), Oct. 2016, pp. 21–37.
features artificially. The model could be moved into the [20] Y. Shen, Y. Yin, C. Zhao, B. Li, J. Wang, G. Li, and Z. Zhang, ‘‘Image
embedded system, which lays a theoretical foundation for the recognition method based on an improved convolutional neural network
development of precise drug application and precise detection to detect impurities in wheat,’’ IEEE Access, vol. 7, pp. 162206–162218,
2019.
robot for maize leaf blight. [21] M. P. S. da Silva, M. S. M. Freitas, P. C. Santos, A. J. C. de Carvalho, and
T. S. Jorge, ‘‘Capsicum annuum var. Annuum under macronutrients and
boron deficiencies: Leaf content and visual symptoms,’’ J. Plant Nutrition,
REFERENCES
vol. 42, no. 5, pp. 417–427, Mar. 2019.
[1] P.-Y. Tong, ‘‘Report on corn production in China,’’ Agric. Tech. Eqpt., [22] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, ‘‘Focal loss for dense
vol. 9, pp. 12–18, Sep. 2011. object detection,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2,
[2] C. DeChant, T. Wiesner-Hanks, S. Chen, E. L. Stewart, J. Yosinski, pp. 318–327, Feb. 2020.
M. A. Gore, R. J. Nelson, and H. Lipson, ‘‘Automated identification of [23] J. Wei, Q. Zhijie, X. Bo, and Z. Dean, ‘‘A nighttime image enhancement
northern leaf blight-infected maize plants from field imagery using deep method based on Retinex and guided filter for object recognition of apple
learning,’’ Phytopathology, vol. 107, no. 11, pp. 1426–1432, Nov. 2017. harvesting robot,’’ Int. J. Adv. Robot. Syst., vol. 15, no. 1, pp. 1–12,
[3] M.-Y. Zhu, H.-B. Yang, and Z.-W. Li, ‘‘Early detection and identification Jan. 2018.
of rice sheath blight disease based on hyperspectral image and chlorophyll [24] K. G. Lore, A. Akintayo, and S. Sarkar, ‘‘LLNet: A deep autoencoder
content,’’ Spectrosc. Spectr. Anal., vol. 39, pp. 1898–1904, Jun. 2019. approach to natural low-light image enhancement,’’ Pattern Recognit.,
[4] C.-L. Chung, K.-J. Huang, S.-Y. Chen, M.-H. Lai, Y.-C. Chen, and vol. 61, pp. 650–662, Jan. 2017.
Y.-F. Kuo, ‘‘Detecting Bakanae disease in rice seedlings by machine [25] D. J. Jobson and G. A. Woodell, ‘‘Multiscale retinex for color rendition
vision,’’ Comput. Electron. Agricult., vol. 121, pp. 404–411, Feb. 2016. and dynamic range compression,’’ Proc. SPIE, vol. 2847, pp. 183–191,
[5] D. I. Patrício and R. Rieder, ‘‘Computer vision and artificial intelligence Nov. 1996.
in precision agriculture for grain crops: A systematic review,’’ Comput. [26] S. Ren, K. He, R. Girshick, and J. Sun, ‘‘Faster R-CNN: Towards real-
Electron. Agricult., vol. 153, pp. 69–81, Oct. 2018. time object detection with region proposal networks,’’ IEEE Trans. Pattern
[6] Z. Lin, S. Mu, A. Shi, C. Pang, and X. Sun, ‘‘A novel method of maize leaf Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, Jun. 2017.
disease image identification based on a multichannel convolutional neural [27] X. Liu, D. Zhao, W. Jia, W. Ji, C. Ruan, and Y. Sun, ‘‘Cucumber fruits
network,’’ Trans. ASABE, vol. 61, no. 5, pp. 1461–1474, Oct. 2018. detection in greenhouses based on instance segmentation,’’ IEEE Access,
[7] G. Zhou, W. Zhang, A. Chen, M. He, and X. Ma, ‘‘Rapid detection of rice vol. 7, pp. 139635–139642, 2019.
disease based on FCM-KM and faster R-CNN fusion,’’ IEEE Access, vol. 7, [28] J. Sun, W.-J. Tan, and H.-P. Mao, ‘‘Identification of plant leaf diseases
pp. 143190–143206, 2019. based on improved convolutional neural network,’’ Trans. Chin. Soc.
[8] Z. Libo, H. Tian, G. Chunyun, and M. Elhoseny, ‘‘Real-time detection of Agric., vol. 33, pp. 151–162, Oct. 2017.
cole diseases and insect pests in wireless sensor networks,’’ J. Intell. Fuzzy [29] N. Liu and J.-M. Kan, ‘‘Improved deep belief networks and multi-feature
Syst., vol. 37, no. 3, pp. 3513–3524, Oct. 2019. fusion for leaf identification,’’ Neurocomputing, vol. 216, pp. 460–467,
[9] P. Jiang, Y. Chen, B. Liu, D. He, and C. Liang, ‘‘Real-time detection Dec. 2016.
of apple leaf diseases using deep learning approach based on improved [30] S. Bertrand, R. B. Ameur, G. Cerutti, D. Coquin, L. Valet, and L. Tougne,
convolutional neural networks,’’ IEEE Access, vol. 7, pp. 59069–59080, ‘‘Bark and leaf fusion systems to improve automatic tree species recogni-
2019. tion,’’ Ecol. Informat., vol. 46, pp. 57–73, Jul. 2018.
[31] D. G. Lowe, ‘‘Object recognition from local scale-invariant features,’’ in YU YANG is currently pursuing the master’s
Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Sep. 1999, pp. 1150–1157. degree with the School of Electrical and Infor-
[32] Y. Yu, K. Zhang, L. Yang, and D. Zhang, ‘‘Fruit detection for strawberry mation Engineering, Jiangsu University, China.
harvesting robot in non-structural environment based on mask-RCNN,’’ His research interests include applications of deep
Comput. Electron. Agricult., vol. 163, Aug. 2019, Art. no. 104846. learning in agriculture, mainly applications of con-
[33] M. Haggag, S. Abdelhay, A. Mecheter, S. Gowid, F. Musharavati, volution neural networks to the precise detection
and S. Ghani, ‘‘An intelligent hybrid experimental-based deep learn- of crop diseases.
ing algorithm for tomato-sorting controllers,’’ IEEE Access, vol. 7,
pp. 106890–106898, 2019.
[34] X. Ye and Q. Zhu, ‘‘Class-incremental learning based on feature extraction
of CNN with optimized softmax and one-class classifiers,’’ IEEE Access,
vol. 7, pp. 42024–42031, 2019.
[35] W. A. Gardner, ‘‘Learning characteristics of stochastic-gradient-descent
algorithms: A general study, analysis, and critique,’’ Signal Process., vol. 6, XIAOFEI HE is currently pursuing the mas-
no. 2, pp. 113–133, Apr. 1984. ter’s degree in electronics and communica-
tion engineering with Jiangsu University, China.
His research interests include applications of
computer vision in agriculture, using deep learn-
ing method to study weed detection in complex
background.
JUN SUN received the Ph.D. degree in mechanical XIAOHONG WU born Hefei, China, in 1971.
electronics from Jiangsu University. He is cur- He is currently a Professor with Jiangsu Univer-
rently a Professor and the Doctoral Director of sity. He is mainly engaged in machine learning,
the School of Electrical Information Engineering, pattern recognition, and spectral information pro-
Jiangsu University. He has published over 80 arti- cessing. He is serving as a member of the Sixth
cles in his research related fields. His research Council of China Electronic Education Society.
focuses on applications of computer electronics
in agriculture, including deep learning, hyper-
spectral technology, and nondestructive testing
technology.