0% found this document useful (0 votes)
41 views14 pages

Cao (2019) Review of Pavement Defect Detection Methods

Uploaded by

VICTORDLC
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views14 pages

Cao (2019) Review of Pavement Defect Detection Methods

Uploaded by

VICTORDLC
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Received December 4, 2019, accepted January 12, 2020, date of publication January 15, 2020, date of current version

January 24, 2020.


Digital Object Identifier 10.1109/ACCESS.2020.2966881

Review of Pavement Defect Detection Methods


WENMING CAO 1,2,3 , QIFAN LIU 1,2 , AND ZHIQUAN HE 1,2
1 Shenzhen Key Laboratory of Media Security, Shenzhen University, Shenzhen 518060, China
2 Guangdong Multimedia Information Service Engineering Technology Research Center, Shenzhen 518060, China
3 Video Processing and Communication Laboratory, Department of Electrical and Computer Engineering, University of Missouri, Columbia, MO 65211, USA

Corresponding author: Zhiquan He (zhiquan@szu.edu.cn)


This work was supported in part by the National Natural Science Foundation of China under Grant 61971290, Grant 61771322, and Grant
61871186.

ABSTRACT Road pavement cracks detection has been a hot research topic for quite a long time due to
the practical importance of crack detection for road maintenance and traffic safety. Many methods have
been proposed to solve this problem. This paper reviews the three major types of methods used in road
cracks detection: image processing, machine learning and 3D imaging based methods. Image processing
algorithms mainly include threshold segmentation, edge detection and region growing methods, which are
used to process images and identify crack features. Crack detection based traditional machine learning
methods such as neural network and support vector machine still relies on hand-crafted features using image
processing techniques. Deep learning methods have fundamentally changed the way of crack detection
and greatly improved the detection performance. In this work, we review and compare the deep learning
neural networks proposed in crack detection in three ways, classification based, object detection based and
segmentation based. We also cover the performance evaluation metrics and the performance of these methods
on commonly-used benchmark datasets. With the maturity of 3D technology, crack detection using 3D data
is a new line of research and application. We compare the three types of 3D data representations and study
the corresponding performance of the deep neural networks for 3D object detection. Traditional and deep
learning based crack detection methods using 3D data are also reviewed in detail.

INDEX TERMS Crack detection, image processing, deep learning, 3D imaging.

I. INTRODUCTION
With the rapid development of road traffic, people have
paid more and more attention to the importance of pave-
ment maintenance as road surface cracks not only affect the
transportation efficiency but also pose a potential threat to
vehicle safety. Many studies have been conducted to detect FIGURE 1. Sample surface defect types: CRACK-Crack, POTHO-Pothole,
the cracks of pavement surfaces. In early pavement crack INPAT-Inlaid patch, APPAT-Applied patch, OPJOI-Open joint.
detection system, people analyzed the road images collected
by line scan or area scan cameras to examine the road
conditions. Such systems include the GERPHO [1] system of assigning a label to every pixel in an image such that pixels
used in France, the DHDV [2] detection system of American with the same label share certain characteristics. The idea
expressway, and the PAVUE [3] system of IMS in Sweden of image segmentation can be used to segment the defect
and so on. The development of hardware technologies such and the rest part of the image. The defects appearing on the
as the appearance of CCD [4] digital photography has greatly surface may have various shapes and types. Fig.1 shows a few
advanced the effect of pavement crack detection. examples. Therefore, defect detection usually contains two
Defect detection is to distinguish the part with defect fea- subtasks, i.e. locate the defect pixels and classify the type of
tures from other defect free parts in the image, which has the defects.
both links and differences with image segmentation. From the Researchers have conducted in-depth researches on road
Wikipedia definition [5], image segmentation is the process crack detection and proposed many methods to crack the
problem, from image processing to machine learning meth-
The associate editor coordinating the review of this manuscript and ods, including deep learning methods which have been
approving it for publication was Tao Zhou . widely used nowadays. Image processing methods mainly

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/
VOLUME 8, 2020 14531
W. Cao et al.: Review of Pavement Defect Detection Methods

include three categories [6], threshold segmentation, edge The rest of the paper is organized as follows. Section II
detection and region growing methods. The threshold seg- briefly reviews the crack detection methods mainly based
mentation method divides the image pixels into several cat- on image processing techniques. Crack detection based on
egories by setting a proper pixel intensity threshold, so as machine learning methods, including unsupervised learn-
to separate the target crack from the background. The edge ing, traditional supervised learning and deep learning, are
detection method detects the edges of the road crack through reviewed in Section III. Section IV talks about the 3D imag-
edge detection operators such as Sobel operator [7], Prewitt ing technologies and corresponding methods for pavement
operator [8], and Canny operator [9]. The region growing defect detection. Discussions about the existing problems and
method depicts the specific information inside the crack by the prospect of crack detection is presented in Section V.
assembling the pixels with similar characteristics to form a Section VI concludes this work.
region.
The emergence of machine learning makes road crack II. CRACK DETECTION BASED ON IMAGE PROCESSING
detection rise to a new level. Image processing techniques Pavement is exposed to the natural environment for long
can only be able to analyze some superficial defect fea- time, often affected by rain, shadow, stains and other factors.
tures, while machine learning can learn some deep features. Therefore, the images captured by imaging sensors usually
Machine learning takes advantage of the similarity between contain a lot of noises, textures and interferences. Cracks
data through the design of algorithms, so that the computer on images appear as thin, irregular, dark curves, surrounded
can master the learning rules and predict from the unknown by strong textured noise. Researchers have proposed various
data by itself. Especially, deep learning methods have greatly image processing methods to reduce the influence of the noise
advanced the accuracy of pavement crack detection. on detection. These methods mainly include three categories:
Unlike other types of surface defects, pavement cracks are threshold segmentation, edge detection and region growing.
usually deep and have large size, such as block cracks and
alligator cracks [10]. It is practically meaningful to measure A. THRESHOLD SEGMENTATION METHODS
and detect the depth of the cracks. The detection of crack Threshold segmentation [17] is a classical method in image
depth can predict the future trend of the crack, which is segmentation. For each pixel in the image, we can judge
helpful to repair the pavement in time and reduce potential whether its characteristic attributes meet a threshold require-
safety risks [11]. In recent years, 3D imaging technology ment to determine the pixel belongs to the target area or the
has achieved great progress, making cracks detection in 3D background. This way, we can convert a gray image into a
images has become a new research direction for scholars. binary image. Let f (x, y) be the original image and T be the
Owing to the extra depth dimension, the 3D structure of threshold value, image segmentation can be written as
road cracks can be constructed from the 3D images. Besides (
this, 3D images can reduce the effect of shadow and other 1, f (x, y) ≥ T
noise [12]. g(x, y) =
0, f (x, y) < T
In recent years, there have been several reviews available
from the literature. Sylvie et al. summarized the application Obtaining reasonable threshold value is the key of this
of image processing technologies in road detection, and pro- method. Dynamic threshold method and local threshold
posed a new automatic road cracks detection and evaluation method have achieved good results in pavement defect detec-
comparison protocol [13]. In the work of [14], Gopalakrish- tion. Oliveira and Correia [18] recognized the potential cracks
nan compared some deep learning frameworks, networks and by identifying dark pixels in images with dynamic thresh-
hyper-parameters used in pavement crack detection, and clas- old. In their work, thresholded images are divided into non-
sified the previous papers, which provided a good reference overlapping blocks by entropy computation, and secondary
for developing pavement crack detection models. Tom et al. dynamic threshold of the generated Entropy Block Matrix
listed different kinds of pavement defects, discussed different is used as the basis for identifying image blocks containing
defect detection methods and assessed different defect data crack pixels. Peng et al. proposed a twice-threshold segmen-
acquisition devices [15]. In [16] Mathavan et al. discussed the tation [19]. Firstly, the improved Otsu threshold segmentation
detection of road surface lesions from the perspective of 3D algorithm was used to remove the road markers in the run-
image defect detection, summarized the application of 3D way image. Then, the improved adaptive iterative threshold
imaging technologies in road surface monitoring, analyzed segmentation algorithm was used to segment images which
the imaging principle of different devices and compared the removed the markers. Finally, the outline of the crack can
advantages and disadvantages of different pavement detec- be obtained through morphological denoising. In [20], a new
tion technologies. These reviews address different emphasis multi-scale local optimal threshold segmentation algorithm
or aspect on road surface detection. In this review, we provide was proposed to segment pavement cracks through crack den-
a comprehensive review of pavement crack detection meth- sity distribution. Compared with the global threshold method
ods, especially the in-depth analysis of deep learning and 3D and the optimal threshold method, this method achieved a
image based methods. better segmentation effect.

14532 VOLUME 8, 2020


W. Cao et al.: Review of Pavement Defect Detection Methods

A. UNSUPERVISED LEARNING METHODS


The biggest difference between unsupervised learning and
supervised learning is absence of data labels in training.
Training samples for unsupervised learning have no labels
FIGURE 2. Detection effect of different edge operators. and no definite results for output, the computer needs to
learn the similarity between samples by itself and classify the
samples. The advantage of unsupervised learning is that there
B. EDGE DETECTION METHODS is no need to label, reducing the influence of human subjective
Edge detection methods can also be used in crack detec- factors on the results.
tion. Common edge detection operators include Sobel oper- Akagic et al. proposed a new unsupervised road crack
ator, Roberts operator, Prewitt operator and Canny operator. detection method based on gray histogram and Otsu method,
Different operators have different detection effects on edge and a better results were obtained under the condition of
of the same type. Fig. 2 shows an example. Simply using low signal-to-noise ratio [27]. In [28], Amhaz et al. intro-
a single operator can hardly reach the expected effect. duced an improved unsupervised learning algorithm based on
Many scholars have improved the edge detection operators. minimum path selection, which reduced the loop and peak
Zhao et al. proposed an improved Canny edge detection artifacts in crack detection by estimating the crack width.
method for road edge detection [21]. Mallat wavelet trans- In [29], Li et al. used a method based on the minimum
form was used to enhance the blurred edge, and a better intensity path of the window to extract candidate cracks at
adaptive threshold Canny algorithm is obtained by using each scale in the image, compared the corresponding relations
genetic algorithm [22]. Ayenu-Prah and Attoh-Okine [23] of different scale cracks, established a crack evaluation model
studied the road crack detection method which combines based on multivariate statistical hypothesis.
bi-dimensional empirical mode decomposition (BEMD) and
Sobel edge detection. BEMD is an extension of EMD [24], B. SUPERVISED LEARNING METHODS
which removes noise from the signal without the need for Supervised learning needs the labels of the training data.
complex convolution processes. Common supervised learning algorithms include logis-
tic regression [30], Naive Bayesian [31], Support Vector
C. REGION GROWING METHODS Machine [32], artificial neural network [33] and random
The edge detection algorithm can get the edge distribution forest [34]. Xu et al. used the self-learning characteristic of
of crack defects and outline the crack contour, but it can neural network to transform cracks recognition into crack
not describe the information of internal pixels of cracks probability judgment of each sub-block image in the work
concretely. The recognition method based on region growing of [35]. They first divide the binary image of cracks into sub-
provides another idea for pavement crack detection. The basic images and extract the parameters representing the features of
idea of region growing is to gather similar pixels to form crack from each sub-image, then select representative images
a region. The selection of seeds is very important, which to train back propagation neural network. In [36], Crack For-
greatly affects the accuracy of image segmentation. In the est, a road crack detection framework based on random struc-
work of [25], after the road surface image was preprocessed, ture forest, was proposed to effectively solve the problems
the lane was marked and the uneven background part was of uneven edge cracks and cracks with complex topological
also processed. Then, the crack seeds were selected by grid structures. The authors extracted crack features from multiple
cell analysis and connected by Euclidean minimum spanning levels and directions to train the random forest model. In [37],
tree structure. In this way, cracks can be detected quickly and an automatic pavement crack detection scheme is proposed.
effectively. Li et al. proposed an automatic cracks detection Firstly, the crack image is preprocessed to smooth its texture
method based on FoSA-F* seed growth for better detection and enhance any existing cracks. Then the image is divided
of blurred and discontinuous cracks [26]. It exploited seed- into several non-overlapping blocks, each block produces a
growing strategy to eliminate the requirement that start and feature vector, and the supervised learning algorithm support
end points should be surrounded in advance. The global vector machine is used to detect the cracks. These methods
search space is reduced to the interested local space to heavily rely on the high-quality features extracted from the
improve the search efficiency. images, which needs careful design of the algorithms.

III. CRACK DETECTION BASED ON MACHINE LEARNING 1) DEEP LEARNING METHODS


Machine learning has become a hot research topic and widely In recent years, deep learning technologies have achieved
used in various areas. It can give predictions by learning the tremendous success in various computer vision tasks such as
rules embedded in the data. Supervised learning and unsuper- image classification, object detection and image segmenta-
vised learning are commonly used for cracks detection and tion [38]–[42]. Many deep learning based methods, especially
analysis. deep convolution neural networks, have been proposed for

VOLUME 8, 2020 14533


W. Cao et al.: Review of Pavement Defect Detection Methods

road crack detection. According to the way of handling the of size 32 × 32 or 64 × 64, then a simple CNN is used to
crack detection problem, these methods can roughly divided classify the grid image to decide if it contains crack. After
into three categories, pure image classification methods, this, crack skeleton can be represented by the grid cells con-
object detection based methods and pixel-level segmentation taining cracks. PCA (principal component analysis) is used to
methods. process the coordinate vector of the crack grid cells to decide
the crack type to be longitudinal, transverse or alligator crack.
a: CRACK DETECTION BASED ON CLASSIFICATION
Basically, this category of methods divide the input image b: CRACK DETECTION BASED ON PIXEL SEGMENTATION
into overlapping blocks, and then classify the block image Pixel segmentation is to assign a label or a score to each pixel
into classes. If the block contains a certain number of defect in the image. In [50] Fan et al. proposed a network structure
pixels or more, the block is labeled as defective block. with 4 convolutional layers with 2 max-pooling layers and
Crack Detection Based on Binary Classification: This kind 3 Fully Connected layers to directly segment the original
of methods divide the input images into overlapping blocks images. The output can have different resolution, from 1 × 1
and then use a deep convolution network to decide if the to 5 × 5. In [53] Jenkins et al. proposed a semantic segmen-
block contains crack or not. For example, Lei et al. divided tation algorithm for road cracks based on U-Net, where the
the road image of 3264 × 2248 into small patches of size U-Net is basically encoder-decoder structure [54]. This net-
99 × 99 × 3, and used their convolution neural network to work can be divided into encoder layer and decoder layer. The
classify these small patches [43]. The output is the probability encoder layer mainly realizes feature mapping of images, and
that the small patch is crack or not. In the work of [44], the decoder layer is mainly used to promote feature vectors
Li et al. modified GoogLeNet [45] to classify image blocks during segmentation and generate probability distribution of
and realized crack detection on real pavement using smart- each pixel. Similarly, Zou et al. [55] proposed DeepCrack
phone. In [46], Cha et al. used MatConvNet [47] to classify which uses encoder-decoder architecture to segment pave-
the input pavement 256 × 256 images. Similarly, in [43], ment image pixels into crack and background. And in [56],
the authors generated image patches of 99 × 99 from original the propose network structure used 4 convolution layers and
pavement images, where the patch is defective if its center max poolings as the encoder to extract features and 4 subse-
pixel is within 5 pixels of the crack center. The CNN model quent modules as the decoder. The work of [57] employed
was compared to the performance of SVM and boosting residue connections inside each encoder and decoder block
methods. Leo et al. studied the relationship between net- and attention gating block before the decoder to retain only
work depth and network accuracy using a self-designed CNN spatially relevant features of the feature map in the shortcut
model [48]. Unlike the work mentioned above, Chen et al. connection. Fully convolutional network is also often used for
processed pavement videos in [49]. In this work, a CNN segmentation purpose, such as [58], [59].
model was designed to classify the image patches of size
120 × 120 sampled from video frame and then adopted a c: CRACK DETECTION BASED ON OBJECT DETECTION
naive Bayes data fusion scheme to aggregate the information Object detection is an important task in computer vision.
obtained from each video frame to enhance the overall per- Its goal is to locate the object with a bounding box in the
formance and robustness of the system. image and decide the object type. Many deep CNN models
Crack Detection Based on Multi-Class Classification: have been proposed to improve the accuracy and efficiency,
Crack detection based binary classification is not suitable for such as faster R-CNN [60], SSD [61], YOLO [62] etc. Object
the case when it is required to decide the defect types. In [50], detection methods are also popular in road crack detection.
Fan et al. used one CNN model to learn the structure of Faster R-CNN is widely used in object detection, which
the pavement cracks as a multi-label classification problem. has three major steps, 1) extract image features using CNN
Small crack image patches of 27 × 27 were used as the structure like VGG, 2) propose candidate regions for objects
input and the output layer had s × s nodes, representing the (RPN), 3) classification of object types and bounding box
intensity states of square block centered at the crack pixel. coordinates regression. The CNN structure in step 1 is shared
For example, if s = 5, the model predicts 25 pixel state of the by step 2 and 3. In [63], Suh and Cha used faster R-CNN to
block image of 5 × 5. During training, the input 27 × 27 was detect the damages in civil infrastructure. Cha et al. modified
resized to 5 × 5 as the ground truth. In [51], Li et al. proposed the faster R-CNN by using a ZF-net to speedup the feature
a deep CNNs for pavement crack classification based on 3D extraction in step 1 [64]. ZF-net [65] is slightly modified
pavement images, and classify pavement patches cut from 3D from AlexNet [66] which is relatively simple and fast. In [67]
images into five categories including the normal category. Li et al. used the faster R-CNN to detect six kinds of road
They trained four supervised CNNs classification models defects. The model can automatically identify and locate
with different sizes of receptive field, and find that different defects under different lighting conditions with high accuracy
size of receptive field have a slight effect on the classification and stability.
accuracy. The method proposed by Wang and Hu [52] is SSD [61] combines predictions from multiple feature maps
quite different from above methods. In this work, the input with different resolutions to naturally handle objects of var-
pavement images are segmented into non-overlapping grids ious sizes and completely eliminates proposal generation

14534 VOLUME 8, 2020


W. Cao et al.: Review of Pavement Defect Detection Methods

and encapsulates the region classification and coordinates


regression in a single network. This makes SSD much faster
than faster R-CNN. And MobileNet [68] is a well known
light weight deep neural networks for mobile applications.
To test the crack detection on devices with limited resources,
Hiroya et al. compared SSD using MobileNet, SSD using
Inception v2 [69] for object detection on smart phones and
found that SSD using Inception v2 is two times slower than
SSD-MobileNet [70]. This conclusion is not surprising as
MobileNet is designed for acceleration purpose.
Unlike above methods, Crack-pot method in [71] com-
bined traditional image processing techniques and deep learn-
ing methods to detect the potholes and cracks in the road.
In these method, edge detection, dilation, contour detec- FIGURE 3. Two ROC curves and AUC.
tion were applied to generate candidate bounding boxes
for suspected potholes and cracks. Then these regions were
feed into a classification model which is modified from detection performance. ROC curve describes the relationship
SqueezeNet [72] by replacing the last pooling layer with a between TP rate and FP rate. Fig. 3 shows two ROC curves.
learned dictionary [73]. If the ROC curve is closer to the upper left corner, that’s
Methods based on object detection like SSD and faster mean, FP is low, TP is high, and the better the model works.
R-CNN propose multiple candidate regions and perform the Therefore, the Area under the ROC curve, namely AUC is
location regression using the image features extracted from used to compare two ROC curves.
CNN structure is a systematic way for object detection. In object detection using models such as SSD, IOU (Inter-
For defects with compact shapes, these methods may work section over Union) is often used to decide if the object is
well. However, for defects like long curves or scratches correctly detected. The IOU means the overlap rate between
on the surface, the methods may fail to detect due to the the bounding box given by the model and the ground truth
overly large bounding box proposed by the Region Proposal bounding box. If the IOU is larger than a predefined thresh-
Network (RPN). old, which is usually 0.5, the object detection is considered
successful.
2) METRICS TO EVALUATE MODEL PERFORMANCE
Detection Result ∩ Ground Truth
a: PRECISION, RECALL AND F1 IOU =
Detection Result ∪ Ground Truth
The three most commonly used parameters for evaluating
crack detection performance are precision, recall, and F1. c: AIU, ODS and OIS
Precision is the ratio of the correct detected results to all In [76], the authors proposed three new evaluation metrics,
the actual detected results, recall is the ratio of the correct AIU, ODS and OIS. AIU is the average intersection over
detected results to all the results that should be detected. union between the predicted area and ground truth area. ODS
The F1 is the harmonic mean of the precision and the represents the best F1 score on the dataset with fixed scale,
TP TP
recall. Precision = TP+FP , Recall = TP+FN and F1 = and OIS represents the aggregated F1 score on the dataset
2∗ TP
2∗ TP+FP+FN . The detection accuracy is defined as Acc = with the best proportion of each image. ODS and OIS are
TP+TN defined as follows:
TP+TN +FP+FN . Table 1 shows the definition of FN (False
Negative), FP (False Positive), TN (True Negative) and TP 
Pt × Rt

(True Positive). ODS = max 2 : t = 0.01, 0.02, . . . , 0.99
Pt + Rt
Nimg  i 
TABLE 1. Definition of FN, FP, TN and TP. 1 X Pt ×Rt
OIS = max 2 : t = 0.01, 0.02, . . . , 0.99
Nimg Pt + Rt
i

where t represents the threshold value, i is the index of image,


Nimg is the total number of images, Pt and Rt are precision
and recall at threshold t on the dataset. Pit and Rit represent
the accuracy rate and recall rate on image I respectively.

b: ROC, AUC, and IOU 3) PUBLIC DATASETS FOR ROAD CRACK DETECTION
ROC (Receiver Operating Characteristic) [74] curve and Road crack detection has been research topic for years. There
AUC (Area Under Curve) [75] can also be used to measure the are many public datasets to help us do better research.

VOLUME 8, 2020 14535


W. Cao et al.: Review of Pavement Defect Detection Methods

a: CRACKFOREST DATASET (CFD) TABLE 4. Results comparison on CRACK500 dataset.

The CrackForest dataset consists of 118 images of cracks on


urban road surface in Beijing taken by iphone5. Each image is
resized to 480×320 pixels and has been labeled. It is available
at https://github.com/cuilimeng/CrackForest-dataset.

b: AIGLERN DATASET
AigleRN dataset contains 38 pre-processed gray-scale
TABLE 5. Test results on Gaps dataset.
images on French pavement. Half of them are 991 × 462
and half of them are 311 × 462. The dataset is available at
http://telerobot.cs.tamu.edu/bridge/Datasets.html.

c: CRACK500
500 pictures of pavement cracks with the size of 2000 × 1500
were taken by smartphone. Each crack image has a binary
mask image for annotation. The dataset is divided into three
net [43] is just a simple and small CNN with four blocks
parts, 250 images for training, 50 for validation, and 200 for
of alternating convolutional and max-pooling layers, and the
test. It is available at https://github.com/fyangneil/pavement-
ASINVOS net [80] is modified from RCD net by adding
crack-detection.
more blocks, the ASINVOS-mod [80] is a further version
of ASINVOS net by replacing large convolutional filters by
d: GAPs DATASET
multiple smaller filters.
German asphalt pavement disease (Gaps) dataset, includ-
ing 1969 gray-scale pavement images, is partitioned into
4) DATA AUGMENTATION
1418 training images, 51 validation images, and 500 test
The training of deep neural network model requires a large
images. The image resolution is 1920 × 1080 pixels.
amount of data. However, it is costly to acquire and label
It is available at http://www.tu-ilmenau.de/neurob/data-sets-
this amount of data. Data augmentation is an effective tech-
code/gaps/.
nique to relieve the problem. Common data augmentation
methods include image rotation, flipping, mirroring, adding
e: RESULTS ON BENCHMARK DATASETS
noise, changing the illumination etc. These techniques are
The following tables list the results comparison on different
usually combined to get more data. Table 6 shows the data
benchmark datasets. In Table 2 and Table 3, the tolerance
augmentation techniques used in road crack detection.
margin is the number of pixels of the predicted pixel away
from the ground truth pixel when we count the true negatives.
IV. CRACK DETECTION BASED ON 3D DATA
For example, if the tolerance margin is 2, a ground truth pixel
Most of existing crack detection methods are based on
is hit if there is a predicted pixel within its 2-pixel neighbor-
2D images. With the development of stereo camera and
hood. AIU, ODS, OIS are used to compare the performance
range-based sensors, stereovision is becoming a promising
of different methods on CRACK500 dataset in Table 4.
approach in crack detection as it can provide accurate and
TABLE 2. Test results on CFD dataset. robust data for the depth information.

A. REPRESENTATION OF 3D DATA
Basically, there are three kinds of 3D data representations,
namely, multi-view, point cloud and voxel data.
Earlier representations of 3D images were made through
multi-view. Multi-view represents a collection of 2D images
of a rendered polygon grid captured from different view-
TABLE 3. Test results on AigleRN dataset. points to convey 3D geometry in a simple manner, as shown
in Fig.4(a). This method is easy to understand, but difficult
to express the spatial structure of 3D data. On the other
hand, since multi-view projections can only represent 2D
contours of 3D objects, some detailed geometrical informa-
tion is inevitably lost during the projection process [81].
Point cloud is a set of points in the 3D space, where
Reference [80] presented GAPs dataset to test pavement each point is specified by the 3D coordinates (x, y, z) and
defect type classification. On this dataset, the authors com- other information such as RGB value of color. These huge
pared four methods, shown in Table 5, where the RCD amount of points are used to interpolate the geometric shape

14536 VOLUME 8, 2020


W. Cao et al.: Review of Pavement Defect Detection Methods

TABLE 6. Data augmentation.

FIGURE 5. Distribution of 3D object classification methods on data


representations.

FIGURE 4. Three expressions of 3D data, (a) Multi-view, (b) point cloud


and (c) voxels.
of object classification performance on benchmark Model-
net40 [86]. Modelnet40 contains 40 categories of CAD 3D
models and is a standard dataset for evaluating semantic
of object surface, the more dense point clouds are, the more segmentation and classification of 3D deep learning mod-
accurate models can be created, this process is called 3D els [87]. For 3D object classification, we studied the 60 meth-
reconstruction, as shown in Fig.4(b). 3D scanners and LiDAR ods submitted to the web site, Fig. 5 shows the distribution
devices can be used to generate point cloud data [82]. of these methods on different data types. We can see that
Point cloud data can convert to structured 3D regular 21.33% of these methods were based on multi-view, 17.27%
grids [83], namely, voxel. Voxel is the smallest unit of digital were based on point cloud data, 18.29% were based on
data in 3D space segmentation, each unit can be viewed as a voxel, and 7.11% were based on other methods. The high-
grid with fixed coordinates. Similar to 2D image, it also has a est classification accuracy (97.37%) was achieved by Rota-
resolution, the finer the 3D space is divided, the smaller each tionNet [88], which jointly estimates the object categories
grid is, and the greater the resolution is. Fig.4(c) shows 3D and viewpoints for each single-view image and aggregates
occupancy grids in different resolution. For easy reference, object class predictions from partial multi-view image sets.
we compared these three kinds of representation in Table 7. As just mentioned, different data representation may affect
the classification performance. We analyzed three different
B. COMPARISON OF DIFFERENT 3D REPRESENTATIONS 3D data representation methods in terms of classification
Different 3D data representation will affect the effectiveness performance. The average accuracy based on multi-view is
of the methods. We compared different methods in terms 92.31%, based on point cloud data is 90.43%, and based on

VOLUME 8, 2020 14537


W. Cao et al.: Review of Pavement Defect Detection Methods

TABLE 7. 3D data representation.

model invariant to the permutation of the data points. Point-


Net [90] is the first CNN model to directly work on the raw
point cloud. The method operates on each point separately
and accumulate features from all the points by a symmetric
function, which is a max pooling layer. Pointnet++ intro-
duces a hierarchical neural network that applies PointNet
recursively on a nested partitioning of the input point set.
By exploiting metric space distances, the method is able to
learn local features with increasing contextual scales [91]. To
further address the problem, DGCNN was proposed in [92].
Instead of working on individual points like PointNet, this
FIGURE 6. Average accuracy of different classification methods.
method constructs a neighborhood graph to capture the local
geometric information and proposes EdgeConv operation to
apply convolution-like operations on the edges.
These methods were all tested on modelnet40 dataset.
voxel is 86.73%, as shown in Fig. 6. It can be found that in the We compared them in terms of the number of model param-
classification task, the method based on multiple views and eters, input type, forward time, accuracy and the deep learn-
point cloud are more accurate than that based on voxel. ing framework in Table 8. We can see that, the multi-view
model is much larger than the other two methods in terms
C. DEEP NETWORKS FOR 3D OBJECT CLASSIFICATION of model parameters. In terms of classification accuracy, data
In the work of [84], the authors presented a CNN architecture representation based on multi-view and point cloud is slightly
that combines information from multiple views of a 3D shape higher than based on voxel. This is caused by the resolution of
into a single and compact shape descriptor offering even voxel, the higher the resolution of voxel, the larger calculation
better recognition performance. In this method, images from amount and the more complex the model is. Generally, only
each view were passed through a separate CNN to extract 32 × 32 × 32 or 64 × 64 × 64 resolutions are selected for
view-based features. Then, an additional CNN is used to training.
combine these features for final classification. For multi-view, the performance of the model will get
Following the first volumetric CNN is 3D ShapeNets [86], better as the number of images from different perspectives
Maturana et al. proposed VoxNet in [85] to process volumet- increases. The same is true to point cloud data. The more
ric data with grid resolution of 32 × 32, where the model points used to describe an object, the more comprehensive
consists of 4D convolution filters to hold 3D spatial features. the 3D information of the object will be, and the classification
Rahul Dev also proposed CNN models to classify 3D object accuracy will be improved. Similarly, the higher the resolu-
based on volumetric data [89]. LightNet [81] is a faster ver- tion of voxel data, the better the performance of the model.
sion of VoxNet to address heavy computation problem for real
time 3D object recognition. D. FEATURE EXTRACTION USING 3D DATA
Point cloud is an unordered set of points scanned from Feature extraction is a very important step in crack detec-
the 3D object. The critical problem to solve is to make the tion. 3D data can provide richer features than 2D images.

14538 VOLUME 8, 2020


W. Cao et al.: Review of Pavement Defect Detection Methods

TABLE 8. Comparison of different methods on modelnet40 dataset.

Several methods explicitly extract features from 3D data to cloud data. PPFNet uses a new n-tuple loss and architecture
feed to traditional machine learning models. For example, to naturally inject global information into local descriptors
in the work of [93], the authors combined the extracted and enhance the representation of local features.
features from 2D and 3D to train classifiers, and in [94],
spatiotemporal features were extracted from videos using E. 3D PAVEMENT DEFECT DETECTION
3D ConvNets. These features followed by a linear classifier With 3D data acquisition is becoming easier, the applica-
achieved state-of-the-art results at the publication time. tion of 3D technology to pavement defect detection is more
and more common. 3D data can well represent the spatial
1) SPATIOTEMPORAL FEATURES
information (length, width and depth) of road defects, and
conduct multi-directional analysis on the area, volume and
In [94] Tran et al. proposed a simple and efficient method
other aspects of defects.
to learn spatial feature of 3D data by using 3D convolutional
Xu et al. [99] used 3D mobile LiDAR to collect road point
neural network to learning spatiotemporal features for videos.
cloud data and studied the automatic extraction of road curbs,
They found that 3×3×3 convolutional kernels in all layers are
in order to improve the robustness and accuracy of the model,
among the best performing architectures for 3D ConvNets.
they designed a new energy function to extract the constrained
In [95] Owoyemi and Hashimoto proposed an end-to-end
candidate points and refined the candidate points with the
spatiotemporal gesture learning method for 3D point cloud
least cost path model. They sampled the point cloud data at
data, mapping the point cloud data into a dense occupancy
a rate of 100%, 50%, 10% and 1% respectively. Even if the
grid and learning the spatiotemporal characteristics of the
point cloud drops to 1%, the method proposed in this paper
data. In this work, 3D ROI jittering method is used in training
can still extract the road curbs.
to expand 3D data.
1) TRADITIONAL METHODS FOR 3D CRACK DETECTION
2) GEOMETRIC FEATURES Zhang et al. utilized the Microsoft Kinect to reconstruct
In [96] Furuya and Ohbuchi proposed a deep local fea- pavement surfaces and capture geometric features of pave-
ture aggregation network (DLAN) for 3D model retrieval. ment cracking, including crack width, length, and depth
It combines the extraction of rotation invariant 3D local to identify the distress severities of three major types of
features with their aggregation in a single depth architecture. pavement cracks, namely, alligator cracking, traverse crack-
DLAN describes the local 3D region of a 3D model by ing, longitudinal cracking [100]. In the work of [101],
using a set of 3D geometric features that are not affected Li et al. employed laser-imaging techniques to model the
by local rotation. Zheng et al. proposed a data-driven model, pavement surface with dense 3D points and used an algo-
3DMatch [97], which learns a local volumetric patch descrip- rithm based on frequency analysis (Fourier transformation)
tor to establish corresponding relationships between local 3D separate potential cracks from the control profile and material
data and can match local geometric features well in real depth texture of the pavement assuming that the road pavement
images. Deng et al. proposed PPFNet [98], a 3D local feature in the absence of pavement distresses commonly holds a
descriptor for in-depth learning of global information, which relatively uniform control profile. Tsai and Li proposed a
can be matched to corresponding parts in disordered point dynamic-optimization-based crack segmentation method to

VOLUME 8, 2020 14539


W. Cao et al.: Review of Pavement Defect Detection Methods

TABLE 9. Network performance comparison.

test 1 to 5 mm wide cracks collected by 3D laser at differ- still uses invariant image width and height through all lay-
ent depths and lighting conditions [102]. To detect similar ers to place explicit requirements on pixel-perfect accuracy.
cracks in masonry, the work [103] presented mathematics to In addition, they deepened the network and the combination
determine the minimum crack width detectable with a terres- of repeated convolution and 1 × 1 convolution is used to
trial laser scanner, in which the main features used include learn the local features with different local receptive fields.
orthogonal offset, interval scan angle, crack orientation, and Recently, Zhang’s team put forward the CrackNet V [108],
crack depth. In [93], the whole image is divided into sub which includes a pre-processing layer, eight convolutional
images of 128 × 128 pixels and filtered by a set of Gabor layers and an output layer. They used a 3 × 3 filter for the
filters. The maximum value of the magnitude of every filtered first six convolutions, and stack multiple 3 × 3 convolutions
image is the feature used to train weak classifiers. To detect together for depth extraction, which reduced the number of
crack in pavement images, binary segmentation is a straight- parameters and improves the efficiency of feature extrac-
forward way. Unlike most 2D thresholding techniques based tion. In addition, they designed a new activation function to
on the assumptions that the distress pixels are darker than improve the detection accuracy of shallow cracks.
their surroundings, [104] proposed a probabilistic relaxation In order to improve the recall rate, they put forward
labeling technique to enhance the accuracy of the distress CrackNet-R [109] based on recurrent neural network.
detection, which take account of the non-uniform illumina- As a recursive unit, gated recurrent multi-layer perceptron
tion and complicated contents on the pavement surface areas. (GRMLP) is designed to update the internal memory of
The work of [105] proposed a unique method which uses CrackNet-R recursively. GRMLP aims to abstract the features
Dempster-Shafer (D-S) theory to combine the 2D gray-scale of input and hidden state more deeply by multi-layer nonlin-
image and 3D laser scanning data as a mass function, and ear transformation at gate unit. The resultant model achieved
the corresponding detection results are fused at the decision- about four times faster and introduces tangible improvements
making level. in detection accuracy, when compared to CrackNet. The per-
formance comparison of the networks shown in Table 9.
2) DEEP NETWORK FOR 3D CRACK DETECTION
Applying deep learning neural network in 3D crack 3) FACTORS AFFECTING 3D PAVEMENT DEFECT DETECTION
detection is currently a new and hot research direction. There are many factors that can influence the detec-
In 2017, Zhang et al. proposed CrackNet network to tion of pavement defects. Yi et al. [102] proposed a
implement pixel-level detection of pavement cracks and dynamic-optimization-based crack segmentation method to
defects [106]. The model consists of five layers with two test 1 to 5 mm wide cracks collected by 3D laser at different
fully connected layers, two convolution layers and one output depths and lighting conditions. Experiments show that cracks
layer. The feature extractor utilizes line filters oriented at with width equal to or greater than 2 mm can be effectively
various directions and with varied lengths as well as widths to separated from the pavement background, while cracks with
enhance the contrast between cracks and the background. The width of 1 mm can only be partially separated. In addition,
model was trained with 1,800 3D pavement images collected it was found that the light intensity had little effect on the test
from DHDV [2]. results.
Later on, in the work of [107], the authors proposed Li et al. [101] used laser imaging technology to model 3D
an improved architecture of CrackNet called CrackNet II dense point road surface and proposed a 3D point cloud crack
for enhanced learning capability and faster performance. detection method based on sparse point grouping, which can
CrackNet II has a deeper architecture with more hidden reduce the influence of light variation and shadow on crack
layers but fewer parameters. Such an architecture yields detection. They tested the effect of the data acquisition vehi-
five times faster performance compared with the original cle on the performance of the proposed method at different
CrackNet. Similar to the original CrackNet, CrackNet II speeds(10km/h to 80km/h). The experimental results show

14540 VOLUME 8, 2020


W. Cao et al.: Review of Pavement Defect Detection Methods

that at different speeds, the crack test effect is roughly the crack data for training and testing by themselves, and it
same, but the slower the speed, the more detailed the crack is impossible to conduct performance analysis on the same
contour description. dataset. Collecting 3D crack benchmark datasets will greatly
Debra et al. [103] found through the experiment that crack benefit future study of the 3D crack detection.
depth depends on three factors: scanning distance, scanning
angle and crack width.The scanning distance is the distance VI. CONCLUSION
between the crack and the laser scanner, and the scanning The automatic detection of pavement crack has been studied
angle is the offset angle between the crack and the laser extensively due to its practical significance. From traditional
scanner. Cracks with a width of 1 to 7 mm were scanned at image processing methods to machine learning methods to
distances of 5m and 7.5m and angles of 0◦ , 15◦ and 30◦ . The deep learning algorithms that have become popular in recent
results show that the crack depth cannot be detected when the years. In this work, we review these methods, and we focus
crack width is less than 1 mm, because the smaller the crack on the detailed comparison and analysis on deep learning
width is, the more difficult to obtain the depth information of methods and 3D image based methods. Particularly, deep
crack. As the crack width increases, the detection of the crack learning methods are grouped and reviewed in three cate-
depth becomes more accurate. With the increase of scanning gories, image classification, object detection and pixel-level
angle, the error of crack depth detection will also increase. segmentation. For 3D crack detection methods, we compare
The closer the scanning distance is, the higher the detection the different data representations and study the corresponding
accuracy will be. performance of the deep neural networks for 3D object classi-
Khurram et al. [110] used Kinect to predict and analyze the fication. Traditional and deep learning based crack detection
depth and volume of pothole, the mean percentage error are methods using 3D data are also reviewed.
2.58% and 5.47%, respectively. In addition, the test perfor-
mance of pothole with water, dust and oil is also discussed. REFERENCES
Experimental results show that the error of test results will [1] G. Caroff, P. Joubert, F. Prudhomme, and G. Soussain, ‘‘Classification of
increase with the increase of water, dust and oil content, and pavement distresses by image processing (MACADAM SYSTEM),’’ in
Proc. ASCE, 1989, pp. 46–51.
the error is also related to the types of these media. [2] K. C. Wang, Z. Hou, and W. Gong, ‘‘Automation techniques for
digital highway data vehicle (DHDV),’’ in Proc. 7th Int. Conf.
V. EXISTING PROBLEMS AND RESEARCH PROSPECTS Manag. Pavement Assets. Citeseer, 2008. [Online]. Available:
http://citeseerx.ist.psu.edu/viewdoc/download
After years of development, many achievements have been [3] L. Sjogren and P. Offrell, ‘‘Automatic crack measurement in Sweden,’’
made in pavement defects detection, which has made great in Proc. 4th Int. Symp. Pavement Surface Characteristics Roads Airfields
contributions to the maintenance of pavement and the safety World Road Assoc. (PIARC) 2000.
[4] L. Jin-Hui, L. Wei, and J. Shou-Shan, ‘‘A study on road surface defects
of vehicles. However, there are still some problems in the detecting technology with CCD camera,’’ J. Xi’an Inst. Technol., vol. 2,
practical application: 2002.
1) Due to the complex and dynamic environmental fac- [5] K. K. Singh and A. Singh, ‘‘A study of image segmentation algorithms
for different types of images,’’ Int. J. Comput. Sci. Issues), vol. 7, no. 5,
tors, there may be some errors in the detection of p. 414, 2010.
road cracks under the condition of poor light in rainy [6] S. Kamdi and R. Krishna, ‘‘Image segmentation and region growing
days or when there is water on the road. algorithm,’’ Int. J. Comput. Technol. Electron. Eng., vol. 2, no. 1, 2012.
[7] N. Kanopoulos, N. Vasanthavada, and R. Baker, ‘‘Design of an image
2) Different algorithms are needed to test on different road edge detection filter using the Sobel operator,’’ IEEE J. Solid-State Cir-
surface conditions, and the algorithm transplantation cuits, vol. SSC-23, no. 2, pp. 358–367, Apr. 1988.
performance is poor. [8] W. Dong and Z. Shisheng, ‘‘Color image recognition method based on
the Prewitt operator,’’ in Proc. Int. Conf. Comput. Sci. Softw. Eng., vol. 6,
3) The process of defects detection is always offline, 2008, pp. 170–173.
so the performance of real-time is not good in reality. [9] L. Er-Sen, Z. Shu-Long, Z. Bao-Shan, Z. Yong, X. Chao-Gui, and
Therefore, we need to further enhance the detection accuracy S. Li-Hua, ‘‘An adaptive edge-detection method based on the canny oper-
ator,’’ in Proc. Int. Conf. Environ. Sci. Inf. Appl. Technol., vol. 1, Jul. 2009,
and real-time performance of the algorithm to ensure the opti- pp. 465–469.
mal detection results in real applications. The generalization [10] B. J. Lee and H. D. Lee, ‘‘Position-invariant neural network for digital
and robustness of the methods is also very important as the pavement crack analysis,’’ Comput.-Aided Civil Infrastruct. Eng., vol. 19,
no. 2, pp. 105–118, Mar. 2004.
factors such as road and weather conditions greatly affect the [11] J.-Y. Jung, H.-J. Yoon, and H.-W. Cho, ‘‘A study on crack depth measure-
detection. As for 3D cracks detection, the depth information ment in steel structures using image-based intensity differences,’’ Adv.
of cracks is added to make the cracks have spatial structure. Civil Eng., vol. 2018, pp. 1–10, 2018.
[12] F. Blais, M. Rioux, and J.-A. Beraldin, ‘‘Practical considerations for a
Although the overall information of cracks is more complete, design of a high precision 3-D laser scanner system,’’ in Proc. Optomech.
it undoubtedly increases the complexity of the algorithm Electro-Opt. Design Ind. Syst., Nov. 1988, pp. 225–246.
and greatly increases the computational cost. The algorithm [13] S. Chambon and J.-M. Moliard, ‘‘Automatic road pavement assess-
ment with image processing: Review and comparison,’’ Int. J. Geophys.,
can be improved and the computing cost can be reduced vol. 2011, pp. 1–20, 2011.
by referring to some progress in deep convolutional neural [14] K. Gopalakrishnan, ‘‘Deep learning in data-driven pavement image anal-
networks for 2D images such as network architecture and ysis and automated distress detection: A review,’’ Data, vol. 3, no. 3, p. 28,
Jul. 2018.
model compression techniques. On the other hand, there are [15] T. B. Coenen and A. Golroo, ‘‘A review on automated pavement distress
few public 3D cracks datasets, researchers collect pavement detection methods,’’ Cogent Eng., vol. 4, no. 1, p. 1374822, 2017.

VOLUME 8, 2020 14541


W. Cao et al.: Review of Pavement Defect Detection Methods

[16] S. Mathavan, K. Kamal, and M. Rahman, ‘‘A review of three-dimensional [39] W. Cao, Q. Lin, Z. He, and Z. He, ‘‘Hybrid representation learning for
imaging technologies for pavement distress detection and measure- cross-modal retrieval,’’ Neurocomputing, vol. 345, pp. 45–57, Jun. 2019.
ments,’’ IEEE Trans. Intell. Transp. Syst., vol. 16, no. 5, pp. 2353–2362, [40] W. Cao, J. Yuan, Z. He, Z. Zhang, and Z. He, ‘‘Fast deep neural networks
Oct. 2015. with knowledge guided training and predicted regions of interests for real-
[17] S. Zhu, X. Xia, Q. Zhang, and K. Belloulata, ‘‘An image segmentation time video object detection,’’ IEEE Access, vol. 6, pp. 8990–8999, 2018.
algorithm in image processing based on threshold segmentation,’’ in [41] D. Meng, L. Zhang, G. Cao, W. Cao, G. Zhang, and B. Hu, ‘‘Liver
Proc. 3rd Int. IEEE Conf. Signal-Image Technol. Internet-Based Syst., fibrosis classification based on transfer learning and FCNet for ultrasound
Dec. 2007, pp. 673–678. images,’’ IEEE Access, vol. 5, pp. 5804–5810, 2017.
[18] H. Oliveira and P. L. Correia, ‘‘Automatic road crack segmentation using [42] D. Meng, G. Cao, Y. Duan, M. Zhu, L. Tu, D. Xu, and J. Xu, ‘‘Tongue
entropy and image dynamic thresholding,’’ in Proc. IEEE 17th Eur. Signal images classification based on constrained high dispersal network,’’
Process. Conf., 2009, pp. 622–626. Evidence-Based Complementray Alternative Med., vol. 2017, no. 4,
[19] L. Peng, W. Chao, L. Shuangmiao, and F. Baocai, ‘‘Research on crack pp. 1–12, 2017.
detection method of airport runway based on twice-threshold segmenta- [43] L. Zhang, F. Yang, Y. Daniel Zhang, and Y. J. Zhu, ‘‘Road crack detection
tion,’’ in Proc. 5th Int. Conf. Instrum. Meas., Comput., Commun. Control using deep convolutional neural network,’’ in Proc. IEEE Int. Conf. Image
(IMCCC), Sep. 2015, pp. 1716–1720. Process. (ICIP), Sep. 2016, pp. 3708–3712.
[20] S. Wang and W. Tang, ‘‘Pavement crack segmentation algorithm based on [44] S. Li and X. Zhao, ‘‘Convolutional neural networks-based crack detec-
local optimal threshold of cracks density distribution,’’ in Proc. Int. Conf. tion for real concrete surface,’’ Proc. SPIE, vol. 10598, Mar. 2018,
Intell. Comput. Springer, 2011, pp. 298–302. Art. no. 105983V.
[21] H. Zhao, G. Qin, and X. Wang, ‘‘Improvement of canny algorithm based [45] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,
on pavement edge detection,’’ in Proc. 3rd Int. Congr. Image Signal V. Vanhoucke, and A. Rabinovich, ‘‘Going deeper with convolutions,’’
Process., Oct. 2010, pp. 964–967. in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015,
[22] C.-C. Zhou, G.-F. Yin, and X.-B. Hu, ‘‘Multi-objective optimization of pp. 1–9.
material selection for sustainable products: Artificial neural networks and [46] Y.-J. Cha, W. Choi, and O. Büyüköztürk, ‘‘Deep learning-based crack
genetic algorithm approach,’’ Mater. Des., vol. 30, no. 4, pp. 1209–1215, damage detection using convolutional neural networks,’’ Comput.-Aided
Apr. 2009. Civil Infrastruct. Eng., vol. 32, no. 5, pp. 361–378, May 2017.
[23] A. Ayenu-Prah and N. Attoh-Okine, ‘‘Evaluating pavement cracks with [47] A. Vedaldi and K. Lenc, ‘‘MatConvNet: Convolutional neural networks
bidimensional empirical mode decomposition,’’ EURASIP J. Adv. Signal for MATLAB,’’ in Proc. 23rd ACM Int. Conf. Multimedia (MM), 2015.
Process., vol. 2008, no. 1, Art. no. 861701, 2008. [48] L. Pauly, H. Peel, S. Luo, D. Hogg, and R. Fuentes, ‘‘Deeper networks
[24] Z. Wu and N. E. Huang, ‘‘A study of the characteristics of white for pavement crack detection,’’ in Proc. 34th Int. Symp. Autom. Robot.
noise using the empirical mode decomposition method,’’ Proc. Roy. Soc. Construct. (ISARC), Jul. 2017, pp. 479–485.
London A, Math., Phys. Eng. Sci., vol. 460, no. 2046, pp. 1597–1611, [49] F.-C. Chen and M. R. Jahanshahi, ‘‘NB-CNN: Deep learning-based
Jun. 2004. crack detection using convolutional neural network and Naïve Bayes
[25] Y. Zhou, F. Wang, N. Meghanathan, and Y. Huang, ‘‘Seed-based approach data fusion,’’ IEEE Trans. Ind. Electron., vol. 65, no. 5, pp. 4392–4400,
for automated crack detection from pavement images,’’ Transp. Res. Rec., May 2018.
vol. 2589, no. 1, pp. 162–171, Jan. 2016. [50] Z. Fan, Y. Wu, J. Lu, and W. Li, ‘‘Automatic pavement crack detec-
[26] Q. Li, Q. Zou, D. Zhang, and Q. Mao, ‘‘FoSA: F* Seed-growing approach tion based on structured prediction with the convolutional neural net-
for crack-line detection from pavement images,’’ Image Vis. Comput., work,’’ 2018, arXiv:1802.02208. [Online]. Available: https://arxiv.org/
vol. 29, no. 12, pp. 861–872, Nov. 2011. abs/1802.02208
[27] A. Akagic, E. Buza, S. Omanovic, and A. Karabegovic, ‘‘Pavement crack [51] B. Li, K. C. Wang, A. Zhang, E. Yang, and G. Wang, ‘‘Automatic clas-
detection using Otsu thresholding for image segmentation,’’ in Proc. 41st sification of pavement crack using deep convolutional neural network,’’
Int. Conv. Inf. Commun. Technol., Electron. Microelectron. (MIPRO), Int. J. Pavement Eng., pp. 1–7, Jun. 2018.
May 2018, pp. 1092–1097. [52] X. Wang and Z. Hu, ‘‘Grid-based pavement crack analysis using deep
[28] R. Amhaz, S. Chambon, J. Idier, and V. Baltazart, ‘‘Automatic crack learning,’’ in Proc. 4th Int. Conf. Transp. Inf. Saf. (ICTIS), Aug. 2017,
detection on two-dimensional pavement images: An algorithm based on pp. 917–924.
minimal path selection,’’ IEEE Trans. Intell. Transp. Syst., vol. 17, no. 10, [53] M. D. Jenkins, T. A. Carr, M. I. Iglesias, T. Buggy, and G. Morison,
pp. 2718–2729, Oct. 2016. ‘‘A deep convolutional neural network for semantic pixel-wise segmen-
[29] H. Li, D. Song, Y. Liu, and B. Li, ‘‘Automatic pavement crack detection tation of road and pavement surface cracks,’’ in Proc. 26th Eur. Signal
by multi-scale image fusion,’’ IEEE Trans. Intell. Transp. Syst., vol. 20, Process. Conf. (EUSIPCO), Sep. 2018, pp. 2120–2124.
no. 6, pp. 2025–2036, Jun. 2019. [54] O. Ronneberger, P. Fischer, and T. Brox, ‘‘U-net: Convolutional networks
[30] R. E. Wright. Logistic regression. American Psycholog- for biomedical image segmentation,’’ in Proc. Int. Conf. Med. Image
ical Association. Accessed: 1995. [Online]. Available: Comput. Comput.-Assist. Intervent., 2015, pp. 234–241.
https://psycnet.apa.org/record/1995-97110-007 [55] Q. Zou, Z. Zhang, Q. Li, X. Qi, Q. Wang, and S. Wang, ‘‘DeepCrack:
[31] K. M. Leung. Naive Bayesian Classifier. Accessed: 2007. [Online]. Learning hierarchical convolutional features for crack detection,’’ IEEE
Available: http://cis.poly.edu/~mleung/FRE7851/f07/naiveBayesian Trans. Image Process., vol. 28, no. 3, pp. 1498–1512, Mar. 2019.
Classifier.pdf [56] W. Liu, Y. Huang, Y. Li, and Q. Chen, ‘‘FPCNet: Fast pavement
[32] C. J. C. Burges, ‘‘A tutorial on support vector machines for pattern crack detection network based on encoder-decoder architecture,’’ 2019,
recognition,’’ Data Mining Knowl. Discovery, vol. 2, no. 2, pp. 121–167, arXiv:1907.02248. [Online]. Available: https://arxiv.org/abs/1907.02248
1998. [57] J. Konig, M. D. Jenkins, P. Barrie, M. Mannion, and G. Morison,
[33] A. Jain, J. Mao, and K. Mohiuddin, ‘‘Artificial neural networks: A tuto- ‘‘A convolutional neural network for pavement surface crack segmenta-
rial,’’ Computer, vol. 29, no. 3, pp. 31–44, Mar. 1996. tion using residual connections and attention gating,’’ in Proc. IEEE Int.
[34] L. Breiman, ‘‘Random forests,’’ Mach. Learn., vol. 45, no. 1, pp. 5–32, Conf. Image Process. (ICIP), Sep. 2019, pp. 1460–1464.
2001. [58] J. Long, E. Shelhamer, and T. Darrell, ‘‘Fully convolutional networks
[35] G. Xu, J. Ma, F. Liu, and X. Niu, ‘‘Automatic recognition of pavement for semantic segmentation,’’ in Proc. IEEE Conf. Comput. Vis. Pattern
surface crack based on BP neural network,’’ in Proc. Int. Conf. Comput. Recognit. (CVPR), Jun. 2015, pp. 3431–3440.
Elect. Eng., Dec. 2008, pp. 19–22. [59] U. Escalona, F. Arce, E. Zamora, and J. H. S. Azuela, ‘‘Fully convolu-
[36] Y. Shi, L. Cui, Z. Qi, F. Meng, and Z. Chen, ‘‘Automatic road crack tional networks for automatic pavement crack segmentation,’’ Comput.
detection using random structured forests,’’ IEEE Trans. Intell. Transp. Sist., vol. 23, no. 2, pp. 451–460, 2019.
Syst., vol. 17, no. 12, pp. 3434–3445, Dec. 2016. [60] S. Ren, K. He, R. Girshick, and J. Sun, ‘‘Faster R-CNN: Towards real-time
[37] A. Marques and P. L. Correia, ‘‘Automatic road pavement crack detection object detection with region proposal networks,’’ in Proc. Adv. Neural Inf.
using SVM,’’ Ph.D. dissertation, Elect. Comput. Eng., Instituto Superior Process. Syst., 2015, pp. 91–99.
Técnico, Lisbon, Portugal, 2012. [61] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C.
[38] Y. LeCun, Y. Bengio, and G. Hinton, ‘‘Deep learning,’’ Nature, vol. 521, Berg, ‘‘SSD: Single shot multibox detector,’’ in Proc. Eur. Conf. Comput.
no. 7553, p. 436, 2015. Vis. Springer, 2016, pp. 21–37.

14542 VOLUME 8, 2020


W. Cao et al.: Review of Pavement Defect Detection Methods

[62] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, ‘‘You only look once: [85] D. Maturana and S. Scherer, ‘‘VoxNet: A 3D convolutional neural net-
Unified, real-time object detection,’’ in Proc. IEEE Conf. Comput. Vis. work for real-time object recognition,’’ in Proc. IEEE/RSJ Int. Conf.
Pattern Recognit. (CVPR), Jun. 2016, pp. 779–788. Intell. Robots Syst. (IROS), Sep. 2015, pp. 922–928.
[63] G. Suh and Y.-J. Cha, ‘‘Deep faster R-CNN-based automated detection [86] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, ‘‘3D
and localization of multiple types of damage,’’ Proc. SPIE, vol. 10598, ShapeNets: A deep representation for volumetric shapes,’’ in Proc. IEEE
Mar. 2018, Art. no. 105980T. Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015, pp. 1912–1920.
[64] Y.-J. Cha, W. Choi, G. Suh, S. Mahmoudkhani, and O. Büyüköztürk, [87] ModelNet. Accessed: Oct. 2019. [Online]. Available: https://modelnet.
‘‘Autonomous structural visual inspection using region-based deep learn- cs.princeton.edu/
ing for detecting multiple damage types,’’ Comput.-Aided Civil Infras- [88] A. Kanezaki, Y. Matsushita, and Y. Nishida, ‘‘RotationNet: Joint object
truct. Eng., vol. 33, no. 9, pp. 731–747, Sep. 2018. categorization and pose estimation using multiviews from unsupervised
[65] M. D. Zeiler and R. Fergus, ‘‘Visualizing and understanding convolu- viewpoints,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.,
tional networks,’’ in Proc. Eur. Conf. Comput. Vis., 2013, pp. 818–833. Jun. 2018, pp. 5010–5019.
[66] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘ImageNet classification [89] R. D. Singh, A. Mittal, and R. K. Bhatia. 3D Convolutional Neural
with deep convolutional neural networks,’’ Commun. ACM, vol. 60, no. 6, Network for Object Recognition. Accessed: 2017. [Online]. Available:
pp. 84–90, May 2017. https://pdfs.semanticscholar.org/218b/b5f163046166a5d13f7832d10f0
[67] J. Li, X. Zhao, and H. Li, ‘‘Method for detecting road pavement de2ab8286.pdf
damage based on deep learning,’’ Proc. SPIE, vol. 10972, Apr. 2019, [90] R. Q. Charles, H. Su, M. Kaichun, and L. J. Guibas, ‘‘PointNet:
Oct. 109722D. Deep learning on point sets for 3D classification and segmentation,’’
[68] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017,
M. Andreetto, and H. Adam, ‘‘Mobilenets: Efficient convolutional neu- pp. 652–660.
ral networks for mobile vision applications,’’ 2017, arXiv:1704.04861. [91] C. R. Qi, L. Yi, H. Su, and L. J. Guibas, ‘‘PointNet++: Deep hierarchical
[Online]. Available: https://arxiv.org/abs/1704.04861 feature learning on point sets in a metric space,’’ in Proc. Adv. Neural Inf.
[69] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, ‘‘Rethinking Process. Syst., 2017, pp. 5099–5108.
the inception architecture for computer vision,’’ in Proc. IEEE Conf. [92] Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M.
Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 2818–2826. Solomon, ‘‘Dynamic graph CNN for learning on point clouds,’’ ACM
[70] H. Maeda, Y. Sekimoto, T. Seto, T. Kashiyama, and H. Omata, ‘‘Road Trans. Graph., vol. 38, no. 5, pp. 1–12, Oct. 2019.
damage detection using deep neural networks with images captured [93] R. Medina, J. Llamas, E. Zalama, and J. Gomez-Garcia-Bermejo,
through a smartphone,’’ 2018, arXiv:1801.09454. [Online]. Available: ‘‘Enhanced automatic detection of road surface cracks by combining
https://arxiv.org/abs/1801.09454 2D/3D image processing techniques,’’ in Proc. IEEE Int. Conf. Image
[71] S. Anand, S. Gupta, V. Darbari, and S. Kohli, ‘‘Crack-pot: Autonomous Process. (ICIP), Oct. 2014, pp. 778–782.
road crack and pothole detection,’’ in Proc. Digit. Image Comput., Techn. [94] D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, ‘‘Learning
Appl., 2018, pp. 1–6. spatiotemporal features with 3D convolutional networks,’’ in Proc. IEEE
Int. Conf. Comput. Vis. (ICCV), Dec. 2015, pp. 4489–4497.
[72] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and
[95] J. Owoyemi and K. Hashimoto, ‘‘Spatiotemporal learning of dynamic
K. Keutzer, ‘‘SqueezeNet: AlexNet-level accuracy with 50x fewer param-
gestures from 3D point cloud data,’’ in Proc. IEEE Int. Conf. Robot.
eters and < 0.5 MB model size,’’ 2016, arXiv:1602.07360. [Online].
Autom. (ICRA), May 2018.
Available: https://arxiv.org/abs/1602.07360
[96] T. Furuya and R. Ohbuchi, ‘‘Deep aggregation of local 3D geometric
[73] J. Mairal, J. Ponce, G. Sapiro, A. Zisserman, and F. R. Bach, ‘‘Supervised
features for 3D model retrieval,’’ in Proc. Brit. Mach. Vis. Conf., 2016,
dictionary learning,’’ in Proc. Adv. Neural Inf. Process. Syst., 2009,
pp. 1–121.
pp. 1033–1040.
[97] A. Zeng, S. Song, M. Niebner, M. Fisher, J. Xiao, and T. Funkhouser,
[74] D. K. Mcclish, ‘‘Analyzing a portion of the ROC curve,’’ Med. Decis.
‘‘3DMatch: Learning local geometric descriptors from RGB-D recon-
Making, vol. 9, no. 3, pp. 190–195, Aug. 1989.
structions,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
[75] A. P. Bradley, ‘‘The use of the area under the ROC curve in the evalua- Jul. 2017.
tion of machine learning algorithms,’’ Pattern Recognit., vol. 30, no. 7, [98] H. Deng, T. Birdal, and S. Ilic, ‘‘PPFNet: Global context aware local fea-
pp. 1145–1159, Jul. 1997. tures for robust 3D point matching,’’ in Proc. IEEE/CVF Conf. Comput.
[76] F. Yang, L. Zhang, S. Yu, D. Prokhorov, X. Mei, and H. Ling, Vis. Pattern Recognit., Jun. 2018, pp. 195–205.
‘‘Feature pyramid and hierarchical boosting network for pavement [99] S. Xu, R. Wang, and H. Zheng, ‘‘Road curb extraction from mobile
crack detection,’’ 2019, arXiv:1901.06340. [Online]. Available: https:// LiDAR point clouds,’’ IEEE Trans. Geosci. Remote Sens., vol. 55, no. 2,
arxiv.org/abs/1901.06340 pp. 996–1009, Feb. 2017.
[77] J. Cheng, W. Xiong, W. Chen, Y. Gu, and Y. Li, ‘‘Pixel-level crack detec- [100] Y. Zhang, C. Chen, Q. Wu, Q. Lu, S. Zhang, G. Zhang, and Y. Yang,
tion using U-net,’’ in Proc. IEEE Region 10 Conf. TENCON, Oct. 2018, ‘‘A Kinect-based approach for 3D pavement surface reconstruction and
pp. 462–466. cracking recognition,’’ IEEE Trans. Intell. Transp. Syst., vol. 19, no. 12,
[78] H. Oliveira and P. L. Correia, ‘‘Automatic road crack detection and pp. 3935–3946, Dec. 2018.
characterization,’’ IEEE Trans. Intell. Transp. Syst., vol. 14, no. 1, [101] Q. Li, D. Zhang, Q. Zou, and H. Lin, ‘‘3D laser imaging and sparse
pp. 155–168, Mar. 2013. points grouping for pavement crack detection,’’ in Proc. 25th Eur. Signal
[79] Y. Liu, M.-M. Cheng, X. Hu, J.-W. Bian, L. Zhang, X. Bai, and J. Tang, Process. Conf. (EUSIPCO), Aug. 2017.
‘‘Richer convolutional features for edge detection,’’ IEEE Trans. Pattern [102] Y.-C.-J. Tsai and F. Li, ‘‘Critical assessment of detecting asphalt pave-
Anal. Mach. Intell., vol. 41, no. 8, pp. 1939–1946, Aug. 2019. ment cracks under different lighting and low intensity contrast conditions
[80] M. Eisenbach, R. Stricker, D. Seichter, K. Amende, K. Debes, using emerging 3D laser technology,’’ J. Transp. Eng., vol. 138, no. 5,
M. Sesselmann, D. Ebersbach, U. Stoeckert, and H.-M. Gross, ‘‘How to pp. 649–656, May 2012.
get pavement distress detection ready for deep learning? A systematic [103] D. F. Laefer, L. Truong-Hong, H. Carr, and M. Singh, ‘‘Crack detection
approach,’’ in Proc. Int. Joint Conf. Neural Netw. (IJCNN), May 2017, limits in unit based masonry with terrestrial laser scanning,’’ NDT E Int.,
pp. 2039–2047. vol. 62, pp. 66–76, Mar. 2014.
[81] S. Zhi, Y. Liu, X. Li, and Y. Guo, ‘‘Toward real-time 3D object recog- [104] E. Salari and G. Bao, ‘‘Automated pavement distress inspection
nition: A lightweight volumetric CNN framework using multitask learn- based on 2D and 3D information,’’ in Proc. IEEE Int. Conf. ELEC-
ing,’’ Comput. Graph., vol. 71, pp. 199–207, Apr. 2018. TRO/INFORMATION Technol., May 2011, pp. 1–4.
[82] F. Chazal, L. J. Guibas, S. Y. Oudot, and P. Skraba, ‘‘Analysis of scalar [105] J. Huang, W. Liu, and X. Sun, ‘‘A Pavement crack detection method
fields over point cloud data,’’ in Proc. 20th Annu. ACM-SIAM Symp. combining 2d with 3d information based on dempster-Shafer theory,’’
Discrete Algorithms, Jan. 2009, pp. 1021–1030. Comput.-Aided Civil Infrastruct. Eng., vol. 29, no. 4, pp. 299–313,
[83] M. J. Lee, ‘‘Method and apparatus for transforming point cloud data to Apr. 2014.
volumetric data,’’ U.S. Patent 7 317 456, Jan. 8, 2008. [106] A. Zhang, K. C. P. Wang, B. Li, E. Yang, X. Dai, Y. Peng, Y. Fei, Y. Liu,
[84] H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller, ‘‘Multi-view J. Q. Li, and C. Chen, ‘‘Automated pixel-level pavement crack detection
convolutional neural networks for 3D shape recognition,’’ in Proc. IEEE on 3D asphalt surfaces using a deep-learning network,’’ Comput.-Aided
Int. Conf. Comput. Vis. (ICCV), Dec. 2015, pp. 945–953. Civil Infrastruct. Eng., vol. 32, no. 10, pp. 805–819, Oct. 2017.

VOLUME 8, 2020 14543


W. Cao et al.: Review of Pavement Defect Detection Methods

[107] A. Zhang, K. C. P. Wang, Y. Fei, Y. Liu, S. Tao, C. Chen, J. Q. Li, and QIFAN LIU is currently pursuing the M.Eng.
B. Li, ‘‘Deep learning-based fully automated pavement crack detection on degree in communication and information engi-
3D asphalt surfaces with an improved CrackNet,’’ J. Comput. Civ. Eng., neering with Shenzhen University, Shenzhen,
vol. 32, no. 5, Sep. 2018, Art. no. 04018041. China. His research interests include image pro-
[108] Y. Fei, K. C. P. Wang, A. Zhang, C. Chen, J. Q. Li, Y. Liu, G. Yang, and cessing and machine learning.
B. Li, ‘‘Pixel-level cracking detection on 3D asphalt pavement images
through deep-learning-based CrackNet-V,’’ IEEE Trans. Intell. Transp.
Syst., vol. 21, no. 1, pp. 273–284, Jan. 2020.
[109] A. Zhang, K. C. P. Wang, Y. Fei, Y. Liu, C. Chen, G. Yang, J. Q. Li,
E. Yang, and S. Qiu, ‘‘Automated pixel-level pavement crack detection
on 3D asphalt surfaces with a recurrent neural network,’’ Comput.-Aided
Civil Infrastruct. Eng., vol. 34, no. 3, pp. 213–229, Mar. 2019.
[110] K. Kamal, S. Mathavan, T. Zafar, I. Moazzam, A. Ali, S. U. Ahmad,
and M. Rahman, ‘‘Performance assessment of Kinect as a sensor for
pothole imaging and metrology,’’ Int. J. Pavement Eng., vol. 19, no. 7,
pp. 565–576, Jul. 2018.

WENMING CAO received the M.S. degree from


the System Science Institute, Chinese Academy of ZHIQUAN HE received the M.S. degree from
Sciences, Beijing, China, in 1991, and the Ph.D. the Institute of Electronics, Chinese Academy of
degree from the School of Automation, Southeast Sciences, in 2001, and the Ph.D. degree from the
University, Nanjing, China, in 2003. From 2005 to Department of Computer Science, University of
2007, he was a Postdoctoral Researcher with the Missouri-Columbia, in 2014. He is currently an
Institute of Semiconductors, Chinese Academy of Assistant Professor with the College of Informa-
Sciences. He is currently a Professor with Shen- tion Engineering, Shenzhen University, China. His
zhen University, Shenzhen, China. His research research interests include image processing, com-
interests include pattern recognition, image puter vision, and machine learning.
processing, and visual tracking.

14544 VOLUME 8, 2020

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy