0% found this document useful (0 votes)
29 views10 pages

b83 Finalversion Esigelec WSCG 30-04-2019

Uploaded by

Nimra Salahuddin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views10 pages

b83 Finalversion Esigelec WSCG 30-04-2019

Uploaded by

Nimra Salahuddin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Real Time Pedestrian and Object Detection and

Tracking-based Deep Learning. Application to Drone


Visual Tracking
Redouane Khemmar, Matthias Gouveia, Benoit Decoux, Jean-Yves y Ertaud

To cite this version:


Redouane Khemmar, Matthias Gouveia, Benoit Decoux, Jean-Yves y Ertaud. Real Time Pedestrian
and Object Detection and Tracking-based Deep Learning. Application to Drone Visual Tracking.
WSCG’2019 - 27. International Conference in Central Europe on Computer Graphics, Visualization
and Computer Vision’2019, May 2019, Plzen, Czech Republic. �10.24132/CSRN.2019.2902.2.5�. �hal-
02343365�

HAL Id: hal-02343365


https://hal.science/hal-02343365v1
Submitted on 2 Nov 2019

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est


archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents
entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non,
lished or not. The documents may come from émanant des établissements d’enseignement et de
teaching and research institutions in France or recherche français ou étrangers, des laboratoires
abroad, or from public or private research centers. publics ou privés.
Real Time Pedestrian and Object Detection and
Tracking-based Deep Learning: Application to Drone
Visual Tracking

R. Khemmar M. Gouveia B. Decoux


Institute for Embedded Institute for Embedded Institute for Embedded
Systems Research, Systems Research, Systems Research,
ESIGELEC, UNIRouen, ESIGELEC, UNIRouen,
ESIGELEC, UNIRouen, Normandy University, Normandy University,
Normandy University,
Saint Etienne du Rouvray, Saint Etienne du Rouvray,
Saint Etienne du Rouvray, 76800, France 76800, France
76800, France benoit.decoux@esigelec.fr
redouane.khemmar@esigelec.fr gouveia.matthias@hotmail.com

JY. Ertaud
Institute for Embedded Systems
Research, ESIGELEC,
UNIRouen, Normandy
University,
Saint Etienne du Rouvray,
76800, France
jean-yves.ertaud@esigelec.fr

ABSTRACT
This work aims to show the new approaches in our improved HOG approach and implemented two
embedded vision dedicated to object detection and kinds of PID controllers. The platform has been
tracking for drone visual control. Object/Pedestrian validated under different scenarios by comparing
detection has been carried out through two methods: measured data to ground truth data given by the drone
1. Classical image processing approach through GPS. Several tests which were ca 1rried out at
improved Histogram Oriented Gradient (HOG) and ESIGELEC car park and Rouen city center validate
Deformable Part Model (DPM) based detection and the developed platform.
pattern recognition methods. In this step, we present Keywords
our improved HOG/DPM approach allowing the Object detection, object recognition, visual tracking,
detection of a target object in real time. The developed tracking, pedestrian detection, deep learning, visual
approach allows us not only to detect the object servoing, HOG, DPM.
(pedestrian) but also to estimates the distance between
the target and the drone. 2. Object/Pedestrian
detection-based Deep Learning approach. The target 1. INTRODUCTION
position estimation has been carried out within image The works presented in this paper are a part of
analysis. After this, the system sends instruction to the ADAPT1 project (Assistive Devices for empowering
drone engine in order to correct its position and to disAbled People through robotic Technologies) which
track target. For this visual servoing, we have applied

1This cooperation projects in the Channel border region between France


work is carried out as part of the INTERREG VA FMA
ADAPT project ”Assistive Devices for empowering disAbled and England. The Programme is funded by the European Regional
People through robotic Technologies” http://adapt- Development Fund (ERDF).
project.com/index.php. The Interreg FCE Programme is a European
Territorial Cooperation programme that aims to fund high quality
focuses on smart and connected wheelchair to multiresolution pedestrian model and shows superior
compensate for user disabilities through driving detection performance than classical DPM approaches
assistance technologies. One of the objectives of the in the detection of small pixel-sized pedestrians. The
project is to develop an Advanced Driver-Assistance system was evaluated with the Caltech Pedestrian
System (ADAS) platform for object detection, benchmark [2], which is the largest public pedestrian
recognition, and tracking for wheelchair applications database. The practicality of the system is
(object detection, obstacle avoidance, etc.). The work demonstrated by a series of use case experiments that
presented in this paper is related to object/pedestrian uses Caltech video database. The discriminatively
detection. In general, ADAS is used to improve safety trained, multiresolution DPM is presented as an
and comfort in vehicles. ADAS is based on the algorithm between generative and discriminative
combination of sensors (RADAR, LIDAR, cameras, model [3][7]. The algorithm has different steps:
etc.) and algorithms that ensure safety of vehicle, building a pyramid of images at different scales, using
driver, passenger and pedestrian based on different several filters and part filters to get responses. The
parameters such as traffic, weather, etc. [26]. Here in algorithm combines these different responses in a star-
this project, ADAS aims to detect pedestrian. Our like model then uses a cost function, and trains
contribution aims to develop a perception system based classifiers by Support Vector Machine (SVM)
on object detection with different approaches such as classifier. This algorithm is still a widely used
HOG, DPM, and Deep Learning. This paper is algorithm particularly in combination with DPM [8].
organized as follows: Section 1 introduces the As another method of object detection, the Integral
motivation of the paper. Section 2 presents the state of Channel Features (ICF) [1], can find a combination of
the art about object detection/tracking and visual multiple registered image channels, which are
control. Section 3 presents a comparison between the computed by linear and nonlinear transformations [9].
adopted object detection algorithms and some results Integrating some features like HOG and do a training
obtained. Section 4 illustrates our improved by AdaBoost in a cascade way can lead to pedestrian
HOG/DPM approach applied to object/pedestrian detection with good accuracy [9]. The sliding window
detection and tracking. In the same section, we present methods (also called pyramid methods) are used in
an innovative solution to estimate the distance object detection with a high cost of detection time. In
separating objects to the vehicle or to the drone. The a recent work, proposing high-recall important regions
visual control system-based multi approach controller is widely used [10]. In another way, the approaches
will be presented in Section 5. Finally, in Section 6, based on Deep Learning, also called Convolutional
we will conclude this paper. Neural Networks (CNN), become very successful for
feature extraction in the image classification task
[33][34]. Rich Feature Hierarchies for Convolutional
2. STATE OF THE ART AND Neural Networks (RCNN) model [12], that combines
RELATED WORK CNN and selective search can be taken as an example.
This algorithm has made a huge progress on object
State of the Art detection task like PASCAL VOC. It will be presented
Object detection is a key problem in computer vision in the next section.
with several applications like automotive,
manufacturing industry, mobile robotics, assisted Related Work
living, etc. Pedestrian detection is a particular case of In the literature, for similar tasks, several approaches
object detection that can improve road security and is have been used for object detection and pattern
considered as an important ADAS component in the recognition, such as HOG/DPM, KLT/RMR (Kanade-
autonomous vehicle. In [13], a very in-depth state of Lucas-Tomasi/Robust Multiresolution Estimation of
the art for pedestrian detection is presented. Three Parametric Motion Models) and Deep Learning [1].
main contributions are developed by the authors: 1. For example, in Google Robot’s Project [17], a deep
Study of the statistics of the size, position, and learning model is applied to articulated robot arm for
occlusion patterns of pedestrians in urban scenes in a the picking up of objects. In Kitti Vision Benchmark
large pedestrian detection image dataset, 2. A refined Suite Project (KIT and Toyota Technological
per frame evaluation methodology that allows to carry Institute) [18], an object detection and orientation
out informative comparisons, including measuring estimation benchmark is carried out. The system
performance in relation to scale and occlusion, and 3. allows not only localization of objects in 2D, but also
Evaluation of 16 pre-trained state of the art detectors estimation of their orientation in 3D [18]. In [19], an
[13]. As a main conclusion of the study, detection is example of end-to-end object detection from
disappointing at low resolutions and for partially Microsoft is described. For the task of detecting
occluded pedestrians. In [1], a real-time pedestrian objects in images, recent methods based on
detection with DPM algorithm applied to automotive convolutional neural networks (CNN, Deep-Learning)
application is presented. The system is based on a like SSD [18][20] allow detection of multiple objects
in images with high accuracy and in real-time. that are very powerful to find matching between
Furthermore, the SSD model is monolithic and images and to detect objects. These methods allow the
relatively simple compared to other models, making it extraction of visual features which are invariant to
easier to use for various applications. For the task of scale, rotation and illumination. Despite their
grasping objects, people from the AI-Research of robustness to changing perspectives and lighting
Google have recently used Deep Learning to learn conditions, we found that the approach is not robust
hand-eye coordination for robotic grasping [21]. The for object/pedestrian detection. SURF is better as it is
experimental evaluation of the method demonstrates up to twice as fast as SIFT. However, SIFT is better
that it achieves effective real-time control, and it can when a scale change or an increase in lighting is
successfully grasp new objects, and correct mistakes applied to the image. For pedestrians detection, SIFT
by continuous servoing (control). In many and SURF therefore remain insufficient for our
applications of detection of objects like pedestrian, application. We have also experimented KLT
cyclists and cars, it is important to estimate their 3D approach [15] for extraction of points of interest and
orientation. In outdoor environments, solutions based tracking them between an image taken at t-1 and
on Deep Learning have been recently shown to another image taken at t. The approach has high
outperform other monocular state-of-the-art accuracy and is fast, but on the other hand, it is not
approaches for detecting cars and estimating their 3D robust to perturbations, for example when the target is
orientation [22][23]. In indoor environments, it has displaced too much, and in case of highly textured
been shown that using synthetic 3D models of objects images. This is why we have decided to experiment
to be detected in the learning process of a CNN can the RMR approach [16], which has very low precision
simplify it [23]. In [24] and [25], we can find an but is considered to be very robust. We have obtained
evaluation of the state of the art object (pedestrian) results which are same as KLT approach. A hybrid
detection approaches based on HOG/DPM. In [31], B. KLT/RMR approach would be a better solution where
Louvat et al. have presented a double (cascade) we can have benefits of both the approaches.
controller for drone-embedded camera. The system is
based on two aspects: target position estimation-based Object Detection-based HOG/DPM
KLT/RMR approaches and control law for the target HOG and DPM algorithms first calculate image
tracking. The developed platform was validated on features. They apply classifiers on databases of
real scenarios like house tracking. In [32], B. Hérissé positive images (with pedestrian) and negative images
has developed in his PhD thesis an autonomous (without pedestrian). They have the advantage of
navigation system for a drone in an unknown being accurate and give relatively good results;
environment based on optical flow algorithms. Optical however, their calculating time is high. We have
flow provides information on velocity of the vehicle therefore experimented three approaches: HAAR,
and proximity of obstacles. Two contributions were HOG and DPM.
presented: automatic landing on a static or mobile 3.1.1 Pseudo-HAAR Features
platforms and field following with obstacle avoidance. HAAR classifiers use features called pseudo-HAAR
All algorithms have been tested on a quadrotor UAV [36][27]. Instead of using pixel intensity values, the
built at CEA LIST laboratory. pseudo-HAAR features use the contrast variation
between rectangular and adjacent pixel clusters. These
variations are used to determine light and dark area.
3. OBJECT DETECTION APPROACH To construct a pseudo-HAAR feature, two or three
In order to identify the most suitable approach to our adjacent clusters with relative contrast variation are
object and/or pedestrian detection application for the required. It is possible to change the size of features
autonomous vehicle (a drone in this study), we need to by increasing or decreasing the pixel clusters. Thus, it
establish the feasibility of several scientific concepts makes it possible to use these features on objects with
in pattern recognition approaches such as the different sizes.
KLT/RMR, SIFT/SURF, HOG/DPM but especially 3.1.2 HOG Algorithm
recent approaches of the artificial intelligence field, HOG is a descriptor containing key features of an
such as Deep-Learning. In this paper, we will focus on image. These features are represented by the
object detection and tracking-based through improved distribution of image gradient directions. HOG is an
HOG/DPM approaches. algorithm frequently used in the field of pattern
recognition. It consists of five steps:
Classical Approaches 1. Sizing of the calculation window ; by default
We have started out our experimentations by the size of the image to be processed is 64x128
implementing point of interest approaches like Scale pixels
Invariant Feature Transform (SIFT) [9] and Speeded- 2. Calculation of image gradients using simple
Up Robust Features (SURF) [35], based on descriptors masks
3. Image division of 64x128 into 8x8 cells. For resolution). Left Column: HOG algorithm, Right
each cell, HOG algorithm calculates the Column: DPM algorithm
histogram.
4. Normalization of histograms by 16x16 blocks
(ie. 4 cells)
5. Calculation of the size of the final descriptor.
The obtained descriptor is then given to a SVM
classifier. The classifier needs many positive and
negative images. The mixture HOG/SVM gives good
results with limited computing resources.
3.1.3 DPM Algorithm
The hardest part for object detection is that there are
many variances. These variances arise from Figure 2. Comparison between HOG and DPM
illumination, change in viewpoint, non-rigid applied under ESIGELEC dataset (640x480-image
deformation, occlusion, and intra-class variability [4]. resolution). Top row: HOG algorithm, Bottom
The DPM method is aimed at capturing those row: DPM algorithm.
variances. It assumes that an object is constructed by
its parts. Thus, the detector will first find a match by
coarser (at half the maximum resolution) root filter, Object Detection-based Deep Learning
and then using its part models to fine-tune the result. Many detection systems repurpose classifiers or
DPM uses HOG features on pyramid levels before localizers to perform detection. They apply the
filtering, and linear SVM as a classifier with training classification to an image at multiple locations and
to find the different part locations of the object in the scales. High scoring regions of the image are
image. Recently, new algorithms have been developed considered as positives detections [4][6]. CNN
in order to make DPM faster and more efficient [4][8]. classifier, like RCNN, can be used for this application.
This approach gives good results but require many
Pedestrian Detection-based number of iterations to process a single image [4][6].
HAAR/HOG/DPM Many detection algorithms using selective search
We have carried out all the experiments with 6 [4][5] with region proposals have been proposed to
different databases dedicated to the pedestrian avoid exhaustive sliding window. With deep learning,
detection: ETH, INRIA, TUD Brussels, Caltech, the detection problem can be tackled in new ways,
Daimler and our own ESIGELEC dataset dedicated to with algorithms like YOLO and SSD. We have
the pedestrian detection. Fig. 1 and Fig.2 shows implemented those two algorithms for object detection
respectively results obtained under ETH and in real time [14], and obtained very good results
ESIGELEC datasets. (qualitative evaluation). Fig. 3 shows an example of
our deep learning model applied in pedestrian
detection within ESIGELEC dataset at the city center
of Rouen.

Figure 3. Object/Pedestrian detection-based deep


learning (Yolo and SSD). The results have been
obtained under GPU Nvidia Quadro K4000M
machine.

Figure 1. Comparison between HOG and DPM


applied under ETH dataset (640x480-image
4. OBJECT DETECTION AND surrounding the object to be detected as a bounding
TRACKING-BASED FASTER DPM box (yellow box in figure 4), we built a New RoI
(NRoI) by providing a tolerance zone (a new rectangle
ALGORITHM like the green one in figure 4 which is larger than the
Improved HOG Algorithm first one). Starting form the second iteration, DPM
We have identified four methods to improve algorithm is applied only in this new image, which is
HOG/DPM algorithm, which correspond to the five represented by NRoI. If the target to be detected is lost,
step of the HOG algorithm. In order to be able to make the DPM algorithm is re-applied over the entire image.
comparisons with the original version, a classifier was In Fig. 4, we can see that Faster-DPM improves target
trained on the INRIA and TUD-Brussels image detection by reducing time from 4 to 8 times less than
datasets. The number of available images in these classic DPM.
datasets is relatively small compared to some other
datasets, which has an impact on the pedestrian
detection quality. The objective is to improve the
quality of HOG/DPM pedestrian detection and to
significantly reduce the computation time for the
detection in road scenes. The tests were carried out on
the ESIGELEC dataset; 1. Gamma Correction: in
order to improve the pedestrian image quality, we
performed a gamma pre-processing on the images to
be trained and tested. Gamma correction can
illuminate more the dark areas in images if gamma
coefficient is greater than 1 or, in the contrary, darken
them if the coefficient is less than 1. We found fewer
parasites in the processed images. 2. Image Resizing: Figure 4. Comparison between classic DPM and
HOG algorithm performs calculations on images of Faster-DPM. Left top and bottom image: object
size 64x128 pixels. We performed calculation with detection-based DPM, Right top and bottom
windows size of 128x256. By doubling the size of the image: object detection-based Faster-DPM with
calculation window, we have improved the accuracy adaptive RoI (green box).
but also doubled the computation time (for example
from 58ms in the classic HOG to 115ms for the
improved HOG). 3. Negative Gradients: when Pedestrian Distance Estimation
calculating gradients, improved HOG uses negative To have a better information on the detected
gradients (180° to 0) and positive gradient (0 to 180°) pedestrians, it is necessary to estimate the distance
like classical HOG. This allows the calculation of separating the pedestrians from the vehicle. As our
histogram with 18 values (9 values in classic HOG). system is based on a monocular camera, the
The calculation time does not vary, however, by measurement of this distance has to be estimated. The
taking the negative gradients (signed gradient) into law called “Inverse Squares”, used in astronomy to
account, a small improvement was carried out but calculate the distance between stars, inspired us:
considered as not significant. However, we found “physical quantity (energy, force, etc.) is inversely
presence of noise, which does not improve the classic proportional to the square of the distance of stars”.
HOG. 4. Normalization Vector: as a last step, HOG By analogy, the physical quantity represents the area
performs a standardization of 2x2 cells, ie. 32x32 of our RoI (bounding box surrounding the target). We
pixels with a pitch of 16x16. The object detection is have used a parametric approach. We have taken
degraded and the computation time doubles, which measurements at intervals of 50 cm, and the
cannot considered as an improvement of classic HOG. corresponding object RoI surfaces have been
recorded. Fig. 5 shows the result obtained.

Faster-DPM Algorithm
The DPM algorithm is applied on the entire image and
this is done at each iteration of the pedestrian video
sequence. In order to reduce the computing time, the
idea is to apply HOG algorithm only in a Region of
Interest (RoI) in which the target (here the pedestrian)
is located. This will drastically reduce the computing
time and will also better isolate the object. Firstly, the
DPM is applied all over the image once to locate the
object. Secondly, and after obtaining a RoI
Figure 5. Detected RoI surface (blue curve) vs system, the instructions are considered as speeds sent
distance from the object (red curve): y axis to drone engines.
represents the surface (pixels2), and x axis Firstly, we have applied a Proportional (P) controller:
represents distance (m).
 Altitude : K p = 0.004
 Forward translation : K p = 0.08
The blue curve looks like a square root. This is
Secondly, we have applied a classic PID controller (3):
validated by the trend curve, whose as equation (1) is:
t
Vrotation = K p ∗ erreur + K i ∗ ∫0 erreur ∗ dt +
−1.905
𝑦 = 262673 ∗ 𝑥 (1) Kd ∗
d
where x is the abscissa and y the ordinate. (erreur) (3)
dt
The equation of blue curve is (2):

𝑆 = 𝐴 ∗ 𝑑 −2 (2) With Rotation: K p = 0.0041, K i = 0.0003, K d =


0.0003
where 𝑆 is the surface in pixel2, 𝑑 is the distance to be Lastly, in (4), we have applied an AST (nonlinear PID)
estimated, and A = 262673.
controller:
The ideal detection zone is between 3 and 7 meters Vrotation
with an error of 5%. The Tab. 1 illustrates the erreur
= Kp ∗ ( ) ∗ √|erreur| + K i
calibration process carried out for the measurement |erreur| + A
t
environment. erreur
∗∫ dt (4)
0 |erreur| +A
Distance (m) RoI Surface (pixel2) K = S * d-2
2.5 44044 275275 with A = 0.001, K p = 0.013, and K i =0.02.
3 33180 298620
The P controller is amply enough to enslave the
3.5 25392 311052 altitude of the drone and its translation. However it is
4 19200 307200 the rotational slaving that predominates; we need to
4.5 14360 290790 keep the target in the center of the image and the target
5 11102 277550 moves strongly from right to left and vice versa.
5.5 10920 330330 Overall, the visual servoing that we have adopted uses
6 8427 303372 the data extracted during the HOG/DPM or deep
6.5 8216 347126 learning image processing (visual controller) and
7 8348 311052
sends commands to the drone actuators. This is a
double controller: speed control (internal loop) to
8 4800 307200
control the motors and servo positioning (external
Table 1. Measurement environment calibration. loop) to position the drone according to the target
position to track. The correction can be done with
classical correctors (P, PI, PID) or more advanced
5. VISUAL CONTROL-BASED MULTI
commands like for example AST controller. Fig. 6
APPROACH CONTROLLER illustrates the architecture of our cascade control used.
Drone Double Controller
The speed servo control of the drone allows it to track
the target continuously and in real time. The image
processing as a closed loop gives the coordinates of
the target in (x,y,z) plan. Using this information, the
system send instruction to the Drone engines
corresponding to the real time position of the target in
order to correct the position of the drone. We have
developed two kinds of controllers: Proportional-
Integral-Derivative (PID) and Adaptive Super
Twisting (AST) controller [28]. The controller system
is based on three corrections: 1. Drone altitude
correction, 2. Drone rotation speed, and 3. Drone Figure 6. Double controller with external (black)
navigation. Under Robot Operating System (ROS) and internal loop (in red).
Test & Validation
Indoor and outdoor environment tests were carried out
throughout the project. For the final test phases,
scenarios have been established. In order to validate
the final platform, several scenarios has been defined.
The platform gives very good results. Better results are
obtained when the target (person) is alone in the
environment. Performance is reduced when the target
is in a group, despite the use of an adaptive region of
interest. In addition, there is always the risk of losing
the target, object detection algorithms do not have a Figure 7. PID and AST comparison in
detection rate of 100% and sometimes the person to Object/Person detection and tracking-based on our
track is no longer detected. If the target is no longer improved HOG/DPM. Top image: target square
detected, the drone is ordered to switch to stationary trajectory (blue line) carried out but the mobile
mode as a filled situation. We are correcting this target (person). Bottom image: comparison for the
problem through two different methods: 1. An different trajectory carried out by person (ground
estimation of the future position of the target based on truth data), PID controller and AST controller. x
improved Kalman filter that gives better results; from and y axis represents the geographic coordinate
now on, we are able to predict the position of the target system coordinates.
to be followed and thus to minimize a possible
confusion of this target with another target of the same
nature (as for example the case of several people who 6. EXPERIMENTAL RESULTS
walk together). A second approach is also under The developments were carried out on an Ubuntu
development which concerns deep learning not only Linux platform with ROS, which is a set of computer
for object/people detection and tracking, but also for tools for the robotics environment that includes a
object distance and orientation estimation. To collection of tools, libraries and conventions that aim
compare the performance of the two implemented to simplify the process of operation, thus allowing
control laws (PID controller and AST controller), the more complex and robust robot behavior [29]. The
GPS coordinates of each trajectory were recorded and ROS architecture developed comprises 4 different
compared. In order to illustrate the results obtained, nodes: Node 1: image acquisition from the drone's
the target (here a person) has made a reference embedded camera, Node 2: image processing and
trajectory in the form of a square. Fig. 7 shows the calculation of the position of the target to follow, Node
performances obtained on a square shape trajectory. 3: visual servoing-based on PID or AST controller,
We can see that the PID controller is more accurate and Node 4: Sending commands to the drone (manual
than the AST controller. and/or autonomous steering). The drone used is a
Parrot Bebop 2 with 22 minutes of autonomy, weight
420g, range of 300m, camera resolution of 14 Mpx,
and maximum speed of 13m/s. We have carried out
several tests with six datasets dedicated to pedestrian
detection. In this section, we present the tests carried
out on the ESIGELEC dataset including tests
scenarios under real traffic conditions in Rouen
shopping center car park and Rouen city center. The
results obtained are shown in Fig. 8.
Programme is funded by the European Regional
Development Fund (ERDF). Many thanks to the
engineers of Autonomous Navigation Laboratory of
IRSEEM for the support during platform tests.

9. REFERENCES
[1]. H. Cho, P. E. Rybski, A. B-Hillel, W. Zheng.:
Figure 8. Pedestrian Detection with improved Real-time Pedestrian Detection with Deformable
HOG/DPM approach under ESIGELEC Part Models, 2012
Indoor/Outdoor dataset. Left top image: [2]. Caltech Pedestrian Detection Benchmark
Pedestrian detection with calculated distance Homepage,
under DPM, Right top image: Pedestrian http://www.vision.caltech.edu/Image_Datasets/C
detection-based SSD deep learning model, Left altechPedestrians/, last accessed 2018/01/14.
bottom image: Pedestrian detection-based DPM, [3]. Felzenszwalb, P., McAllester, D., & Ramanan, D.
Right bottom image: Pedestrian detection-based A discriminatively trained, multiscale,
Faster-DPM. deformable part model. In Computer Vision and
Pattern Recognition, 2008. CVPR 2008. IEEE
Conference on (pp. 1-8). IEEE (June 2008).
7. CONCLUSION [4]. Jong-Chyi Su.: State of the Art object Detection
In this paper, we have presented a new approach for Algorithms. University of California, San Diego,
object detection and tracking applied for drone visual 9500 Gilman Dr. La Jolla, CA., 2014.
control. A comparison study between different [5]. Koen E. A. van de Sande, Jasper R. R. Uijlings,
approaches dedicated to pattern recognition and TheoGevers, Arnold W. M. Smeulders,
object/pedestrian detection has been carried out. We Segmentation As Selective Search for Object
have present our contribution to improve the quality Recognition, ICCV, 2011.
of both HOG and DPM algorithms. We have also [6]. Alex Krizhevsky , Ilya Sutskever , Geoffrey E.
implemented deep learning based Yolo and SSD Hinton. Imagenet classification with deep
algorithm for pedestrian detection. An innovative convolutional neural networks, NIPS, 2012.
method for distance calculation of pedestrian/object is [7]. Ross Girshick, Jeff Donahue, Trevor Darrell,
presented in this paper. The approach were validated Jitendra Malik, Rich feature hierarchies for
within experiments and the accuracy is 5% of errors. accurate object detection and semantic
The system detects not only object/target to follow but segmentation, CVPR, 2014.
also their distance from the drone. The system is [8]. P. Dollar, C. Wojek, B. Schiele.: Pedestrian
generic so that it is "applicable" on any type of Detection: An Evaluation of the State of the Art.
platform and/or environment (pedestrian detection for IEEE Transactions on Pattern Analysis and
autonomous vehicle and smart mobility, object Machine Intelligence,Volume: 34, Issue: 4, April
detection for smart wheelchair, object detection for 2012.
autonomous train, etc.). This work aims to show the [9]. D. G. Lowe. Distinct Image Features from Scale-
feasibility of scientific and technological concepts that Invariant Keypoints. Computer Science
affect object detection and tracking-based classic Department. University of British Columbia.
approaches like HOG/DPM, but also deep learning Vancouver, B. C. Canada. January, 5, 2004 28.
approaches-based Yolo or SSD. We have validated the [10]. Kanade-Lucas-Tomasi. KLT Feature
whole development under several scenarios by using Tracker. Computer Vision Lab. Jae Kyu Suhr.
both parrot drone platform and real vehicle in real Computer Vision (EEE6503) Fall 2009, Yonsei
traffic conditions (city center of Rouen in France). Uni.
[11]. J. M. Odobez and P. Bouthemy. Robust
Multiresolution Estimation of Parametric Motion
8. ACKNOWLEDGMENTS Models. IRIS/INRIA Rennes, Campus de
This research is supported by ADAPT project (This Beaulieu, February, 13, 1995.
work is carried out as part of the INTERREG VA [12]. Object Detection Homepage,
FMA ADAPT project ”Assistive Devices for http://cseweb.ucsd.edu/~jcsu/reports/ObjectDete
empowering disAbled People through robotic ction.pdf, last accessed 2018/01/14.
Technologies” http://adapt-project.com/index.php. [13]. Yan, J., Lei, Z., Wen, L., & Li, S. Z. The
The Interreg FCE Programme is a European fastest deformable part model for object
Territorial Cooperation programme that aims to fund detection. In Proceedings of the IEEE Conference
high quality cooperation projects in the Channel on Computer Vision and Pattern Recognition (pp.
border region between France and England. The 2497-2504). 2014.
[14]. Yolo Homepage, 2009 IEEE Conference on Computer Vision and
https://pjreddie.com/darknet/yolo/, last accessed Pattern Recognition. IEEE, Piscataway, NJ, pp.
2018/01/19. 304-311. ISBN 978-1-4244-3991-1. 2009.
[15]. Koen E. A. van de Sande, Jasper R. R. [26]. Junjie Yan, Xucong Zhang, Zhen Lei,
Uijlings, Theo Gevers, Arnold W. M. Smeulders, Shengcai Liao, Stan Z. Li. Robust Multi-
Segmentation As Se-lective Search for Object Resolution Pedestrian Detection in Traffic
Recognition, ICCV, 2011. Scenes. Computer Vision Foundation. CVPR
[16]. Alex Krizhevsky , Ilya Sutskever , Geoffrey 2013.
E. Hinton. Imagenet classification with deep [27]. http://www.bmw.com/com/en/insights/techn
convolutional neural networks. NIPS, 2012. ology/efficientdynamics/phase_2/, last accessed
[17]. Dave Gershgon, Google’s Robot Are 2018/01/14.
Learning How to Pick Things Up. Popular [28]. P. Viola, M. Jones. Rapid Object Detection
Science, March 8, 2016. using a Boosted Cascade of Simple Features.
[18]. Andreas Geiger, Philip Lenz, Christoph Computer Vision and Pattern Recongition
Stiller, Raquel Urtasun. Object Detection Conferencies. 2001.
Evaluation. The Kitti Vision Benchmark Suite. [29]. Mohamed, G., Ali, S. A., & Langlois, N.
Karlsruhe Institute of Technology. IEEE CVPR, (2017). Adaptive Super Twisting control design
2012. for manufactured diesel engine air path. The
[19]. Eddie Forson. Understanding SSD International Journal of Advanced Manufacturing
MultiBox-Real Time Object Detection in Deep Technology, 1-12. 2017.
Learning. Towards Data Science. November, [30]. ROS homepage:
18th. 2017. http://www.generationrobots.com/blog/fr/2016/0
[20]. Liu W., Anguelov D., Erhan D., Szegedy C., 3/ros-robot-operating-system-3/. Last accessed
Reed S., Fu C.-Y., Berg A. C., (2016, déc.). 2017/12/12.
Single-Shot Multibox Detector. [31]. B. Louvat, “Analyse d’image pour
https://arxiv.org/abs/1512.02325. l’asservissement d’une camera embarquée dans
[21]. Levine S., Pastor P., Krizhevsky A., Ibarz J., un drone”. Gipsa-lab, February, 5th, 2008,
Quillen D. (2017, June). Learning Hand-Eye Grenoble. 2008.
Coordination for Robotic Grasping with [32]. B. Hérissé, “Asservissement et navigation
DeepLearning and Large-Scale Data Collection. autonome d’un drone en environnement incertain
The International Journal of Robotics Research. par flot optique. PhD thesis, Université Sophia
[22]. Chabot F., Chaouch M., Rabarisoa J., Antipolis, UNSA et I2S CNRS, November 19th,
Teulière C., Château T.,. (2017, July). Deep 2010.
MANTA: A Coarse-to-fine Many-Task Network [33]. R. Girshick, “Fast R-CNN,” ICCV, 2015
for joint 2D and 3D vehicle analysis from [34]. S. Ren et al., “Faster R-CNN: Towards Real-
monocular image. IEEE Conference on Computer Time Object Detection with Region Proposal
Vision and Pattern Recognition. Networks,” arXiv:1506.01497. 2015.
[23]. Mousavian A., Anguelov D., Flynn J., [35]. H. Bay, T. Tuytelaars, and L. V. Gool.
Kosecka J. (2017, July). 3D Bounding Box “SURF: Speeded Up Robust Features”, Computer
Estimation Using Deep Learning and Geometry. Vision – ECCV 2006, pp. 404-417. 2006.
IEEE Conference on Computer Vision and [36]. D. Gerónimo, A. López, D. Ponsa, A.D.
Pattern Recognition. Sappa. “Haar wavelets and edge orientation
[24]. Georgakis G., Mousavian A., Berg A. C., histograms for on–board pedestrian detection”.
Kosecka J. (2017, July). Synthesizing Training In: Pattern Recognition and Image Analysis, pp.
Data for Object Detection in Indoor Scenes. IEEE 418-425. 2007.
Conference on Computer Vision and Pattern
Recognition
[25]. P. Dollar, C. Wojek, B. Schiele, and P.
Perona. Pedestrian Detection: A Benchmark. In:

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy