Lanes Curves Detection Report
Lanes Curves Detection Report
Graduation Project
supervised by :
Dr. -Eng. Atheer
Dr.QAZDAR Aimad
Dr.EL HASSAN Abdelwahed
Realized by : Jury members:
Dr. -Eng. Atheer
ITOULI Oussama
Dr.QAZDAR Aimad
KAMEL Aymane
Dr.EL HASSAN Abdelwahed
Dr.ZAHIR Jihad
Dr.QAFFOU Issam
Dr.EL BACHARI Essaid
Dr.EL QASSIMI Sara
Session Year : 2021-2022
Acknowledgements
We’d also like to thank our dear supervisors, Dr. -Eng. Atheer, Dr.
Abdelwahed, Dr. EL Bachari, and our supervisor and examiner, Dr.
Qazdar, for guiding us in the right direction throughout this project
and bravely reading through all of our endless drafts. And thanks to
Dr. Zahir for her inspiration. Even in small talk, she would make
suggestions to boost our confidence.
We would also like to thank the individuals behind this program for
helping us and other groups recognize that higher education should
have no borders..
1
Contents
1 Introduction 4
1.1 Context and Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Challenges and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 General challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Conduct methodology (CRISP Methodology) . . . . . . . . . . . . . . . . . . . 5
4 Datasets 16
4.1 Dataset source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.2 Dataset preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2
6 Implementation and Evaluation 22
6.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.2.1 training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.2.2 testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
7 Deployment 26
7.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
7.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.3 Real time detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8 General Conclusion 32
8.1 Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
8.2 Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3
Chapter 1
Introduction
4
1.2.1 General challenges
As newcomers to the Ai Research field, we faced several challenges that we had to overcome
in order to complete this project. The first was the lack of essential information to understand
the project. Despite the fact that this is a recent complex project that requires a significant
amount of knowledge. It also requires extensive work to finish within the two months’ deadline,
which is insufficient for such a subject. We did, however, have a good time and received a real
experience utilizing various Python tools and doing an inquiry.
1.2.2 Objectives
The goal of this project is to provide an application or system that can help the driver while
driving, or the possibility for the vehicle to be completely self-controlled. In order to do so, we
will leverage a variety of A.I. technologies and techniques to do this. Create a detection system
that can recognize road lanes. Use this system in a simulation, run a real-time detection test,
and create a self-driving RC-Car that responds based on the system output.
5
Chapter 2
6
Figure 2.1: The five levels of Autonomous Driving.
7
pushes its fingers into sockets, grabs a knife, or attempts to catch a spark because it doesn’t
realize it’s harmful.
However, data alone cannot assist the automobile in performing its function. So, if data
represents "what to learn," algorithms represent "how to learn." Such as computer vision
algorithms, deep learning, and machine learning, to name a few.
Figure 2.2: Venn diagram of machine learning concepts and classes (inspired by Goodfellow et
al. 2016 (7), p. 9
8
There are some variants on how to describe Machine Learning Algorithms, however they
may generally be split into categories based on their purpose. As a result, there are four
major types of machine learning: supervised learning, unsupervised learning, sumi-supervised
learning, and reinforcement learning. Let’s take a quick look at each of these categories.
9
2.2.3.3 Semi-supervised learning
In the previous two types, either no labels exist for all of the observations in the dataset or
labels exist for all of the observations. Semi-supervised learning sits somewhere in the middle.
Labeling is expensive in many practical settings because it requires experienced human profes-
sionals. As a result, semi-supervised algorithms are the best options for model development
when labels are absent in the majority of observations but present in a few. These methods
take use of the fact that, while the group memberships of the unlabeled data are unknown, this
data contains critical information about the group parameters.
2.2.3.5 Conclusion
Machine-learning-based computer vision algorithms serve as the "eye and brain" of self-driving
cars. The primary goals of computer vision are to provide a seamless self-driving experience
and to guarantee the safety of the passengers. Now, the challenge is to enable vehicles and other
roadside equipment to communicate, even though we have a car that can "see and think".
10
and systems through the internet is referred to as the Internet of Things (IoT) (20). These
gadgets include anything from basic household’s items to high-tech industrial equipment.
Although the concept of IoT has been around for a while, it has only just become a reality
thanks to a variety of recent technological advancements. More manufacturers may now use
IoT technology thanks to reasonably priced and trustworthy sensors. Additionally, a variety of
internet network protocols have made it simple to link sensors to the cloud and other "things" for
effective data transfer. Additionally, the expansion of cloud platform options gives consumers
and organizations access to the infrastructure they need to scale up without having to handle
it all themselves. Furthermore, with advancements in machine learning and analytics, as well
as access to a variety of enormous amounts of data stored on the cloud, businesses may be able
to gain insights more quickly and easily. The development of these interconnected technologies
pushes the limits of IoT, and IoT data also fuels these innovations. Finally, Natural language
processing (NLP) is now available on Internet of Things (IoT) devices, including digital personal
assistants like Alexa, Cortana, and Siri. This has made IoT devices more attractive, practical,
and inexpensive for usage at home.
Self-Driving Cars are the most vivid example of the IoT. We can now enable communication
and information exchange between the automobile and its surroundings owing to IoT and
5G technologies. As a result, the vehicle will be able to interact with its surroundings more
effectively.
11
Chapter 3
3.1 Introduction
This venture aims to develop a self-driving and self-using car, or at the very least to aid drivers
in reducing their risk of injuries, achieving driverless cars, and hopefully having fewer rule
violations and street incidents. Furthermore, it enables blind and elderly people to own and
operate an automobile without the need for assistance.
We will use advanced approaches like machine learning and deep learning artificial intelli-
gence models to make this endeavor a success.
To begin with, the lanes must be detected in order for the car to travel down its path,
which is why machine vision, deep learning and machine learning models are used. It’s critical
to have the best and most accurate detection for lane curves and changes. To accomplish this
objective, the suggested system is fed images, which are then analyzed using AI algorithms to
produce lane recognition. Or using feature extraction methods for lanes detection.
Numerous lane-detection techniques with advanced performance have been presented re-
cently, according to the literature. However, the majority of these techniques are only effective
in detecting the road lanes in the current frame of the driving scene, which results in poor
performance when dealing with difficult driving conditions like deep shadows, severe road mark
degradation, significant vehicle blockage, and so on so forth. It is possible to forecast the lane
in the wrong direction, to just partially notice it, or even to completely miss it.
Road lanes are often continuous line formations that are either solid or dashed on the
pavement. Due to the continuous nature of the driving sequences and the substantial overlap
between two adjacent frames, the relationship between the lane positions in the adjacent frames
is strong. Even though the lane may experience degradation or damage produced on by shadows,
stains, and occlusion, the lane in the current frame may be predicted with more accuracy by
using several prior frames. And this represents an abstraction for the method suggested by
Q.Zou et al (21) that we used.
3.2 Technologies
When it comes to handling several computer vision issues including object identification, image
classification, and semantic segmentation, deep learning has shown to be a cutting-edge, human-
competitive, and often superior technology.
12
Deep neural networks often fall into one of two categories. One is the deep convolutional
neural network (DCNN), which is skilled in feature abstraction for pictures and videos and
frequently processes the input signal via many levels of convolution. The other is the deep
recurrent neural network (DRNN), which has the ability to anticipate information for time-series
signals by breaking the input signal down into subsequent blocks and establishing complete
connection layers between them.
The fundamental goal of ADAS is to increase vehicle security while also protecting other
road users, the driver, pedestrians, and bicycles. In recent years, the system’s demands have
expanded. In order to make real-time choices, warn the driver, or even act in his place, ADAS
must be able to distinguish objects, road signs, the road itself, and any other moving vehicle.
Deep learning techniques such as CNN, RNN and LSTM can be used to do this.
According to the methods provided by Q. Zou et al, a hybrid deep neural network is pre-
sented for lane recognition by employing multiple continuous driving scene photos. It is deep
neural network that combines the DCNN and the DRNN.
The DCNN takes many frames as input and predicts the current frame’s lane using se-
mantic segmentation (26). A fully convolution DCNN architecture is demonstrated to perform
segmentation. It is divided into two networks: The encoder network and the decoder network.
This ensures that the final output and input images are the same size. The features extracted
by the DCNN encoder network are then processed by a DRNN.
To manage the time-series of encoded characteristics, a long short-term memory (LSTM
(24)) network is used. The DRNN output is expected to have merged the information from the
continuous input frames and is passed into the DCNN decoder network to help anticipate the
lanes.
To assess performance, two datasets are gathered. Hundreds of samples were collected for
each of the 12 circumstances in one dataset. The other dataset, which includes thousands
of samples, was gathered on country roads. These datasets may be used to evaluate various
lane-detection techniques quantitatively.
13
lane detection based on image segmentation in the HSV color space, (2) optimal path finding
using the edge detection-based hyperbolic fitting line detection algorithm, and (3) road lane
detection based on edge detection, Scharr mask (? ) , and Hough Transform algorithm. The
suggested solutions were developed and tested using embedded devices such as the NVIDIA
Jetson Nano and the Raspberry Pi 4B. The accuracy of feature-based lane detection was dra-
matically lowered when the lane was damaged or visibility was limited; these approaches are
only relevant to real-world road situations when the lane margins are clean and under basic
road conditions.
Model-based strategies have been applied to address the problem of road lane recognition,
resulting in the advancement of Self-Driving Systems and (ADAS). y. Zhao et al. (19) proposed
a deep reinforcement learning-based model for surface lane detection that consists of two steps:
the bounding box detector and the landmark point localizer. For better representation of
curving lanes, a bounding box level convolution neural network is utilized to find the road lane,
followed by a reinforcement-based Deep Q-Learning Localizer (DQLL) to properly localize
the lanes as a set of landmark points. This suggested model outperforms competitors in the
NWPU Lanes and TuSimple (15) Lanes datasets. To detect automobile activity on the road,
Heo et al. (14) developed a mix of lightweight deep learning models on an embedded GPU
platform (eGPU). Their method analyzes discrete photos to provide a continuous track of the
vehicle’s journey. The assessment findings reveal that the suggested method can accurately
extract a vehicle’s horizontal and vertical motions. Kortli et al (29). proposed a novel deep
learning lane detection method (deep embedded hybrid CNN-LSTM network). First, pre-
processing such as resizing, shuffling, and normalization are carried out. Second, the proposed
three-network system is applied to a single RGB channel to predict lane markings: a CNN
Encoder-Decoder network that predicts lane markings on the road, a CNN Encoder-Decoder
network that includes a Dropout layer for regularization and uncertainty estimation, and a
CNN Encoder-LSTM decoder network architecture that uses the LSTM network to improve
detection rate by suppressing the influence of false alarm patches on detection results. Third,
using feature extraction to locate road lane markings. Finally, some data post-processing is
performed, such as Canny edge detection, Perspective transforms, and polynomial curve fitting.
This model was deployed on a high-performance NVIDIA Jetson Xavier NX platform. Jianwei
et al (31). Provided the LDTFE (Lane Detection with Two-stage Feature Extraction). LDTFE
is robust model-based method used to detect lanes, whereby each lane has two boundaries.
To enhance robustness, the lane boundary is considered as a collection of small line segments.
The two-stage feature extraction method is then used. The first stage consists of extracting
small but important line segments using a modified Hough Transform based on the concept of
continuity. The second stage involves clustering those small line segments using a density-based
Clustering algorithm (DBSCAN). To lessen the probability of false positives, the final lanes are
identified using the vanishing point after locating candidate clusters using the color contrast
between the road and the lane boundaries. Experiments show that this strategy produces
outstanding results on two datasets of road photos. Model-based (deep learning) lane detection
methods are appropriate for situations where the lane is damaged or visibility is poor. However,
when the road traffic information is extremely complicated or there are interfering obstacles,
detection is drastically decreased, and false detections are frequent. [ (17), (27), (30), (23)].
Deep learning-based techniques can considerably increase the accuracy and robustness of
lane detection. At the same time, these approaches demand more technology and have more
sophisticated architecture, which leads to some limits. To meet the Embedded Intelligence (EI)
restrictions, it is vital to continue to enhance lane detecting systems and put them on embedded
platforms for faster execution time.
Most of the previously mentioned methods confine their approaches to detecting road lanes
in one current frame of the driving scene, resulting in poor performance in dealing with difficult
14
driving circumstances. We will be using a method proposed by Q. Zou et al to address this
specific issue.
15
Chapter 4
Datasets
The training set contains images of continuous driving scenes that are divided into image
sequences, each sequence is composed of twenty frames. There are 19096 sequences available
for training. A sequence’s 13th and 20th frames are labeled. The training dataset is divided
into two parts. They are built using the TuSimple lane detection dataset, which contains scenes
from American highways. as well as photographs of rural China.
16
Figure 4.1: An example of an input image and the labeled groundtruth lanes. (a) The input
image. (b) The ground truth.
For the test, the dataset providers also took samples from 5 continuous images to determine
the lanes in the previous frame and compared those samples to the preceding frame’s actual
ground truth. They created Test set #1 and Test set #2, which are two entirely independent
test sets. On the TuSimple test set for typical testing, Test set #1 is based. Test set #2
includes Various situations need for the collection of hard samples, particularly when evaluating
robustness.
4.3 Conclusion
We had a valid dataset after the aforementioned procedures. It will be employed in the training
to help create a robust model able to predict lanes.
17
Chapter 5
5.1 Introduction
During the third week of our internship, we attempted to detect lanes within a video frame
using openCV, the Canny algorithm to detect edges, the Hough Transform to detect straight
lines, Area of Interest Selection based on image masking and Clustering to group line segments
based on similarity measurements.
Figure 5.1: lane detection using area of interest masking and Canny edge detection:
(left): the original frame, (center) Canny image, (right) masked Area of Interest [AoI]
When the road lines are obvious and the region of interest matches the road lanes, we achieve
some results. However, this resulted in the inconsistency of our simple method. Because the
lanes and the car are not always consistent, the Area of Interest must be changed for different
frames.
Because of the disadvantages indicated above, we have decided to use a deep-based strategy
18
that employs computer vision and A.I. approaches to improve the efficiency and satisfaction of
the outcomes. That is why we employed the strategy described below.
19
cells in the network to determine whether the information is important or not. A double-layer
LSTM is being used, one for sequential feature extraction and one for integration. The typical
full-connection LSTM requires a lot of time and computation. As a result, in the suggested
network, a convolutional LSTM (Conv LSTM) was used. The ConvLSTM replaces matrix
multiplication with a convolution operation in each gate of the LSTM, which is commonly
utilized in end-to-end training and feature extraction from time-series data.
In this network, the input and the output of the proposed ConvLSTM are equal to the
feature map size produced by the encoder, it’s equal to 8*16 for the UNet-ConvLSTM. The
convolutional kernel has a size of 3 x 3. The ConvLSTM has two hidden layers, each of which
has a dimension of 512.
20
in a straightforward manner in U-Net between the respec-tive sub-blocks of the encoder and
decoder.
As a convolution procedure, we use the general Convolution-BatchNorm-ReLu technique
for the encoder and decoder CNNs. Every convolution uses the ’same’ padding.
5.5 Conclusion
Throughout this chapter, we were introduced to a variety of concepts. After explaining and
comprehending the technologies employed in this strategy, we will attempt to put them on a
machine, test and train the model, and measure the results in accordance with the metrics.
21
Chapter 6
6.1 Implementation
Experiments are carried out in this chapter to validate the accuracy and robustness of the
suggested method. The suggested networks’ performances are evaluated in various scenarios and
compared to various lane-detection algorithms. The impact of parameters is also investigated.
The training was initially conducted on a potent machine with an I7 CPU and a 2060 NVIDA
RTX GPU. However, this configuration is insufficient to perform such complex calculations. Due
to the Google Collaborator environment’s reliability compared to our material, we switched to
the pro version that Google offers. A faster Tesla P100 GPU was used. There are 9 epochs in
the batch, which has a size of 10. The training took 1200 minutes to complete, with a 97.95
percent accuracy rate and a 22 percent loss. Due to time constraints, we couldn’t complete
more than 9 epochs. But the initial results showed promising prospects.
6.2 Evaluation
The accuracy evaluation criterion, which gauges overall classification performance using pixels
that have been correctly classified, is the most straightforward 6.1.
T rueP ositive + T rueN egative
accuracy = (6.1)
T otalN umberof P ixels
For a more fair and realistic comparison, precision 6.2 and recall 6.3 are used as two metrics,
which are defined as
T rueP ositive
P recision = (6.2)
T rueP ositive + F alseP ositive
,
T rueP ositive
Recall = (6.3)
T rueP ositive + F alseN egative
We set lane as the positive class and background as the negative class for the lane detection
job. According to Eqs.6.2 6.3, the number of lane pixels that are correctly predicted as lanes is
known as the true positive, the number of background pixels that are incorrectly predicted as
lanes is known as the false positive, and the number of lane pixels that are incorrectly predicted
as background is known as the false negative.
We deduce from a visual inspection that the prediction of thinner lanes, the reduction of
fuzzy conglutination zones, and a decrease in misclassification are the primary factors contribut-
ing to the improvement in precision. For ADAS systems, incorrect lane width, the presence
of fuzzy regions, and misclassification are particularly risky. Our approaches produce smaller
22
lanes than the results of other methods, which reduces the likelihood that background pix-
els close to the ground truth would be classified as lanes and results in a low false positive
rate. Because backdrop pixels are no longer classified as lane class, the fuzzy area surrounding
vanishing points and vehicle-occluded zones also results in a low false positive.
Moreover, the aforementioned causes also contribute to the decline in memory. Although
thinner lanes are more accurate in representing lane positions, it can occasionally be simple for
them to maintain a pixel-level separation from the reality. It will be more difficult to overlap
two thinner lines. Higher false negatives will result from these deflected pixels. Small pixel-
level deviation, however, is impalpable to human sight and has no negative effects on ADAS
systems. Conglutination reduction has a similar impact. Conglutination will emerge if the
model accurately classifies every pixel in a region as belonging to the lane, which will result in
a high recall.In this case, the background class will suffer from substantial misclassification and
the precision will be quite low, despite the high recall.
In other words, despite the slightly reduced recall, the model better matches the task.
Thinner lines, which will fairly deviate slightly from the reality, are the cause of the decreased
recall. We provide F1-measure as a complete matrices for the evaluation because the accuracy
or recall only represent a portion of the performance of lane detection.F1 6.4 is described as
P recision.Recall
F1 = 2 ∗ (6.4)
P recision + Recall
Running time:The suggested model adds an LSTM block in addition to taking a series of
photos as input, which could result in longer run-time.
Robustness:Despite the strong performance on previous test datasets, the proposed lane
detection model has not yet been tested for robustness. This is like increasing the chances of
a traffic accident with even the slightest false positives. A good lane detection model needs
to be able to handle common driving situations such as urban areas and highways, as well
as difficult driving scenarios such as country roads, poor lighting, and vehicle blockages. I
have. Robustness testing uses a completely new dataset that contains many real-world driving
scenes. The 728 images of Test Set 2 displayed in the Dataset section contain lanes for rural,
urban, and highway scenes. This dataset was recorded by a data recorder in different weather
conditions, inside and outside the windshield, and at different heights. This is a thorough and
difficult test dataset that contains some tracks that are difficult for the human eye to see.
6.2.1 training
The training lasted approximately 1200 minutes and was conducted on the Google collaborator
environment. The average loss in the ninth epoch was approximately 22.75 percent, which
was not entirely satisfactory. Accuracy was approximately 97.95 percent, which is extremely
impressive given only nine epochs. LSTM helps the model to perform better because of its
ability to learn long term dependencies by remembering information across extended delays,
and forgetting superfluous information, and carefully exposing information at each time step.
23
Figure 6.1: loss and accuracy over the number of epochs
Even when the loss is rather high, the precision is just about adequate. Training the model
on a larger number of epochs reduces loss while increasing accuracy and precision.
6.2.2 testing
We evaluated the model on the 0531 testing set, and the output photos are nearly identical
to the true value. The average loss was calculated to be 0.22, the accuracy is around 97.95
percent, the precision is approximately 0.89, the recall is approximately 0.98, and the F1-score
is approximately 0.93. In comparison to what we expected, these results are pretty promising.
24
The results are clearly robust, the only problem is that the output lanes are slightly thicker
than the real value.
We should not forget to mention that these results are predictions.The inputs are 5 frames
instead of one.
6.3 Conclusion
For robust lane detection in driving scenes, a novel hybrid neural network combining CNN
and RNN was proposed. The proposed network architecture is based on an encoder-decoder
framework that takes as input multiple continuous frames and predicts the lane of the current
frame using semantic segmentation. In this framework, a CNN encoder first abstracted features
from each frame of input. A ConvLSTM was then used to process the sequentially encoded
features of all input frames. The ConvLSTM outputs were then fed into the CNN decoder
for information reconstruction and lane prediction. For performance evaluation, two datasets
containing continuous driving images are created.
When compared to baseline architectures that use a single image as input, the proposed
architecture produced significantly better results, proving the effectiveness of using multiple
continuous frames as input. In the meantime, the experimental results showed that ConvLSTM
outperformed FcLSTM in sequential feature learning and target-information prediction in the
context of lane detection.
When compared to other models, the proposed models performed better, with higher pre-
cision, recall, and accuracy values. Furthermore, the proposed models were tested on a dataset
with extremely difficult driving scenes to ensure their robustness. The results demonstrated
that the proposed models can detect lanes in a variety of situations while avoiding false recog-
nitions. Longer sequences of inputs were found to improve performance in parameter analysis,
further supporting the strategy that multiple frames are more helpful than a single image for
lane detection.
Looking at the outcomes, we discover that the model was satisfying, which encourages us
to apply it in practical settings.
25
Chapter 7
Deployment
7.1 Results
Before applying our results on real-time detection, we should first test the detection on: a single
photo and a video recording.
We used the same videos and photos that we did for the Hough transform. The proposed
method was used on the "Marrakech-frame" and the videos "Marrakech-3," "Marrakech-2,"
and "Agadir." The results are as follows:
For the "marrakech-frame" the results on a single frame are good but not perfect since we
can’t predict lane from a single frame. This is the obtained output :
Despite some erroneous detection or missing lanes, the detection of the "marrakech-2" shows
exceptional results. However, the approach is quite robust for only 9 epochs. Since we demand
a better material setup, video detection is pretty slow. The results are shown in the screenshots
below.
26
Figure 7.2: lane detection on the Marrakech-2 the frames are from the same video and they are
arranged in terms of time from left to right, top to bottom
Now let’s see detection on a better road with clear road lanes. The following images are
from Agadir videos. The detection appears to be way more robust than in the previous videos.
27
Figure 7.3: lane detection on Agadir video, the frames are from the same video and they are
arranged in terms of time from left to right, top to bottom
On Google Collaborator the detection for a video is slightly slow. The detection on the local
machine is extremely slow. this illustrates the hardware requirements of the proposed network.
7.2 Discussion
To further validate the proposed methods’ excellent performance, we compare them to addi-
tional methods reported in the TuSimple lane-detection competition. It should be noted that
our training set is based on the TuSimple dataset.
The following table shows a comparison that the team of scientists did :
28
Figure 7.4: TuSimple lane marking challenge leader board on test set as of March 14, 2018
Note that the UNet-ConvLSTM here in the table was trained for (100 epochs). the proposed
method outperforms most of the methods stated in the table above.
The performance of the proposed methods is influenced primarily by two parameters. The
first is the number of frames used as the networks’ input, and the second is the sampling stride.
These two parameters together determine the total range between the first and last frames.
Given more frames as input for the proposed networks, the models can generate prediction
maps with more additional information, which may aid in the final results.
The challenge is to deal with the speed of detection when detecting 3 to 5 frames. Using
one frame we can’t get prediction but it requires less hardware configurations. We have to deal
with the model so it can be more increase efficiency and decrease time consumption.
Figure 7.5: The Graphical User Interface (GUI) | (left)we check the box to activate streamlit
(right) active
Using a phone camera mounted to a PC via USB and using the Iriun Webcam to link the
PC to the phone’s camera. We were able to capture frames in real time. and then using them
to detect lanes.
29
We first get the live frames from the camera to the (real-time) detection python file using
openCV. The image format should be converted to Numpy and then to tensor.
Figure 7.6: The original camera frame detected using phone Iriun Webcam
We launch the real time detection from the GUI. And the following are the image outputs
from both Marrakech and Agadir.
30
Figure 7.8: two screenshots of live detection on the Agadir video
7.4 Conclusion
When compared to other models, the proposed models performed better, with higher precision,
recall, and accuracy values. The plan is to improve the lane detection system in the future by
incorporating lane fitting into the proposed framework.
As a result, the detected lanes will be smoother and more reliable. Furthermore, when
strong interferences exist in a dim environment.
This will improve real-time detection as well, but the challenge is to develop a model capable
of detecting objects in real time that can be embedded on low-cost hardware.
31
Chapter 8
General Conclusion
8.1 Synthesis
After discovering and comprehending the subject factors and challenges, we seek to solve and
enhance them. Finding the suitable data for both hardware and software. We grouped together
the activities related to the construction of the precise set of data to be analysed, which must
be prepared to be compatible for use. In the next step, and with the use of an algorithm to
generate knowledge, then verify the model or knowledge obtained to test the robustness and
accuracy of the models obtained (in our case, the generated model). Last but not least, finally
put the knowledge found in an algorithm to generate knowledge (testing/real-time detection).
Learning about the levels of vehicle automation, we become able to distinguish between
DAS, ADAS, and ADS. In addition to this, we acquired a tip from the iceberg of the deployed
technologies on the subject, such as: Big Data that represents the "what to learn?"; Computer
vision that represents the "how to see?"; Machine learning/Deep learning that represents "how
to learn?"; and the Internet of Things that represents "how to communicate?".
We were also introduced to different categories of machine learning/deep learning algo-
rithms, which include supervised, unsupervised, semi-supervised, and reinforcement learning.
Moreover, we were introduced to two supervised learning deep neural networks: Conventional
Neural Network (CNN) and Recurrent Neural Network (RNN). Also, we learned about self-
driving cars and their use cases and implemented algorithms within this topic.
After reading some state-of-the-art methods such as LDTFE and the deep hybrid CNN-
LSTM network, We learned about the two categories of lane detection techniques: feature-based
and model-based.
We then selected the most convenient dataset (TvtDataset). and the matching model. We
started to understand the model and tried to run it on our machines. Due to the lack of
suitable hardware, we found many difficulties during this stage. However, we surely enjoyed
the process.The proposed method is a hybrid deep neural network that combines the DCNN
and the DRNN to predict the road lane from multiple frames. The method showed some
outstanding results. Despite being slow, it outperforms most of the methods on the proposed
dataset. Finally, we deployed the model on a local machine and created a GUI to ease the
usage of the model.
Finally, we are quite satisfied with those results. and we are hoping to go even further.
Maybe create a model from scratch, invest in making models more suitable for a machine, etc.
8.2 Perspective
The goal for future work is to deploy our models on hardware. The idea is to create a small
dataset with three or four classes (for example, (1) stop sign, (2) crosswalk road mark, (3) red
32
traffic light, (4) green traffic light), train a tiny YOLO model on that dataset, and then deploy
the trained model on a raspberry pi 3 (model B, for example) (and of course, the Arduino and
the Raspberry pi are connected via LAN).
When the RC-car is moving (on a racetrack made of small traffic signs, lights, and curves),
the Pi-camera captures road frames, the Raspberry Pi processes these frames, and our tiny
YOLO model detects Vertical/Horizontal Road Signs,and the Lane Curves are identified using
pattern recognition. For example, using the LDTFE or another method that requires fewer
calculations, the raspberry pi will send commands to the Arduino to execute, causing the RC-
car to move forward, left, right, or even stop. (However, if we want it to go backwards, another
camera will be required to capture frames of the road behind the RC-car.)
We wanted to use the CARLA simulator for the simulation part, but due to a lack of
computational power, we will use a less hardware requiring simulator such as Udacity, and the
idea is to install a system capable of detecting and recognising speed traffic signs in the Udacity
simulator, and the architecture is as follows:
Making a model more "embedded" is a difficult goal to achieve, and creating a simulation
was difficult due to time constraints. Nonetheless, they are intriguing endeavors that we hope
to complete soon as humanly possible. We will keep working to achieve those objectives and
enhance our knowledge in this area As a result, we will strive to stay current with developments
in the field and to conduct cutting-edge research using sophisticated technology to push the
boundaries of knowledge, as well as becoming more creative and passionate.
33
Bibliography
[2] K.Heinrich C.Janiesch, P.Zschech. Machine learning and deep learning. Springer Link,
2021.
[4] IBM Corporation. Introduction to crisp-dm. IBM SPSS Modeler CRISP-DM Documenta-
tion, 2018.
[6] K.Song C.W.Yan M.C.Wang F.Zheng, S.Luo. Improved lane line detection algorithm
based on hough transform. Springer Link, 2018.
[9] Oracle Cloud Infrastructure. What are the three vs of big data? 2022.
[11] Hyunjoo Jin. Like tesla, toyota develops self-driving tech with low-cost cameras. Rueters,
2022.
[12] T.Marciniak J.Suder, K.Podbucki and A.Dąbrowski. Low complexity lane detection meth-
ods for light photometry system. MDPI, 2021.
[13] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, 2014.
[15] Zhiyuan Zhao Qi Wang Xuelong Li. Deep reinforcement learning based lane detection and
localization. ScienceDirect, 2020.
[17] M.Bauer N.Khairdoost, S.Beauchemin. Road lane detection and classification in urban
and suburban areas based on cnns. SCITEPRESS Digital library, 2021.
[18] P. Fischer O. Ronneberger and T. Brox. U-net: Convolutional networks for biomedi-
cal image segmentation. in International Conference on Medical Image Computing and
Computer Assisted Intervention, 2015.
34
[20] Oracle. What is iot? Oracle, 2022.
[21] Q. Dai Y. Yue L. Chen Q. Wang Q. Zou, H. Jiang. Wang, robust lane detection from
continuous driving scenes using deep neural networks. IEEE Transactions on Vehicular
Technology, 2019.
[22] J.Liu Q.Huang. Practical limitations of lane detection algorithm based on hough transform
in challenging scenarios. SAGE Journals, 2021.
[23] Jinlong Liu Qiao Huang. Practical limitations of lane detection algorithm based on hough
transform in challenging scenarios. SAGE Journals, 2021.
[24] Eric Rothstein Morris Ralf C. Staudemeyer. Understanding lstm – a tutorial into long
short-term memory recurrent neural networks. Arxiv, 2019.
[25] Sebastian Ruder. An overview of gradient descent optimization algorithms. Arxiv, 2017.
[27] G.Panda S.C.Satapathy R.Sharma S.Ghanem, P.Kanungo. Lane detection under artificial
colored light in tunnels and on highways: an iot-based framework for smart city infras-
tructure. Springer Link, 2021.
[28] K. Simonyan and A. Zisserman. “very deep convolutional networks for large-scale image
recognition". CoRR, 2014.
[29] Jeongyeup Paek JeongGil Ko Taewook Heo, Woojin Nam. Autonomous reckless driving
detection using deep learning on embedded gpus. IEEE, 2020.
[30] Vitor Santos Tiago Almeida, Bernardo Lourenço. Road detection based on simultaneous
deep learning approaches. ScienceDirect, 2020.
[31] Lew F.C. Lew Yan Voon Maher Jridi Mehrez Merzougui Mohamed Atri Yassin Kortli,
Souhir Gabsi. Deep embedded hybrid cnn–lstm network for lane detection on nvidia jetson
xavier nx. ScinceDirect, 2022.
35