0% found this document useful (0 votes)
40 views38 pages

Lanes Curves Detection Report

This document is a graduation project report on lane and curve detection using digital image processing and artificial intelligence. It discusses autonomous vehicles and the technologies used to enable them such as computer vision, machine learning and deep learning. It then presents the proposed method of using an LSTM neural network and encode-decoder architecture for robust lane detection from continuous driving scenes. The method is implemented and evaluated on datasets with promising results, demonstrating potential for real-time detection deployment in autonomous vehicles. In conclusion, the project achieved lane and curve detection through emerging technologies as its objectives.

Uploaded by

Hind
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views38 pages

Lanes Curves Detection Report

This document is a graduation project report on lane and curve detection using digital image processing and artificial intelligence. It discusses autonomous vehicles and the technologies used to enable them such as computer vision, machine learning and deep learning. It then presents the proposed method of using an LSTM neural network and encode-decoder architecture for robust lane detection from continuous driving scenes. The method is implemented and evaluated on datasets with promising results, demonstrating potential for real-time detection deployment in autonomous vehicles. In conclusion, the project achieved lane and curve detection through emerging technologies as its objectives.

Uploaded by

Hind
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Baden-Württemberg Cadi Ayyad University

Cooperative Faculty of science


State University Semlalia
Duale Hochschule computer science
Baden-Württemberg department
DHBW Marrakech

Graduation Project

Lane and curves detection using Digital


Image Processing and Artificial
Intelligence AI

supervised by :
Dr. -Eng. Atheer
Dr.QAZDAR Aimad
Dr.EL HASSAN Abdelwahed
Realized by : Jury members:
Dr. -Eng. Atheer
ITOULI Oussama
Dr.QAZDAR Aimad
KAMEL Aymane
Dr.EL HASSAN Abdelwahed
Dr.ZAHIR Jihad
Dr.QAFFOU Issam
Dr.EL BACHARI Essaid
Dr.EL QASSIMI Sara
Session Year : 2021-2022
Acknowledgements

"First and foremost, we thank ALLAH for providing us with the


patience and strength to achieve this level. Countless people helped
us with this project."

Our sincere thanks go to our parents and family for their


unwavering support throughout this journey.
We would like to thank Madam Loubna OUAOUZAR for assisting
us in writing this report and providing any grammatical corrections
that were required.

We’d also like to thank our dear supervisors, Dr. -Eng. Atheer, Dr.
Abdelwahed, Dr. EL Bachari, and our supervisor and examiner, Dr.
Qazdar, for guiding us in the right direction throughout this project
and bravely reading through all of our endless drafts. And thanks to
Dr. Zahir for her inspiration. Even in small talk, she would make
suggestions to boost our confidence.

We would also want to thank all of the researchers and students in


Cadi Ayyad University’s Computer Science Department for creating
pleasant and exciting working conditions over these months.

We would also like to thank the individuals behind this program for
helping us and other groups recognize that higher education should
have no borders..
1
Contents

1 Introduction 4
1.1 Context and Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Challenges and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 General challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Conduct methodology (CRISP Methodology) . . . . . . . . . . . . . . . . . . . 5

2 autonomous vehicles using emerging technologies 6


2.1 autonomous vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Implemented and deployed technologies . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.2 Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.3 Machine learning and Deep learning . . . . . . . . . . . . . . . . . . . . . 8
2.2.3.1 Supervised learning . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.3.2 Unsupervised learning . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.3.3 Semi-supervised learning . . . . . . . . . . . . . . . . . . . . . . 10
2.2.3.4 Reinforcement learning . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.4 Internet of Things (IoT) . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Lane and Curves detection


using Digital Image processing and Artificial Intelligence 12
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Related works and state-of-the-art methods . . . . . . . . . . . . . . . . . . . . 13

4 Datasets 16
4.1 Dataset source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.2 Dataset preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5 Proposed Method (Robust Lane Detection from Continuous Driving Scene


Using Deep Neural Networks) 18
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.3 Network Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.3.1 LSTM Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.3.2 Encode-decoder network . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.4 training strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2
6 Implementation and Evaluation 22
6.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.2.1 training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.2.2 testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

7 Deployment 26
7.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
7.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.3 Real time detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

8 General Conclusion 32
8.1 Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
8.2 Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3
Chapter 1

Introduction

1.1 Context and Problem


In order to complete our graduation project and earn a computer science bachelor’s degree. A
two-month internship was carried out in collaboration between Cadi Ayyad University (UCA)
and Baden-Württemberg Cooperative State University (DHBW), and it was overseen by aca-
demic personnel from both institutions.
Founded in 1994 and specialized in engineering, computer science and natural sciences. The
Baden-Württemberg Cooperative State University (Duale Hochschule Baden-Württemberg/DHBW)
is a state-run university, and it is Germany’s first higher education institution to combine on-
the-job training and academic studies, accomplishing a close integration of theory and practice,
which are both components of cooperative education. DHBW is one of the largest higher
education institutions in the German Federal State of Baden-Württemberg.
The Faculty of Sciences Semlalia (FSSM) in Marrakesh is one of Cadi Ayyad University’s
key institutes. It was established in 1978. Since then, it has seen a tremendous dynamic in
terms of training, scientific research, and partnership. It is considered as one of Morocco’s major
faculties, in addition to being the pioneer in establishing a computer science department.. As
a result, it has earned itself a place in the national higher education scene. It educates the
overwhelming majority of postgraduate students in computer science and provides academic
degrees (MA and PhD) as well as higher vocational training certifications.
In this study, we will discuss the concept of a self-driving automobile, specifically lane and
curve detection as a critical component of an autonomous vehicle, and its relationship to A.I.
algorithms, machine vision, machine learning, deep learning and so on and so forth.

1.2 Challenges and Objectives


Artificial intelligence (AI) has infiltrated our daily lives in ways we may not even be aware of.
It has grown so widespread that many people are unaware of its significance or our reliance
on it. From dawn to night, as we go about our daily lives, A.I. drives a large portion of what
we do. As we wake up, many of us reach for our phones or laptops to begin our day. This
has become a routine and a crucial act to our decision-making, organization, and information-
seeking processes.
When we turn on our devices, we are immediately plugged into A.I. functions such as face
ID and image recognition, digital voice assistants such as Apple’s Siri and Amazon’s Alexa. Not
only that, but some people can’t drive their cars without full assistance. and in the near future,
we will never "drive" but instead, cars will do this task for us, which is now more possible than
ever thanks to the massive amount of data that is available.

4
1.2.1 General challenges
As newcomers to the Ai Research field, we faced several challenges that we had to overcome
in order to complete this project. The first was the lack of essential information to understand
the project. Despite the fact that this is a recent complex project that requires a significant
amount of knowledge. It also requires extensive work to finish within the two months’ deadline,
which is insufficient for such a subject. We did, however, have a good time and received a real
experience utilizing various Python tools and doing an inquiry.

1.2.2 Objectives
The goal of this project is to provide an application or system that can help the driver while
driving, or the possibility for the vehicle to be completely self-controlled. In order to do so, we
will leverage a variety of A.I. technologies and techniques to do this. Create a detection system
that can recognize road lanes. Use this system in a simulation, run a real-time detection test,
and create a self-driving RC-Car that responds based on the system output.

1.3 Conduct methodology (CRISP Methodology)


To address the stated issue, we first sug-
gest the "CRISP-DM" (4) approach of de-
signing and running analytical processes in
order to identify the various phases and func-
tions. This strategy is the most successful for
all data science initiatives and is often used
for data mining tasks. The life cycle model
is divided into 6 stages, with arrows showing
the most essential and common interdepen-
dencies. The stages are not in any particular
order. In reality, most projects alternate be-
tween stages as needed.
This strategy requires the six well-defined
actions outlined below to attain a goal: The
first phase, "Business knowledge," entails
comprehending the business factors and chal-
lenges that Data Science seeks to solve or en-
hance; the second phase, "Data understand-
ing," involves accurately determining the data
Figure 1.1: CRISP methodology
to be studied. and the third phase, "Data
preparation," groups together the activities
related to the construction of the precise set of data to be analyzed, which must be prepared
to be compatible for use in the fourth phase of "Modelling" with the use of an algorithm to
generate knowledge, then verify the model or knowledge obtained to test the robustness and ac-
curacy of the models obtained in the fifth phase of "Evaluation," and finally put the knowledge
found in an algorithm to generate knowledge.

5
Chapter 2

autonomous vehicles using emerging


technologies

2.1 autonomous vehicles


An autonomous car, also known as driverless car, is one that can run and execute critical
activities without human involvement, depending on deep learning and a subset of artificial in-
telligence, to detect its surroundings. Nvidia, a technology company, uses artificial intelligence
to provide cars with "the ability to detect, analyze, and learn, allowing them to navigate a
virtually infinite spectrum of hypothetical driving situations." Nissan’s SAM (3), Toyota (11),
Audi and Tesla (16) already uses A.I.-powered technologies, which is expected to revolution-
ize driving experience and enable cars to run automatically. The most frequently mentioned
advantages of autonomous cars include: to ensure traffic safety, to have greater fuel efficiency,
and reduced congestion, enables blind and elderly persons to own and operate an automobile
without the need for assistance. There be inescapable hurdles along the route to accomplishing
such a goal.
There are six degrees of automation, and as the levels increase, so does the driverless car’s
autonomy in terms of operation control.
At level 0, the car has no control over its operation and is totally operated by a person.
The ADAS (advanced driving assistance system) in the automobile may assist the driver with
steering, acceleration, and braking at level 1. At level 2, the ADAS can supervise steering,
acceleration, and braking in certain conditions, but the human driver must maintain complete
focus on the driving environment while performing the other activities.
In some circumstances, the ADS (advanced driving system) can perform all aspects of the
driving task at level 3. However, the human driver must be able to regain control when the
ADS requests it. The human driver does the necessary tasks in the remaining circumstances.
At level 4, the ADS in the automobile is capable of managing every aspect of driving on its
own when no human involvement is necessary.
Level 5 refers to complete automation, where the vehicle’s ADS is capable of handling all
tasks in all circumstances without the need for human driver assistance. Complete automation
will be made possible by (IoT) 5G technology, which will let automobiles communicate not only
with one another but also with traffic lights, signs, and even the roads itself.

6
Figure 2.1: The five levels of Autonomous Driving.

2.2 Implemented and deployed technologies


2.2.1 Big Data
According to Oracle (10), big data is defined as data with increased variety that arrives in
higher volume and with greater velocity. This is also known as the three Vs of big data (9) .
Simply said, big data is bigger, more complicated data sets, particularly from new data
sources. Traditional data processing tools simply cannot handle these massive data volumes.
However, these huge amounts of data may be leveraged to answer business challenges that were
previously unsolvable.
Big data has several use cases, including product creation. Big data is used by companies
like Netflix and Proctor Gamble to forecast consumer demand for new goods. PG plans,
produces, and introduces new goods in the US using data and analytics from focus groups,
social media, test markets, and early retail rollouts.
A different use case of big data is assisting us in innovating by examining the inter-
dependencies between people, institutions, and other organizations and then coming up with
new applications for those findings. To make better choices on financial and planning factors,
use data insights. In order to supply innovative products and services, examine trends and
client preferences.
Currently, machine learning is a trendy topic. And one of the causes is data, particularly
large data. Instead of programming machines anymore, we can now educate them. That is
made feasible by the availability of massive data to train machine learning models. This will
introduce the concept of autonomous vehicles to us.
Big data has lately grown in popularity in self-driving cars. Big data for autonomous vehicles
is what allows sensors to be used. An autonomous vehicle will be worthless on the road if it does
not have access to a consistent and reliable stream of self-driving car data - it will not know
what to do with the data it gets. A connected automobile without data is like a newborn that

7
pushes its fingers into sockets, grabs a knife, or attempts to catch a spark because it doesn’t
realize it’s harmful.
However, data alone cannot assist the automobile in performing its function. So, if data
represents "what to learn," algorithms represent "how to learn." Such as computer vision
algorithms, deep learning, and machine learning, to name a few.

2.2.2 Computer Vision


To put it simply, machine vision (5) technology allows industrial equipment to "see" what it is
doing and make quick decisions based on what it sees. Machine vision is most commonly used
for visual inspection and flaw detection, locating and measuring parts, and identifying, sorting,
and tracking items etc.
Intelligent manufacturing, logistics, and operations are built on the foundation of industrial
machine vision. Every stage of the production process may benefit from information, analysis,
and efficiency provided by machine vision cameras, integrated IoT sensors, and industrial PCs.
Machine vision is a foundational technology in industrial automation. For decades, it has
aided in the improvement of product quality, the acceleration of production, and the optimiza-
tion of manufacturing and logistics. This tried-and-true technology is now combining with A.I.
to drive the shift to Industry 4.0 (8).

2.2.3 Machine learning and Deep learning


Machine learning (2) is frequently used by intelligent systems that provide artificial intelligence
capabilities nowadays. Machine learning is the ability of systems to learn from training data
that is relevant to a given problem in order to automate the construction of analytical models
and complete related activities.
A machine learning concept called "Deep Learning " (2) is based on artificial neural net-
works. Deep learning models outperform shallow machine learning models and conventional
data analysis techniques for many applications.
To provide a fundamental comprehension, we must distinguish between pertinent terms
and concepts. To accomplish this, we propose a basic A.I foundation: the hierarchical link
between what is considered deep learning (Deep Neural Networks) and what is considered
shallow machine learning (machine learning algorithms and artificial neural networks). The
following diagram summarizes this (Venn figure).

Figure 2.2: Venn diagram of machine learning concepts and classes (inspired by Goodfellow et
al. 2016 (7), p. 9

8
There are some variants on how to describe Machine Learning Algorithms, however they
may generally be split into categories based on their purpose. As a result, there are four
major types of machine learning: supervised learning, unsupervised learning, sumi-supervised
learning, and reinforcement learning. Let’s take a quick look at each of these categories.

2.2.3.1 Supervised learning


The machine is taught by example in supervised learning. This type of machine learning
necessitates a training dataset that includes input instances as well as labeled answers or target
values. and the algorithms must figure out how to get to the desired inputs and outputs.
Here, an expert act as the “teacher”, feeding the machine a training data set containing the
input/predictors and displaying the correct response (output).
Classification, Regression and Forecasting are subcategories of supervised learning. Most
prevalent supervised algorithms are: Decision trees, linear regression, support vector machines...

2.2.3.2 Unsupervised learning


Unsupervised learning occurs when the learning system is expected to detect patterns in the
absence of any pre-existing labels or specifications. There is no teacher in this situation; instead,
the computer may be able to teach you after learning patterns in data; these algorithms are
especially effective when the human expert is unsure what to search for in the data.
Unsupervised learning methods are utilized for pattern detection and descriptive modeling.
There are no output labels, however, on which the algorithm can attempt to model relationships.
The machine learning algorithm is left to evaluate massive data sets and address that data in
an unsupervised learning process. The program attempts to organize the data in a way that
describes its structure. This could imply grouping the data into clusters or arranging it in a
more organized manner.
Clustering, Dimension reduction are subcategories of unsupervised of unsupervised learning.
Most common algorithms are K-means clustering, association rules and the list goes on.

Figure 2.3: Supervised and unsupervised learning

9
2.2.3.3 Semi-supervised learning
In the previous two types, either no labels exist for all of the observations in the dataset or
labels exist for all of the observations. Semi-supervised learning sits somewhere in the middle.
Labeling is expensive in many practical settings because it requires experienced human profes-
sionals. As a result, semi-supervised algorithms are the best options for model development
when labels are absent in the majority of observations but present in a few. These methods
take use of the fact that, while the group memberships of the unlabeled data are unknown, this
data contains critical information about the group parameters.

2.2.3.4 Reinforcement learning


In a reinforcement learning system, instead of providing input and output pairs, we describe
the current state of the system, specify a goal, provide a list of allowable actions and their
environmental constraints for their outcomes, and let the ML model experience the process of
achieving the goal by itself using the trial-and-error principle to maximize a reward.

Figure 2.4: Reinforcement learning abstraction

Reinforcement Learning is a subset of Machine Learning and, as such, a subset of Artifi-


cial Intelligence. It enables machines and software agents to automatically find the optimal
behavior within a given situation in order to maximize their performance. For the agent to
learn its behavior, simple reward feedback is required; this is known as the reinforcement signal.
Most common examples of the Reinforcement learning are: Q-learning, temporal difference and
artificial neural networks. . .

2.2.3.5 Conclusion
Machine-learning-based computer vision algorithms serve as the "eye and brain" of self-driving
cars. The primary goals of computer vision are to provide a seamless self-driving experience
and to guarantee the safety of the passengers. Now, the challenge is to enable vehicles and other
roadside equipment to communicate, even though we have a car that can "see and think".

2.2.4 Internet of Things (IoT)


The network of physical items, or "things," that are implanted with sensors, software, and
other technologies for the purpose of communicating and exchanging data with other devices

10
and systems through the internet is referred to as the Internet of Things (IoT) (20). These
gadgets include anything from basic household’s items to high-tech industrial equipment.
Although the concept of IoT has been around for a while, it has only just become a reality
thanks to a variety of recent technological advancements. More manufacturers may now use
IoT technology thanks to reasonably priced and trustworthy sensors. Additionally, a variety of
internet network protocols have made it simple to link sensors to the cloud and other "things" for
effective data transfer. Additionally, the expansion of cloud platform options gives consumers
and organizations access to the infrastructure they need to scale up without having to handle
it all themselves. Furthermore, with advancements in machine learning and analytics, as well
as access to a variety of enormous amounts of data stored on the cloud, businesses may be able
to gain insights more quickly and easily. The development of these interconnected technologies
pushes the limits of IoT, and IoT data also fuels these innovations. Finally, Natural language
processing (NLP) is now available on Internet of Things (IoT) devices, including digital personal
assistants like Alexa, Cortana, and Siri. This has made IoT devices more attractive, practical,
and inexpensive for usage at home.
Self-Driving Cars are the most vivid example of the IoT. We can now enable communication
and information exchange between the automobile and its surroundings owing to IoT and
5G technologies. As a result, the vehicle will be able to interact with its surroundings more
effectively.

11
Chapter 3

Lane and Curves detection


using Digital Image processing and
Artificial Intelligence

3.1 Introduction
This venture aims to develop a self-driving and self-using car, or at the very least to aid drivers
in reducing their risk of injuries, achieving driverless cars, and hopefully having fewer rule
violations and street incidents. Furthermore, it enables blind and elderly people to own and
operate an automobile without the need for assistance.
We will use advanced approaches like machine learning and deep learning artificial intelli-
gence models to make this endeavor a success.
To begin with, the lanes must be detected in order for the car to travel down its path,
which is why machine vision, deep learning and machine learning models are used. It’s critical
to have the best and most accurate detection for lane curves and changes. To accomplish this
objective, the suggested system is fed images, which are then analyzed using AI algorithms to
produce lane recognition. Or using feature extraction methods for lanes detection.
Numerous lane-detection techniques with advanced performance have been presented re-
cently, according to the literature. However, the majority of these techniques are only effective
in detecting the road lanes in the current frame of the driving scene, which results in poor
performance when dealing with difficult driving conditions like deep shadows, severe road mark
degradation, significant vehicle blockage, and so on so forth. It is possible to forecast the lane
in the wrong direction, to just partially notice it, or even to completely miss it.
Road lanes are often continuous line formations that are either solid or dashed on the
pavement. Due to the continuous nature of the driving sequences and the substantial overlap
between two adjacent frames, the relationship between the lane positions in the adjacent frames
is strong. Even though the lane may experience degradation or damage produced on by shadows,
stains, and occlusion, the lane in the current frame may be predicted with more accuracy by
using several prior frames. And this represents an abstraction for the method suggested by
Q.Zou et al (21) that we used.

3.2 Technologies
When it comes to handling several computer vision issues including object identification, image
classification, and semantic segmentation, deep learning has shown to be a cutting-edge, human-
competitive, and often superior technology.

12
Deep neural networks often fall into one of two categories. One is the deep convolutional
neural network (DCNN), which is skilled in feature abstraction for pictures and videos and
frequently processes the input signal via many levels of convolution. The other is the deep
recurrent neural network (DRNN), which has the ability to anticipate information for time-series
signals by breaking the input signal down into subsequent blocks and establishing complete
connection layers between them.
The fundamental goal of ADAS is to increase vehicle security while also protecting other
road users, the driver, pedestrians, and bicycles. In recent years, the system’s demands have
expanded. In order to make real-time choices, warn the driver, or even act in his place, ADAS
must be able to distinguish objects, road signs, the road itself, and any other moving vehicle.
Deep learning techniques such as CNN, RNN and LSTM can be used to do this.
According to the methods provided by Q. Zou et al, a hybrid deep neural network is pre-
sented for lane recognition by employing multiple continuous driving scene photos. It is deep
neural network that combines the DCNN and the DRNN.
The DCNN takes many frames as input and predicts the current frame’s lane using se-
mantic segmentation (26). A fully convolution DCNN architecture is demonstrated to perform
segmentation. It is divided into two networks: The encoder network and the decoder network.
This ensures that the final output and input images are the same size. The features extracted
by the DCNN encoder network are then processed by a DRNN.
To manage the time-series of encoded characteristics, a long short-term memory (LSTM
(24)) network is used. The DRNN output is expected to have merged the information from the
continuous input frames and is passed into the DCNN decoder network to help anticipate the
lanes.
To assess performance, two datasets are gathered. Hundreds of samples were collected for
each of the 12 circumstances in one dataset. The other dataset, which includes thousands
of samples, was gathered on country roads. These datasets may be used to evaluate various
lane-detection techniques quantitatively.

3.3 Related works and state-of-the-art methods


A lane detecting system on the road is an important component in the development of intelligent
automobiles. It has a direct impact on driving habits. It is feasible to establish an effective
driving direction and offer an exact location of the automobile in the road lane using such a
system. As a result, a thorough investigation into this subject is required.
Lane detection approaches are classified into two types: feature-based detection methods
and model-based detection methods.
The first technique consists of separating the lane from the road scene based on the color
and edge feature. To simplify the Hough Transform-based boundary line identification tech-
nique, Zheng et al. (6) used a system that directly recognizes the boundary line in the Hough
space. The picture is subjected to the Hough Transform, and the points in the Hough space
corresponding to the parallelism, length and angle, and intercept characteristics of the line are
chosen. When compared to the previous method, testing findings indicated that recognition on
fast lanes and structured roadways is much better. Huang et al. (22) evaluated probable per-
ception difficulties in tough settings such as changeable lighting conditions, variable weather
and color conditions, and diverse road kinds using the linear Hough Transform (HT)-based
straight lane recognition method. Indeed, the HT-based method was shown to have an ade-
quate detection rate in basic scenarios, such as driving on a highway or in conditions with clear
contrast between lane borders and their surroundings. It, on the other hand, failed to recognize
roadway limits under a range of illumination circumstances. Suder et al. (12) proposed three
systems for extracting and detecting road lanes in different environments: (1) horizontal road

13
lane detection based on image segmentation in the HSV color space, (2) optimal path finding
using the edge detection-based hyperbolic fitting line detection algorithm, and (3) road lane
detection based on edge detection, Scharr mask (? ) , and Hough Transform algorithm. The
suggested solutions were developed and tested using embedded devices such as the NVIDIA
Jetson Nano and the Raspberry Pi 4B. The accuracy of feature-based lane detection was dra-
matically lowered when the lane was damaged or visibility was limited; these approaches are
only relevant to real-world road situations when the lane margins are clean and under basic
road conditions.
Model-based strategies have been applied to address the problem of road lane recognition,
resulting in the advancement of Self-Driving Systems and (ADAS). y. Zhao et al. (19) proposed
a deep reinforcement learning-based model for surface lane detection that consists of two steps:
the bounding box detector and the landmark point localizer. For better representation of
curving lanes, a bounding box level convolution neural network is utilized to find the road lane,
followed by a reinforcement-based Deep Q-Learning Localizer (DQLL) to properly localize
the lanes as a set of landmark points. This suggested model outperforms competitors in the
NWPU Lanes and TuSimple (15) Lanes datasets. To detect automobile activity on the road,
Heo et al. (14) developed a mix of lightweight deep learning models on an embedded GPU
platform (eGPU). Their method analyzes discrete photos to provide a continuous track of the
vehicle’s journey. The assessment findings reveal that the suggested method can accurately
extract a vehicle’s horizontal and vertical motions. Kortli et al (29). proposed a novel deep
learning lane detection method (deep embedded hybrid CNN-LSTM network). First, pre-
processing such as resizing, shuffling, and normalization are carried out. Second, the proposed
three-network system is applied to a single RGB channel to predict lane markings: a CNN
Encoder-Decoder network that predicts lane markings on the road, a CNN Encoder-Decoder
network that includes a Dropout layer for regularization and uncertainty estimation, and a
CNN Encoder-LSTM decoder network architecture that uses the LSTM network to improve
detection rate by suppressing the influence of false alarm patches on detection results. Third,
using feature extraction to locate road lane markings. Finally, some data post-processing is
performed, such as Canny edge detection, Perspective transforms, and polynomial curve fitting.
This model was deployed on a high-performance NVIDIA Jetson Xavier NX platform. Jianwei
et al (31). Provided the LDTFE (Lane Detection with Two-stage Feature Extraction). LDTFE
is robust model-based method used to detect lanes, whereby each lane has two boundaries.
To enhance robustness, the lane boundary is considered as a collection of small line segments.
The two-stage feature extraction method is then used. The first stage consists of extracting
small but important line segments using a modified Hough Transform based on the concept of
continuity. The second stage involves clustering those small line segments using a density-based
Clustering algorithm (DBSCAN). To lessen the probability of false positives, the final lanes are
identified using the vanishing point after locating candidate clusters using the color contrast
between the road and the lane boundaries. Experiments show that this strategy produces
outstanding results on two datasets of road photos. Model-based (deep learning) lane detection
methods are appropriate for situations where the lane is damaged or visibility is poor. However,
when the road traffic information is extremely complicated or there are interfering obstacles,
detection is drastically decreased, and false detections are frequent. [ (17), (27), (30), (23)].
Deep learning-based techniques can considerably increase the accuracy and robustness of
lane detection. At the same time, these approaches demand more technology and have more
sophisticated architecture, which leads to some limits. To meet the Embedded Intelligence (EI)
restrictions, it is vital to continue to enhance lane detecting systems and put them on embedded
platforms for faster execution time.
Most of the previously mentioned methods confine their approaches to detecting road lanes
in one current frame of the driving scene, resulting in poor performance in dealing with difficult

14
driving circumstances. We will be using a method proposed by Q. Zou et al to address this
specific issue.

15
Chapter 4

Datasets

4.1 Dataset source


We had some difficulty finding a suitable dataset during our search, but at the end, we discovered
a Chinese dataset called tvtdataset. It includes 19383 image sequences for lane detection, with
39460 labeled frames. These images were split into two sections: the first training dataset has
9548 labeled images that have been augmented four times, while the second is a test dataset
which has 1268 labeled images. The images in this dataset are 128*256 in size.

4.2 Dataset preparation


The training set has been expanded. The data volume is quadrupled by flipping and rotating
the images by three degrees. This augmented data is separated from the original training set
and transformed into flipped and rotated datasets.

The training set contains images of continuous driving scenes that are divided into image
sequences, each sequence is composed of twenty frames. There are 19096 sequences available
for training. A sequence’s 13th and 20th frames are labeled. The training dataset is divided
into two parts. They are built using the TuSimple lane detection dataset, which contains scenes
from American highways. as well as photographs of rural China.

16
Figure 4.1: An example of an input image and the labeled groundtruth lanes. (a) The input
image. (b) The ground truth.

For the test, the dataset providers also took samples from 5 continuous images to determine
the lanes in the previous frame and compared those samples to the preceding frame’s actual
ground truth. They created Test set #1 and Test set #2, which are two entirely independent
test sets. On the TuSimple test set for typical testing, Test set #1 is based. Test set #2
includes Various situations need for the collection of hard samples, particularly when evaluating
robustness.

4.3 Conclusion
We had a valid dataset after the aforementioned procedures. It will be employed in the training
to help create a robust model able to predict lanes.

17
Chapter 5

Proposed Method (Robust Lane


Detection from Continuous Driving Scene
Using Deep Neural Networks)

5.1 Introduction
During the third week of our internship, we attempted to detect lanes within a video frame
using openCV, the Canny algorithm to detect edges, the Hough Transform to detect straight
lines, Area of Interest Selection based on image masking and Clustering to group line segments
based on similarity measurements.

Figure 5.1: lane detection using area of interest masking and Canny edge detection:
(left): the original frame, (center) Canny image, (right) masked Area of Interest [AoI]

When the road lines are obvious and the region of interest matches the road lanes, we achieve
some results. However, this resulted in the inconsistency of our simple method. Because the
lanes and the car are not always consistent, the Area of Interest must be changed for different
frames.

Figure 5.2: Hough Transform results on 3 different frames :


(left): the original frame, (center) Marrakech video frame, (right) failed marrakech video frame

Because of the disadvantages indicated above, we have decided to use a deep-based strategy

18
that employs computer vision and A.I. approaches to improve the efficiency and satisfaction of
the outcomes. That is why we employed the strategy described below.

5.2 System Overview


Lanes are solid - or dashed - line structures on the pavement that can be recognized in a single
image using geometric modeling or semantic segmentation. However, due to their unsatisfactory
performance under demanding conditions such as strong shadow, severe mark deterioration, and
extreme vehicle occlusion, these models are not promising for use in practical ADAS systems.
A single image’s information is found to be insufficient to facilitate an effective lane detection.
The suggested method combines CNN and RNN for lane recognition with a number of
continuous frames of the driving scenario. Actually, in the continuous driving scenario, images
collected by automotive cameras are successive, and lanes in one frame and the preceding
frame frequently overlap, allowing lane detection in a time-series prediction framework. RNN
is adaptable for lane identification and prediction tasks due to its abilities in continuous signal
processing, sequential feature extraction, and integrating. Meanwhile, CNN excels at processing
huge images. An input image can be abstracted as a smaller-sized feature map(s) via recursive
convolution and pooling methods. These feature maps obtained on continuous frames have the
time-series property and can be easily handled by an algorithm.

Figure 5.3: Architecture of the proposed network.

Q.Zou et al built the network in an encoder-decoder framework to incorporate CNN and


RNN as an end-to-end trainable network. The figure above depicts the planned network’s
architecture. Both the encoder and decoder CNNs are fully convolutional networks. The
encoder CNN uses a number of continuous frames as input to produce a time-series of feature
maps. The feature maps were then fed into the LSTM network to predict lane information.
The LSTM output is then sent into the CNN decoder to generate a probability map for lane
prediction. The size of the probability map is the same as the size of the input image.

5.3 Network Design


5.3.1 LSTM Network
An LSTM network is used in this work, which outperforms the typical RNN model in terms of
its capacity to forget insignificant information while remembering crucial elements by employing

19
cells in the network to determine whether the information is important or not. A double-layer
LSTM is being used, one for sequential feature extraction and one for integration. The typical
full-connection LSTM requires a lot of time and computation. As a result, in the suggested
network, a convolutional LSTM (Conv LSTM) was used. The ConvLSTM replaces matrix
multiplication with a convolution operation in each gate of the LSTM, which is commonly
utilized in end-to-end training and feature extraction from time-series data.
In this network, the input and the output of the proposed ConvLSTM are equal to the
feature map size produced by the encoder, it’s equal to 8*16 for the UNet-ConvLSTM. The
convolutional kernel has a size of 3 x 3. The ConvLSTM has two hidden layers, each of which
has a dimension of 512.

5.3.2 Encode-decoder network


The encoder-decoder framework models lane
detection as a semantic segmentation task. In
the encoder part, convolution and pooling are
used for image abstraction and feature extrac-
tion. The decoder part uses deconvolution
and up-sampling to grasp and highlight in-
formation targets.
Inspired by the success of the U-Net (18)
in semantic segmentation, the network was
built by embedding the ConvLSTM block into
these two encoder-decoder networks. The re-
sultant network is named Unet-ConvLSTM.
The encoder and decoder blocks are made up
entirely of convolutional networks. Thus, de-
termining the number of convolutional layers,
as well as the size and number of convolu-
tional kernels, is crucial to architectural de-
sign.
A block of the encoder network in the U-
Figure 5.4: Encoder Network in UNet-
Net has two convolution layers with twice the
ConvLSTM, Skip connections exist between
number of convolution kernels as the previous
convolutional layers in the encoder and the
block, and a pooling layer is utilized for down-
matching layer in the decoder.
sampling the feature map. After this process,
the size of the feature map will be cut in half
(on side length), while the number of channels representing high-level semantic characteristics
will be increased. The last block of the proposed UNetConvLSTM does not double the number
of kernels for the convolution layers, as seen in the picture above. There are two causes. First,
information in the original image can be well represented even with fewer channels. The lanes
are typically represented by primitives such as color and edge, which can be extracted and used.
Second, the encoder network’s feature maps will be passed into ConvLSTM for sequential fea-
ture learning. As the side length of the feature map is reduced to half while the channels stay
identical, the parameters of a full-connection layer are quadrupled, making it easier to process
by ConvLSTM.
The size and number of feature maps in the decoder CNN should be the same as those
in the encoder CNN, but arranged in the opposite way for improved feature recovery. As a
result, the up-sampling and convolution processes in each decoder sub-block correspond to
the corresponding actions in the encoder sub-block. Feature-map appending is accomplished

20
in a straightforward manner in U-Net between the respec-tive sub-blocks of the encoder and
decoder.
As a convolution procedure, we use the general Convolution-BatchNorm-ReLu technique
for the encoder and decoder CNNs. Every convolution uses the ’same’ padding.

5.4 training strategy


After constructing the end-to-end trainable neural network, the network may be trained to
generate predictions towards the ground truth via a back propagation process that updates the
weight parameters of the convolutional kernels and the ConvLSTM matrix. The training takes
into account the four factors listed below:
a) The usage of the weights of UNet pre-trained on ImageNet (28)
b)For identifying the lanes, N continuous frames of the driving scenario are used as input.
As a result, the coefficient on each weight update for ConvLSTM should be divided by N during
back propagation. In this experiment, the number of photos used for comparison was N=5. An
experimental examination was conducted to determine how N affects performance.
c)To handle the discriminative segmentation task, a loss function based on weighted cross
entropy is built.
d) At various phases of training, several optimizers were utilized to efficiently train the
proposed network. Initially, the adaptive moment estimate (Adam) (13) optimizer was em-
ployed . When the network has been trained to a high level of precision, we apply a stochastic
gradient descent optimizer (SGD) (25), which has a smaller painstaking stride in finding the
global optimal. When switching optimizers, learning rate matching should be done; otherwise,
the learning will be disrupted by a completely different learning stride, resulting in turbulent
or stagnant convergence.

5.5 Conclusion
Throughout this chapter, we were introduced to a variety of concepts. After explaining and
comprehending the technologies employed in this strategy, we will attempt to put them on a
machine, test and train the model, and measure the results in accordance with the metrics.

21
Chapter 6

Implementation and Evaluation

6.1 Implementation
Experiments are carried out in this chapter to validate the accuracy and robustness of the
suggested method. The suggested networks’ performances are evaluated in various scenarios and
compared to various lane-detection algorithms. The impact of parameters is also investigated.
The training was initially conducted on a potent machine with an I7 CPU and a 2060 NVIDA
RTX GPU. However, this configuration is insufficient to perform such complex calculations. Due
to the Google Collaborator environment’s reliability compared to our material, we switched to
the pro version that Google offers. A faster Tesla P100 GPU was used. There are 9 epochs in
the batch, which has a size of 10. The training took 1200 minutes to complete, with a 97.95
percent accuracy rate and a 22 percent loss. Due to time constraints, we couldn’t complete
more than 9 epochs. But the initial results showed promising prospects.

6.2 Evaluation
The accuracy evaluation criterion, which gauges overall classification performance using pixels
that have been correctly classified, is the most straightforward 6.1.
T rueP ositive + T rueN egative
accuracy = (6.1)
T otalN umberof P ixels
For a more fair and realistic comparison, precision 6.2 and recall 6.3 are used as two metrics,
which are defined as
T rueP ositive
P recision = (6.2)
T rueP ositive + F alseP ositive
,
T rueP ositive
Recall = (6.3)
T rueP ositive + F alseN egative
We set lane as the positive class and background as the negative class for the lane detection
job. According to Eqs.6.2 6.3, the number of lane pixels that are correctly predicted as lanes is
known as the true positive, the number of background pixels that are incorrectly predicted as
lanes is known as the false positive, and the number of lane pixels that are incorrectly predicted
as background is known as the false negative.
We deduce from a visual inspection that the prediction of thinner lanes, the reduction of
fuzzy conglutination zones, and a decrease in misclassification are the primary factors contribut-
ing to the improvement in precision. For ADAS systems, incorrect lane width, the presence
of fuzzy regions, and misclassification are particularly risky. Our approaches produce smaller

22
lanes than the results of other methods, which reduces the likelihood that background pix-
els close to the ground truth would be classified as lanes and results in a low false positive
rate. Because backdrop pixels are no longer classified as lane class, the fuzzy area surrounding
vanishing points and vehicle-occluded zones also results in a low false positive.
Moreover, the aforementioned causes also contribute to the decline in memory. Although
thinner lanes are more accurate in representing lane positions, it can occasionally be simple for
them to maintain a pixel-level separation from the reality. It will be more difficult to overlap
two thinner lines. Higher false negatives will result from these deflected pixels. Small pixel-
level deviation, however, is impalpable to human sight and has no negative effects on ADAS
systems. Conglutination reduction has a similar impact. Conglutination will emerge if the
model accurately classifies every pixel in a region as belonging to the lane, which will result in
a high recall.In this case, the background class will suffer from substantial misclassification and
the precision will be quite low, despite the high recall.
In other words, despite the slightly reduced recall, the model better matches the task.
Thinner lines, which will fairly deviate slightly from the reality, are the cause of the decreased
recall. We provide F1-measure as a complete matrices for the evaluation because the accuracy
or recall only represent a portion of the performance of lane detection.F1 6.4 is described as
P recision.Recall
F1 = 2 ∗ (6.4)
P recision + Recall
Running time:The suggested model adds an LSTM block in addition to taking a series of
photos as input, which could result in longer run-time.
Robustness:Despite the strong performance on previous test datasets, the proposed lane
detection model has not yet been tested for robustness. This is like increasing the chances of
a traffic accident with even the slightest false positives. A good lane detection model needs
to be able to handle common driving situations such as urban areas and highways, as well
as difficult driving scenarios such as country roads, poor lighting, and vehicle blockages. I
have. Robustness testing uses a completely new dataset that contains many real-world driving
scenes. The 728 images of Test Set 2 displayed in the Dataset section contain lanes for rural,
urban, and highway scenes. This dataset was recorded by a data recorder in different weather
conditions, inside and outside the windshield, and at different heights. This is a thorough and
difficult test dataset that contains some tracks that are difficult for the human eye to see.

6.2.1 training
The training lasted approximately 1200 minutes and was conducted on the Google collaborator
environment. The average loss in the ninth epoch was approximately 22.75 percent, which
was not entirely satisfactory. Accuracy was approximately 97.95 percent, which is extremely
impressive given only nine epochs. LSTM helps the model to perform better because of its
ability to learn long term dependencies by remembering information across extended delays,
and forgetting superfluous information, and carefully exposing information at each time step.

23
Figure 6.1: loss and accuracy over the number of epochs

Even when the loss is rather high, the precision is just about adequate. Training the model
on a larger number of epochs reduces loss while increasing accuracy and precision.

6.2.2 testing
We evaluated the model on the 0531 testing set, and the output photos are nearly identical
to the true value. The average loss was calculated to be 0.22, the accuracy is around 97.95
percent, the precision is approximately 0.89, the recall is approximately 0.98, and the F1-score
is approximately 0.93. In comparison to what we expected, these results are pretty promising.

Figure 6.2: truth value and the input image


(left) input (right)Truth value

Figure 6.3: output values


(left) output data (right) the prediction

24
The results are clearly robust, the only problem is that the output lanes are slightly thicker
than the real value.
We should not forget to mention that these results are predictions.The inputs are 5 frames
instead of one.

6.3 Conclusion
For robust lane detection in driving scenes, a novel hybrid neural network combining CNN
and RNN was proposed. The proposed network architecture is based on an encoder-decoder
framework that takes as input multiple continuous frames and predicts the lane of the current
frame using semantic segmentation. In this framework, a CNN encoder first abstracted features
from each frame of input. A ConvLSTM was then used to process the sequentially encoded
features of all input frames. The ConvLSTM outputs were then fed into the CNN decoder
for information reconstruction and lane prediction. For performance evaluation, two datasets
containing continuous driving images are created.
When compared to baseline architectures that use a single image as input, the proposed
architecture produced significantly better results, proving the effectiveness of using multiple
continuous frames as input. In the meantime, the experimental results showed that ConvLSTM
outperformed FcLSTM in sequential feature learning and target-information prediction in the
context of lane detection.
When compared to other models, the proposed models performed better, with higher pre-
cision, recall, and accuracy values. Furthermore, the proposed models were tested on a dataset
with extremely difficult driving scenes to ensure their robustness. The results demonstrated
that the proposed models can detect lanes in a variety of situations while avoiding false recog-
nitions. Longer sequences of inputs were found to improve performance in parameter analysis,
further supporting the strategy that multiple frames are more helpful than a single image for
lane detection.
Looking at the outcomes, we discover that the model was satisfying, which encourages us
to apply it in practical settings.

25
Chapter 7

Deployment

7.1 Results
Before applying our results on real-time detection, we should first test the detection on: a single
photo and a video recording.
We used the same videos and photos that we did for the Hough transform. The proposed
method was used on the "Marrakech-frame" and the videos "Marrakech-3," "Marrakech-2,"
and "Agadir." The results are as follows:
For the "marrakech-frame" the results on a single frame are good but not perfect since we
can’t predict lane from a single frame. This is the obtained output :

Figure 7.1: lane detection on the Marrakech-frame

Despite some erroneous detection or missing lanes, the detection of the "marrakech-2" shows
exceptional results. However, the approach is quite robust for only 9 epochs. Since we demand
a better material setup, video detection is pretty slow. The results are shown in the screenshots
below.

26
Figure 7.2: lane detection on the Marrakech-2 the frames are from the same video and they are
arranged in terms of time from left to right, top to bottom

Now let’s see detection on a better road with clear road lanes. The following images are
from Agadir videos. The detection appears to be way more robust than in the previous videos.

See Next Page

27
Figure 7.3: lane detection on Agadir video, the frames are from the same video and they are
arranged in terms of time from left to right, top to bottom

On Google Collaborator the detection for a video is slightly slow. The detection on the local
machine is extremely slow. this illustrates the hardware requirements of the proposed network.

7.2 Discussion
To further validate the proposed methods’ excellent performance, we compare them to addi-
tional methods reported in the TuSimple lane-detection competition. It should be noted that
our training set is based on the TuSimple dataset.
The following table shows a comparison that the team of scientists did :

28
Figure 7.4: TuSimple lane marking challenge leader board on test set as of March 14, 2018

Note that the UNet-ConvLSTM here in the table was trained for (100 epochs). the proposed
method outperforms most of the methods stated in the table above.
The performance of the proposed methods is influenced primarily by two parameters. The
first is the number of frames used as the networks’ input, and the second is the sampling stride.
These two parameters together determine the total range between the first and last frames.
Given more frames as input for the proposed networks, the models can generate prediction
maps with more additional information, which may aid in the final results.
The challenge is to deal with the speed of detection when detecting 3 to 5 frames. Using
one frame we can’t get prediction but it requires less hardware configurations. We have to deal
with the model so it can be more increase efficiency and decrease time consumption.

7.3 Real time detection


In this section, we will demonstrate our real-time detection method. We were able to detect
frames from a camera using OpenCV, convert them from RGB to BGR, resize them, and feed
them into the network. We also used Streamlit (1) to create a GUI (Graphical user interface)
to make the real-time model more user-friendly. Streamlit is an open-source Python library for
creating and sharing beautiful, custom web apps for machine learning and data science. You
can create and deploy powerful data apps in just a few minutes.

Figure 7.5: The Graphical User Interface (GUI) | (left)we check the box to activate streamlit
(right) active

Using a phone camera mounted to a PC via USB and using the Iriun Webcam to link the
PC to the phone’s camera. We were able to capture frames in real time. and then using them
to detect lanes.

29
We first get the live frames from the camera to the (real-time) detection python file using
openCV. The image format should be converted to Numpy and then to tensor.

Figure 7.6: The original camera frame detected using phone Iriun Webcam

We launch the real time detection from the GUI. And the following are the image outputs
from both Marrakech and Agadir.

Figure 7.7: two screenshots of live detection on the Marrakech video

30
Figure 7.8: two screenshots of live detection on the Agadir video

7.4 Conclusion
When compared to other models, the proposed models performed better, with higher precision,
recall, and accuracy values. The plan is to improve the lane detection system in the future by
incorporating lane fitting into the proposed framework.
As a result, the detected lanes will be smoother and more reliable. Furthermore, when
strong interferences exist in a dim environment.
This will improve real-time detection as well, but the challenge is to develop a model capable
of detecting objects in real time that can be embedded on low-cost hardware.

31
Chapter 8

General Conclusion

8.1 Synthesis
After discovering and comprehending the subject factors and challenges, we seek to solve and
enhance them. Finding the suitable data for both hardware and software. We grouped together
the activities related to the construction of the precise set of data to be analysed, which must
be prepared to be compatible for use. In the next step, and with the use of an algorithm to
generate knowledge, then verify the model or knowledge obtained to test the robustness and
accuracy of the models obtained (in our case, the generated model). Last but not least, finally
put the knowledge found in an algorithm to generate knowledge (testing/real-time detection).
Learning about the levels of vehicle automation, we become able to distinguish between
DAS, ADAS, and ADS. In addition to this, we acquired a tip from the iceberg of the deployed
technologies on the subject, such as: Big Data that represents the "what to learn?"; Computer
vision that represents the "how to see?"; Machine learning/Deep learning that represents "how
to learn?"; and the Internet of Things that represents "how to communicate?".
We were also introduced to different categories of machine learning/deep learning algo-
rithms, which include supervised, unsupervised, semi-supervised, and reinforcement learning.
Moreover, we were introduced to two supervised learning deep neural networks: Conventional
Neural Network (CNN) and Recurrent Neural Network (RNN). Also, we learned about self-
driving cars and their use cases and implemented algorithms within this topic.
After reading some state-of-the-art methods such as LDTFE and the deep hybrid CNN-
LSTM network, We learned about the two categories of lane detection techniques: feature-based
and model-based.
We then selected the most convenient dataset (TvtDataset). and the matching model. We
started to understand the model and tried to run it on our machines. Due to the lack of
suitable hardware, we found many difficulties during this stage. However, we surely enjoyed
the process.The proposed method is a hybrid deep neural network that combines the DCNN
and the DRNN to predict the road lane from multiple frames. The method showed some
outstanding results. Despite being slow, it outperforms most of the methods on the proposed
dataset. Finally, we deployed the model on a local machine and created a GUI to ease the
usage of the model.
Finally, we are quite satisfied with those results. and we are hoping to go even further.
Maybe create a model from scratch, invest in making models more suitable for a machine, etc.

8.2 Perspective
The goal for future work is to deploy our models on hardware. The idea is to create a small
dataset with three or four classes (for example, (1) stop sign, (2) crosswalk road mark, (3) red

32
traffic light, (4) green traffic light), train a tiny YOLO model on that dataset, and then deploy
the trained model on a raspberry pi 3 (model B, for example) (and of course, the Arduino and
the Raspberry pi are connected via LAN).
When the RC-car is moving (on a racetrack made of small traffic signs, lights, and curves),
the Pi-camera captures road frames, the Raspberry Pi processes these frames, and our tiny
YOLO model detects Vertical/Horizontal Road Signs,and the Lane Curves are identified using
pattern recognition. For example, using the LDTFE or another method that requires fewer
calculations, the raspberry pi will send commands to the Arduino to execute, causing the RC-
car to move forward, left, right, or even stop. (However, if we want it to go backwards, another
camera will be required to capture frames of the road behind the RC-car.)
We wanted to use the CARLA simulator for the simulation part, but due to a lack of
computational power, we will use a less hardware requiring simulator such as Udacity, and the
idea is to install a system capable of detecting and recognising speed traffic signs in the Udacity
simulator, and the architecture is as follows:

Figure 8.1: Simulation abstraction

Making a model more "embedded" is a difficult goal to achieve, and creating a simulation
was difficult due to time constraints. Nonetheless, they are intriguing endeavors that we hope
to complete soon as humanly possible. We will keep working to achieve those objectives and
enhance our knowledge in this area As a result, we will strive to stay current with developments
in the field and to conduct cutting-edge research using sophisticated technology to push the
boundaries of knowledge, as well as becoming more creative and passionate.

33
Bibliography

[1] A faster way to build and share data apps, 2019https://streamlit.io/.

[2] K.Heinrich C.Janiesch, P.Zschech. Machine learning and deep learning. Springer Link,
2021.

[3] Nissan Motor Co. Seamless autonomous mobility (sam).

[4] IBM Corporation. Introduction to crisp-dm. IBM SPSS Modeler CRISP-DM Documenta-
tion, 2018.

[5] Intel Corporation. What is machine vision?

[6] K.Song C.W.Yan M.C.Wang F.Zheng, S.Luo. Improved lane line detection algorithm
based on hough transform. Springer Link, 2018.

[7] Bengio Y Courville Goodfellow, I. Deep learning. SpringerLink, 2016.

[8] IBM. What is industry 4.0? (https://www.ibm.com/topics/industry-4-0).

[9] Oracle Cloud Infrastructure. What are the three vs of big data? 2022.

[10] Oracle Cloud Infrastructure. What is big data? 2022.

[11] Hyunjoo Jin. Like tesla, toyota develops self-driving tech with low-cost cameras. Rueters,
2022.

[12] T.Marciniak J.Suder, K.Podbucki and A.Dąbrowski. Low complexity lane detection meth-
ods for light photometry system. MDPI, 2021.

[13] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, 2014.

[14] kivinju. tusimple-benchmark ground truth 3. Github, 2018.

[15] Zhiyuan Zhao Qi Wang Xuelong Li. Deep reinforcement learning based lane detection and
localization. ScienceDirect, 2020.

[16] Tesla motors. Tesla autopilot. 2022.

[17] M.Bauer N.Khairdoost, S.Beauchemin. Road lane detection and classification in urban
and suburban areas based on cnns. SCITEPRESS Digital library, 2021.

[18] P. Fischer O. Ronneberger and T. Brox. U-net: Convolutional networks for biomedi-
cal image segmentation. in International Conference on Medical Image Computing and
Computer Assisted Intervention, 2015.

[19] OpenCV. Documentation / image processing in opencv. 2022.

34
[20] Oracle. What is iot? Oracle, 2022.

[21] Q. Dai Y. Yue L. Chen Q. Wang Q. Zou, H. Jiang. Wang, robust lane detection from
continuous driving scenes using deep neural networks. IEEE Transactions on Vehicular
Technology, 2019.

[22] J.Liu Q.Huang. Practical limitations of lane detection algorithm based on hough transform
in challenging scenarios. SAGE Journals, 2021.

[23] Jinlong Liu Qiao Huang. Practical limitations of lane detection algorithm based on hough
transform in challenging scenarios. SAGE Journals, 2021.

[24] Eric Rothstein Morris Ralf C. Staudemeyer. Understanding lstm – a tutorial into long
short-term memory recurrent neural networks. Arxiv, 2019.

[25] Sebastian Ruder. An overview of gradient descent optimization algorithms. Arxiv, 2017.

[26] J.P.Cohen J. Cohen-Adad G.Hamarneh S.A.Taghanaki, K.Abhishek. Deep semantic seg-


mentation of natural and medical images: a review artif intell rev 54, 137–178 (2021).
Springer Link, 2020.

[27] G.Panda S.C.Satapathy R.Sharma S.Ghanem, P.Kanungo. Lane detection under artificial
colored light in tunnels and on highways: an iot-based framework for smart city infras-
tructure. Springer Link, 2021.

[28] K. Simonyan and A. Zisserman. “very deep convolutional networks for large-scale image
recognition". CoRR, 2014.

[29] Jeongyeup Paek JeongGil Ko Taewook Heo, Woojin Nam. Autonomous reckless driving
detection using deep learning on embedded gpus. IEEE, 2020.

[30] Vitor Santos Tiago Almeida, Bernardo Lourenço. Road detection based on simultaneous
deep learning approaches. ScienceDirect, 2020.

[31] Lew F.C. Lew Yan Voon Maher Jridi Mehrez Merzougui Mohamed Atri Yassin Kortli,
Souhir Gabsi. Deep embedded hybrid cnn–lstm network for lane detection on nvidia jetson
xavier nx. ScinceDirect, 2022.

35

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy