0% found this document useful (0 votes)
4 views50 pages

Face Mask Detection Project Report.

The document presents a research proposal for a face mask detection project, focusing on enhancing recognition accuracy for masked faces using a two-stage Convolutional Neural Network (CNN) architecture. It discusses the significance of face recognition technology in various applications, particularly in the context of the COVID-19 pandemic, and highlights the role of machine learning in improving detection performance. The report outlines the evolution of machine learning, its applications, and the proposed methodology for integrating face mask detection with surveillance systems.

Uploaded by

Kamal Acharya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views50 pages

Face Mask Detection Project Report.

The document presents a research proposal for a face mask detection project, focusing on enhancing recognition accuracy for masked faces using a two-stage Convolutional Neural Network (CNN) architecture. It discusses the significance of face recognition technology in various applications, particularly in the context of the COVID-19 pandemic, and highlights the role of machine learning in improving detection performance. The report outlines the evolution of machine learning, its applications, and the proposed methodology for integrating face mask detection with surveillance systems.

Uploaded by

Kamal Acharya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/393288478

FACE MASK DETECTION PROJECT REPORT

Research Proposal · July 2025


DOI: 10.13140/RG.2.2.29238.82246

CITATIONS
0

1 author:

Kamal Acharya
Tribhuvan University
285 PUBLICATIONS 5,526 CITATIONS

SEE PROFILE

All content following this page was uploaded by Kamal Acharya on 02 July 2025.

The user has requested enhancement of the downloaded file.


AN
INTERNSHIP REPORT
ON
FACE MASK DETECTION PROJECT REPORT
BY
KAMAL ACHARYA
(Tribhuvan University)

Date: 2025/07/02

1|Page
ABSTRACT

Recognition from faces is a popular and significant technology in recent years. In the
real-world, when a person is uncooperative with the systems such as in video surveillance
then masking is further common scenarios. For these masks, current face recognition
performance degrades. Still, difficulties created by masks are usually disregarded. Face
recognition is a promising area of applied computer vision. This technique is used to
recognize a face or identify a person automatically from given images. In our daily life
activates like, in a passport checking, smart door, access control, voter verification, criminal
investigation, and many other purposes face recognition is widely used to authenticate a
person correctly and automatically. Face recognition has gained much attention as a unique,
reliable biometric recognition technology that makes it most popular than any other
biometric technique likes password, pin, fingerprint, etc.
The primary concern to this work is about facial masks, and especially to enhance
the recognition accuracy of different masked faces. A feasible approach has been proposed
that consists of first detecting the facial regions. The occluded face detection problem has
been approached using Cascaded Convolutional Neural Network (CNN). Besides, its
performance has been also evaluated within excessive facial masks and found attractive
outcomes. Finally, a correlative study also made here for a better understanding.

2|Page
Chapter 1

Introduction

1.1 Face-Mask Recognition:


Rapid advancements in the fields of Science and Technology have led us to a stage where
we are capable of achieving feats that seemed improbable a few decades ago. Technologies in
fields like Machine Learning and Artificial Intelligence have made our lives easier and provide
solutions to several complex problems in various areas.

Face mask detection refers to detect whether a person is wearing a mask or not. In fact, the
problem is reverse engineering of face detection where the face is detected using different
machine learning algorithms for the purpose of security, authentication and surveillance. Face
detection is a key area in the field of Computer Vision and Pattern Recognition. A significant
body of research has contributed sophisticated to algorithms for face detection in past. The
primary research on face detection was done in 2001 using the design of handcraft feature and
application of traditional machine learning algorithms to train effective classifiers for detection
and recognition. The problems encountered with this approach include high complexity in feature
design and low detection accuracy. In recent years, face detection methods based on deep
convolutional neural networks (CNN) have been widely developed to improve detection
performance.

Modern Computer Vision algorithms are approaching human-level performance in visual


perception tasks. From image classification to video analytics, Computer Vision has proven to be
revolutionary aspect of modern technology. In a world battling against the Novel Corona- virus
Disease (COVID-19) pandemic, technology has been a lifesaver. With the aid of technology,
‘work from home’ has substituted our normal work routines and has become a part of our daily
lives. However, for some sectors, it is impossible to adapt to this new norm.

In this paper, we propose a two-stage CNN architecture, where the first stage detects human
faces, while the second stage uses a lightweight image classifier to classify the faces detected in
the first stage as either ‘Mask’ or ‘No Mask’ faces and draws bounding boxes around them along
with the detected class name.

3|Page
This algorithm was extended to videos as well. The detected faces are then tracked between
frames using an object tracking algorithm, which makes the detection robust to the noise. This
system can then be integrated with an image or video capturing device like a CCTV camera, to
track safety violations, promote the use of face masks, and ensure a safe working environment.

1.2 Role of Machine Learning:


Machine learning is a method of data analysis that automates analytical model building. It is
a branch of artificial intelligence based on the idea that systems can learn from data, identify
patterns and make decisions with minimal human intervention.

1.2.1 Evolution of machine learning


Because of new computing technologies, machine learning today is not like machine learning of
the past. It was born from pattern recognition and the theory that computers can learn without
being programmed to perform specific tasks; researchers interested in artificial intelligence wanted
to see if computers could learn from data. The iterative aspect of machine learning is important
because as models are exposed to new data, they are able to independently adapt. They learn from
previous computations to produce reliable, repeatable decisions and results.

While many machine learning algorithms have been around for a long time, the ability to
automatically apply complex mathematical calculations to big data – over and over, faster and
faster – is a recent development. Here are a few widely publicized examples of machine learning
applications you may be familiar with:

• The heavily hyped, self-driving Google car? The essence of machine learning.
• Online recommendation offers such as those from Amazon and Netflix? Machine learning
applications for everyday life.

• Knowing what customers are saying about you on Twitter? Machine learning combined
with linguistic rule creation.

• Fraud detection? One of the more obvious, important uses in our world today.

4|Page
1.2.2 Importance of Machine Learning
Resurging interest in machine learning is due to the same factors that have made data mining and
Bayesian analysis more popular than ever. Things like growing volumes and varieties of available
data, computational processing that is cheaper and more powerful, and affordable data storage.

All of these things mean it's possible to quickly and automatically produce models that can analyze
bigger, more complex data and deliver faster, more accurate results even on a very large scale.
And by building precise models, an organization has a better chance of identifying profitable
opportunities or avoiding unknown risks.

Features of Machine Learning:


 Machine leaning models involves machines learning from data without the help of humans
or any kind of human intervention.
 Machine Learning is the science of making of making the computers learn and act like
humans by feeding data and information without being explicitly programmed.
 Machine Learning is totally different from traditionally programming, here data and output
is given to the computer and in return it gives us the program which provides solution to
thevarious problems.
 It is nothing but automating the automation.
 Writing software is bottleneck.
 Getting computers to program themselves.
 With in the field of data analytics, machine learning is a method used to devise complex
models and algorithms that lend themselves to prediction; in commercial use, this is known
as predictive analytics. These analytical models allow researchers, data scientists,
engineers, and analysts to "produce reliable, repeatable decisions and results" and uncover
“hidden insights" through learning from historical relationships and trends in the data.

Categories of Machine Learning:

Machine learning tasks Machine learning tasks are typically classified into several broad
categories:

5|Page
Supervised learning: The computer is presented with example inputs and their desired outputs, given
by a "teacher”, and the goal is to learn a general rule that maps inputs to outputs. As special
cases, theinput signal can beonlypartially available, or restricted to special feedback.

Semi-supervised learning: The computer is given only an incomplete training signal: a training set
with some (often many) of the target outputs missing.

Active learning: The computer can only obtain training labels for a limited set of instances (based
on a budget), and also has to optimize its choice of objects to acquire labels for. When used
interactively, these can be presented to the user for labelling.

Unsupervised learning: No labels are given to the learning algorithm, leaving it on its own to find
structure in its input. Unsupervised learning can be a goal in itself (discovering hidden patterns in
data) or a means towards an end (feature learning).

Reinforcement learning: Data (in form of rewards and punishments) are given only as feedback to the
program's actions in a dynamic environment, such as driving a vehicle or playing a game against an
opponent.

1.2.3 Machine Learning Applications

Artificial Intelligence is everywhere. Possibility is that you are using it in one way or the other and
you don’t even know about it. One of the popular applications of AI is Machine Learning, in which
computers, software, and devices perform via cognition (very similar to human brain). Herein, we
share few applications of machine learning.

I. Image Recognition:

One of the most common uses of machine learning is image recognition. There are many situations
where you can classify the object as a digital image. For digital images, the measurements describe
the outputs of each pixel in the image.

• In the case of a black and white image, the intensity of each pixel serves as one
measurement. So if a black and white image has N*N pixels, the total number of pixels
and hence measurement is N2.

6|Page
• In the colored image, each pixel considered as providing 3 measurements to the intensities
of 3 main colours component i.e RGB. So N*N colored image there are 3 N2
measurements.

• For face detection – The categories might be face versus no face present. There might be a
separate category for each person in a database of several individuals.

• For character recognition – We can segment a piece of writing into smaller images, each
containing a single character. The categories might consist of the 26 letters of the English
alphabet, the 10 digits, and some special characters.

II. Speech Recognition:

Speech recognition (SR) is the translation of spoken words into text. It is also known as “automatic
speech recognition” (ASR), “computer speech recognition”, or “speech to text” (STT).In speech
recognition, a software application recognizes spoken words. The measurements in this application
might be a set of numbers that represent the speech signal. We can segment the signal into portions
that contain distinct words or phonemes.

In each segment, we can represent the speech signal by the intensities or energy in different time-
frequency bands. Although the details of signal representation are outside the scope of this
program, we can represent the signal by a set of real values. Speech recognition applications
include voice user interfaces. Voice user interfaces are such as voice dialing; call routing, demotic
appliance control. It can also use as simple data entry, preparation of structured documents, speech-
to-text processing, and plane.

III. Medical Diagnosis:

ML provides methods, techniques, and tools that can help solving diagnostic and prognostic
problems in a variety of medical domains. It is being used for the analysis of the importance of
clinical parameters and of their combinations for prognosis, e.g. prediction of disease progression,
for the extraction of medical knowledge for outcomes research, for therapy planning and support,
and for overall patient management. ML is also being used for data analysis, such as detection of
regularities in the data by appropriately dealing with imperfect data, interpretation

7|Page
of continuous data used in the Intensive Care Unit, and for intelligent alarming resulting in
effective and efficient monitoring.

It is argued that the successful implementation of ML methods can help the integration of
computer-based systems in the healthcare environment providing opportunities to facilitate and
enhance the work of medical experts and ultimately to improve the efficiency and quality of
medical care. In medical diagnosis, the main interest is in establishing the existence of a disease
followed by its accurate identification. There is a separate category for each disease under
consideration and one category for cases where no disease is present. Here, machine learning
improves the accuracy of medical diagnosis by analyzing data of patients.

IV. Statistical Arbitrage:

In finance, statistical arbitrage refers to automated trading strategies that are typical of a short term
and involve a large number of securities. In such strategies, the user tries to implement a trading
algorithm for a set of securities on the basis of quantities such as historical correlations and general
economic variables. These measurements can be cast as a classification or estimation problem.
The basic assumption is that prices will move towards a historical average.

In the case of classification, the categories might be sold, buy or do nothing for each security. I the
case of estimation one might try to predict the expected return of each security over a future time
horizon. In this case, one typically needs to use the estimates of the expected return to make a
trading decision (buy, sell, etc.)

V. Learning Associations:

Learning association is the process of developing insights into various associations between
products. A good example is how seemingly unrelated products may reveal an association to one
another. When analyzed in relation to buying behaviors of customers.

One application of machine learning- Often studying the association between the products people
buy, which is also known as basket analysis. If a buyer buys ‘X’, would he or she force to buy ‘Y’
because of a relationship that can identify between them. This leads to relationship that exists
between fish and chips etc. When new products launches in the market a Knowing these
relationships it develops new relationship. Knowing these relationships could help in suggesting

8|Page
the associated product to the customer. For a higher likelihood of the customer buying it, it can
also help in bundling products for a better package.

IV. Classification:

A Classification is a process of placing each individual from the population under study in many
classes. This is identifying as independent variables. Classification helps analysts to use
measurements of an object to identify the category to which that object belong. To establish an
efficient rule, analysts use data. Data consists of many examples of objects with their correct
classification.

For example, before a bank decides to disburse a loan, it assesses customers on their ability to
repay the loan. By considering factors such as customer’s earning, age, savings and financial
history we can do it. This information taken from the past data of the loan. Hence, Seeker uses to
create relationship between customer attributes and related risks.

VI. Prediction:

Consider the example of a bank computing the probability of any of loan applicants faulting the
loan repayment. To compute the probability of the fault, the system will first need to classify the
available data in certain groups. It is described by a set of rules prescribed by the analysts. Once
we do the classification, as per need we can compute the probability. These probability
computations can compute across all sectors for varied purposes.

VII. Extraction:

Information Extraction (IE) is another application of machine learning. It is the process of


extracting structured information from unstructured data. For example web pages, articles, blogs,
business reports, and e-mails. The relational database maintains the output produced by the
information extraction. The process of extraction takes input as a set of documents and produces
a structured data. This output is in summarized form such as excel sheet and table in a relational
database.

Now-a-days extraction is becoming a key in big data industry. As we know that huge volume of
data is getting generated out of which most of the data is unstructured. The first key challenge is
handling of unstructured data. Now conversion of unstructured data to structured form based on

9|Page
some pattern so that the same can stored in RDBMS. Apart from this in current day’s data
collection mechanism is also getting change.

VIII. Regression:

We can apply Machine learning to regression as well. Assume that x= x1, x2, x3 … xn are the
input variables and y is the outcome variable. In this case, we can use machine learning technology
to produce the output (y) on the basis of the input variables (x).
You can use a model to express the relationship between various parameters as below:

Y=g(x) where g is a function that depends on specific characteristics of the model.

In regression, we can use the principle of machine learning to optimize the parameters. To cut the
approximation error and calculate the closest possible outcome. We can also use Machine learning
for function optimization. We can choose to alter the inputs to get a better model. This gives a new
and improved model to work with. This is known as response surface design.

1.3 System Requirements

1.3.1 Hardware Requirements


Table 1.3.1: Hardware Requirements
SYSTEM Intel Core i3, i5, or i7

RAM 8 GB and above

HARD DISK 10 GB and above

INPUT DEVICES Keyboard and Mouse

OUTPUT DEVICES Monitor or PC

Table 1 shows the basic / minimal hardware requirements to do the project.

10 | P a g e
1.3.2 Software Requirements

Table 1.3.2: Software Requirements


OPERATING SYSTEM Windows 7, 10 or above/ ubuntu linux

IDE/EDITOR Jupyter, visual studio, google colab

FRONT END Opencv

BACK END Python and Files

PROGRAMMING LANGUAGE Python

Table 2 shows what are the basic / minimal software requirements to do the project.

1.3.2.1 Python:
Python is an interpreted, object-oriented, high-level programming language with dynamic
semantics. Its high-level built in data structures, combined with dynamic typing and dynamic
binding; make it very attractive for Rapid Application Development, as well as for use as a
scripting or glue language to connect existing components together. Python's simple, easy to learn
syntax emphasizes readability and therefore reduces the cost of program maintenance. Python
supports modules and packages, which encourages program modularity and code reuse. The
Python interpreter and the extensive standard library are available in source or binary form without
charge for all major platforms, and can be freely distributed.

Often, programmers fall in love with Python because of the increased productivity it provides.
Since there is no compilation step, the edit-test-debug cycle is incredibly fast. Debugging Python
programs is easy: a bug or bad input will never cause a segmentation fault. Instead, when the
interpreter discovers an error, it raises an exception. When the program doesn't catch the exception,
the interpreter prints a stack trace. A source level debugger allows inspection of local and global
variables, evaluation of arbitrary expressions, setting breakpoints, stepping through the code a line
at a time, and so on. The debugger is written in Python itself, testifying to Python's introspective
power. On the other hand, often the quickest way to debug a program is to

11 | P a g e
add a few print statements to the source: the fast edit-test-debug cycle makes this simple approach
very effective.

1.3.2.2 Importance of Python:

Python was designed to be easy to understand and fun to use (its name came from Monty Python
so a lot of its beginner tutorials reference it). Fun is a great motivator, and since you'll be able to
build prototypes and tools quickly with Python, many find coding in Python a satisfying
experience. Thus, Python has gained popularity for being a beginner-friendly language, and it has
replaced Java as the most popular introductory language at Top U.S. Universities.

Easy to Understand

Being a very high level language, Python reads like English, which takes a lot of syntax-learning
stress off coding beginners. Python handles a lot of complexity for you, so it is very beginner-
friendly in that it allows beginners to focus on learning programming concepts and not have to
worry about too many details.

Very Flexible

As a dynamically typed language, Python is really flexible. This means there are no hard rules on
how to build features, and you'll have more flexibility solving problems using different methods
(though the Python philosophy encourages using the obvious way to solve things). Furthermore,
Python is also more forgiving of errors, so you'll still be able to compile and run your program
until you hit the problematic part.

Scalability
Not Easy to Maintain

Because Python is a dynamically typed language, the same thing can easily mean something
different depending on the context. As a Python app grows larger and more complex, this may
get difficult to maintain as errors will become difficult to track down and fix, so it will take
experience and insight to know how to design your code or write unit tests to ease
maintainability.

12 | P a g e
Slow

As a dynamically typed language, Python is slow because it is too flexible and the machine
would need to do a lot of referencing to make sure what the definition of something is, and this
slows Python performance down.

At any rate, there are alternatives such as PyPy that are faster implementations of Python. While
they might still not be as fast as Java, for example, it certainly improves the speed greatly.

Community
As you step into the programming world, you'll soon understand how vital support is, as the
developer community is all about giving and receiving help. The larger a community, the more
likely you'd get help and the more people will be building useful tools to ease the process of
development.

Career Opportunities
On Angel List, Python is the 2nd most demanded skill and also the skill with the highest average
salary offered.

With the rise of big data, Python developers are in demand as data scientists, especially since
Python can be easily integrated into web applications to carry out tasks that require machine
learning.

Future
According to the TIOBE index, Python is the 4th most popular programming language out of 100,
with the rise of Ruby on Rails and more recently Node.js, Python's usage as the main prototyping
language for backend web development has diminished somewhat, especially since it has a
fragmented MVC ecosystem. However, with big data becoming more and more important, Python
has become a skill that is more in demand than ever, especially it can be integrated into web
applications.

As an open source project, Python is actively worked on with a moderate update cycle, pushing
out new versions every year or so to make sure it remains relevant. A programming language's
ability to stay relevant also depends on whether the language is getting new blood. In terms of

13 | P a g e
search volume for anyone interested in learning Python, it has skyrocketed to the 1st place when
compared to other languages.

Benefits of Python:
 Presence of Third-Party Modules
 Extensive Support Libraries
 Open Source and Community Development
 Learning Ease and Support Available
 User-friendly Data Structures
 Productivity and Speed
 Highly Extensible and Easily Readable Language.

1.3.2.3 Python Modules

Python allows us to store our code in files (also called modules). This is very useful for more
serious programming, where we do not want to retype a long function definition from the very
beginning just to change one mistake. In doing this, we are essentially defining our own modules,
just like the modules defined already in the Python library. To support this, Python has a way to
put definitions in a file and use them in a script or in an interactive instance of the interpreter.
Such a file is called a module; definitions from a module can be imported into other modules or
into the main module.

 NumPy - NumPy is a module for Python. The name is an acronym for "Numeric Python"
or "Numerical Python". Furthermore, NumPy enriches the programming language Python
with powerful data structures, implementing multi-dimensional arrays and matrices.
 Opencv - OpenCV-Python is a library of Python bindings designed to solve computer
vision problems. ... OpenCV-Python makes use of Numpy, which is a highly optimized
library for numerical operations with a MATLAB-style syntax. All the OpenCV array
structures are converted to and from NumPy arrays.
 keras: Keras is a minimalist Python library for deep learning that can run on top of Theano
or TensorFlow. It was developed to make implementing deep learning models as fast and
easy as possible for research and development.
 Tensorflow: It is an open source artificial intelligence library, using data flow graphs to
build models. It allows developers to create large-scale neural networks with many

14 | P a g e
layers. TensorFlow is mainly used for: Classification, Perception, Understanding,
Discovering, Prediction and Creation.

Jupyter notebook:
The Jupyter Notebook is an open source web application that you can use to create and share
documents that contain live code, equations, visualizations, and text. Jupyter Notebook is
maintained by the people at Project Jupyter. Jupyter Notebooks are powerful, versatile, shareable
and provide the ability to perform data visualization in the same environment. Jupyter Notebooks
allow data scientists to create and share their documents, from codes to full blown reports.

Advantages of Jupyter Notebook:


1. All in one place: As you know, Jupyter Notebook is an open-source web-based interactive
environment that combines code, text, images, videos, mathematical equations, plots, maps,
graphical user interface and widgets to a single document.
2. Easy to convert: Jupyter Notebook allows users to convert the notebooks into other formats
such as HTML and PDF. It also uses online tools and nbviewer which allows you to render a
publicly available notebook in the browser directly.
3. Easy to share: Jupyter Notebooks are saved in the structured text files (JSON format), which
makes them easily shareable.
4. Language independent: Jupyter Notebook is platform-independent because it is represented
as JSON (JavaScript Object Notation) format, which is a language-independent, text-based
file format. Another reason is that the notebook can be processed by any programing language,
and can be converted to any file formats such as Markdown, HTML, PDF, and others.
5. Interactive code: Jupyter notebook uses ipywidgets packages, which provide many common
user interfaces for exploring code and data interactivity.

Visual Studio:
Visual Studio Code features a lightning fast source code editor, perfect for day-to-day use.
With support for hundreds of languages, VS Code helps you be instantly productive with syntax
highlighting, bracket-matching, auto-indentation, box-selection, snippets, and more.

15 | P a g e
Benefits of Visual Studio:
1. Cross-platform support : Windows, Linux, Mac
2. Light-weight
3. 3. Robust Architecture
4. Intelli-Sense
5. Freeware: Free of Cost- probably the best feature of all for all the programmers out there,
even more for the organizations.
6. Many users will use it or might have used it for desktop applications only, but it also
provides great tool support for Web Technologies like; HTML, CSS, JSON.

Google Colab:
Colaboratory, or “Colab” for short, is a product from Google Research. Colab allows anybody to
write and execute arbitrary python code through the browser, and is especially well suited to
machine learning, data analysis and education.

Benefits of Google colab:

 Sharing: You can share your Google Colab notebooks very easily. Thanks to Google Colab
everyone with a Google account can just copy the notebook on his own Google Drive
account. No need to install any modules to run any code, modules come preinstalled within
Google Colab.
 Versioning: You can save your notebook to Github with just one simple click on a button.
No need to write “git add git commit git push git pull” codes in your command client (this is
if you did use versioning already)!
 Code snippets: Google Colab has a great collection of snippets you can just plug in on your
code.E.g. if you want to write data to a Google Sheet automatically, there’s a snippet for it in
the Google Library
 Forms for non-technical users: Not only programmers have to analyze data and Python can
be useful for almost everyone in an office job. The problem is non-technical people are
scared to death of making even the tiniest change to the code. But Google Colab has the
solution for that. Just insertthe comment #@param {type:”string”} and you turn any variable

16 | P a g e
field in a easy-to-use form input field. Your non-technical user needs to change form fields
and Google Colab will automatically update the code. You can find more info
on https://colab.research.google.com/notebooks/forms.ipynb
 Performance: Use the computing power of the Google servers instead of your own machine.
Running python scripts requires often a lot of computing power and can take time. By
running scripts in the cloud, you don’t need to worry. Your local machine performance won’t
drop while executing your Python scripts.
 Price: Best of all it’s free.

17 | P a g e
Chapter 2

Literature Survey
Initially, researchers focused on the edge and gray value of the face image. It was based on a
pattern recognition model, having prior information of the face model. Ada-boost was a good
training classifier. The face detection technology got a breakthrough with the famous Viola-Jones
Detector, which greatly improved real-time face detection. Viola-Jones detector optimized the
features of Haar, but failed to tackle the real-world problems and was influenced by various factors
like face brightness and face orientation. Viola-Jones could only detect frontal well-light faces. It
failed to work well in dark conditions and with non-frontal images. These issues have made the
independent researchers work on developing new face detection models based on deep learning,
to have better results for the different facial conditions. We have developed our face detection
model using Convolutional Neural Network (CNN), such that it can detect the face in any
geometric condition frontal or non-frontal for that matter. Convolutional Neural Networks have
always been used for image classification tasks.

[1] Single-stage Detectors:


The single-stage detectors treat the detection of region proposals as a simple regression
problem by taking the input image and learning the class probabilities and bounding box
coordinates. Overfeat and Deep-MultiBox were early examples. YOLO (You Only Look Once)
popularized the single-stage approach by demonstrating real-time predictions and achieving
remarkable detection speed but suffered from low localization accuracy when compared with two-
stage detectors; especially when small objects are taken into consideration. Basically, the YOLO
network divides an image into a grid of size GxG, and each grid generates N predictions for
bounding boxes. Each bounding box is limited to having only one class during the prediction,
which restricts the network from finding smaller objects. Further, the YOLO network was
improved to YOLOv2 included batch normalization, high-resolution classifier, and anchor boxes.
Furthermore, the development of YOLOv3 is built upon YOLOv2 with the addition of an
improved backbone classifier, multi-sale prediction, and a new network for feature extraction.
Although, YOLOv3 is executed faster than Single-Shot Detector (SSD) but does not perform

18 | P a g e
well in terms of classification accuracy. Moreover, YOLOv3 requires a large amount of
computational power for inference, making it not suitable for embedded or mobile devices. Next,
SSD networks have superior performance than YOLO due to small convolutional filters, multiple
feature maps, and prediction in multiple scales. The key difference between the two architectures
is that YOLO utilizes two fully connected layers, whereas the SSD network uses convolutional
layers of varying sizes. Besides, the RetinaNet proposed by Lin is also a single-stage object
detector that uses featured image pyramid and focal loss to detect the dense objects in the image
across multiple layers and achieves remarkable accuracy as well as speed comparable to two-stage
detectors.

[2] Two-stage Detectors:


In contrast to single-stage detectors, two-stage detectors follow a long line of reasoning in
computer vision for the prediction and classification of region proposals. They first predict
proposals in an image and then apply a classifier to these regions to classify potential detection.
Various two-stage region proposal models have been proposed in past by researchers. Region-
based convolutional neural network also abbreviated as R-CNN described in 2014 by Ross
Girshick et al. It may have been one of the first large-scale applications of CNN to the problem of
object localization and recognition. The model was successfully demonstrated on benchmark
datasets such as VOC-2012 and ILSVRC-2013 and produced state of art results. Basically, R-
CNN applies a selective search algorithm to extract a set of object proposals at an initial stage and
applies an SVM (Support Vector Machine) classifier for predicting objects and related classes at
a later stage. Spatial pyramid pooling SPPNet (modifies R-CNN with an SPP layer) collects
features from various region proposals and fed into a fully connected layer for classification. The
capability of SPNN to compute feature maps of the entire image in a single shot resulted in
significant improvement in object detection speed by the magnitude of nearly 20 folds greater than
R-CNN. Next, Fast R-CNN is an extension over R-CNN and SPPNet. It introduces a new layer
named Region of Interest (RoI) pooling layer between shared convolutional layers to fine-tune the
model. Moreover, it allows to simultaneously train a detector and regressor without altering the
network configurations. Although Fast-R-CNN effectively integrates the benefits of R-CNN and
SPPNet but still lacks detection speed compared to single-stage detectors. Further, Faster R-CNN
is an amalgam of fast R-CNN and Region Proposal Network (RPN). It

19 | P a g e
enables nearly cost-free region proposals by gradually integrating individual blocks (e.g. proposal
detection, feature extraction, and bounding box regression) of the object detection system in a
single step. Although this integration leads to the accomplishment of break-through for the speed
bottleneck of Fast R-CNN there exists a computation redundancy at the subsequent detection stage.
The Region-based Fully Convolutional Network (R-FCN) is the only model that allows complete
backpropagation for training and inference. Feature Pyramid Networks (FPN) can detect non-
uniform objects but are least used by researchers due to high computation cost and more memory
usage. Furthermore, Mask R-CNN strengthens Faster R- CNN by including the prediction of
segmented masks on each RoI. Although two-stage yields high object detection accuracy, it is
limited by low inference speed in real-time for video surveillance.

20 | P a g e
Chapter 3
Problem Statement

Introduction: Face recognition is one of the well-studied real-life problems. Excellent progress
has been done against face recognition technology throughout the last years. Face alterations and
the presence of different masks make it too much challenging. The primary concern of this work
is about facial masks, especially to enhance the recognition accuracy of different masked faces.
Face Mask detection has turned up to be an astonishing problem in the domain of image
processing and computer vision.

3.1 Existing System


There are several recent works on Facial mask recognition using different frameworks. Most
of the existing method works on Fully Convolutional Networks. In the existing system, a Fully
Convolutional Network is used to detect faces, then facial feature extraction is performed using a
Semantic segmentation model.

3.1.1 Disadvantages of Existing System

 A Fully convolution Network is a significantly slower operation than, say max pool, both
forward and backward. If the network is pretty deep, each training step is going to take
much longer.
 The network is a bit too slow and complicated if you just want a good pre-trained model.

3.2 Proposed System


In this model, our aim is to detect whether the person is wearing the mask or not wearing the
mask. Our proposed system analyzes people’s faces using a Convolutional Neural Network
(CNN) architecture in two stages. First, the system detects the face from the input image, and
these detected faces are cropped and normalized to a size of 150×150. Then, these face images
are used as input to CNN. while the second stage uses a lightweight image classifier to classify
the faces detected in the first stage as either ‘Mask’ or ‘No Mask’ faces and draws bounding boxes
around them along with the detected class name.

21 | P a g e
3.2.1 Advantages of Proposed System

 The approach proposed system in the project uses only the Convolutional Neural
Network model (CNN) to detect the human faces.
 In these Feature extraction was performed best compared to the existing system and
Computation is very fast. The power of CNN is to detect distinct features from images all by
itself, without any actual human intervention.

22 | P a g e
Chapter 4

Dataset Description

In our research, we worked on covid-face-mask-detection-dataset which we obtained


from the Kaggle website.

Kaggle allows users to find and publish data sets, explore and build models in a web-based data-
science environment, work with other data scientists and machine learning engineers, and enter
competitions to solve data science challenges. Kaggle has over 50,000 public datasets and 400,000
public notebooks.

Dataset No. of faces with No. of faces without


masks masks
Kaggle-covid-face-
503 503
mask-detection-dataset

Fig 1: Input data to the model

23 | P a g e
Chapter 5

Analysis and Design

Object-oriented analysis and design (OOAD) is a software engineering approach that models a
system as a group of interacting objects. Each object represents an entity of interest in the system
being modeled and is characterized by its class, its state (data elements), and its behaviors. Various
models can be created to show the static structure, dynamic behaviors, and runtime deployment of
these collaborating objects. there are a number of different notations for representing these models,
such as the Unified Modelling Language (UML).
Object-oriented analysis (OOA) applies object modeling techniques to analyze the functional
requirements for a system. Object-oriented design (OOD) elaborates the analysis models to
produce implementation specifications. OOA focuses on what this system does, OOD on how this
system does it.

Design: The most creative and challenging phase of the life cycle is system design. The term
design describes a final system and the process by which it is developed. It refers to the technical
specifications that will be applied in the implementation of the candidate system. The design may
be defined as “The process of applying various techniques and principal for the purpose of
defining a device, a process or a system with sufficient details to permit its physical realization
are documented and evaluated by management as a step towards implementation”.

The importance of software design can be stated in a single word “Quality”. Design providers
with the representation of software that can be assessed for quality. Designers the only way where
we can accurately translate customers’ requirements into a complete software product or system.
without design, we risk building an unstable system that might fail if small changes are made. It
may as well be difficult to test or could be one whose quality can't be tested. so it is an essential
facet in the development of software products.

24 | P a g e
5.1 Architecture :

Fig 2: CNN Architecture

In this model, our aim is to detect whether the person is wearing the mask or not wearing the
mask. Face Mask detection has turned up to be an astonishing problem in the domain of image
processing and computer vision. Face detection has various use cases ranging from face
recognition to capturing facial motions, where the latter calls for the face to be revealed with very
high precision. Due to the rapid advancement in the domain of machine learning algorithms, face
mask detection technology seems to be well addressed yet. This technology is more relevant today
because it is used to detect faces in static images and videos also and the same is graphically
represented in the figure as a Flow diagram.

5.2 Flow Diagram:

Step1: Initially import all the libraries and packages.


Step2: Train the data with mask faces and without mask faces, after training is complete. Validate
and test the dataset with accuracy, finally completing all these save the file for prediction.

25 | P a g e
Step3: Run the video and if it detected the face
Step4: If it is a human face then the condition will re-scale the image or if it is not a human face the
condition will break.
Step5: After re-scaling the image and check with greyscale values and check to the face
recognition values and displays rectangle box on the screen.
Step6: After face recognition is complete it will predict whether the face has a mask or not.
Step7: If the person wears a mask, then in the display it will show with a green line rectangle box
with the name mask on the screen. If the person wears a mask, then in the display it will show
with a red line rectangle box with the name No-mask on the screen.

Fig 3: Working flow chart

5.3 UML DIAGRAMS:

UML short for Unified Modeling Language, is a standardized modeling language consisting
of an integrated set of diagrams, developed to help system and software developers for specifying,
visualizing, constructing, and documenting the artifacts of software systems, as well

26 | P a g e
as for business modeling and other non-software systems. The UML represents a collection of
best engineering practices that have proven successful in the modeling of large and complex
systems. The UML is a very important part of developing object-oriented software and the
software development process. The UML uses mostly graphical notations to express the design
of software projects. Using the UML helps project teams communicate, explore potential designs,
and validate the architectural design of the software.

The goal of UML is to provide a standard notation that can be used by all object-oriented
methods and to select and integrate the best elements of precursor notations. UML has been
designed for a broad range of applications. Hence, it provides constructs for a broad range of
systems and activities.

5.3.1 Use case Diagram:

A UML use case diagram is the primary form of system/software requirements for a new
software program underdeveloped. Use cases specify the expected behavior (what), and not the
exact method of making it happen (how). Use cases once specified can be denoted by both textual
and visual representation (i.e., use case diagram).

A key concept of use case modeling is that it helps us design a system from the end user's
perspective. It is an effective technique for communicating system behavior in the user's terms by
specifying all externally visible system behavior.

27 | P a g e
Fig 4: use case diagram

28 | P a g e
5.3.2 Sequence Diagram:

UML Sequence Diagrams are interaction diagrams that detail how operations are carried out.
They capture the interaction between objects in the context of a collaboration. Sequence Diagrams
are time focused and they show the order of the interaction visually by using the vertical axis of
the diagram to represent time what messages are sent and when.

Fig 5: sequence diagram

29 | P a g e
Sequence Diagrams captures:

 the interaction that takes place in a collaboration that either realizes a use case or an
operation (instance diagrams or generic diagrams)
 high-level interactions between the user of the system and the system, between the
system and other systems, or between subsystems (sometimes known as system
sequence diagrams)

5.3.3 Class Diagram:

In software engineering, a class diagram in the UML is a type of static structure diagram that
describes the structure of a system by showing the system's classes, their attributes, operations (or
methods), and the relationships among objects.

A UML class diagram is made up of:

 A set of classes and


 A set of relationships between classes

The user performs an operation called giving input to the system. The system trains and tests the
model and predicts the result based on the inputs given to it and displays the result to the user.

Fig 6: class diagram

30 | P a g e
5.4 Modules of CNN:

CONVOLUTIONAL NEURAL NETWORKS: (CNN)

Convolutional Neural Network, also known as CNN is a subfield of deep learning which is
mostly used for the analysis of visual imagery. CNN is a class of deep feedforward (ANN). This
Neural Network uses the already supplied dataset to it for training purposes and predicts the
possible future labels to be assigned. Any kind of data This Neural Network uses its strengths
against the curse of dimensionality. A portion of the territories where CNNs are broadly utilized
are image recognition, image classification, image captioning and object detection, etc. The CNNs
got immense popularity when Alex discovered it in 2012. In just three years, the engineers have
advanced it to an extent that an older 8 layer AlexNet now is converted into 152 layer ResNet.
Tasks where recommendation systems, contextual importance, or natural language processing
(NLP) is considered, CNNs come handy. The key chore of the neural network is to make sure it
processes all the layers, and hence detects all the underlying features, automatically. A CNN is a
convolution tool that parts the different highlights of the picture for analysis and prediction.

 Convolutional neural network is also known as Artificial neural network that has so far been
most popularly used for analyzing images.
 CNN has hidden layers called Convolutional layers, and these layers are precisely what
makes this CNN.
 Convolutional layer more precisely able to detect patterns.
 Through CNN model we insert any object from input it will check convolutional layer and
transform through output.

A CNN typically has three layers:

1. Convolutional layer
2. Pooling layer
3. Fully connected layer.

1. Convolution Layer: The convolution layer is a core building block of the CNN. It carries
the main portion of the network’s computational load. This layer performs a dot product

31 | P a g e
between two matrices, where one matrix is the set of learnable parameters otherwise known as
a kernel and the other matrix is the restricted portion of the receptive field. The kernel is
spatially smaller than an image but is more in-depth.

Fig:7 Convolutional Layer

During the forward pass, the kernel slides across the height and width of the image-
producing the image representation of that receptive region. This produces a two- dimensional
representation of the image known as an activation map that gives the response of the kernel
at each spatial position of the image. The sliding size of the kernel is called a stride. If we
have an input of size W *W *D and Dout number of kernels with a spatial size of F with stride
S and amount of padding P, then the size of output volume can be determined by the following
formula:

2. Pooling layer: The pooling layer replaces the output of the network at certain locations by
deriving a summary statistics of the nearby outputs. This helps in reducing the spatial size of
the representation, which decreases the required amount of computation and weights. The
pooling operation is processed on every slice of the representation individually.
There are several pooling functions such as the average of the rectangular
neighborhood, L2 norm of the rectangular neighborhood, and a weighted average based on

32 | P a g e
the distance from the central pixel. However, the most popular process is max pooling,
which reports the maximum output from the neighborhood.

Fig:8 Pooling
If we have an activation map of sizeW*W*D, a pooling kernel of spatial size F, and stride S,
then the size of output volume can be determined by the following formula:

3. Fully Connected Layer: Neurons in this layer have full connectivity with all neurons in the
preceding and succeding layer as seen in regular FCNN. This is why it can be computed as
usual by a matrix multiplication folowed by a bias effect. The FC layer helps to map the
representation between the input and and output.

Designing a Convolutional Neural Network:

Our convolutional neural network has architecture as follows:

[INPUT]

→ [CONV 1] → [BATCH NORM] → [ReLU] → [POOL 1]

→ [CONV 2] → [BATCH NORM] → [ReLU] → [POOL 2]

→ [FC LAYER] → [RESULT]

33 | P a g e
5.5 CNN Algorithm Description:

 Importing the libraries and reading the CSV file.


 Getting the training features X and labels y from pixels of the CSV respectively and
converting them into numpy arrays. We also add an additional dimension to our
feature vector by using np.expand_dims() function, this is done to make the input
suitable for our CNN which we will design later. Both features and labels are stored
as .npy files to be used later.
 Importing the required libraries for CNN.
 We have 150x 150-pixel resolution so we have width and height as 150. We will be
processing our inputs with a batch size of 64.
 load the features and labels into x and y respectively
 divide the data into training and testing set and save the test features and labels to be
used later.
 Sequential () - A sequential model is just a linear stack of layers which is putting layers
on top of each other as we progress from the input layer to the output layer. You can
read more about this here.
 model. Add (Conv2D()) - This is a 2D Convolutional layer which performs the
convolution operation as described at the beginning of this post. To quote Keras
Documentation “ This layer creates a convolution kernel that is convolved with the
layer input to produce a tensor of outputs.” Here we are using a 3x3 kernel size and
Rectified Linear Unit (ReLU) as our activation function.
 model. Add (MaxPooling2D()) - This function performs the pooling operation on the
data as explained at the beginning of the post. We are taking a pooling window of 2x2
with 2x2 strides in this model. If you want to read more about MaxPooling you can
refer the Keras Documentation or the post mentioned above.
 model. Add (Dropout()) - As explained above Dropout is a technique where randomly
selected neurons are ignored during the training. They are “dropped out” randomly.
This reduces overfitting.

34 | P a g e
 model.add(Flatten()) - This just flattens the input from ND to 1D and does not affect
the batch size.
 model.add(Dense()) - According to Keras Documentation, Dense implements the
operation: output = activation(dot(input, kernel)where activation is the element-wise
activation function passed as the activation argument, kernel is a weights matrix
created by the layer. In simple words, it is the final nail in the coffin which uses the
features learned using the layers and maps it to the label.

Fig 9: CNN Model Dataset Summary

35 | P a g e
Chapter 6

Implementation

Implementation is the stage of the project when the theoretical design is turned out into a
working system. Thus, it can be considered to be the most critical stage in achieving a successful
new system and in giving the user, confidence that the new system will work and be effective.

The implementation stage involves careful planning, investigation of the existing system and its
constraints on implementation, designing of methods to achieve changeover and evaluation of
changeover methods.

6.1 Importing all the required packages

6.2 Loading a dataset

36 | P a g e
6.3 Defining directories

6.4 Importing matplot library

6.5 Classification of images into classes

37 | P a g e
6.6 Model Summary

6.7 Validating data

38 | P a g e
6.8 Plotting chart for Training and Validation Loss

6.9 Plotting chart for Training and Validation Accuracy

6.10 Evaluation of Model and print test loss and test accuracy

6.11 Facial Mask Prediction

39 | P a g e
Mask.py: (face-mask detection)

// importing libraries
import cv2
from tensorflow.keras.models import load_model
from keras.preprocessing.image import load_img , img_to_array
import numpy as np
//loading model.h5
model =load_model('model.h5')
img_width , img_height = 150,150
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
//providing video as input to the model
cap = cv2.VideoCapture('video.mp4')
img_count_full = 0
font = cv2.FONT_HERSHEY_SIMPLEX
org = (1,1)
class_label = ''
fontScale = 1
color = (255,0,0)
thickness = 2
while True:
img_count_full += 1
response , color_img = cap.read()
if response == False:
break
scale = 50
width = int(color_img.shape[1]*scale /100)
height = int(color_img.shape[0]*scale/100)
dim = (width,height)
color_img = cv2.resize(color_img, dim ,interpolation= cv2.INTER_AREA)
gray_img = cv2.cvtColor(color_img,cv2.COLOR_BGR2GRAY)

40 | P a g e
faces = face_cascade.detectMultiScale(gray_img, 1.1, 6)
img_count = 0
for (x,y,w,h) in faces:
org = (x-10,y-10)
img_count += 1
color_face = color_img[y:y+h,x:x+w]
cv2.imwrite('input/%d%dface.jpg'%(img_count_full,img_count),color_face)
img=
load_img('input/%d%dface.jpg'%(img_count_full,img_count),target_size=(img_width,img_height))
img = img_to_array(img)
img = np.expand_dims(img,axis=0)
prediction = model.predict(img)
if prediction==0:
class_label = "Mask"
color = (255,0,0)
else:
class_label = "No Mask"
color = (0,255,0)
cv2.rectangle(color_img,(x,y),(x+w,y+h),(0,0,255),3)
cv2.putText(color_img, class_label, org, font ,fontScale, color, thickness,cv2.LINE_AA)
cv2.imshow('Face mask detection', color_img)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()

41 | P a g e
Chapter 7

Testing & Experimental result analysis

7.1 TYPES OF TESTS:

UNIT TESTING:

Unit testing involves the design of test cases that validate that the internal program logic is
functioning properly, and that program inputs produce valid outputs. All decision branches and
internal code flow should be validated. It is the testing of individual software units of the
application. It is done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is invasive. Unit tests perform
basic tests at component level and test a specific business process, application, and/or system
configuration.

INTEGRATION TESTING:

Integration tests are designed to test integrated software components to determine if they actually
run as one program. Testing is event driven and is more concerned with the basic outcome of
screens or fields. Integration tests demonstrate that although the components were individually
satisfaction, as shown by successfully unit testing, the combination of components is correct and
consistent. Integration testing is specifically aimed at exposing the problems that arise from the
combination of components.

VALIDATION TESTING:

An engineering validation test (EVT) is performed on first engineering prototypes, to ensure that
the basic unit performs to design goals and specifications. It is important in identifying design
problems, and solving them as early in the design cycle as possible, is the key to keeping projects on
time and within budget. Too often, product design and performance problems are not detected
until late in the product development cycle — when the product is ready to be shipped. The old

42 | P a g e
adage holds true: It costs a penny to make a change in engineering, a dime in production and a
dollar after a product is in the field.

Verification is a Quality control process that is used to evaluate whether or not a product, service,
or system complies with regulations, specifications, or conditions imposed at the start of a
development phase. Verification can be in development, scale-up, or production. This is often an
internal process. Validation is a Quality assurance process of establishing evidence that provides
a high degree of assurance that a product, service, or system accomplishes its intended
requirements. This often involves acceptance of fitness for purpose with end users and other
product stakeholders.

The testing process overview is as follows:

Fig: The Testing Process

SYSTEM TESTING:

System testing of software or hardware is testing conducted on a complete, integrated system to


evaluate the system's compliance with its specified requirements. System testing falls within the
scope of black box testing, and as such, should require no knowledge of the inner design of the
code or logic.

As a rule, system testing takes, as its input, all of the "integrated" software components that have
successfully passed integration testing and also the software system itself integrated with any
applicable hardware system.

43 | P a g e
System testing is a more limited type of testing; it seeks to detect defects both within the "inter-
assemblages" and also within the system as a whole. System testing is performed on the entire
system in the context of a Functional Requirement Specification (FRS) or System Requirement
Specification (SRS).

TESTCASE: Test Case for Prediction Result

Serial Number of Test Case TC 01

Module Under Test predict facial masks

Description User trains data and user model to predict new


user in normal video or live video and predict
whether there is a mask or not.
Input Live video stream data and trained model

Output Application detects facial masks and draws


bounding boxes around them along with the
detected class name.
Remarks Test Successful.

44 | P a g e
7.2 Experimental Evaluation:

To avoid the problem of overfitting, two major steps are taken. First, we performed data
augmentation. Second, the model accuracy is critically observed over 60 epochs both for the
training and testing phase. The observations are reported in the below figures:

Figure representing Training and Validation loss

Figure representing Training and Validation accuracy

Observation: It is further observed that model accuracy keeps on increasing in different


epochs and get stable after epoch=3 as depicted graphically in Fig. 2. above. To summarize the
experimental results, we can say that the proposed model achieves high accuracy in face

45 | P a g e
and mask detection with less inference time and less memory consumption as compared to
recent techniques. Significant efforts had been put to resolve the data imbalance problem in
the existing MAFA dataset, resulting in a new unbiased dataset which is highly suitable for
COVID related mask detection tasks. The newly created dataset, optimal face detection
approach, localizing the person identity and avoidance of overfitting resulted in an overall
system that can be easily installed in an embedded device at public places to curtail the spread
of Coronavirus.

7.3 EXPECTED OUTPUT:

Fig:10 Result after detecting face with and without mask

46 | P a g e
Chapter 8
Conclusion

A novel architecture for an economic Face-Mask detection technology is proposed and


implemented in this paper. We presented a Convolutional Neural Network model for facial
masks recognition. The proposed model includes 2 convolutional layers and 2 max pooling.
The system recognizes faces from input images and classifies them and neutral. Thus, Our
facial mask recognition system can be integrated with an image or video capturing device like
a CCTV camera, to track safety violations, promote the use of face masks, and ensure a safe
working environment. Thus, in our future work we will focus on applying Convolutional
Neural Network model on 3D face image in order to detect more accurately.

Future Work:
The future enhancements that can be projected for the project are:
 More interactive user interface.
 Can be done as Mobile Application.
 More Details with result applicable to real time applications such as in public
places.

47 | P a g e
REFERENCES

1. Roomi, Mansoor, Beham, M.Parisa, “A Review Of Face Recognition Methods,” in


International Journal of Pattern Recognition and Artificial Intelligence, 2013, 27(04),
p.1356005.
2. S. Syed navaz, t. Dhevi sri, Pratap mazumder, “ Face recognition using principal
component analysis and neural network,” in International Journal of Computer
Networking, Wireless and Mobile Communications (IJCNWMC), vol. 3, pp. 245-256,
Mar. 2013.
3. Turk, Matthew A., and Alex P. Pentland. "Face recognition using eigenfaces," in IEEE
Computer Society Conference on Computer Vision and Pattern Recognition, 1991, pp.
586-591.
4. H. Li, Z. Lin, X. Shen, J. Brandit, and G. Hua, “A convolutional neural network cascade
for face detection,” in IEEE CVPR, 2015, pp.5325-5334.
5. Wei Bu, Jiangjian Xiao, Chuanhong Zhou, Minmin Yang, Chengbin Peng, “A Cascade
Framework for Masked Face Detection,” in IEEE International Conference on CIS &
RAM, Ningbo, China, 2017, pp.458-462.
6. Shiming Ge, Jia Li, Qiting Ye, Zhao Luo, “Detecting Masked Faces in the Wild with LLE-
CNNs,” in IEEE Conference on Computer Vision and Pattern Recognition, China, 2017,
pp. 2682--2690.
7. Opitz, G. Waltner, G. Poier, and et al, “Grid Loss: Detecting Occluded Faces,” in ECCV,
2016, pp. 386-402.
8. X. Zhu and D. Ramanan, “Face Detection, pose estimation and landmark localization in
the wild,” in IEEE CVPR, 2012, pp.2879-2886.
9. Florian Schroff, Dmitry Kalenichenko, James Philbin, “FaceNet: A Unified Embedding
for Face Recognition and Clustering,” in The IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), 2015, pp. 815-823
10. Ankan Bansal Rajeev Ranjan Carlos D. Castillo Rama Chellappa, “Deep Features for
Recognizing Disguised Faces in wild” in IEEE/CVF Conference on Computer Vision and
Pattern Recognition Workshops, 2018.

48 | P a g e
Citations
1. Kamal Acharya. School management system project report. Authorea. August 01, 2024. DOI:
https://doi.org/10.22541/au.172254873.34023165/v1
2. Kamal Acharya. A CASE STUDY OF CINEMA MANAGEMENT SYSTEM PROJECT. Authorea. August
01, 2024. DOI: https://doi.org/10.22541/au.172254873.30191075/v1
3. Kamal Acharya. A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT. Authorea.
August 01, 2024. DOI: https://doi.org/10.22541/au.172254872.26972790/v1
4. Kamal Acharya. Web chatting application project report management system. Authorea. August 01, 2024.
DOI: https://doi.org/10.22541/au.172254871.18588592/v1
5. Kamal Acharya. RETAIL STORE MANAGEMENT SYSTEM PROJECT REPORT. Authorea. August 01,
2024. DOI: https://doi.org/10.22541/au.172254871.14590154/v1
6. Kamal Acharya. SUPERMARKET MANAGEMENT SYSTEM PROJECT REPORT. Authorea. August 01,
2024. DOI: https://doi.org/10.22541/au.172252491.19145062/v1
7. Kamal Acharya. Online music portal management system project report. Authorea. August 01, 2024.
DOI: https://doi.org/10.22541/au.172252488.89734698/v1
8. Kamal Acharya. COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT. Authorea. July 31,
2024. DOI: https://doi.org/10.22541/au.172245277.70798942/v1
9. Kamal Acharya. AUTOMOBILE MANAGEMENT SYSTEM PROJECT REPORT. Authorea. July 31, 2024.
DOI: https://doi.org/10.22541/au.172245276.67982593/v1
10. Kamal Acharya. Ludo management system project report. Authorea. July 31, 2024. DOI:
https://doi.org/10.22541/au.172243999.98091616/v1
11. Kamal Acharya. Avoid waste management system project. Authorea. July 29, 2024. DOI:
https://doi.org/10.22541/au.172228528.85022205/v1
12. Kamal Acharya. CHAT APPLICATION THROUGH CLIENT SERVER MANAGEMENT SYSTEM
PROJECT. Authorea. July 29, 2024. DOI: https://doi.org/10.22541/au.172228527.74316529/v1
13. Kamal Acharya. Hotel billing management system project report. Authorea. June 30, 2025.
DOI: https://doi.org/10.22541/au.175131158.85565042/v1

49 | P a g e

View publication stats

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy