Face Mask Detection Project Report.
Face Mask Detection Project Report.
net/publication/393288478
CITATIONS
0
1 author:
Kamal Acharya
Tribhuvan University
285 PUBLICATIONS 5,526 CITATIONS
SEE PROFILE
All content following this page was uploaded by Kamal Acharya on 02 July 2025.
Date: 2025/07/02
1|Page
ABSTRACT
Recognition from faces is a popular and significant technology in recent years. In the
real-world, when a person is uncooperative with the systems such as in video surveillance
then masking is further common scenarios. For these masks, current face recognition
performance degrades. Still, difficulties created by masks are usually disregarded. Face
recognition is a promising area of applied computer vision. This technique is used to
recognize a face or identify a person automatically from given images. In our daily life
activates like, in a passport checking, smart door, access control, voter verification, criminal
investigation, and many other purposes face recognition is widely used to authenticate a
person correctly and automatically. Face recognition has gained much attention as a unique,
reliable biometric recognition technology that makes it most popular than any other
biometric technique likes password, pin, fingerprint, etc.
The primary concern to this work is about facial masks, and especially to enhance
the recognition accuracy of different masked faces. A feasible approach has been proposed
that consists of first detecting the facial regions. The occluded face detection problem has
been approached using Cascaded Convolutional Neural Network (CNN). Besides, its
performance has been also evaluated within excessive facial masks and found attractive
outcomes. Finally, a correlative study also made here for a better understanding.
2|Page
Chapter 1
Introduction
Face mask detection refers to detect whether a person is wearing a mask or not. In fact, the
problem is reverse engineering of face detection where the face is detected using different
machine learning algorithms for the purpose of security, authentication and surveillance. Face
detection is a key area in the field of Computer Vision and Pattern Recognition. A significant
body of research has contributed sophisticated to algorithms for face detection in past. The
primary research on face detection was done in 2001 using the design of handcraft feature and
application of traditional machine learning algorithms to train effective classifiers for detection
and recognition. The problems encountered with this approach include high complexity in feature
design and low detection accuracy. In recent years, face detection methods based on deep
convolutional neural networks (CNN) have been widely developed to improve detection
performance.
In this paper, we propose a two-stage CNN architecture, where the first stage detects human
faces, while the second stage uses a lightweight image classifier to classify the faces detected in
the first stage as either ‘Mask’ or ‘No Mask’ faces and draws bounding boxes around them along
with the detected class name.
3|Page
This algorithm was extended to videos as well. The detected faces are then tracked between
frames using an object tracking algorithm, which makes the detection robust to the noise. This
system can then be integrated with an image or video capturing device like a CCTV camera, to
track safety violations, promote the use of face masks, and ensure a safe working environment.
While many machine learning algorithms have been around for a long time, the ability to
automatically apply complex mathematical calculations to big data – over and over, faster and
faster – is a recent development. Here are a few widely publicized examples of machine learning
applications you may be familiar with:
• The heavily hyped, self-driving Google car? The essence of machine learning.
• Online recommendation offers such as those from Amazon and Netflix? Machine learning
applications for everyday life.
• Knowing what customers are saying about you on Twitter? Machine learning combined
with linguistic rule creation.
• Fraud detection? One of the more obvious, important uses in our world today.
4|Page
1.2.2 Importance of Machine Learning
Resurging interest in machine learning is due to the same factors that have made data mining and
Bayesian analysis more popular than ever. Things like growing volumes and varieties of available
data, computational processing that is cheaper and more powerful, and affordable data storage.
All of these things mean it's possible to quickly and automatically produce models that can analyze
bigger, more complex data and deliver faster, more accurate results even on a very large scale.
And by building precise models, an organization has a better chance of identifying profitable
opportunities or avoiding unknown risks.
Machine learning tasks Machine learning tasks are typically classified into several broad
categories:
5|Page
Supervised learning: The computer is presented with example inputs and their desired outputs, given
by a "teacher”, and the goal is to learn a general rule that maps inputs to outputs. As special
cases, theinput signal can beonlypartially available, or restricted to special feedback.
Semi-supervised learning: The computer is given only an incomplete training signal: a training set
with some (often many) of the target outputs missing.
Active learning: The computer can only obtain training labels for a limited set of instances (based
on a budget), and also has to optimize its choice of objects to acquire labels for. When used
interactively, these can be presented to the user for labelling.
Unsupervised learning: No labels are given to the learning algorithm, leaving it on its own to find
structure in its input. Unsupervised learning can be a goal in itself (discovering hidden patterns in
data) or a means towards an end (feature learning).
Reinforcement learning: Data (in form of rewards and punishments) are given only as feedback to the
program's actions in a dynamic environment, such as driving a vehicle or playing a game against an
opponent.
Artificial Intelligence is everywhere. Possibility is that you are using it in one way or the other and
you don’t even know about it. One of the popular applications of AI is Machine Learning, in which
computers, software, and devices perform via cognition (very similar to human brain). Herein, we
share few applications of machine learning.
I. Image Recognition:
One of the most common uses of machine learning is image recognition. There are many situations
where you can classify the object as a digital image. For digital images, the measurements describe
the outputs of each pixel in the image.
• In the case of a black and white image, the intensity of each pixel serves as one
measurement. So if a black and white image has N*N pixels, the total number of pixels
and hence measurement is N2.
6|Page
• In the colored image, each pixel considered as providing 3 measurements to the intensities
of 3 main colours component i.e RGB. So N*N colored image there are 3 N2
measurements.
• For face detection – The categories might be face versus no face present. There might be a
separate category for each person in a database of several individuals.
• For character recognition – We can segment a piece of writing into smaller images, each
containing a single character. The categories might consist of the 26 letters of the English
alphabet, the 10 digits, and some special characters.
Speech recognition (SR) is the translation of spoken words into text. It is also known as “automatic
speech recognition” (ASR), “computer speech recognition”, or “speech to text” (STT).In speech
recognition, a software application recognizes spoken words. The measurements in this application
might be a set of numbers that represent the speech signal. We can segment the signal into portions
that contain distinct words or phonemes.
In each segment, we can represent the speech signal by the intensities or energy in different time-
frequency bands. Although the details of signal representation are outside the scope of this
program, we can represent the signal by a set of real values. Speech recognition applications
include voice user interfaces. Voice user interfaces are such as voice dialing; call routing, demotic
appliance control. It can also use as simple data entry, preparation of structured documents, speech-
to-text processing, and plane.
ML provides methods, techniques, and tools that can help solving diagnostic and prognostic
problems in a variety of medical domains. It is being used for the analysis of the importance of
clinical parameters and of their combinations for prognosis, e.g. prediction of disease progression,
for the extraction of medical knowledge for outcomes research, for therapy planning and support,
and for overall patient management. ML is also being used for data analysis, such as detection of
regularities in the data by appropriately dealing with imperfect data, interpretation
7|Page
of continuous data used in the Intensive Care Unit, and for intelligent alarming resulting in
effective and efficient monitoring.
It is argued that the successful implementation of ML methods can help the integration of
computer-based systems in the healthcare environment providing opportunities to facilitate and
enhance the work of medical experts and ultimately to improve the efficiency and quality of
medical care. In medical diagnosis, the main interest is in establishing the existence of a disease
followed by its accurate identification. There is a separate category for each disease under
consideration and one category for cases where no disease is present. Here, machine learning
improves the accuracy of medical diagnosis by analyzing data of patients.
In finance, statistical arbitrage refers to automated trading strategies that are typical of a short term
and involve a large number of securities. In such strategies, the user tries to implement a trading
algorithm for a set of securities on the basis of quantities such as historical correlations and general
economic variables. These measurements can be cast as a classification or estimation problem.
The basic assumption is that prices will move towards a historical average.
In the case of classification, the categories might be sold, buy or do nothing for each security. I the
case of estimation one might try to predict the expected return of each security over a future time
horizon. In this case, one typically needs to use the estimates of the expected return to make a
trading decision (buy, sell, etc.)
V. Learning Associations:
Learning association is the process of developing insights into various associations between
products. A good example is how seemingly unrelated products may reveal an association to one
another. When analyzed in relation to buying behaviors of customers.
One application of machine learning- Often studying the association between the products people
buy, which is also known as basket analysis. If a buyer buys ‘X’, would he or she force to buy ‘Y’
because of a relationship that can identify between them. This leads to relationship that exists
between fish and chips etc. When new products launches in the market a Knowing these
relationships it develops new relationship. Knowing these relationships could help in suggesting
8|Page
the associated product to the customer. For a higher likelihood of the customer buying it, it can
also help in bundling products for a better package.
IV. Classification:
A Classification is a process of placing each individual from the population under study in many
classes. This is identifying as independent variables. Classification helps analysts to use
measurements of an object to identify the category to which that object belong. To establish an
efficient rule, analysts use data. Data consists of many examples of objects with their correct
classification.
For example, before a bank decides to disburse a loan, it assesses customers on their ability to
repay the loan. By considering factors such as customer’s earning, age, savings and financial
history we can do it. This information taken from the past data of the loan. Hence, Seeker uses to
create relationship between customer attributes and related risks.
VI. Prediction:
Consider the example of a bank computing the probability of any of loan applicants faulting the
loan repayment. To compute the probability of the fault, the system will first need to classify the
available data in certain groups. It is described by a set of rules prescribed by the analysts. Once
we do the classification, as per need we can compute the probability. These probability
computations can compute across all sectors for varied purposes.
VII. Extraction:
Now-a-days extraction is becoming a key in big data industry. As we know that huge volume of
data is getting generated out of which most of the data is unstructured. The first key challenge is
handling of unstructured data. Now conversion of unstructured data to structured form based on
9|Page
some pattern so that the same can stored in RDBMS. Apart from this in current day’s data
collection mechanism is also getting change.
VIII. Regression:
We can apply Machine learning to regression as well. Assume that x= x1, x2, x3 … xn are the
input variables and y is the outcome variable. In this case, we can use machine learning technology
to produce the output (y) on the basis of the input variables (x).
You can use a model to express the relationship between various parameters as below:
In regression, we can use the principle of machine learning to optimize the parameters. To cut the
approximation error and calculate the closest possible outcome. We can also use Machine learning
for function optimization. We can choose to alter the inputs to get a better model. This gives a new
and improved model to work with. This is known as response surface design.
10 | P a g e
1.3.2 Software Requirements
Table 2 shows what are the basic / minimal software requirements to do the project.
1.3.2.1 Python:
Python is an interpreted, object-oriented, high-level programming language with dynamic
semantics. Its high-level built in data structures, combined with dynamic typing and dynamic
binding; make it very attractive for Rapid Application Development, as well as for use as a
scripting or glue language to connect existing components together. Python's simple, easy to learn
syntax emphasizes readability and therefore reduces the cost of program maintenance. Python
supports modules and packages, which encourages program modularity and code reuse. The
Python interpreter and the extensive standard library are available in source or binary form without
charge for all major platforms, and can be freely distributed.
Often, programmers fall in love with Python because of the increased productivity it provides.
Since there is no compilation step, the edit-test-debug cycle is incredibly fast. Debugging Python
programs is easy: a bug or bad input will never cause a segmentation fault. Instead, when the
interpreter discovers an error, it raises an exception. When the program doesn't catch the exception,
the interpreter prints a stack trace. A source level debugger allows inspection of local and global
variables, evaluation of arbitrary expressions, setting breakpoints, stepping through the code a line
at a time, and so on. The debugger is written in Python itself, testifying to Python's introspective
power. On the other hand, often the quickest way to debug a program is to
11 | P a g e
add a few print statements to the source: the fast edit-test-debug cycle makes this simple approach
very effective.
Python was designed to be easy to understand and fun to use (its name came from Monty Python
so a lot of its beginner tutorials reference it). Fun is a great motivator, and since you'll be able to
build prototypes and tools quickly with Python, many find coding in Python a satisfying
experience. Thus, Python has gained popularity for being a beginner-friendly language, and it has
replaced Java as the most popular introductory language at Top U.S. Universities.
Easy to Understand
Being a very high level language, Python reads like English, which takes a lot of syntax-learning
stress off coding beginners. Python handles a lot of complexity for you, so it is very beginner-
friendly in that it allows beginners to focus on learning programming concepts and not have to
worry about too many details.
Very Flexible
As a dynamically typed language, Python is really flexible. This means there are no hard rules on
how to build features, and you'll have more flexibility solving problems using different methods
(though the Python philosophy encourages using the obvious way to solve things). Furthermore,
Python is also more forgiving of errors, so you'll still be able to compile and run your program
until you hit the problematic part.
Scalability
Not Easy to Maintain
Because Python is a dynamically typed language, the same thing can easily mean something
different depending on the context. As a Python app grows larger and more complex, this may
get difficult to maintain as errors will become difficult to track down and fix, so it will take
experience and insight to know how to design your code or write unit tests to ease
maintainability.
12 | P a g e
Slow
As a dynamically typed language, Python is slow because it is too flexible and the machine
would need to do a lot of referencing to make sure what the definition of something is, and this
slows Python performance down.
At any rate, there are alternatives such as PyPy that are faster implementations of Python. While
they might still not be as fast as Java, for example, it certainly improves the speed greatly.
Community
As you step into the programming world, you'll soon understand how vital support is, as the
developer community is all about giving and receiving help. The larger a community, the more
likely you'd get help and the more people will be building useful tools to ease the process of
development.
Career Opportunities
On Angel List, Python is the 2nd most demanded skill and also the skill with the highest average
salary offered.
With the rise of big data, Python developers are in demand as data scientists, especially since
Python can be easily integrated into web applications to carry out tasks that require machine
learning.
Future
According to the TIOBE index, Python is the 4th most popular programming language out of 100,
with the rise of Ruby on Rails and more recently Node.js, Python's usage as the main prototyping
language for backend web development has diminished somewhat, especially since it has a
fragmented MVC ecosystem. However, with big data becoming more and more important, Python
has become a skill that is more in demand than ever, especially it can be integrated into web
applications.
As an open source project, Python is actively worked on with a moderate update cycle, pushing
out new versions every year or so to make sure it remains relevant. A programming language's
ability to stay relevant also depends on whether the language is getting new blood. In terms of
13 | P a g e
search volume for anyone interested in learning Python, it has skyrocketed to the 1st place when
compared to other languages.
Benefits of Python:
Presence of Third-Party Modules
Extensive Support Libraries
Open Source and Community Development
Learning Ease and Support Available
User-friendly Data Structures
Productivity and Speed
Highly Extensible and Easily Readable Language.
Python allows us to store our code in files (also called modules). This is very useful for more
serious programming, where we do not want to retype a long function definition from the very
beginning just to change one mistake. In doing this, we are essentially defining our own modules,
just like the modules defined already in the Python library. To support this, Python has a way to
put definitions in a file and use them in a script or in an interactive instance of the interpreter.
Such a file is called a module; definitions from a module can be imported into other modules or
into the main module.
NumPy - NumPy is a module for Python. The name is an acronym for "Numeric Python"
or "Numerical Python". Furthermore, NumPy enriches the programming language Python
with powerful data structures, implementing multi-dimensional arrays and matrices.
Opencv - OpenCV-Python is a library of Python bindings designed to solve computer
vision problems. ... OpenCV-Python makes use of Numpy, which is a highly optimized
library for numerical operations with a MATLAB-style syntax. All the OpenCV array
structures are converted to and from NumPy arrays.
keras: Keras is a minimalist Python library for deep learning that can run on top of Theano
or TensorFlow. It was developed to make implementing deep learning models as fast and
easy as possible for research and development.
Tensorflow: It is an open source artificial intelligence library, using data flow graphs to
build models. It allows developers to create large-scale neural networks with many
14 | P a g e
layers. TensorFlow is mainly used for: Classification, Perception, Understanding,
Discovering, Prediction and Creation.
Jupyter notebook:
The Jupyter Notebook is an open source web application that you can use to create and share
documents that contain live code, equations, visualizations, and text. Jupyter Notebook is
maintained by the people at Project Jupyter. Jupyter Notebooks are powerful, versatile, shareable
and provide the ability to perform data visualization in the same environment. Jupyter Notebooks
allow data scientists to create and share their documents, from codes to full blown reports.
Visual Studio:
Visual Studio Code features a lightning fast source code editor, perfect for day-to-day use.
With support for hundreds of languages, VS Code helps you be instantly productive with syntax
highlighting, bracket-matching, auto-indentation, box-selection, snippets, and more.
15 | P a g e
Benefits of Visual Studio:
1. Cross-platform support : Windows, Linux, Mac
2. Light-weight
3. 3. Robust Architecture
4. Intelli-Sense
5. Freeware: Free of Cost- probably the best feature of all for all the programmers out there,
even more for the organizations.
6. Many users will use it or might have used it for desktop applications only, but it also
provides great tool support for Web Technologies like; HTML, CSS, JSON.
Google Colab:
Colaboratory, or “Colab” for short, is a product from Google Research. Colab allows anybody to
write and execute arbitrary python code through the browser, and is especially well suited to
machine learning, data analysis and education.
Sharing: You can share your Google Colab notebooks very easily. Thanks to Google Colab
everyone with a Google account can just copy the notebook on his own Google Drive
account. No need to install any modules to run any code, modules come preinstalled within
Google Colab.
Versioning: You can save your notebook to Github with just one simple click on a button.
No need to write “git add git commit git push git pull” codes in your command client (this is
if you did use versioning already)!
Code snippets: Google Colab has a great collection of snippets you can just plug in on your
code.E.g. if you want to write data to a Google Sheet automatically, there’s a snippet for it in
the Google Library
Forms for non-technical users: Not only programmers have to analyze data and Python can
be useful for almost everyone in an office job. The problem is non-technical people are
scared to death of making even the tiniest change to the code. But Google Colab has the
solution for that. Just insertthe comment #@param {type:”string”} and you turn any variable
16 | P a g e
field in a easy-to-use form input field. Your non-technical user needs to change form fields
and Google Colab will automatically update the code. You can find more info
on https://colab.research.google.com/notebooks/forms.ipynb
Performance: Use the computing power of the Google servers instead of your own machine.
Running python scripts requires often a lot of computing power and can take time. By
running scripts in the cloud, you don’t need to worry. Your local machine performance won’t
drop while executing your Python scripts.
Price: Best of all it’s free.
17 | P a g e
Chapter 2
Literature Survey
Initially, researchers focused on the edge and gray value of the face image. It was based on a
pattern recognition model, having prior information of the face model. Ada-boost was a good
training classifier. The face detection technology got a breakthrough with the famous Viola-Jones
Detector, which greatly improved real-time face detection. Viola-Jones detector optimized the
features of Haar, but failed to tackle the real-world problems and was influenced by various factors
like face brightness and face orientation. Viola-Jones could only detect frontal well-light faces. It
failed to work well in dark conditions and with non-frontal images. These issues have made the
independent researchers work on developing new face detection models based on deep learning,
to have better results for the different facial conditions. We have developed our face detection
model using Convolutional Neural Network (CNN), such that it can detect the face in any
geometric condition frontal or non-frontal for that matter. Convolutional Neural Networks have
always been used for image classification tasks.
18 | P a g e
well in terms of classification accuracy. Moreover, YOLOv3 requires a large amount of
computational power for inference, making it not suitable for embedded or mobile devices. Next,
SSD networks have superior performance than YOLO due to small convolutional filters, multiple
feature maps, and prediction in multiple scales. The key difference between the two architectures
is that YOLO utilizes two fully connected layers, whereas the SSD network uses convolutional
layers of varying sizes. Besides, the RetinaNet proposed by Lin is also a single-stage object
detector that uses featured image pyramid and focal loss to detect the dense objects in the image
across multiple layers and achieves remarkable accuracy as well as speed comparable to two-stage
detectors.
19 | P a g e
enables nearly cost-free region proposals by gradually integrating individual blocks (e.g. proposal
detection, feature extraction, and bounding box regression) of the object detection system in a
single step. Although this integration leads to the accomplishment of break-through for the speed
bottleneck of Fast R-CNN there exists a computation redundancy at the subsequent detection stage.
The Region-based Fully Convolutional Network (R-FCN) is the only model that allows complete
backpropagation for training and inference. Feature Pyramid Networks (FPN) can detect non-
uniform objects but are least used by researchers due to high computation cost and more memory
usage. Furthermore, Mask R-CNN strengthens Faster R- CNN by including the prediction of
segmented masks on each RoI. Although two-stage yields high object detection accuracy, it is
limited by low inference speed in real-time for video surveillance.
20 | P a g e
Chapter 3
Problem Statement
Introduction: Face recognition is one of the well-studied real-life problems. Excellent progress
has been done against face recognition technology throughout the last years. Face alterations and
the presence of different masks make it too much challenging. The primary concern of this work
is about facial masks, especially to enhance the recognition accuracy of different masked faces.
Face Mask detection has turned up to be an astonishing problem in the domain of image
processing and computer vision.
A Fully convolution Network is a significantly slower operation than, say max pool, both
forward and backward. If the network is pretty deep, each training step is going to take
much longer.
The network is a bit too slow and complicated if you just want a good pre-trained model.
21 | P a g e
3.2.1 Advantages of Proposed System
The approach proposed system in the project uses only the Convolutional Neural
Network model (CNN) to detect the human faces.
In these Feature extraction was performed best compared to the existing system and
Computation is very fast. The power of CNN is to detect distinct features from images all by
itself, without any actual human intervention.
22 | P a g e
Chapter 4
Dataset Description
Kaggle allows users to find and publish data sets, explore and build models in a web-based data-
science environment, work with other data scientists and machine learning engineers, and enter
competitions to solve data science challenges. Kaggle has over 50,000 public datasets and 400,000
public notebooks.
23 | P a g e
Chapter 5
Object-oriented analysis and design (OOAD) is a software engineering approach that models a
system as a group of interacting objects. Each object represents an entity of interest in the system
being modeled and is characterized by its class, its state (data elements), and its behaviors. Various
models can be created to show the static structure, dynamic behaviors, and runtime deployment of
these collaborating objects. there are a number of different notations for representing these models,
such as the Unified Modelling Language (UML).
Object-oriented analysis (OOA) applies object modeling techniques to analyze the functional
requirements for a system. Object-oriented design (OOD) elaborates the analysis models to
produce implementation specifications. OOA focuses on what this system does, OOD on how this
system does it.
Design: The most creative and challenging phase of the life cycle is system design. The term
design describes a final system and the process by which it is developed. It refers to the technical
specifications that will be applied in the implementation of the candidate system. The design may
be defined as “The process of applying various techniques and principal for the purpose of
defining a device, a process or a system with sufficient details to permit its physical realization
are documented and evaluated by management as a step towards implementation”.
The importance of software design can be stated in a single word “Quality”. Design providers
with the representation of software that can be assessed for quality. Designers the only way where
we can accurately translate customers’ requirements into a complete software product or system.
without design, we risk building an unstable system that might fail if small changes are made. It
may as well be difficult to test or could be one whose quality can't be tested. so it is an essential
facet in the development of software products.
24 | P a g e
5.1 Architecture :
In this model, our aim is to detect whether the person is wearing the mask or not wearing the
mask. Face Mask detection has turned up to be an astonishing problem in the domain of image
processing and computer vision. Face detection has various use cases ranging from face
recognition to capturing facial motions, where the latter calls for the face to be revealed with very
high precision. Due to the rapid advancement in the domain of machine learning algorithms, face
mask detection technology seems to be well addressed yet. This technology is more relevant today
because it is used to detect faces in static images and videos also and the same is graphically
represented in the figure as a Flow diagram.
25 | P a g e
Step3: Run the video and if it detected the face
Step4: If it is a human face then the condition will re-scale the image or if it is not a human face the
condition will break.
Step5: After re-scaling the image and check with greyscale values and check to the face
recognition values and displays rectangle box on the screen.
Step6: After face recognition is complete it will predict whether the face has a mask or not.
Step7: If the person wears a mask, then in the display it will show with a green line rectangle box
with the name mask on the screen. If the person wears a mask, then in the display it will show
with a red line rectangle box with the name No-mask on the screen.
UML short for Unified Modeling Language, is a standardized modeling language consisting
of an integrated set of diagrams, developed to help system and software developers for specifying,
visualizing, constructing, and documenting the artifacts of software systems, as well
26 | P a g e
as for business modeling and other non-software systems. The UML represents a collection of
best engineering practices that have proven successful in the modeling of large and complex
systems. The UML is a very important part of developing object-oriented software and the
software development process. The UML uses mostly graphical notations to express the design
of software projects. Using the UML helps project teams communicate, explore potential designs,
and validate the architectural design of the software.
The goal of UML is to provide a standard notation that can be used by all object-oriented
methods and to select and integrate the best elements of precursor notations. UML has been
designed for a broad range of applications. Hence, it provides constructs for a broad range of
systems and activities.
A UML use case diagram is the primary form of system/software requirements for a new
software program underdeveloped. Use cases specify the expected behavior (what), and not the
exact method of making it happen (how). Use cases once specified can be denoted by both textual
and visual representation (i.e., use case diagram).
A key concept of use case modeling is that it helps us design a system from the end user's
perspective. It is an effective technique for communicating system behavior in the user's terms by
specifying all externally visible system behavior.
27 | P a g e
Fig 4: use case diagram
28 | P a g e
5.3.2 Sequence Diagram:
UML Sequence Diagrams are interaction diagrams that detail how operations are carried out.
They capture the interaction between objects in the context of a collaboration. Sequence Diagrams
are time focused and they show the order of the interaction visually by using the vertical axis of
the diagram to represent time what messages are sent and when.
29 | P a g e
Sequence Diagrams captures:
the interaction that takes place in a collaboration that either realizes a use case or an
operation (instance diagrams or generic diagrams)
high-level interactions between the user of the system and the system, between the
system and other systems, or between subsystems (sometimes known as system
sequence diagrams)
In software engineering, a class diagram in the UML is a type of static structure diagram that
describes the structure of a system by showing the system's classes, their attributes, operations (or
methods), and the relationships among objects.
The user performs an operation called giving input to the system. The system trains and tests the
model and predicts the result based on the inputs given to it and displays the result to the user.
30 | P a g e
5.4 Modules of CNN:
Convolutional Neural Network, also known as CNN is a subfield of deep learning which is
mostly used for the analysis of visual imagery. CNN is a class of deep feedforward (ANN). This
Neural Network uses the already supplied dataset to it for training purposes and predicts the
possible future labels to be assigned. Any kind of data This Neural Network uses its strengths
against the curse of dimensionality. A portion of the territories where CNNs are broadly utilized
are image recognition, image classification, image captioning and object detection, etc. The CNNs
got immense popularity when Alex discovered it in 2012. In just three years, the engineers have
advanced it to an extent that an older 8 layer AlexNet now is converted into 152 layer ResNet.
Tasks where recommendation systems, contextual importance, or natural language processing
(NLP) is considered, CNNs come handy. The key chore of the neural network is to make sure it
processes all the layers, and hence detects all the underlying features, automatically. A CNN is a
convolution tool that parts the different highlights of the picture for analysis and prediction.
Convolutional neural network is also known as Artificial neural network that has so far been
most popularly used for analyzing images.
CNN has hidden layers called Convolutional layers, and these layers are precisely what
makes this CNN.
Convolutional layer more precisely able to detect patterns.
Through CNN model we insert any object from input it will check convolutional layer and
transform through output.
1. Convolutional layer
2. Pooling layer
3. Fully connected layer.
1. Convolution Layer: The convolution layer is a core building block of the CNN. It carries
the main portion of the network’s computational load. This layer performs a dot product
31 | P a g e
between two matrices, where one matrix is the set of learnable parameters otherwise known as
a kernel and the other matrix is the restricted portion of the receptive field. The kernel is
spatially smaller than an image but is more in-depth.
During the forward pass, the kernel slides across the height and width of the image-
producing the image representation of that receptive region. This produces a two- dimensional
representation of the image known as an activation map that gives the response of the kernel
at each spatial position of the image. The sliding size of the kernel is called a stride. If we
have an input of size W *W *D and Dout number of kernels with a spatial size of F with stride
S and amount of padding P, then the size of output volume can be determined by the following
formula:
2. Pooling layer: The pooling layer replaces the output of the network at certain locations by
deriving a summary statistics of the nearby outputs. This helps in reducing the spatial size of
the representation, which decreases the required amount of computation and weights. The
pooling operation is processed on every slice of the representation individually.
There are several pooling functions such as the average of the rectangular
neighborhood, L2 norm of the rectangular neighborhood, and a weighted average based on
32 | P a g e
the distance from the central pixel. However, the most popular process is max pooling,
which reports the maximum output from the neighborhood.
Fig:8 Pooling
If we have an activation map of sizeW*W*D, a pooling kernel of spatial size F, and stride S,
then the size of output volume can be determined by the following formula:
3. Fully Connected Layer: Neurons in this layer have full connectivity with all neurons in the
preceding and succeding layer as seen in regular FCNN. This is why it can be computed as
usual by a matrix multiplication folowed by a bias effect. The FC layer helps to map the
representation between the input and and output.
[INPUT]
33 | P a g e
5.5 CNN Algorithm Description:
34 | P a g e
model.add(Flatten()) - This just flattens the input from ND to 1D and does not affect
the batch size.
model.add(Dense()) - According to Keras Documentation, Dense implements the
operation: output = activation(dot(input, kernel)where activation is the element-wise
activation function passed as the activation argument, kernel is a weights matrix
created by the layer. In simple words, it is the final nail in the coffin which uses the
features learned using the layers and maps it to the label.
35 | P a g e
Chapter 6
Implementation
Implementation is the stage of the project when the theoretical design is turned out into a
working system. Thus, it can be considered to be the most critical stage in achieving a successful
new system and in giving the user, confidence that the new system will work and be effective.
The implementation stage involves careful planning, investigation of the existing system and its
constraints on implementation, designing of methods to achieve changeover and evaluation of
changeover methods.
36 | P a g e
6.3 Defining directories
37 | P a g e
6.6 Model Summary
38 | P a g e
6.8 Plotting chart for Training and Validation Loss
6.10 Evaluation of Model and print test loss and test accuracy
39 | P a g e
Mask.py: (face-mask detection)
// importing libraries
import cv2
from tensorflow.keras.models import load_model
from keras.preprocessing.image import load_img , img_to_array
import numpy as np
//loading model.h5
model =load_model('model.h5')
img_width , img_height = 150,150
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
//providing video as input to the model
cap = cv2.VideoCapture('video.mp4')
img_count_full = 0
font = cv2.FONT_HERSHEY_SIMPLEX
org = (1,1)
class_label = ''
fontScale = 1
color = (255,0,0)
thickness = 2
while True:
img_count_full += 1
response , color_img = cap.read()
if response == False:
break
scale = 50
width = int(color_img.shape[1]*scale /100)
height = int(color_img.shape[0]*scale/100)
dim = (width,height)
color_img = cv2.resize(color_img, dim ,interpolation= cv2.INTER_AREA)
gray_img = cv2.cvtColor(color_img,cv2.COLOR_BGR2GRAY)
40 | P a g e
faces = face_cascade.detectMultiScale(gray_img, 1.1, 6)
img_count = 0
for (x,y,w,h) in faces:
org = (x-10,y-10)
img_count += 1
color_face = color_img[y:y+h,x:x+w]
cv2.imwrite('input/%d%dface.jpg'%(img_count_full,img_count),color_face)
img=
load_img('input/%d%dface.jpg'%(img_count_full,img_count),target_size=(img_width,img_height))
img = img_to_array(img)
img = np.expand_dims(img,axis=0)
prediction = model.predict(img)
if prediction==0:
class_label = "Mask"
color = (255,0,0)
else:
class_label = "No Mask"
color = (0,255,0)
cv2.rectangle(color_img,(x,y),(x+w,y+h),(0,0,255),3)
cv2.putText(color_img, class_label, org, font ,fontScale, color, thickness,cv2.LINE_AA)
cv2.imshow('Face mask detection', color_img)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
41 | P a g e
Chapter 7
UNIT TESTING:
Unit testing involves the design of test cases that validate that the internal program logic is
functioning properly, and that program inputs produce valid outputs. All decision branches and
internal code flow should be validated. It is the testing of individual software units of the
application. It is done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is invasive. Unit tests perform
basic tests at component level and test a specific business process, application, and/or system
configuration.
INTEGRATION TESTING:
Integration tests are designed to test integrated software components to determine if they actually
run as one program. Testing is event driven and is more concerned with the basic outcome of
screens or fields. Integration tests demonstrate that although the components were individually
satisfaction, as shown by successfully unit testing, the combination of components is correct and
consistent. Integration testing is specifically aimed at exposing the problems that arise from the
combination of components.
VALIDATION TESTING:
An engineering validation test (EVT) is performed on first engineering prototypes, to ensure that
the basic unit performs to design goals and specifications. It is important in identifying design
problems, and solving them as early in the design cycle as possible, is the key to keeping projects on
time and within budget. Too often, product design and performance problems are not detected
until late in the product development cycle — when the product is ready to be shipped. The old
42 | P a g e
adage holds true: It costs a penny to make a change in engineering, a dime in production and a
dollar after a product is in the field.
Verification is a Quality control process that is used to evaluate whether or not a product, service,
or system complies with regulations, specifications, or conditions imposed at the start of a
development phase. Verification can be in development, scale-up, or production. This is often an
internal process. Validation is a Quality assurance process of establishing evidence that provides
a high degree of assurance that a product, service, or system accomplishes its intended
requirements. This often involves acceptance of fitness for purpose with end users and other
product stakeholders.
SYSTEM TESTING:
As a rule, system testing takes, as its input, all of the "integrated" software components that have
successfully passed integration testing and also the software system itself integrated with any
applicable hardware system.
43 | P a g e
System testing is a more limited type of testing; it seeks to detect defects both within the "inter-
assemblages" and also within the system as a whole. System testing is performed on the entire
system in the context of a Functional Requirement Specification (FRS) or System Requirement
Specification (SRS).
44 | P a g e
7.2 Experimental Evaluation:
To avoid the problem of overfitting, two major steps are taken. First, we performed data
augmentation. Second, the model accuracy is critically observed over 60 epochs both for the
training and testing phase. The observations are reported in the below figures:
45 | P a g e
and mask detection with less inference time and less memory consumption as compared to
recent techniques. Significant efforts had been put to resolve the data imbalance problem in
the existing MAFA dataset, resulting in a new unbiased dataset which is highly suitable for
COVID related mask detection tasks. The newly created dataset, optimal face detection
approach, localizing the person identity and avoidance of overfitting resulted in an overall
system that can be easily installed in an embedded device at public places to curtail the spread
of Coronavirus.
46 | P a g e
Chapter 8
Conclusion
Future Work:
The future enhancements that can be projected for the project are:
More interactive user interface.
Can be done as Mobile Application.
More Details with result applicable to real time applications such as in public
places.
47 | P a g e
REFERENCES
48 | P a g e
Citations
1. Kamal Acharya. School management system project report. Authorea. August 01, 2024. DOI:
https://doi.org/10.22541/au.172254873.34023165/v1
2. Kamal Acharya. A CASE STUDY OF CINEMA MANAGEMENT SYSTEM PROJECT. Authorea. August
01, 2024. DOI: https://doi.org/10.22541/au.172254873.30191075/v1
3. Kamal Acharya. A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT. Authorea.
August 01, 2024. DOI: https://doi.org/10.22541/au.172254872.26972790/v1
4. Kamal Acharya. Web chatting application project report management system. Authorea. August 01, 2024.
DOI: https://doi.org/10.22541/au.172254871.18588592/v1
5. Kamal Acharya. RETAIL STORE MANAGEMENT SYSTEM PROJECT REPORT. Authorea. August 01,
2024. DOI: https://doi.org/10.22541/au.172254871.14590154/v1
6. Kamal Acharya. SUPERMARKET MANAGEMENT SYSTEM PROJECT REPORT. Authorea. August 01,
2024. DOI: https://doi.org/10.22541/au.172252491.19145062/v1
7. Kamal Acharya. Online music portal management system project report. Authorea. August 01, 2024.
DOI: https://doi.org/10.22541/au.172252488.89734698/v1
8. Kamal Acharya. COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT. Authorea. July 31,
2024. DOI: https://doi.org/10.22541/au.172245277.70798942/v1
9. Kamal Acharya. AUTOMOBILE MANAGEMENT SYSTEM PROJECT REPORT. Authorea. July 31, 2024.
DOI: https://doi.org/10.22541/au.172245276.67982593/v1
10. Kamal Acharya. Ludo management system project report. Authorea. July 31, 2024. DOI:
https://doi.org/10.22541/au.172243999.98091616/v1
11. Kamal Acharya. Avoid waste management system project. Authorea. July 29, 2024. DOI:
https://doi.org/10.22541/au.172228528.85022205/v1
12. Kamal Acharya. CHAT APPLICATION THROUGH CLIENT SERVER MANAGEMENT SYSTEM
PROJECT. Authorea. July 29, 2024. DOI: https://doi.org/10.22541/au.172228527.74316529/v1
13. Kamal Acharya. Hotel billing management system project report. Authorea. June 30, 2025.
DOI: https://doi.org/10.22541/au.175131158.85565042/v1
49 | P a g e