0% found this document useful (0 votes)
37 views7 pages

Glasses Detection From Human Face Images

This study uses the MobileNet architecture to provide a novel approach for identifying glasses in photos of people’s faces. The goal of this effort is to correctly identify glasses in face photographs for a variety of uses, including virtual try-on apps, driver monitoring systems, and facial recognition systems. Our study offers a powerful transfer learning-based glasses identification model utilizing the MobileNet architecture.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views7 pages

Glasses Detection From Human Face Images

This study uses the MobileNet architecture to provide a novel approach for identifying glasses in photos of people’s faces. The goal of this effort is to correctly identify glasses in face photographs for a variety of uses, including virtual try-on apps, driver monitoring systems, and facial recognition systems. Our study offers a powerful transfer learning-based glasses identification model utilizing the MobileNet architecture.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG707

Glasses Detection from Human Face Images


Kajal Lochab1 Lakshin Pathak2
Computer Science and Engineering Computer Science and Engineering
Nirma University Ahmedabad, India Nirma University Ahmedabad, India

Abstract:- This study uses the MobileNet architecture to hence increasing their efficiency and improving user
provide a novel approach for identifying glasses in experience.
photos of people’s faces. The goal of this effort is to
correctly identify glasses in face photographs for a  Motivation
variety of uses, including virtual try-on apps, driver Accurately identifying eyeglasses in facial photos is
monitoring systems, and facial recognition systems. Our es- sential for a number of applications, including virtual
study offers a powerful transfer learning-based glasses try-on platforms, increased security via facial recognition
identification model utilizing the MobileNet architecture. systems, and the efficacy of driver vigilance programs.
We analyze the issue formulation in detail, taking into Because eyewear modifies face features and creates
account the evaluation metrics, optimization objectives, reflections and distortions, it presents special difficulties
and mathematical framework. The data, intelligence, that could affect the accuracy of facial recognition
and application layers in our suggested architecture are technology. Similarly, precise eyewear detection is
tailored for effective glasses detection. By means of necessary for virtual fitting services in order to provide a
comprehensive testing and analysis, we exhibit the smooth and lifelike online fitting experience. Deter- mining
efficacy of our methodology in precisely identifying if a driver is wearing eyeglasses can yield important
spectacles in photographs of human faces. The outcomes information about their concentration and alertness levels in
and conversations demonstrate how well our approach driver surveillance applications.
performs in various circumstances and assessment
parameters. This study offers insightful information for However, given the variety of eyewear shapes, sizes,
creating efficient glasses detection algorithms that may and ways in which they cover different portions of the face,
be used in a variety of real-world contexts. the process of eyewear identification poses significant
challenges. Conventional image processing methods
Keywords:- Glasses Detection, Human Face Images, Image frequently find it difficult to properly account for these
Processing, Transfer Learning, Facial Feature Extraction. factors. Therefore, the creation of complex algorithms that
can reliably identify eyewear in a variety of real-life
I. INTRODUCTION circumstances is urgently needed.

Facial recognition technology is now widely used in  Research Contribution


many different industries for things like verification systems, In this work, we present a novel approach to eyeglass
security measures, and human-computer interface. The recog- nition by utilizing the MobileNet architecture.
ability to pre- cisely recognize and assess facial traits to MobileNet, which is well-known for its effectiveness,
differentiate individ- uals is at the heart of these provides a solid foundation for our picture categorization
technologies. Given that spectacles can significantly alter problems. By using transfer learning [2], we are able to
one’s face appearance, identifying the existence of effectively modify a pre- trained MobileNet model to
eyeglasses presents a substantial problem in this field [1]. perform well in the complex task of eyewear detection,
The variety in the form, style, and capacity of eyeglasses to obtaining notable accuracy even with a small amount of
partially obfuscate face characteristics makes it more training data.
difficult to identify in photos.
Our research delves deeper into the theoretical
The development of deep learning methodologies has underpin- nings of this problem, outlining the underlying
led to significant developments in multiple computer vision mathematics, optimization goals, and assessment standards.
domains, such as object detection and categorization. In We take great care in crafting the assessment criteria and
particular, transfer learning has become a potent method for loss function to make sure our model can accurately identify
using pre- existing models on novel, at times data-poor minute details about eyewear in facial photos.
problem areas. The objective is to implement transfer
learning [2] to develop an accurate and credible model that In addition, we provide a custom architecture that
can identify eyewear in photos of faces. consists of layers for data pretreatment, analysis, and
prediction that are created especially for this task. While the
Using the MobileNet architecture, we offer a new analysis layers use MobileNet’s complex feature hierarchies
method in our research for the detection of eyeglasses. to detect eyeglasses, the preparation layers prepare the
By correctly recognizing eyewear, this technique aims to photos for feature extraction. The final determination of
improve the accuracy of facial [3] recognition systems, whether or not eyewear is present in the examined photos is

IJISRT24AUG707 www.ijisrt.com 1152


Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG707

made by the prediction layers. as the mainstay of technology for several popular object
detection frameworks. These networks are particularly good
 Organization in learning multi-level visual data representations on their
In this paper, we present a novel approach to recognize own, which makes them very useful for challenging
eyeglasses with the MobileNet framework. This work tasks like object detection.
presents a robust glasses detection model within the
MobileNet system by applying transfer learning concepts Furthermore, transfer learning [4] has become a key
[2]. We provide a thorough analysis of the problem, technique in deep learning, allowing models that have
including its mathematical foundations, optimization been pre-trained on large datasets to be applied to new
objectives, and evaluation metrics. In addition, we create a problems with limited data. This strategy makes use of
tailored framework with layers for practical application, prior knowledge to accelerate learning and improve task
cognitive layers, and data processing layers in order to performance. One streamlined CNN design that stands
improve the precision of eyewear recognition. out for its ease of implementation on mobile and embedded
platforms is MobileNet [5]. By using depthwise separable
II. BACKGROUND convolutions, it is easy to maintain robust accuracy with a
large reduction in the number of required parameters and
One of the most important applications of computer computing demands. Due to its small size, low processing
vision is object identification in photos, which has several overhead, and remarkable accuracy, MobileNet is a widely
applications ranging from object recognition to scene used option for a wide range of applications.
comprehension to image classification. Historically, the MobileNet [5] serves as a potent foundation for feature
discipline has relied heavily on standard machine learning extraction in object detection scenarios, providing
approaches and manually constructed features, which comprehensive visual data representations helpful for
unfortunately have difficulty adjusting to a large range of identifying particular things, such as identifying eyeglasses
complex object classes. But the emergence of deep learning on human faces. This ability to customize pre-trained
has revolutionized computer vision, yielding exceptional MobileNet models for particular detection tasks highlights
performance on a wide range of tasks, including object its versatility and effectiveness.
recognition. Convolutional Neural Networks (CNNs) [4] are
one of these innovations that have gained significant traction III. RELATED WORK

Table 1 State of The Art in Glasses Detection


Year Title Model Used Pros Cons Accuracy
(%)
2000 [4] CNN The deformable contour can be The crossing of eyebrows on the 99.52%
achieved through dynamic program- edges of glasses creates significant
ming. ambigui- ties.
2004 [6] Wavelet Based LUT High level of correctness and a fast Lower accuracy compared to other 95.5%
Weak Classifier running speed. models
2013 [7] SVM,GB,GNB Achieved a suitable balance Collecting training dataset is 96%
between precision and speed cumber- some

IV. PROBLEM FORMULATION

 Mathematical Framework  Transfer Learning:


When data is scarce, utilizing trans- fer learning is a
 Input Representation: useful tactic for adapting current models to new tasks. In
The RGB photos displaying peo- ple’s faces make up particular, this method involves adapting a previously
the raw data for the task of identifying glasses on human trained MobileNet model to operate with a fresh dataset
faces inside images. Every image is orga- nized as a three- consisting of pictures of human faces that are labeled as
layer matrix, with each layer representing the intensity of a either wearing glasses or not in order to recognize eyewear
pixel throughout its width, height, and the three fundamental on faces.
hues of red, green, and blue.
Using the original weights from the MobileNet
These photographs go through a number of preparation model—obtained from its training on a large-scale dataset
procedures before being analyzed using the MobileNet such as ImageNet—we commence this refinement process.
archi- tecture. To increase the robustness of the model, this Then, in order to minimize the cross-entropy loss,
entails resizing the photographs to a consistent size, optimization techniques based on gradient descent are
scaling the pixel intensity values to lie between 0 and 1, and applied to modify the model’s parameters. The difference
applying image modification techniques like spinning and between the actual classifications in the training dataset and
mirroring the images. the projected likelihoods of the model is measured by this
loss.

IJISRT24AUG707 www.ijisrt.com 1153


Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG707

 Intelligence Layer
Our intelligent framework serves as the foundation
for our glasses recognition system, which is mainly
designed to identify eyeglasses in pictures of faces.
 Output Generation: Advanced deep learning techniques, with a focus on
The output generation procedure involves using the convolutional neural networks (CNNs), enable this process.
capabilities of the MobileNet architecture to make Through the analysis of both minor and significant
predictions about the identification of eyeglasses on faces in patterns, these networks are skilled at differentiating
image data. The final output is a simple binary classification particular aspects within the images, guaranteeing a high
that determines whether or not the examined picture shows a recognition accuracy rate for glasses. We use transfer
person wearing spectacles. learning to improve the training process, particularly when
working with a little amount of labeled data. This method
After processing the image using the best-fit makes the training process more efficient by enabling us to
MobileNet framework, this model uses a softmax activation use a pre-established set of weights to get our CNN model
function to provide a probability distribution across the two off to a faster start.
categories (glasses presence or absence). The anticipated
categorization is then determined by identifying the  Application Layer
category that has the highest likelihood. where θ represents the parameters of the MobileNet
model, N is the number of training samples, xi is the i-th
 Optimization Objective input image, yi is the ground truth label (0 for no glasses, 1
Reducing a loss function, which quantifies the for glasses), and Mθ(xi ) is the output of the MobileNet
discrepancy between the actual labels of the training data model for the i-th input image with parameters θ. By
and the prob- abilities predicted by the model, is the aim of assessing the discrepancy between the ground truth labels
the transfer learning process using MobileNet to identify and the expected probabilities, the cross-entropy loss
glasses. Usually, the cross-entropy loss is used as the loss penalizes the model for making inaccurate predictions.
function for this, and it is defined as follows:
 Evaluation Metrics
A few critical metrics are essential for gauging the
effec- tiveness of machine learning algorithms. Two crucial
(3) measures are used in the evaluation of the code in question:
accuracy and binary cross-entropy loss. The fraction of all
The difference between actual binary outcomes and correctly predicted predictions is represented by the
their expected probability is measured by binary cross- accuracy measure, which is calculated using the following
entropy loss. formula:
V. PROPOSED ARCHITECTURE

 Data Layer
Our proposed framework’s base layer is essential for
gather- ing and preparing input photos for the purpose of Our developed glasses detection model is implemented
recognizing eyeglasses. Our data came from a variety of in multiple real-world scenarios at the application layer. This
high-quality photos of human faces that we obtained from step entails integrating the model into various applications,
Kaggle, a reputable hub for machine learning datasets and such as security monitoring systems, facial recognition
contests. This dataset includes a wide range of participants, platforms, and even cutting-edge smart eyewear options.
including those who do not wear glasses, as well as those The model’s ability to recognize glasses in real-time from
from different demographic backgrounds and with a variety still photos and live video feeds greatly improves the
of eyeglasses. We carried out a number of data cleaning and usability and efficacy of the applications. In addition, we are
preprocessing procedures to preserve the uniformity and looking into ways to improve and fine-tune the model even
quality of the dataset. In order to strengthen and diversify more to make sure it satisfies the particular requirements
the dataset, these stages included removing distortions, and performance requirements of any application.
standardizing the image formats, and augmenting the
dataset. We were able to access a large and detailed VI. RESULTS AND DISCUSSIONS
collection of human face photos by using the Kaggle dataset,
which helped us develop a useful technique for detecting  Experiment Setup and Roles
eyewear. In this section, we describe in detail the experimental
setup and the roles played by each component in our
glasses- detecting investigation. Creating the model, training
schedule, evaluation criteria, and dataset preparation were
only a few of the crucial parts of this system. With roles
given according to our responsibilities and areas of
experience, our group collaborated to ensure that each step
went smoothly.

IJISRT24AUG707 www.ijisrt.com 1154


Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG707

Fig 1 Block Diagram

Gathering and prepping data was the first step. Team  Test Accuracy (First 10 Epochs): 0.917
members were tasked with locating human face photos  Train Accuracy (First 10 Epochs): 0.895
in the Kaggle library and applying the appropriate cleaning
and enhancing methods. Then, the intelligence layer was Although the model’s performance could be further en-
devel- oped. Deep learning specialists created and hanced, its weights were initialized and modified depending
implemented the convolutional neural network (CNN) on training data for a respectable level of accuracy.
model that was designed to identify eyewear.
 Fine-Tuning Process:
The team worked on optimizing hyperparameters, To enhance the model’s function- ality, a procedure
monitor- ing the training progression, and modifying the known as fine-tuning was implemented. The pre-trained
model settings during the training phase. Validation and test MobileNet V2 model’s top layers were unfrozen in this
datasets were used to assess the model’s effectiveness, and manner, allowing their weights to change as the model was
team analysis identified areas for improvement. At every trained. Through the application of the pre-trained network’s
level, productive teamwork and collaboration were essential expertise, the model was able to modify its features in order
to achieving the project’s goals. to better fit the particular goal of identifying glasses.

The team roles encompassed a variety of talents from There were significant improvements in accuracy after
computer vision and software engineering to data science the fine-tuning procedure.
and machine learning, demonstrating diversity and synergy.
Every member offered unique skills and perspectives that  Train Accuracy (After Fine-tuning): 0.999
improved innovative problem-solving. By working together  Validation Accuracy (After Fine-tuning): 0.997
and using strategic management, we were able to conduct
extensive trials, evaluate the results, and obtain important These outcomes demonstrate how fine-tuning raises
information about identifying eyewear on human faces. the accuracy and overall performance of the model.

 Evaluation Metrics VII. CONCLUSION AND FUTURE SCOPE


In this section, we examine the metrics used to assess
the eyewear identification algorithm’s efficacy by  Conclusion
contrasting its performances before and after tuning. The glasses detection model developed in this project
ef- fectively demonstrates the use of transfer learning and
 Before Fine-Tuning: fine- tuning techniques in computer vision tasks. By
The model was trained using the initial environments utilizing the pre-trained MobileNet [5] V2 model and
for 10 epochs prior to fine-tuning. The following assessment adapting it for the specific task of categorizing images as
metrics were obtained: either ”glasses” or ”no glasses”, we achieved remarkable
accuracy.

IJISRT24AUG707 www.ijisrt.com 1155


Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG707

Fig 2 Training And Validation Accuracy

Fig 3 Training and Validation Loss

Fig 4 After Fine Tuning

IJISRT24AUG707 www.ijisrt.com 1156


Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG707

Table 2 Experimental Setup


Parameter Value
Optimizer Adam
Learning Rate 0.0001
Number of Epochs 15
Image Dimension 160x160
Batch Size 32
Train-Validation-Test Split Ratio 70-15-15
Loss Function Binary Crossentropy
Performance Metric Accuracy
Dropout Applied (0.2)

The initial training process showed promising results,  Fine-Tuning Strategies:


with the model accurately distinguishing between images A variety of fine-tuning tech- niques, such as
contain- ing people wearing glasses and those without. modifying the learning rate schedules or unfreezing
However, the model’s performance was further enhanced different layers of the pre-trained model, can be investigated
through the fine- tuning process, allowing it to refine its in order to maximize the model’s performance.
features and achieve near-perfect accuracy on both the
training and validation datasets.  Multi-Class Classification:
The model’s potential appli- cations in many domains
This project highlights the potential of deep learning can be increased by extending its classification capabilities
models in solving real-world problems, such as facial to include multiple sets of photos, such as various styles of
recognition and attribute detection. eyeglasses or facial traits.

 Future Scope  Real-Time Implementation:


While the existing model does exceptionally well in Real-time applications such as smart glasses or
iden- tifying glasses, there are a number of avenues for more surveillance systems would require the model to be
investigation and improvement. optimized in terms of inference speed and resource
efficiency.
 Enhanced Dataset:
A greater variety of photos with different backdrops, By concentrating on these areas for development,
lighting, and stances could improve the model’s resilience further iterations of the glasses detection model can improve
and generalization capabilities. practi- cality and accuracy in real-world scenarios.

Fig 5 MobileNet Architecture

IJISRT24AUG707 www.ijisrt.com 1157


Volume 9, Issue 8, August – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24AUG707

REFERENCES

[1]. S. Bekhet and H. Alahmer, “A robust deep learning


approach for glasses detection in non-standard facial
images,” IET Biometrics, vol. 10, no. 1,
[2]. pp. 74–86, 2021.
[3]. K. Weiss, T. M. Khoshgoftaar, and D. Wang, “A
survey of transfer learning,” Journal of Big data,
vol. 3, pp. 1–40, 2016.
[4]. A. Ferna´ndez, R. Garc´ıa, R. Usamentiaga, and R.
Casado, “Glasses detection on real images based on
robust alignment,” Machine Vision and
Applications, vol. 26, pp. 519–531, 2015.
[5]. Z. Jing and R. Mariani, “Glasses detection and
extraction by deformable contour,” in Proceedings
15th International Conference on Pattern Recog-
nition. ICPR-2000, vol. 2, pp. 933–936, IEEE, 2000.
[6]. K.-Y. Kim and K.-B. Song, “Eyeball tracking and
object detection in smart glasses,” in 2020
International Conference on Information and
Communication Technology Convergence (ICTC),
pp. 1799–1801, IEEE, 2020.
[7]. B. Wu, H. Ai, and R. Liu, “Glasses detection by
boosting simple wavelet features,” in Proceedings of
the 17th International Conference on Pattern
Recognition, 2004. ICPR 2004., vol. 1, pp. 292–295,
IEEE, 2004.
[8]. H. Le, T. Dang, and F. Liu, “Eye blink detection for
smart glasses,” in 2013 IEEE International
Symposium on Multimedia, pp. 305–308, IEEE,
2013.

IJISRT24AUG707 www.ijisrt.com 1158

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy