Face Regonition
Face Regonition
Architectures
Abstract— The progression of information discernment via facial identification and the emergence of innovative frameworks has exhibited
remarkable strides in recent years. This phenomenon has been particularly pronounced within the realm of verifying individual credentials, a
practice prominently harnessed by law enforcement agencies to advance the field of forensic science. A multitude of scholarly endeavors have
been dedicated to the application of deep learning techniques within machine learning models. These endeavors aim to facilitate the extraction
of distinctive features and subsequent classification, thereby elevating the precision of unique individual recognition. In the context of this
scholarly inquiry, the focal point resides in the exploration of deep learning methodologies tailored for the realm of facial recognition and its
subsequent matching processes. This exploration centers on the augmentation of accuracy through the meticulous process of training models
with expansive datasets. Within the confines of this research paper, a comprehensive survey is conducted, encompassing an array of diverse
strategies utilized in facial recognition. This survey, in turn, delves into the intricacies and challenges that underlie the intricate field of facial
recognition within imagery analysis.
C. ResNet.
The bedrock upon which the architectural underpinnings of
Figure 6. Architecture ResNet [20]
deep Convolutional Neural Network (CNN) designs repose is
rooted in the notion that with the escalation of network depth, Furthermore, it impairs the propagation of pertinent information
coupled with the utilization of an array of nonlinear mappings through the feature map during the feed-forward process, a
and the cultivation of more intricate feature hierarchies, the drawback that cannot be ignored. In addition to these concerns,
it is essential to underscore that the ResNet's architectural generating region proposals that are independent of object
configuration entails an exceptionally high computational cost, categories, thereby creating a preliminary selection of regions of
which must be taken into careful consideration. interest. Subsequently, the second component of RCNN, namely
a deep convolutional neural network (specifically, AlexNet),
D. Region-Based Convolutional Neural Network (R
takes center stage. This neural network is responsible for
CNN).
extracting intricate feature vectors from the identified regions of
In the realm of computer vision, the paradigm of Region-based interest. These feature vectors encapsulate the discriminative
Convolutional Neural Networks, or R-CNN, emerged as a information necessary for object classification. The final step in
significant advancement. In the year 2014, Ross Girshick and his this pipeline entails employing a Support Vector Machine
collaborators presented R-CNN as a robust solution aimed at (SVM) classifier to categorize the extracted information. This
rectifying the challenges associated with effective object classifier leverages the feature vectors to discern and assign
localization in the context of object recognition tasks. The object labels to the regions of interest. However, it is worth
fundamental predicament addressed by R-CNN stems from the noting that the performance of this approach may be hindered
inherent inefficiency of Convolutional Neural Networks (CNNs) when applied to real-time applications. The primary constraint
in swiftly and accurately pinpointing objects of interest. This arises from the necessity to partition the image into a substantial
inefficiency arises from the nature of CNNs, which directly number of regions, often exceeding 2000, on a recurrent basis.
extract pertinent features from the input data. Consequently, the Consequently, this computational overhead may lead to
conventional approach to identifying a specific object within an suboptimal results in scenarios requiring real-time
image entails a considerable computational time investment. responsiveness.
One of the primary limitations of employing a traditional
convolutional network followed by a fully connected layer lies E. Google Net
in the variability of the output layer's size. Unlike a fixed-size In the scholarly publication titled "Going Deeper with
output layer, the output of such networks can assume variable Convolutions," released in the year 2014 [22], a team of
dimensions, leading to the creation of image representations researchers affiliated with Google introduced what has since
containing an unpredictable multitude of instances featuring become widely recognized as GoogleNet, alternatively referred
various objects. This unpredictability in the number of object to as Inception-V1. This architectural innovation ascended to
instances further complicates the process of object localization victory in the fiercely competitive arena of the 2014 ILSVRC
and recognition within the image data. image classification competition. In comparison to the prior
architectures employed in Convolutional Neural Networks
(CNNs), GoogleNet demonstrated a notably diminished error
rate, marking a pivotal achievement in the realm of deep
learning. The overarching objective underpinning the creation of
the GoogleNet architecture was the pursuit of exceptional
accuracy in image classification tasks while maintaining a
judicious approach to computational resources. This
architectural marvel boasts a formidable depth, comprising a
total of 22 distinct layers, and incorporates a staggering 27
pooling levels. Within this intricate framework, the researchers
thoughtfully integrated a 1x1 convolutional layer in conjunction
with average pooling techniques. An inherent challenge faced in
the development of GoogleNet was the looming specter of
Figure 7. Architecture R CNN [21]
overfitting. Given the profound depth of the network's layers,
there existed a palpable risk of an excessively specialized model
Utilizing a Convolutional Neural Network (CNN) for the
that performed exceedingly well on the training data but
purpose of classifying the presence of objects within various
struggled to generalize effectively. In response, the GoogleNet
regions of interest depicted in an image represents a direct and
architecture ingeniously diverged from the conventional wisdom
pragmatic approach to addressing this challenge. The Region-
of deepening the network and instead embraced a strategy that
based Convolutional Neural Network (RCNN) method, which
broadened its computational capabilities. This strategy was
comprises three distinct sequential steps, offers a systematic
anchored in the deployment of filters of varying sizes, enabling
solution to the task at hand. The initial phase of the RCNN
them to operate synergistically on the same hierarchical level.
workflow involves the identification of a set of salient point
Yet, the intricacy of GoogleNet's architecture came with its own
detections within the image. This process commences by
set of complications. A salient issue pertained to the Nonetheless, it is imperative to acknowledge certain intrinsic
heterogeneous topology that necessitated intricate module-to- limitations inherent to CNNs. Firstly, CNNs do not encode
module modifications, posing a considerable challenge in terms information pertaining to an object's spatial location or
of design and implementation. Additionally, the architecture orientation. Consequently, when an object undergoes slight
grappled with a bottleneck phenomenon within its representation alterations in either its position or orientation, it may fail to
flow. This bottleneck significantly compressed the feature space activate the neural pathways responsible for its recognition.
in subsequent layers, thereby occasionally leading to the Additionally, the training process can become protracted,
unfortunate loss of pivotal data, adversely affecting the model's especially when a CNN encompasses numerous layers and the
overall performance and robustness. computational capabilities of the GPU are suboptimal. Another
notable drawback of CNNs is their voracious appetite for
TABLE II. COMPARATIVE STUDY OF VARIANTS OF CNN.
voluminous training data, rendering them relatively sluggish in
Architecture Origin Advantages Applications terms of processing speed. Furthermore, the pooling layer, an
1. Pioneer in CNNs. integral component of CNN architecture, tends to overlook the
2. Efficient for 1. Handwritten
small image digit recognition
interrelationship between localized features and the holistic
LeNet 1998 recognition tasks. (MNIST dataset). context, resulting in appreciable information loss. For instance,
3. Utilizes 2. Early character when discerning facial features from a video feed, a considerable
convolution and recognition. degree of data dependency is requisite. Furthermore, CNNs are
pooling layers.
not ideally suited for tackling time series problems. Their
1. Introduced deep 1. Image
CNNs. classification extensive parameterization, comprising millions of tunable
2. Utilizes ReLU (ImageNet parameters, renders them susceptible to underperformance when
activation and challenge). confronted with inadequately sized datasets. A surfeit of data,
AlexNet 2012
dropout. 2. Object
conversely, imbues CNNs with greater robustness and the
3. GPU acceleration detection.
for 3. Image propensity to yield enhanced performance outcomes. To
training. segmentation. ameliorate these limitations and optimize the performance of
1. Deep 1. Image CNNs, a judicious strategy involves amalgamating the CNN
architectures classification
algorithm with other neural network paradigms such as
without (ImageNet
vanishing. challenge). Recurrent Neural Networks (RNNs), Long Short-Term Memory
ResNet 2015 2. Gradients 2. Object (LSTM) networks, or alternative approaches. This fusion
problem. detection (e.g., facilitates enhanced computational efficiency and can
3. Improved Faster R-CNN).
substantially augment the efficacy of the CNN algorithm,
training of very 3. Semantic
deep networks. segmentation. particularly when confronted with complex, multifaceted tasks.
1. Combines region
proposals with 1. Object V. PRACTICAL SCENARIOS FOR FACE
CNNs detection and RECOGNITION.
R-CNN 2013 2. Achieved state- localization.
of-the-art results in 2. Image
Face recognition technology has a wide range of practical
object detection segmentation. scenarios across various industries and applications. Here are
tasks. some practical scenarios for face recognition with explanations:
GoogLeNet 2014 1. Inception 1. Image Access Control and Security: Facility Access: In office buildings
modules for classification
or secure facilities, employees can gain access by simply having
efficient and deep (ImageNet
networks. challenge). their faces recognized, enhancing security and convenience.
2. Reduces the 2. Object Airport Security: Facial recognition can expedite the passenger
number of detection (e.g., screening process at airports, identifying individuals on watch
parameters. YOLO).
lists or verifying their identity.
Mobile Device Authentication: Smartphones: Users can unlock
In this exposition, we have delved into the rudimentary
their smartphones or authorize mobile payments by facial
principles underpinning Convolutional Neural Networks
recognition, adding an extra layer of security to their devices.
(CNNs). CNNs represent a dependable and efficacious deep
Payment Authorization: Retail Payments: Customers can make
learning methodology, particularly germane to the realm of
payments at stores or online by simply looking at a camera,
image processing. They excel in multifarious image-related
reducing the need for physical cards or passwords.
tasks such as facial recognition, image categorization, and object
detection. One of the salient virtues of CNNs is their innate
capacity for feature extraction sans human intervention.
Healthcare: Patient Identification: Hospitals can accurately VI. CHALLENGES AND COMPLICATIONS IN THE
identify patients to prevent medical errors and ensure that the SPHERE OF FACE RECOGNITION
right patient receives the right treatment. Face recognition technology has made significant advancements
Law Enforcement and Public Safety: Criminal Identification: in recent years, but it still faces several challenges. Here are
Police departments can quickly identify suspects in crowds or some of the key challenges in face recognition:
match suspects to existing databases, aiding in crime prevention Privacy Concerns:
and solving cases. • Data Privacy: The collection and storage of facial data raise
Attendance Tracking: Schools and Universities: Educational privacy concerns, especially when used without individuals'
institutions can track student and faculty attendance consent or knowledge.
automatically, streamlining administrative tasks.
• Surveillance: Widespread use of facial recognition in public
Customer Service: Retail and Hospitality: Businesses can use
spaces can lead to mass surveillance concerns and potential
facial recognition to personalize customer experiences,
abuse by governments and corporations.
recognize loyal customers, and improve service.
Accuracy and Robustness:
Human Resources: Time and Attendance: Companies can
• Variability: Faces can vary significantly due to lighting
automate employee attendance tracking, reducing errors and
conditions, angles, facial expressions, and occlusions, making
ensuring fair compensation.
it challenging to achieve consistently high accuracy.
Public Events and Venues: Ticketless Entry: Attendees at
• Adversarial Attacks: Face recognition systems can be
concerts, sporting events, and amusement parks can gain entry
vulnerable to attacks that involve modifying or adding noise
by having their faces scanned, reducing ticket fraud.
to input images to deceive the system.
Smart Homes or Home Automation: Homeowners can use facial
Security Risks:
recognition to control smart home devices, customize settings,
• Spoofing: Attackers can use photos, videos, or 3D masks to
and enhance security.
trick face recognition systems, compromising security.
Retail Analytics or Customer Insights: Retailers can gather data
• Privacy Invasion: Criminals or unauthorized individuals can
on customer demographics, behavior, and shopping preferences,
use stolen biometric data to impersonate others or gain access
enabling targeted marketing strategies.
to sensitive information.
Customized Advertising or Digital Signage: Advertisers can
Regulatory and Legal Challenges:
display personalized ads based on the age and gender of
• Lack of Standards: The absence of comprehensive regulations
individuals passing by digital billboards.
and standards can lead to inconsistent deployment and ethical
Aging and Healthcare Monitoring: Aging Population: Face
concerns.
recognition can help monitor the health and well-being of the
elderly by detecting changes in facial expressions or vital signs. • Legislation: Governments are still working to create
Authentication in Banking: ATM Access: Banks can enhance appropriate legal frameworks to address the ethical and
ATM security by adding facial recognition as a biometric privacy implications of face recognition.
Scalability and Performance:
authentication method.
Visitor Management: Corporate Offices: Companies can • Real-time Processing: Achieving real-time performance on a
streamline visitor check-ins and enhance security by using facial large scale, such as in crowded public spaces, remains a
recognition for visitor management. technical challenge.
Forensics: Criminal Investigations: Law enforcement agencies • Hardware Constraints: Some applications may require
can use facial recognition to identify potential suspects from specialized hardware to perform face recognition efficiently.
surveillance footage or composite sketches. Aging and Long-term Changes:
Contactless Check-in at Hotels: Hospitality Industry: Guests can • Aging: Over time, people's faces change due to aging, which
check into hotels without physical contact, improving the check- can reduce the accuracy of recognition systems.
in process and safety during a pandemic. • Lifestyle Changes: Significant lifestyle changes, such as
Customized Healthcare Treatment: Medical Diagnosis: Facial weight loss or gain, can also affect facial recognition accuracy.
recognition can assist in diagnosing certain medical conditions Environmental Factors:
by analyzing facial features and expressions. • Environmental conditions such as poor lighting, weather, or
Search and Rescue Operations or Emergency Response: In low-resolution images can affect the performance of face
disaster scenarios, facial recognition can help locate missing recognition algorithms.
persons by matching faces with databases of survivors.
VII. CONCLUSION. [4] Carolina Todedo Ferraz And Jose Hiroki. , “A Comprehensive
Analysis Of Local Binary Convolution Neural Network For Fast
In this comprehensive review paper, we endeavor to provide a
Face Recognition In Surveillance Video.” ACM. 2018.
meticulous summary of the diverse Deep Learning [5] Nate Crosswhite, Jeffrey Byrne, Chris Stauffer, Omkar Parkhi,
methodologies that have been harnessed in the realm of facial Aiong Cao And Andrew Zisserman, “Template Adaptation For
recognition systems. A thorough and exhaustive scrutiny of the Face Verification And Identification. 12th International
existing literature has yielded the realization that Deep Learning Conference On Automatic Face & Gesture Recognition”, IEEE.
Techniques have, undeniably, propelled significant 2017.
advancements within the sphere of facial recognition. It is [6] Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong,
noteworthy to mention that a multitude of scholarly publications Jingchao Zhou, Zhifeng Li And Wei Liu, “Cosface: Large
Margin Cosine Loss For Deep Face Recognition. Conference On
have not only proffered insightful perspectives but have also
Computer Vision And Pattern Recognition.” , IEEE. 2018.
implemented a myriad of methodologies catering to various
[7] Ran He, Xiang Wu, Zhenan Sun And Tieniu Tan. “Wasserstein
facets of face recognition, encompassing aspects such as the Cnn: Learning Invariant Features For NIR-VIS Face
accommodation of multiple facial expressions, temporal Recognition.” IEEE. 2017.
invariance, variations in facial weight, fluctuations in [8] Yibo Ju, Lingxiao Song, Bing Yu, Ran He, Zhenan Sun.
illumination conditions, and more. It is noteworthy to highlight “Adversarial Embedding And Variational Aggregation For Video
that the utilization of deep learning techniques in the context of Face Recognition”, IEEE. 2018.
facial recognition has thus far attracted a relatively modest [9] S, D. A. (2021). CCT Analysis and Effectiveness in e-Business
number of academic articles. However, upon a comprehensive Environment. International Journal of New Practices in
Management and Engineering, 10(01), 16–18.
amalgamation of numerous evaluations, it becomes
https://doi.org/10.17762/ijnpme.v10i01.97
unequivocally apparent that the modified Convolutional Neural
[10] Wang, X., Lu, Y., Wang, Z., & Feng, J. (2018). Deep
Network (CNN) variants, specifically tailored for facial discriminative feature learning for face verification. In
recognition purposes, exhibit significant promise. This Proceedings of the IEEE Conference on Computer Vision and
observation underscores the existence of a substantial scope for Pattern Recognition (CVPR) (2018).
continued and extensive research endeavors employing Deep [11] Kaiming He; Xiangyu Zhang; Shaoqing Ren; Jian Sun. ”Deep
Learning techniques to further enhance the capabilities of facial Residual Learning for Image Recognition”. IEEE Conference on
recognition systems. It is of paramount importance to underscore Computer Vision and Pattern Recognition (CVPR). 2016.
that the findings of this review illuminate a relatively sparse [12] Florian Schroff; Dmitry Kalenichenko; James Philbin. “FaceNet:
A unified embedding for face recognition and clustering.” IEEE
adoption of the transfer-learning strategy within the domain of
Conference on Computer Vision and Pattern Recognition
facial recognition systems, subsequent to the identification and
(CVPR). 2015.
analysis of various deep learning approaches currently in use. [13] Yaniv Taigman; Ming Yang; Marc'Aurelio Ranzato; Lior Wolf.
Consequently, this underscores the compelling need for future “DeepFace: Closing the Gap to Human-Level Performance in
research endeavors to direct their focus towards the refinement Face Verification.” IEEE Conference on Computer Vision and
and augmentation of facial recognition through the judicious Pattern Recognition. 2014
application of deep learning methodologies. This emerging area [14] Mr. Zubin C. Bhaidasna, Dr. Priya R. Swaminarayan. “A
beckons for further exploration and experimentation, promising SURVEY ON CONVOLUTION NEURAL NETWORK FOR
breakthroughs that will undoubtedly bolster the efficacy and FACE RECOGNITION”, Journal of Data Acquisition and
Processing Vol. 38 (2) 2023
reliability of facial recognition systems in the times ahead.
[15] Mr. Zubin C. Bhaidasna, Dr. Priya R. Swaminarayan. “A
REFERENCES SURVEY ON CONVOLUTION NEURAL NETWORK FOR
FACE RECOGNITION”, Journal of Data Acquisition and
[1] Peng Lu, Baoye Song, Lin Xu. “ Human face recognition based
Processing Vol. 38 (2) 2023.
on convolutional neural network and augmented dataset.“
[16] Peng Lu, Baoye Song, Lin Xu“ Human face recognition based on
Systems Science & Control Engineering, 2020.
convolutional neural network and augmented dataset, Systems
[2] Jiankang Deng, Jia Guo, Niannan Xue, Stefanos Zafeiriou
Science & Control Engineering, 2020.
“ArcFace: Additive Angular Margin Loss for Deep Face
[17] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based
Recognition”, IEEE Conference on Computer Vision and Pattern
learning applied to document recognition," in Proceedings of the
Recognition (CVPR), 2019.
IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.
[3] Jun-Cheng Chen, Rajeev Ranjan, Swami Sankaranarayanan,
[18] Mr. Zubin C. Bhaidasna, Dr. Priya R. Swaminarayan. “A
Amit Kumar. Ching-Hui Chen, Vishal M. Patel, Carlos D.
SURVEY ON CONVOLUTION NEURAL NETWORK FOR
Castillo, Rama Chellappa.” Unconstrained Still/Video-Based
FACE RECOGNITION”, Journal of Data Acquisition and
Face Verification With Deep Convolutional Neural Networks”,
Processing Vol. 38 (2) 2023.
Springer. 2017.
[19] Khan, Asifullah et al. “A survey of the recent architectures of deep
convolutional neural networks.” Artificial Intelligence Review
(2020).
[20] https://www.google.com/search?sca_esv=561848188&q=alexnet
+architecture&tbm=isch&source=lnms&sa=X&ved=2ahUKEwj
e9aWa3IiBAxVyTmwGHfcfDQQQ0pQJegQIDBAB&biw=136
6&bih=619&dpr=1#imgrc=xqC2QyZ_mjTNqM.
[21] Mr. Zubin C. Bhaidasna, Dr. Priya R. Swaminarayan. “A
SURVEY ON CONVOLUTION NEURAL NETWORK FOR
FACE RECOGNITION”, Journal of Data Acquisition and
Processing Vol. 38 (2) 2023.
[22] https://www.researchgate.net/figure/Block-diagram-of-Faster-R-
CNN_fig1_339463390.
[23] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott
Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke,
Andrew Rabinovich; Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR).