0% found this document useful (0 votes)

11 views14 pages

Cyb306 Feature Extraction

This book gives you understand on biometric security

Uploaded by

philemonalozie29

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views14 pages

Cyb306 Feature Extraction

This book gives you understand on biometric security

Uploaded by

philemonalozie29

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 14

FEDERAL UNIVERSITY OF TECHNOLOGY OWERRI

IMO STATE

P.M.B. 1526

SCHOOL OF INFORMATION AND COMMUNICATION

TECHNOLOGY

THE DEPARTMENT OF

CYBER SECURITY

CYB 306

BIOMETRIC SECURITY

REPORT ON:

EXPERIENCE GATHERED AND TASKED PERFORMED

SUBMITTED TO

MRs FESTUS

PRESENTED BY

ALOZIE CHINONYE PHILEMON

20211290062
Table of content

 Feature Extraction in Biometric Security

 Introduction
 Bag of Words (BoW)
 Process
 Advantages
 Disadvantages
 Word to Vector (Word2Vec)
 Process
 Advantages
 Disadvantages
 Large pre-trained natural language processing
 Process
 Advantages
 Disadvantages
 Conclusion

Feature Extraction in Biometric Security

Introduction
In biometric security systems, feature extraction is a critical process that transforms raw data, such as
images or text, into a format that can be processed and analyzed by machine learning algorithms. The
objective of feature extraction is to derive a set of significant characteristics from raw data that can be
used for efficient pattern recognition and decision-making. In the context of biometric security, feature
extraction involves identifying and quantifying the unique attributes of biometric data—such as
fingerprints, facial features, or voice patterns—to create a template that can be used for
authentication or identification purposes.

For textual data in biometric systems, particularly in behavioral biometrics (e.g., typing patterns or
speech analysis), feature extraction can be achieved using natural language processing (NLP) techniques.
Three primary methods of feature extraction in the context of textual data include the Bag of Words
(BoW), Word to Vector (Word2Vec), and large pre-trained NLP models. Each of these methods plays a
pivotal role in transforming raw text data into a structured and meaningful format for biometric
security.

I. Bag of Words (BoW)

1. Process

Bag of Words represents text by counting how often words appear in a document, without considering
grammar or word order. Each unique word becomes a feature, and its value is the number of times it
appears.

Advantages

 Simplicity: The method is easy to implement and understand.

 Effective for small datasets: Works well when dealing with smaller corpora where context is less
important.
 Efficient computation: Fast to compute and can be optimized for different tasks.

Disadvantages

 Ignores context: Since word order and semantics are not captured, the model may lose
important information regarding the meaning of the text.
 Sparsity: The vector representations tend to be sparse, especially for large vocabularies, which
can negatively impact computational efficiency.
 Vulnerable to dimensionality issues: With a growing number of unique words, the vector space
becomes extremely large, making it hard to manage and leading to overfitting.

II. Word to Vector (Word2Vec)

Process

Word2Vec is a word embedding technique that represents words as vectors in a continuous vector
space. These vectors are created so that similar words (like "king" and "queen") are close to each other
in the vector space. Word2Vec uses a neural network to learn these representations based on the
context in which words appear.

Example: "King" and "Queen" might have similar vector representations because they share similar
contexts (e.g., both royalty). These vectors capture semantic relationships.

Advantages

 Captures semantics: Word2Vec captures the semantic relationships between words, making it
highly effective for tasks that involve understanding the meaning and context of the text.

 Dimensionality reduction: Instead of working with sparse high-dimensional vectors, Word2Vec

produces dense, low-dimensional vectors, which are easier to work with and more
computationally efficient.
 Improved accuracy: The embeddings learned by Word2Vec often result in better performance
for NLP tasks compared to simple methods like BoW.

Disadvantages

 Requires large datasets: Word2Vec requires large amounts of text data to effectively capture the
relationships between words.
 Computationally intensive: Training the model can be time-consuming and requires significant
computational resources, especially for larger datasets.
 Context window limitations: The fixed window size in CBOW and Skip-gram can limit the model's
ability to capture long-range dependencies in the text.

III) large pre-trained natural language processing:

Large pre-trained NLP models, such as BERT (Bidirectional Encoder Representations from
Transformers) and GPT (Generative Pre-trained Transformer), represent the cutting-edge in
feature extraction techniques. These models are trained on massive datasets and can capture
complex semantic and syntactic patterns in the text. Unlike BoW or Word2Vec, these models
consider the full context of each word within a sentence, both before and after the word
(bidirectional context).

Process

The process typically involves two stages:

 Pre-training: The model is trained on a large corpus of text data to learn general language
representations.
 Fine-tuning: The pre-trained model is then fine-tuned on a specific task or dataset, such as
biometric security data.

Advantages

 Contextual understanding: Large pre-trained models capture both short-term and long-term
dependencies in the text, leading to a deeper understanding of word meaning and context.
 Transfer learning: These models can be fine-tuned on specific tasks without the need for large
labeled datasets, making them highly adaptable.
 High performance: Pre-trained models achieve state-of-the-art results on a wide range of NLP
tasks, including biometric security.

Disadvantages

 Computationally expensive: These models require significant computational power and

memory, making them less accessible for smaller organizations.
 Large datasets needed: Pre-training these models requires massive amounts of text data, which
might not always be available.

 Overfitting risk: Without careful fine-tuning, the model may overfit to specific tasks, reducing its
generalizability.

Conclusion

Feature extraction plays a critical role in biometric security, especially when dealing with textual or
behavioral biometric data. The Bag of Words model offers simplicity and efficiency for smaller tasks but
is limited in its ability to capture context. Word2Vec improves upon this by embedding semantic
relationships between words, making it more suitable for nuanced biometric applications. Large pre-
trained NLP models, such as BERT and GPT, offer state-of-the-art performance by capturing complex
contextual relationships, albeit at a higher computational cost.
The choice of feature extraction method should depend on the specific needs of the biometric system,
the type of data being processed, and the available computational resources. As biometric security
continues to evolve, leveraging advanced feature extraction techniques will be key to developing more
accurate and robust systems.
References

1. Jurafsky, D., & Martin, J. H. (2020). Speech and Language Processing (3rd ed.). Pearson.This textbook
provides a comprehensive overview of natural language processing techniques, including the Bag of
Words and Word2Vec models, and their applications in various domains like biometric security.

2. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in
Vector Space. arXiv.This paper introduces Word2Vec and its two algorithms, CBOW and Skip-gram,
explaining how these methods capture semantic relationships between words.

3. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional
Transformers for Language Understanding. arXiv.This paper details the development of the BERT model,
which revolutionized feature extraction in natural language processing by considering the full context of
words in a sentence.

4. Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is All You Need. In Proceedings of the 31st
Conference on Neural Information Processing Systems (NIPS).

This paper introduces the Transformer model, the foundation of large pre-trained NLP models like BERT
and GPT, which are used for advanced feature extraction in biometric security applications.

5. Zhu, J., Zhang, Z., Zhou, Q., & You, X. (2021). A Survey on Biometric Security: From Basic to Advanced
Concepts. IEEE Access, 9, 102529-102555.

This survey provides a detailed exploration of various biometric security methods, including behavioral
biometrics, and discusses the role of feature extraction in enhancing the accuracy and security of these
systems.

6. Sethi, A., & Jain, R. (2018). Keystroke Dynamics and Behavioral Biometrics: Methods and Applications.
In Proceedings of the 9th International Conference on Biometrics. IEEE.

This paper explores the application of feature extraction techniques, such as Bag of Words and
Word2Vec, in behavioral biometrics, particularly in keystroke dynamics for biometric security.

7. Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A Neural Probabilistic Language Model.
Journal of Machine Learning Research, 3, 1137-1155.This foundational paper discusses neural network-
based language models, which laid the groundwork for later methods such as Word2Vec and large pre-
trained models like BERT.

Module03 Embeddings
No ratings yet
Module03 Embeddings
102 pages
AcuXDBC User's Guide v8.1
No ratings yet
AcuXDBC User's Guide v8.1
288 pages
9713 Y16 Sy PDF
No ratings yet
9713 Y16 Sy PDF
53 pages
BERT
No ratings yet
BERT
98 pages
Power BI Basic
100% (2)
Power BI Basic
2,387 pages
Efficient Estimation of Word Representations in Vector Space - Meghana B
No ratings yet
Efficient Estimation of Word Representations in Vector Space - Meghana B
2 pages
Apache Quick Reference Card
100% (1)
Apache Quick Reference Card
2 pages
Lecture 1
No ratings yet
Lecture 1
70 pages
02 Security 101 Gaia 2022
No ratings yet
02 Security 101 Gaia 2022
42 pages
An Introduction To Feature Extraction
No ratings yet
An Introduction To Feature Extraction
2 pages
An Introduction To Android Development: CS231M - Alejandro Troccoli
No ratings yet
An Introduction To Android Development: CS231M - Alejandro Troccoli
22 pages
s10579-022-09620-5
No ratings yet
s10579-022-09620-5
35 pages
UCCX - 11 - Release Notes For Uccx Solution
No ratings yet
UCCX - 11 - Release Notes For Uccx Solution
50 pages
Integrating Oracle and SQL Server: E-Guide
No ratings yet
Integrating Oracle and SQL Server: E-Guide
17 pages
Compiler Design
No ratings yet
Compiler Design
20 pages
Machine Learning For NLP: Vocabulary
No ratings yet
Machine Learning For NLP: Vocabulary
37 pages
BA-LLMS-W2-S2-2024-2025
No ratings yet
BA-LLMS-W2-S2-2024-2025
47 pages
Word2Vec
No ratings yet
Word2Vec
33 pages
GenAI Workflow Automation NPTEL Zoom Course
No ratings yet
GenAI Workflow Automation NPTEL Zoom Course
88 pages
Wifi
No ratings yet
Wifi
7 pages
06 Wordvectors
No ratings yet
06 Wordvectors
96 pages
Balaguruswamy Solved Question Papers 2006
No ratings yet
Balaguruswamy Solved Question Papers 2006
73 pages
Accelerating 5g New Radio NR For Enhanced Mobile Broadband and Beyond
No ratings yet
Accelerating 5g New Radio NR For Enhanced Mobile Broadband and Beyond
40 pages
Unit 5b - Natural Language Processing
No ratings yet
Unit 5b - Natural Language Processing
41 pages
NLP_Module 2
No ratings yet
NLP_Module 2
54 pages
Word Embeddings Notes
No ratings yet
Word Embeddings Notes
9 pages
HP Consumer Pavilion Notebook Price List 2019 Singapore
No ratings yet
HP Consumer Pavilion Notebook Price List 2019 Singapore
36 pages
Database File Organisation Lecture
No ratings yet
Database File Organisation Lecture
32 pages
CCS369 UNIT-2 20.12.24
No ratings yet
CCS369 UNIT-2 20.12.24
41 pages
ML for NLP-LO4
No ratings yet
ML for NLP-LO4
42 pages
ProgramsGenAI_BAIL657C
No ratings yet
ProgramsGenAI_BAIL657C
18 pages
Research Review of Related Literature
No ratings yet
Research Review of Related Literature
9 pages
PhilHealth Contributions
No ratings yet
PhilHealth Contributions
19 pages
NLP Prez Word - Sentence Embedding - MAQUET - MARTIN - LEEFEBURE - MOGAVERO
No ratings yet
NLP Prez Word - Sentence Embedding - MAQUET - MARTIN - LEEFEBURE - MOGAVERO
18 pages
Outputinstall 1703904202
No ratings yet
Outputinstall 1703904202
9 pages
Rebong Ermintrude Roark RIPActivity
No ratings yet
Rebong Ermintrude Roark RIPActivity
9 pages
Introduction To Electrical Engineering
0% (1)
Introduction To Electrical Engineering
4 pages
CDS300 Assignment S22 Ques Qonita Qotrunnada
No ratings yet
CDS300 Assignment S22 Ques Qonita Qotrunnada
9 pages
Trainer Preparation Guide For Course 10961C: Automating Administration With Windows Powershell Design of The Course
No ratings yet
Trainer Preparation Guide For Course 10961C: Automating Administration With Windows Powershell Design of The Course
8 pages
Best Practices in Elasticsearch
No ratings yet
Best Practices in Elasticsearch
5 pages
Explaining The Intuition of Word2Vec & Implementing It in Python
No ratings yet
Explaining The Intuition of Word2Vec & Implementing It in Python
13 pages
Cncmillingprograms 160318071113 PDF
No ratings yet
Cncmillingprograms 160318071113 PDF
33 pages
Computer Networks Syllabus
No ratings yet
Computer Networks Syllabus
3 pages
Lecture Word Embeddings WordTo Vec IR
No ratings yet
Lecture Word Embeddings WordTo Vec IR
60 pages
DM Chapter 9 - word embedding
No ratings yet
DM Chapter 9 - word embedding
7 pages
542 315 Word2vec
No ratings yet
542 315 Word2vec
20 pages
Gantt Chart
No ratings yet
Gantt Chart
2 pages
Word2Vec - A Baby Step in Deep Learning But A Giant Leap Towards Natural Language Processing
100% (1)
Word2Vec - A Baby Step in Deep Learning But A Giant Leap Towards Natural Language Processing
12 pages
Wireless Communication With Ir Based Projects:: Project Code Title of The Project Description Cost MIR01
No ratings yet
Wireless Communication With Ir Based Projects:: Project Code Title of The Project Description Cost MIR01
2 pages
David Cotton, David Falvey, Simon Kent Business English Course Book
No ratings yet
David Cotton, David Falvey, Simon Kent Business English Course Book
1 page
AI For Everyone
No ratings yet
AI For Everyone
1 page
Areeba Techno India University
No ratings yet
Areeba Techno India University
1 page
Bootstrapping Language-Image Pretraining: The Complete Guide for Developers and Engineers
From Everand
Bootstrapping Language-Image Pretraining: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
WLang Essentials: Definitive Reference for Developers and Engineers
From Everand
WLang Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Modula-2 Language and Programming Techniques: Definitive Reference for Developers and Engineers
From Everand
Modula-2 Language and Programming Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Rebol Programming Insights: Definitive Reference for Developers and Engineers
From Everand
Rebol Programming Insights: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Advanced Metaprogramming Techniques: Definitive Reference for Developers and Engineers
From Everand
Advanced Metaprogramming Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
CHATGPT DALL.E 3: Complete Guide. Third Edition
From Everand
CHATGPT DALL.E 3: Complete Guide. Third Edition
Hesham Mohamed Elsherif
No ratings yet
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
From Everand
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
Timothy King
No ratings yet
Programming with X10: Definitive Reference for Developers and Engineers
From Everand
Programming with X10: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Unveiling the Secrets of ChatGPT Inside the Mind of an AI
From Everand
Unveiling the Secrets of ChatGPT Inside the Mind of an AI
Nelson Ambrose
No ratings yet
Introduction to DBMS: Designing and Implementing Databases from Scratch for Absolute Beginners
From Everand
Introduction to DBMS: Designing and Implementing Databases from Scratch for Absolute Beginners
Dr. Hariram Chavan
No ratings yet
Deep Learning
From Everand
Deep Learning
Manish Soni
No ratings yet
Code Generation Techniques and Applications: Definitive Reference for Developers and Engineers
From Everand
Code Generation Techniques and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Decoding Large Language Models: An exhaustive guide to understanding, implementing, and optimizing LLMs for NLP applications
From Everand
Decoding Large Language Models: An exhaustive guide to understanding, implementing, and optimizing LLMs for NLP applications
Irena Cronin
No ratings yet
Semantic Computing
From Everand
Semantic Computing
Phillip C.-Y. Sheu
No ratings yet
Transformers in Deep Learning Architecture: Definitive Reference for Developers and Engineers
From Everand
Transformers in Deep Learning Architecture: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering Computer Programming: A Comprehensive Guide
From Everand
Mastering Computer Programming: A Comprehensive Guide
Kondwani Hara
No ratings yet
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
BERT Foundations and Applications: Definitive Reference for Developers and Engineers
From Everand
BERT Foundations and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Practical MXNet Applications: Definitive Reference for Developers and Engineers
From Everand
Practical MXNet Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Applied Techniques for GPT-3: Definitive Reference for Developers and Engineers
From Everand
Applied Techniques for GPT-3: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
From Everand
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Crafting Excellence in Software Development
From Everand
Crafting Excellence in Software Development
Pasquale De Marco
No ratings yet
Mastering C: Advanced Techniques and Tricks
From Everand
Mastering C: Advanced Techniques and Tricks
Ted Norice
No ratings yet
Advanced Deep Learning Techniques for Natural Language Understanding: A Comprehensive Guide
From Everand
Advanced Deep Learning Techniques for Natural Language Understanding: A Comprehensive Guide
Adam Jones
No ratings yet
Mastering Vector Databases: The Future of Data Retrieval and AI
From Everand
Mastering Vector Databases: The Future of Data Retrieval and AI
Robert Johnson
No ratings yet
Internet of Things (IoT) A Quick Start Guide: A to Z of IoT Essentials
From Everand
Internet of Things (IoT) A Quick Start Guide: A to Z of IoT Essentials
Chitra Lele
No ratings yet
Introduction to LLMs for Business Leaders: Responsible AI Strategy Beyond Fear and Hype: Byte-Sized Learning Series
From Everand
Introduction to LLMs for Business Leaders: Responsible AI Strategy Beyond Fear and Hype: Byte-Sized Learning Series
I. Almeida
No ratings yet
Self-Supervised Learning: Teaching AI with Unlabeled Data
From Everand
Self-Supervised Learning: Teaching AI with Unlabeled Data
Robert Johnson
No ratings yet
Bag of Words Model: Unlocking Visual Intelligence with Bag of Words
From Everand
Bag of Words Model: Unlocking Visual Intelligence with Bag of Words
Fouad Sabry
No ratings yet
Mobile Computing Textbook
From Everand
Mobile Computing Textbook
Manish Soni
No ratings yet
Large Language Models
From Everand
Large Language Models
A. Scholtens
2/5 (2)
Hugging Face Transformers Essentials: From Fine-Tuning to Deployment
From Everand
Hugging Face Transformers Essentials: From Fine-Tuning to Deployment
Robert Johnson
No ratings yet
Visual Word: Unlocking the Power of Image Understanding
From Everand
Visual Word: Unlocking the Power of Image Understanding
Fouad Sabry
No ratings yet
Constrained Conditional Model: Fundamentals and Applications
From Everand
Constrained Conditional Model: Fundamentals and Applications
Fouad Sabry
No ratings yet
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
From Everand
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
Fouad Sabry
No ratings yet
Deep Learning: Fundamentals and Applications
From Everand
Deep Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Automatic Image Annotation: Fundamentals and Applications
From Everand
Automatic Image Annotation: Fundamentals and Applications
Fouad Sabry
No ratings yet
Knowledge Reasoning: Fundamentals and Applications
From Everand
Knowledge Reasoning: Fundamentals and Applications
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Cyb306 Feature Extraction

Uploaded by

Cyb306 Feature Extraction

Uploaded by

FEDERAL UNIVERSITY OF TECHNOLOGY OWERRI

SCHOOL OF INFORMATION AND COMMUNICATION

EXPERIENCE GATHERED AND TASKED PERFORMED

ALOZIE CHINONYE PHILEMON

 Feature Extraction in Biometric Security

Feature Extraction in Biometric Security

I. Bag of Words (BoW)

 Simplicity: The method is easy to implement and understand.

II. Word to Vector (Word2Vec)

 Dimensionality reduction: Instead of working with sparse high-dimensional vectors, Word2Vec

III) large pre-trained natural language processing:

The process typically involves two stages:

 Computationally expensive: These models require significant computational power and

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.