0% found this document useful (0 votes)

24 views48 pages

Sample Project Report-1

Uploaded by

dhananjaypandit34419

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views48 pages

Sample Project Report-1

Uploaded by

dhananjaypandit34419

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

A Project Report

on
HAND GESTURE CONTROLLED WHITE BOARD
Submitted for partial fulfillment of award of the degree of

BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE & ENGINEERING

2023-2024
Under the Guidance of Submitted By
Mr. Shomil Bansal Dhruv Singhal (2002220100060)
Asst. Professor (CSE) Dhananjay Vashishta (2002220100056)

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

I.T.S ENGINEERING COLLEGE, 46, KNOWLEDGE PARK-III,
GREATER NOIDA

Affiliated from Dr. A.P.J Abdul Kalam Technical University, Lucknow

May 2024
Final Project Report

1. Course : Bachelor of Technology

2. Semester : VIIIth
3. Branch : Computer Science & Engineering
4. Project Title : Hand Gesture Controlled White Board
5. Details of Students:
S. No. Roll No. Name Role as Signature
1 2002220100060 Dhruv Singhal
Data Collection,
Training the
Model
2 2002220100056 Dhananjay Vashishta Testing and
Evaluation

6. SUPERVISOR: Mr. Shomil Bansal

(Assistant Professor)

Remarks from Project Guide:

………………………………………………………………………………
………………………………………………………………………………
………………………………………………………………………………
………………………………………………………………………………
………………………………………………………………………………
DECLARATION BY STUDENTS

This is to certify that the Project Report on “Hand Gesture Controlled White Board” by
Dhruv Singhal, Dhananjay Vashishta have been submitted for the partial fulfilment of the
requirements of B.Tech. in Computer Science and Engineering (CSE). The report is our own
work. Also, we certify that it is our original work and free from any plagiarism.

Dhruv Singhal (2002220100060)

Dhananjay Vashishta (2002220100056)
CERTIFICATE

This is to certify that Project Report on “Hand Gesture Controlled White Board” by Dhruv
Singhal, Dhananjay Vashishta have been submitted for the partial fulfilment of the
requirements of B.Tech. in Computer Science and Engineering (CSE). The work is carried
out under my supervision and free from plagiarism.

Mr. Shomil Bansal

Asst. Professor-CSE
(Supervisor)

Dr. Ashish Kumar

Head of Department-CSE
ACKNOWLEDGEMENT

It gives us a great sense of pleasure to present the Report of the Project “Hand
Gesture Controlled White Board” undertaken during B-Tech final Year. First and foremost,
we wish to thank our Guide Prof. Shomil Bansal (Department of Computer Science and
Engineering, I.T.S. Engineering College, Greater Noida) for his kind blessings to us. He
allowed us the freedom to explore, while at the same time provided us with invaluable
sight without which this Project would not have been possible.
We also do not like to miss the opportunity to acknowledge the contribution of all faculty
members of the Department for their kind assistance and cooperation during the
development of our project.

Dhruv Singhal Dhananjay Vashishta

(2002220100060) (2002220100056)
ABSTRACT

The artificial intelligence subfield of human-computer interaction entered a new phase with the advent of
hand gesture recognition technology. The goal of this project is to develop a virtual whiteboard that can be
used by making hand gestures with the fingertips. There is no need for any kind of hardware input device
because the gesture is being made by the user. The main constraint of this system is that it demands a well-lit
environment. The user's hand motions are captured by a camera which is recognized by the machine through
image processing by this system to carry out various tasks on the white board. There will be a white color
canvas on which the user can draw on.

Keywords: Gesture, Recognition, Human Computer Interaction, Computer Vision, Artificial Intelligence.
TABLE OF CONTENTS

1. INTRODUCTION 1-3
1.1 PROJECT INTRODUCTION 1
1.2 PROBLEM STATEMENT 2
1.3 OBJECTIVE 3
2. LITERATURE SURVEY BACKGROUND 4-5
3. SOFTWARE DESIGN 6 - 15
4. REQUIREMENTS AND METHODOLOGY 16 - 18
4.1 REQUIREMENTS 16
4.1.1 HARDWARE REQUIREMENTS 16
4.1.2 SOFTWARE REQUIREMENTS 16
4.2 METHODOLOGY 16
4.2.1 SETUP 16
4.2.2 DATASET COLLECTION 16
4.2.3 FEATURE EXTRACTION 17
4.2.4 TRAINING THE NEURAL NETWORK 17
4.2.5 TESTING AND EVALUATION 18
5. CODING/CODE TEMPLATES 19 - 29
6. TESTING 30 - 32
7. RESULT AND DISCUSSION 33 - 34
7.1 OBSERVATION 34
8. CONCLUSION AND FUTURE WORK 35 - 36
8.1 CONCLUSION 35
8.2 FUTURE WORK 35 - 36
9. REFERENCES 37 - 38
LIST OF FIGURES

CHAPTER NO. FIGURE NO. TITLE PAGE NO.

3 Figure 3.1 Hand Landmark Model Bundle 6
3 Figure 3.2 Convolutional Neural Network 8
3 Figure 3.3 Convolutional Layer 9
3 Figure 3.4 Activation Functions 11
3 Figure 3.5 System Architecture Diagram 12
3 Figure 3.6 Proposed System 15
6 Figure 6.1 Hand Detection 30
6 Figure 6.2 Drawing Mode 30
6 Figure 6.3 Erase Mode 31
6 Figure 6.4 Dim Light 31
6 Figure 6.5 Skin Colored Background 31
6 Figure 6.6 Distance more than 1 meter 32
6 Figure 6.7 Bright Light Source in Background 32
CHAPTER 1

INTRODUCTION
In the following sections, a brief introduction and the problem statement for the work has been included.

1.1 Project Introduction

The user and the computer can interact or communicate through a variety of input devices, such as a
keyboard, mouse, and other pointing devices. However, using the hand to make gestures will be a more
instinctive, easy and natural way to communicate.

The reason for choosing this topic:

There are a lot of whiteboard software that have their own advantages and disadvantages. One of the
important drawbacks of whiteboard software is the need for precise pointing devices as it uses traditional
writing devices such as mouse light, pens, etc. which requires mastery of these devices to draw or write
accurately. Here we introduce the Hand gesture controlled white board, a drawing software based on hand
gestures that uses your hand movements and gestures to draw and interact with a virtual board.
The user does not need an external hardware pointing device to draw. For drawing, the different hand
movements of the fingertips are used.
These hand gestures are captured via camera and processed to recognize gestures and perform appropriate
action.
Since graphical user interfaces have replaced text-based interfaces as the primary form of human-computer
interaction, direct hand input is a promising alternative.
Hand movement recognition can be viewed as a method for computers to start comprehending body language
thus creating a stronger understanding or bond between humans and computers which will result in
improvement in productivity.
It will be more user friendly and natural for computer systems to be controlled via hand movements as there
will be no requirement to master other input devices.
Advantages of using our Proposed system are: -
Communication -The user can interact with the system in infinite ways.
Workspace - simple workspace is required as our system requires only a camera for interacting with the
system.
Easy to use – No requirement to master input devices as only the hand is used to interact with the system.
Simple installation – No need to use external input devices.

1
1.2 Problem Statement

The problem statement for the present work can be stated as follows:
The use of whiteboard software in the software industry offers numerous advantages but often requires
precise pointing devices for optimal functionality. This reliance on external hardware poses limitations and
inconveniences to users. To address this issue, this major project aims to develop a Hand gesture Controlled
White Board, a writing software that leverages hand gestures as the primary input method. By utilizing hand
movements and fingertip gestures, users will be able to write and draw without the need for traditional input
devices like mouse light or pens.
The primary objective of this project is to design and implement a Hand gesture Controlled White Board that
provides a natural and intuitive writing experience. This system will overcome the limitations associated with
conventional writing software by harnessing computer vision technology and hand gesture recognition
algorithms.

The project will focus on the following key aspects:

Hand Gesture Recognition: Developing robust algorithms to accurately recognize and interpret hand gestures,
allowing users to perform writing and drawing actions seamlessly.
User Interface Design: Designing an intuitive and user-friendly interface that visualizes hand gestures and
facilitates smooth interaction with the virtual whiteboard.
Usability and Performance Evaluation: Conducting thorough usability tests and performance evaluations to
assess the effectiveness and efficiency of the Hand gesture Controlled White Board. This will involve
gathering feedback from users to improve and optimize system performance.

Integration and Further Improvements: Exploring opportunities to integrate the developed system with
existing whiteboard software, enhancing its functionalities and extending its applications. Additionally,
identifying areas for future enhancements and updates to advance the current capabilities of the system.
By addressing these aspects, the project aims to revolutionize the way users interact with Hand gesture
Controlled White Board by eliminating the need for external hardware and providing a more natural and
intuitive writing experience. The success of this project will contribute to the advancement of hand gesture
technology and its potential applications in various domains, including education, collaboration, and creative
design.

Note: The problem statement can be further refined and tailored to align with the specific objectives and
scope of your project.

2
Sno Problems Solution

1 User Experience Better User Experience by using Gesture

2 Limited functionality Can have multiple gesture

3 External Device No need of extra device

4 Background noise No background noise with gesture

5 Learning Curve Hand gesture are easy to learn and remember

6 Costly No need of extra device

7 precision Hand more precise than mouse

Table 1.1 Problems and possible solutions

1.3 Objectives

The proposed work objective are follows:

1.To design an application that can control a virtual whiteboard using hand gestures with features like
writing, drawing, using various colors and shapes, and erasing.

2.To develop a user-friendly interface which is easy to learn and use.

3.To apply different machine learning algorithms best suited for hand gesture recognition and compare their
performance.

3
CHAPTER 2

LITERATUE SURVEY

This literature survey aims to explore the existing research on hand gesture recognition techniques and their
applications in the context of a hand gesture-controlled whiteboard project. By reviewing a collection of
relevant studies, this survey will provide insights into the methods, algorithms, and technologies employed in
this area.
Panwar and Mehra [1] discuss the use of hand gestures as an intuitive means of interaction between humans
and computers. They present a framework for hand gesture recognition using depth and color information.
The authors propose a system that employs a depth sensor and color camera to capture hand gestures,
followed by segmentation, feature extraction, and classification using machine learning algorithms.
Sonkar et al. [2] present a virtual teaching board system that utilizes computer vision techniques for hand
gesture recognition. They employ OpenCV, an open-source computer vision library, to track hand
movements and recognize gestures. The authors discuss the implementation of various gestures for
controlling the virtual board, such as drawing, erasing, and zooming functionalities, to enhance the teaching
experience.
Abhishek et al. [3] propose a hand gesture recognition system using machine learning algorithms. They
evaluate different classifiers, including k-nearest neighbors, support vector machines, and random forests, to
classify hand gestures based on extracted features. The authors achieve promising results in terms of accuracy
and robustness, highlighting the potential of machine learning techniques in hand gesture recognition
applications.
Mazumdar et al. [4] propose a hybrid approach for hand gesture recognition that combines tracking of both
gloved and free hands. The authors present a system that employs background subtraction, hand region
segmentation, and feature extraction techniques to recognize various hand gestures. They evaluate their
approach using a dataset of predefined gestures and achieve promising results in terms of accuracy and
recognition speed.

4
Fernando and Wijaya nayaka [5] focus on the application of hand gesture recognition in sign language
recognition systems. They propose a low-cost approach that utilizes depth information captured by a Kinect
sensor for hand tracking and gesture recognition. The authors describe the preprocessing steps, feature
extraction techniques, and classification algorithms employed in their system. They evaluate their approach
using a dataset of sign language gestures and achieve satisfactory recognition accuracy.

Ghotkar et al. [6] propose a hand gesture recognition system specifically designed for the Indian sign
language. They present a framework that involves hand detection, tracking, and feature extraction using
color-based and shape-based approaches. The authors evaluate their system using a dataset of Indian sign
language gestures and report satisfactory recognition accuracy for various gestures.
Chen et al. [7] propose a real-time vision-based hand gesture recognition system using Haar-like features.
They present a methodology that involves hand detection, feature extraction, and classification using a hidden
Markov model. The authors evaluate their system on a dataset of hand gestures and achieve promising results
in terms of recognition accuracy and real-time performance.
Zhu et al. [8] present a vision-based hand gesture recognition system that utilizes depth information captured
by a Kinect sensor. They propose a feature extraction method based on depth histogram analysis and employ
a support vector machine for classification. The authors evaluate their system using a dataset of predefined
hand gestures and demonstrate satisfactory recognition accuracy.
Gajjar et al. [9] propose a real-time painting toolbox controlled by hand gestures using a machine learning
approach. They utilize OpenCV for hand detection and tracking and machine learning algorithms, such as
random forests and support vector machines, for gesture recognition. The authors evaluate their system on a
dataset of predefined gestures and demonstrate its effectiveness in controlling the painting functionalities in
real-time.
Dhule and Nagrare [10] present a computer vision-based human-computer interaction system that utilizes
color detection techniques for hand gesture recognition. They employ color-based segmentation and feature
extraction methods to recognize hand gestures. The authors evaluate their system using a dataset of
predefined gestures and report satisfactory recognition accuracy.
Hartanto et al. [11] propose a real-time hand gesture tracking and recognition system. They utilize a web
camera for hand detection and tracking and employ a combination of color and
motion-based features for gesture recognition. The authors evaluate their system on a dataset of predefined
gestures and achieve satisfactory recognition accuracy in real-time scenarios.

5
CHAPTER 3
SOFTWARE DESIGN

3.1 Introduction
3.1.1 System Architecture
The hand gesture recognition system follows a client-server architecture. The client-side consists of the user
interface and the hand tracking module, while the server-side includes the feature extraction module, neural
network training module, and gesture recognition module.

3.1.2 Gesture recognition using Mediapipe

MediaPipe is an open-source framework developed by Google that provides a comprehensive solution for
building perception pipelines to process audio and visual media data, including hand tracking and gesture
recognition.
MediaPipe Hands is a high-fidelity hand and finger tracking solution. It employs machine learning (ML) to
infer 21 3D landmarks of a hand from just a single frame.
MediaPipe provides a pre-trained hand tracking pipeline that can detect hand landmarks from a live video
stream or recorded video.
The hand landmark model bundle detects the keypoint localization of 21 hand-knuckle coordinates within the
detected hand regions. The model was trained on approximately 30K real-world images, as well as several
rendered synthetic hand models imposed over various backgrounds. The definition of the 21 landmarks
below:

Fig. 3.1 Hand landmark model bundle

6
MediaPipe is a cross-platform framework for building machine learning-based pipelines to process and
analyze multimedia data. It was developed by Google and provides a set of pre-built, configurable, and
reusable building blocks for building computer vision, audio, and video processing applications.
The MediaPipe framework allows developers to design and deploy machine learning models for tasks such as
object detection, face detection, pose estimation, hand tracking, gesture recognition, and more. It also
provides a pipeline for real-time data processing that can be integrated into applications for mobile, desktop,
and web platforms.
The MediaPipe algorithm is a collection of algorithms used by the framework to perform various tasks.

3.1.3 PyTorch
Pytorch is an open-source deep learning framework available with a Python and C++ interface. Pytorch
resides inside the torch module. In PyTorch, the data that has to be processed is input in the form of a tensor.

Building Neural Network with PyTorch

We will see this in a stepwise implementation:

1. Dataset Preparation: As everything in PyTorch is represented in the form of tensors, so we should first
in tensors.
2. Building model: For building a neutral network first we first define the number of input layers, hidden
layers, and output layers. We also need to define the initial weights. The values of the weights
matrices are chosen randomly using torch.randn(). Torch.randn() returns a tensor consisting of
random numbers from a standard normal distribution.
3. Forward Propagation: The data is fed to a neural network and a matrix multiplication is performed
between the weights and input. This can be easily done using a torch.
4. Loss computation: PyTorch.nn functions have multiple loss functions. Loss functions are used to
measure the error between the predicted value to the target value.

7
Neural networks: MediaPipe uses various neural network models, including convolutional neural networks
(CNNs) and recurrent neural networks.
3.1.4 CNN
Convolutional neural networks, or CNNs, are widely used for image classification, object recognition, and
detection. The CNN architecture includes several building blocks, such as convolution layers, pooling layers,
and fully connected layers. A typical architecture consists of repetitions of a stack of several convolution
layers and a pooling layer, followed by one or more fully connected layers. The step where input data are
transformed into output through these layers is called forward propagation.

There are two main parts to a CNN architecture

● A convolution tool that separates and identifies the various features of the image for analysis in a
process called Feature Extraction.
● The network of feature extraction consists of many pairs of convolutional or pooling layers.
● A fully connected layer that utilizes the output from the convolution process and predicts the class of
the image based on the features extracted in previous stages.
● This CNN model of feature extraction aims to reduce the number of features present in a dataset. It
creates new features which summarizes the existing features contained in an original set of features.
There are many CNN layers as shown in the CNN architecture diagram.

Fig 3.2 Convolutional Neural Network

8
1. Convolutional Layer
This layer is the first layer that is used to extract the various features from the input images. In this layer, the
mathematical operation of convolution is performed between the input image and a filter of a particular size
MxM. By sliding the filter over the input image, the dot product is taken between the filter and the parts of
the input image with respect to the size of the filter (MxM).

The output is termed as the Feature map which gives us information about the image such as the corners and
edges. Later, this feature map is fed to other layers to learn several other features of the input image.
The convolution layer in CNN passes the result to the next layer once applying the convolution operation in
the input. Convolutional layers in CNN benefit a lot as they ensure the spatial relationship between the pixels
is intact.

Fig 3.3 Convolutional Layer

2. Pooling Layer
In most cases, a Convolutional Layer is followed by a Pooling Layer. The primary aim of this layer is to
decrease the size of the convolved feature map to reduce the computational costs. This is performed by
decreasing the connections between layers and independently operates on each feature map. Depending upon
the method used, there are several types of Pooling operations.
In Max Pooling, the largest element is taken from the feature map. Average Pooling calculates the average of
the elements in a predefined size Image section. The total sum of the elements in the predefined section is
computed in Sum Pooling. The Pooling Layer usually serves as a bridge between the Convolutional Layer
and the FC Layer.
9
3. Fully Connected Layer
The Fully Connected (FC) layer consists of the weights and biases along with the neurons and is used to
connect the neurons between two different layers. These layers are usually placed before the output layer and
form the last few layers of a CNN Architecture.

In this, the input image from the previous layers are flattened and fed to the FC layer. The flattened vector
then undergoes few more FC layers where the mathematical functions operations usually take place. In this
stage, the classification process begins to take place. The reason two layers are connected is that two fully
connected layers will perform better than a single connected layer. These layers in CNN reduce the human
supervision

4. Dropout
Usually, when all the features are connected to the FC layer, it can cause overfitting in the training dataset.
Overfitting occurs when a particular model works so well on the training data causing a negative impact in
the model’s performance when used on new data.

To overcome this problem, a dropout layer is utilized wherein a few neurons are dropped from the neural
network during the training process resulting in reduced size of the model. On passing a dropout of 0.3, 30%
of the nodes are dropped out randomly from the neural network. Dropout results in improving the
performance of a machine learning model as it prevents overfitting by making the network simpler. It drops
neurons from the neural networks during training.

5. Activation Functions
Finally, one of the most important parameters of the CNN model is the activation function. They are used to
learn and approximate any kind of continuous and complex relationship between variables of the network. In
simple words, it decides which information of the model should fire in the forward direction and which ones
should not at the end of the network.

It adds non-linearity to the network. There are several commonly used activation functions such as the ReLU,
Softmax, tanH and the Sigmoid functions. Each of these functions have a specific usage. For a binary
classification CNN model, sigmoid and softmax functions are preferred for a multi-class classification,
generally softmax is used. In simple terms, activation functions in a CNN model determine whether a neuron
should be activated or not. It decides whether the input to the work is important or not to predict using
mathematical operations.

10
Fig 3.4 Activation Functions

OpenCV

Open Source Computer Vision Library is an open source computer vision and machine learning software
library. OpenCV was built to provide a common infrastructure for computer vision applications and to
accelerate the use of machine perception in the commercial products. Being an Apache 2 licensed product,
OpenCV makes it easy for businesses to utilize and modify the code.
The library has more than 2500 optimized algorithms, which includes a comprehensive set of both classic and
state-of-the-art computer vision and machine learning algorithms. These algorithms can be used to detect and
recognize faces, identify objects, classify human actions in videos, track camera movements, track moving
objects, extract 3D models of objects, produce 3D point clouds from stereo cameras, stitch images together to
produce a high resolution image of an entire scene, find similar images from an image database, remove red
eyes from images taken using flash, follow eye movements, recognize scenery and establish markers to
overlay it with augmented reality, etc. The library is used extensively in companies, research groups and by
governmental bodies.
It has C++, Python, Java and MATLAB interfaces and supports Windows, Linux, Android and Mac OS.
OpenCV leans mostly towards real-time vision applications and takes advantage of MMX and SSE
instructions when available. A full-featured CUDA and OpenCL interfaces are being actively developed right
now. There are over 500 algorithms and about 10 times as many functions that compose or support those
algorithms. OpenCV is written natively in C++ and has a templated interface that works seamlessly with STL
containers [4].

The system architecture diagram is presented in Figure 3.5.

11
Figure 3.5: System Architecture Diagram

12
3.1.5 User Interface Design
The user interface is designed to provide a seamless and intuitive experience for the users. It includes the
following screens:
- Real-time video display screen: Shows the live video feed from the webcam with overlaid hand tracking
and recognized gestures.
- Results screen: Displays the recognized gesture and provides feedback to the user.
- Setting screen : Displays the various colour settings, pen thickness and line dash settings.

The user interface is designed with a clean and user-friendly layout, ensuring easy navigation and interaction.
It employs appropriate colours, fonts, and icons to enhance the visual appeal and usability.

3.1.6 Database Design

The hand gesture recognition system does not require a database for its operation. However, if there is a need
for data storage, a database can be integrated to store user profiles, training data, or recognized gesture
statistics. The database design includes defining the data requirements, creating entity-relationship diagrams,
and designing the database tables and fields.

3.1.7 System Modules

The hand gesture recognition system consists of the following modules:

Module 1: Hand Tracking and Detection

Description: This module tracks the user's hand in real-time using computer vision techniques. It detects and
localizes the hand within the video frame, providing accurate hand tracking for further processing.
Inputs:
- Live video feed from the webcam.
Processing Steps:
1. Acquire the video frame from the webcam.
2. Preprocess the frame to enhance image quality and reduce noise.
3. Apply hand detection algorithms, such as Mediapipe or OpenCV, to locate the hand within the frame.
4. Extract the hand region and pass it to the next module for feature extraction.
Outputs:
- Hand region coordinates and bounding box.

13
Module 2: Feature Extraction
Description: This module extracts relevant features from the tracked hand region, capturing its shape,
orientation, and movements. These features serve as input to the neural network for training and recognition.
Inputs:
- Hand region coordinates and bounding box from Module 1.
Processing Steps:
1. Normalize and resize the hand region to a fixed size.
2. Apply feature extraction algorithms, such as contour analysis or keypoint detection, to extract features like
finger positions, hand shape, and hand pose.
3. Convert the extracted features into a suitable format for neural network input.
Outputs:
- Extracted features for training or recognition.

Module 3: Neural Network Training

Description: This module trains a deep neural network using the extracted hand gesture features. It learns to
classify different hand gestures based on the provided training dataset.
Inputs:
- Extracted features from Module 2.
- Labeled training dataset consisting of hand gesture samples.
Processing Steps:
1. Define the architecture of the deep neural network, including the number and types of layers.
2. Split the labeled dataset into training and validation sets.
3. Preprocess the training data, such as normalization or data augmentation.
4. Train the neural network using an appropriate optimization algorithm and loss function.
5. Validate the trained model using the validation set and adjust hyperparameters if necessary.
Outputs:
- Trained neural network model.

Module 4: Gesture Recognition

Description: This module uses the trained neural network to recognize hand gestures in real-time. It takes the
extracted features from the tracked hand and applies the trained model for classification.
Inputs:
- Extracted features from Module 2.

14
3.2 Flowchart

Fig. 3.6 Proposed System

15
CHAPTER 4

REQUIREMENT AND METHODOLOGY

4.1 Requirements
4.1.1 Hardware Requirements
The hardware requirements for the development and implementation of the hand gesture recognition system
are as follows:
● Webcam or camera for capturing video input.
● Computer with sufficient processing power and memory (at least 4 GB) to handle real-time video
processing and deep learning computations.

4.1.2 Software Requirements

The software requirements for the hand gesture recognition system include:
PyTorch: A deep learning framework used for designing and training the neural network. Mediapipe: A
library for hand tracking and pose estimation, utilized for extracting hand features.
OpenCV: A computer vision library used for video processing and manipulation.

4.2 Methodology
The methodology followed for the development and implementation of the hand gesture recognition system
is outlined below:

4.2.1 Setup
1. Prepare the computer by ensuring it meets the required hardware specifications.
2. Install the necessary software dependencies, including PyTorch, Mediapipe, and OpenCV.
3. Connect a webcam or camera to the computer for capturing real-time video input.
4. Set up the environment with proper lighting conditions to facilitate accurate hand tracking and detection.

4.2.2 Dataset Collection

1. Capture videos of hand gestures performed by multiple users in various lighting conditions.
2. Label the videos with corresponding gesture names for supervised learning.
3. Preprocess the videos to extract frames containing the hand gesture, which will be used for training.

16
4.2.3 Feature Extraction
1. Utilize the Mediapipe library for hand tracking and pose estimation.
2. Extract relevant features from the tracked hand, such as hand position, orientation, and movements.
3. Preprocess the extracted features to prepare them for training the neural network.

4.2.4 Training the Neural Network

1. Define the architecture of the deep neural network using the PyTorch framework.
2. Train the neural network using appropriate optimization techniques and loss functions.
3. Utilize PyTorch layers and functions, such as linear layers, dropout layers, activation functions (e.g.,
ReLU), and loss functions, to enhance the system's performance.

I. Linear Layers
The most basic type of neural network layer is a linear or fully connected layer. This is a layer where every
input influences every output of the layer to a degree specified by the layer’s weights. If a model has m inputs
and n outputs, the weights will be an m x n matrix.

II. Dropout layers

Dropout layer is utilized wherein a few neurons are dropped from the neural network during the training
process resulting in reduced size of the model. On passing a dropout of 0.3, 30% of the nodes are dropped out
randomly from the neural network. Dropout results in improving the performance of a machine learning
model as it prevents overfitting by making the network simpler. It drops neurons from the neural networks
during training.

III. Activation function

They are used to learn and approximate any kind of continuous and complex relationship between variables
of the network. In simple words, it decides which information of the model should fire in the forward
direction and which ones should not at the end of the network. It adds non-linearity to the network. There are
several commonly used activation functions such as the ReLU.

IV. Loss Functions

Loss functions tell us how far a model’s prediction is from the correct answer.

17
4.2.5 Testing and Evaluation
1. Capture real-time video input from the webcam or camera.
2. Use the trained neural network to predict the hand gestures from the input video.
3. Evaluate the accuracy and performance of the system based on the predictions.
4. Analyze and interpret the results, considering factors such as lighting, processing speed, and robustness.

By following this methodology, the hand gesture recognition system was developed and evaluated,
demonstrating its effectiveness in recognizing and interpreting hand gestures in real-time scenarios. Further
enhancements and optimizations can be explored to improve the system's accuracy, robustness, and
applicability in various domains.

18
CHAPTER 5

CODING/ CODE TEMPLATE

1. Data collection
import cv2
import mediapipe as mp
import pandas as pd

# Single run
cols = ['label', *map(str, range(63))]
dataset = pd.DataFrame(columns=cols)

mpHands = mp.solutions.hands
hands = mpHands.Hands(
static_image_mode=False,
max_num_hands=2,
min_detection_confidence=0.8,
min_tracking_confidence=0.8
)
mp_draw = mp.solutions.drawing_utils
_lm_list = [
mpHands.HandLandmark.WRIST,
mpHands.HandLandmark.THUMB_CMC,
mpHands.HandLandmark.THUMB_MCP,
mpHands.HandLandmark.THUMB_IP,
mpHands.HandLandmark.THUMB_TIP,
mpHands.HandLandmark.INDEX_FINGER_MCP,
mpHands.HandLandmark.INDEX_FINGER_DIP,
mpHands.HandLandmark.INDEX_FINGER_PIP,
mpHands.HandLandmark.INDEX_FINGER_TIP,
mpHands.HandLandmark.MIDDLE_FINGER_MCP,
mpHands.HandLandmark.MIDDLE_FINGER_DIP,
mpHands.HandLandmark.MIDDLE_FINGER_PIP,
mpHands.HandLandmark.MIDDLE_FINGER_TIP,
mpHands.HandLandmark.RING_FINGER_MCP,
mpHands.HandLandmark.RING_FINGER_DIP,
mpHands.HandLandmark.RING_FINGER_PIP,
mpHands.HandLandmark.RING_FINGER_TIP,
mpHands.HandLandmark.PINKY_MCP,
mpHands.HandLandmark.PINKY_DIP,
19
mpHands.HandLandmark.PINKY_PIP,
mpHands.HandLandmark.PINKY_TIP
]

def landmark_extract(hand_lms, mpHands):

output_lms = []

for lm in _lm_list :
lms = hand_lms.landmark[lm]
output_lms.append(lms.x)
output_lms.append(lms.y)
output_lms.append(lms.z)

return output_lms

# Loop
vid = cv2.VideoCapture(0)
action = input('Enter action name : ')

try :
while True :
success, frame = vid.read()
img_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
results = hands.process(img_rgb)

if results.multi_hand_landmarks :
for hand_landmarks in results.multi_hand_landmarks :
mp_draw.draw_landmarks(frame, hand_landmarks, mpHands.HAND_CONNECTIONS)

landmark_list = [action, *landmark_extract(hand_landmarks, mpHands)]

landmark_df = pd.DataFrame([landmark_list], columns=cols)
dataset = dataset.append(landmark_df)

cv2.imshow('output', frame)
if cv2.waitKey(1) and 0xFF == ord('q'):
break

except KeyboardInterrupt :
pass

dataset.to_csv(f'{action}.csv')

20
2. model.py

import torch
import torch.nn as nn

class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.layers = nn.Sequential(
nn.Linear(63, 1000),
nn.ReLU(),
nn.BatchNorm1d(1000),
nn.Dropout(p=0.5),

nn.Linear(1000, 500),
nn.ReLU(),
nn.BatchNorm1d(500),
nn.Dropout(p=0.5),

nn.Linear(500, 200),
nn.ReLU(),
nn.BatchNorm1d(200),
nn.Dropout(p=0.5),

nn.Linear(200, 50),
nn.ReLU(),

nn.Linear(50, 3)
)

def forward(self, x):

return self.layers(x)

def test():
model = Model()
noise = torch.randn((20, 63))
out = model(noise)
print(out.shape)

21
3. Train.py

import torch
import torch.nn as nn
import torch.optim as optim
from model import Model
import pandas as pd
import numpy as np
from torch.utils.data import Dataset,DataLoader

# Define the custom dataset class

class Data(Dataset):
def __init__(self, path):
self.dataset = np.array(pd.read_csv(path))
self.map = {'draw': 0, 'erase': 1, 'none': 2}

def getitem(self, index):

yvals = torch.tensor(self.map[self.dataset[index][0]], dtype=torch.long)
xvals = torch.from_numpy(self.dataset[index][1:].astype(np.float32))
return xvals, yvals

def __len__(self):
return self.dataset.shape[0]

# Create an instance of the dataset

dataset = Data(path='dataset/dataset.csv')

# Create data loaders for training

batch_size = 32
train_loader = DataLoader(dataset, batch_size=batch_size, shuffle=True)

# Create an instance of the model

model = Model()

# Define the loss function and optimizer

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop
num_epochs = 10
device = torch.device('cpu')
model.to(device)

22
for epoch in range(num_epochs):
running_loss = 0.0
for inputs, labels in train_loader:
inputs = inputs.to(device)
labels = labels.to(device)
optimizer.zero_grad()

outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()

running_loss += loss.item()

epoch_loss = running_loss / len(train_loader)

print(f"Epoch {epoch+1}/{num_epochs}, Loss: {epoch_loss:.4f}")

# Save the model

torch.save(model.state_dict(), f"models3/models{epoch+1}.pt")

# Test the model

model.eval()
with torch.no_grad():
noise = torch.randn((20, 63)).to(device)
outputs = model.forward(noise)
_, predicted = torch.max(outputs.data, 1)
print(predicted)

23
4. Main.py

import cv2, torch

import time, os
import mediapipe as mp
import numpy as np
from model import Model

current_path = os.getcwd()

# Camera number, can be varied if using multiple webcams

cam_number = 0
# Laterally inverting video stream
flip = True
# Minimum confidence score required for detecting and marking hand landmarks
min_conf = 0.75
max_hands = 2
# Path of trained model. Can be changed to point to a custom model
model_path = os.path.join(current_path, 'models3/models10.pt')
# Pen parameters
pen_color = (255, 0, 0)
eraser_size = 80
pen_size = 10
# The density of the line. Smaller values make the line more smooth.
intermediate_step_gap = 4
# Create Control window to change color and size of pen
cv2.namedWindow('control')
# This show the color
img = np.zeros((200, 600, 3), np.uint8)

def nothing(x):
pass

# This creates a trackbar to adjust various values.

cv2.createTrackbar('Red', 'control', 0, 255, nothing)
cv2.createTrackbar('Blue', 'control', 0, 255, nothing)
cv2.createTrackbar('Green', 'control', 0, 255, nothing)
cv2.createTrackbar('pen_thickness', 'control', 5, 30, nothing)
cv2.createTrackbar('Imd_step_gap', 'control', 10, 29, nothing)
# Button size
button = [20, 60, 145, 460]

24
# Function to click button
def process_click(event, x, y, flags, params):
# check if the click is within the dimensions of the button
if event == cv2.EVENT_LBUTTONDOWN:
if button[0] < y < button[1] and button[2] < x < button[3]:
cv2.imwrite('Image' + str(fps) + '.png', frame)
img[:80, :] = (0, 0, 255)

cv2.setMouseCallback('control', process_click)
cap = cv2.VideoCapture(cam_number)

mpHands = mp.solutions.hands
hands = mpHands.Hands(
static_image_mode=False,
max_num_hands=max_hands,
min_detection_confidence=min_conf,
min_tracking_confidence=min_conf
)
mp_draw = mp.solutions.drawing_utils

_lm_list = [
mpHands.HandLandmark.WRIST,
mpHands.HandLandmark.THUMB_CMC,
mpHands.HandLandmark.THUMB_MCP,
mpHands.HandLandmark.THUMB_IP,
mpHands.HandLandmark.THUMB_TIP,
mpHands.HandLandmark.INDEX_FINGER_MCP,
mpHands.HandLandmark.INDEX_FINGER_DIP,
mpHands.HandLandmark.INDEX_FINGER_PIP,
mpHands.HandLandmark.INDEX_FINGER_TIP,
mpHands.HandLandmark.MIDDLE_FINGER_MCP,
mpHands.HandLandmark.MIDDLE_FINGER_DIP,
mpHands.HandLandmark.MIDDLE_FINGER_PIP,
mpHands.HandLandmark.MIDDLE_FINGER_TIP,
mpHands.HandLandmark.RING_FINGER_MCP,
mpHands.HandLandmark.RING_FINGER_DIP,
mpHands.HandLandmark.RING_FINGER_PIP,
mpHands.HandLandmark.RING_FINGER_TIP,
mpHands.HandLandmark.PINKY_MCP,
mpHands.HandLandmark.PINKY_DIP,
mpHands.HandLandmark.PINKY_PIP,
mpHands.HandLandmark.PINKY_TIP
]

25
## Extract landmark positions as array
def landmark_extract(hand_lms, mpHands):
output_lms = []

for lm in _lm_list:
lms = hand_lms.landmark[lm]
output_lms.append(lms.x)
output_lms.append(lms.y)
output_lms.append(lms.z)

return output_lms

## Checks if the position is out of bounds or not

def is_position_out_of_bounds(position, top_left, bottom_right):
return (
position[0] > top_left[0] and position[0] < bottom_right[0]
and position[1] > top_left[1] and position[1] < bottom_right[1]
)

## Loading torch model

model = Model()
model.load_state_dict(torch.load(model_path, map_location='cpu'))
model.eval()

action_map = {0: 'Draw', 1: 'Erase', 2: 'None'}

## cv2 text parameters

font = cv2.FONT_HERSHEY_SIMPLEX
fontScale = 1
fontColor = (255, 255, 255)
lineType = 4
## Stores previously drawn circles to give continuous lines and also store current colour and size of pen
circles = []

was_drawing_last_frame = False

ptime = 0
ctime = 0

26
## Video feed loop
while True:
success, frame = cap.read()
if flip:
frame = cv2.flip(frame, 1)
h, w, c = frame.shape
img_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
results = hands.process(img_rgb)
cv2.rectangle(frame, (w, h), (w - 320, h - 90), (0, 0, 0), -1, 1)
b = cv2.getTrackbarPos('Blue', 'control')
g = cv2.getTrackbarPos('Green', 'control')
r = cv2.getTrackbarPos('Red', 'control')
t = cv2.getTrackbarPos('pen_thickness', 'control')
# Added 1 to make range of imd_step_gap equal to [1, 30].
imd_step_gap = (cv2.getTrackbarPos('Imd_step_gap', 'control')+1)/10

intermediate_step_gap = imd_step_gap
if not results.multi_hand_landmarks:
was_drawing_last_frame = False
cv2.putText(frame, 'No hand in frame', (w - 300, h - 50), font, fontScale, fontColor, lineType)
else:
for hand_landmarks in results.multi_hand_landmarks:
mp_draw.draw_landmarks(frame, hand_landmarks, mpHands.HAND_CONNECTIONS)

## Mode check
landmark_list = landmark_extract(hand_landmarks, mpHands)
model_input = torch.tensor(landmark_list, dtype=torch.float).unsqueeze(0)
action = action_map[torch.argmax(model.forward(model_input)).item()]
cv2.putText(frame, f"Mode : {action}", (w - 300, h - 50), font, fontScale, fontColor, lineType)

## Draw mode
if action == 'Draw':
pen_color = (b, g, r)
pen_size = t
index_x = hand_landmarks.landmark[mpHands.HandLandmark.INDEX_FINGER_TIP].x
index_y = hand_landmarks.landmark[mpHands.HandLandmark.INDEX_FINGER_TIP].y
pos = (int(index_x * w), int(index_y * h))
cv2.circle(frame, pos, 20, (255, 0, 0), 2)
if was_drawing_last_frame:

27
prev_pos = circles[-1][0]
x_distance = pos[0] - prev_pos[0]
y_distance = pos[1] - prev_pos[1]
distance = (x_distance ** 2 + y_distance ** 2) ** 0.5
num_step_points = int(int(distance) // intermediate_step_gap) - 1
if num_step_points > 0:
x_normalized = x_distance / distance
y_normalized = y_distance / distance
for i in range(1, num_step_points + 1):
step_pos_x = prev_pos[0] + int(x_normalized * i)
step_pos_y = prev_pos[1] + int(y_normalized * i)
step_pos = (step_pos_x, step_pos_y)
circles.append((step_pos, pen_color, pen_size))

circles.append((pos, pen_color, pen_size))

was_drawing_last_frame = True
else:
was_drawing_last_frame = False

## Erase mode
if action == 'Erase':
eraser_mid = [
int(hand_landmarks.landmark[mpHands.HandLandmark.MIDDLE_FINGER_MCP].x * w),
int(hand_landmarks.landmark[mpHands.HandLandmark.MIDDLE_FINGER_MCP].y * h)
]

bottom_right = (eraser_mid[0] + eraser_size, eraser_mid[1] + eraser_size)

top_left = (eraser_mid[0] - eraser_size, eraser_mid[1] - eraser_size)
cv2.rectangle(frame, top_left, bottom_right, (0, 0, 255), 5)

circles = [
(position, pen_color, pen_size)
for position, pen_color, pen_size in circles
if not is_position_out_of_bounds(position, top_left, bottom_right)
]

## Draws all stored circles

for position, pen_color, pen_size in circles:
frame = cv2.circle(frame, position, pen_size, pen_color, -1)

ctime = time.time()
fps = round(1 / (ctime - ptime), 2)

28
ptime = ctime
cv2.putText(frame, f'FPS : {fps}', (w - 300, h - 20), font, fontScale, fontColor, lineType)

cv2.imshow('output', frame)
# contol_image = img[:80, :]
img[button[0]:button[1], button[2]:button[3]] = 180
cv2.putText(img, 'Click_to_save_img', (148, 50), cv2.FONT_HERSHEY_TRIPLEX, 1.0, 0, 1)
cv2.imshow('control', img)
img[:80, :] = [255, 255, 255]
img[80:, :] = [b, g, r]

if cv2.waitKey(1) and 0xFF == ord('q'):

break

29
CHAPTER 6

TESTING

The first and foremost measure of this application’s performance is the implementation of the three modes
namely write, erase and none. The application is working perfectly on this measure.
Secondly, we tested the application on varying light settings, distance and reactiveness of the system by
measuring FPS limits. The various test cases are:

Fig. 6.1 Hand detection

Fig. 6.1 shows how the hand is detected using Mediapipe. This real time testing was based on video
processing. The main objective of hand posture estimate is to locate hand key points that can help to detect
gestures afterwards.

Fig. 6.2 Drawing Mode

30
Fig. 6.3 Erase Mode

Fig 6.2 & Fig 6.3 shows different modes /gestures. Fig 6.2 shows Drawing mode and Fig 6.3 shows erase
mode in which anything in the square box area will be erased.

Fig. 6.4 Dim light

Fig. 6.4 shows how the hand is detected in a dim light environment. The performance of the system is
average in a dim light environment.

Fig. 6.5 Skin colored background

Fig. 6.5 shows how the hand is detected in a skin-colored background. The performance of the system is
good in a skin-colored background.

31
Fig. 6.6 Distance more than 1 meter

Fig. 6.6 shows how the hand is detected at a distance of more than 1 meter. The performance of the system is
average at 1 meter range.

Fig. 6.7 Bright light source in background

Fig. 6.7 shows how the hand is not detected in the presence of a bright light source in the background. The
performance of the system is poor in the presence of a bright light source in the background.

32
CHAPTER 7

RESULT

In this section, we present the results of the performance evaluation for the Hand gesture Controlled White
Board. The evaluation was conducted based on various observations and metrics, including brightness,
distance, and background.

Observation Metric - Performance

The performance of the system was assessed under different lighting conditions, ranging from bright/mid
light to dim light. It was found that the system performed well in bright/mid light conditions, while the
performance was rated as average in dim light. However, when there was a bright light source in the
background, the performance was poor. This suggests that the system is sensitive to the presence of bright
light sources in the surroundings.

The performance of the system was also evaluated based on the distance between the user's hand and the
system. It was observed that the system performed well in close range, providing accurate detection and
tracking of hand gestures. However, when the distance exceeded 1 meter, the performance was rated as
average, indicating a decrease in accuracy and reliability.

Additionally, the background color played a role in the system's performance. When the background had a
skin color tone, the system demonstrated good performance, accurately detecting and interpreting hand
gestures.

These performance evaluations are summarized in Table 7.1: Performance Evaluation Table.

7.1 Observations

33
Sno Metric Performance

1 Bright/Mid light Good

2 Dim light Average

3 Bright light source in Poor

background

4 Close range Good

5 Range more than 1m Average

6 Skin color background Good

Table 7.1 Performance evaluation table

Furthermore, the performance of the system was assessed in terms of frames per second (FPS) in the writing
mode. Table 7.2: Writing Mode Performance categorizes the performance based on the FPS range.

Sno Frames per Performance

Second

1 Below 7-8 Poor

2 8-15 Good

3 Above 15 Poor
Table 7.2 Writing mode performance

The results indicate that the system's performance is affected by the frame rate, with lower FPS values
leading to poorer performance. Therefore, maintaining a frame rate of 8-15 frames per second is crucial for
optimal performance.

Overall, the evaluation results provide insights into the system's performance under different conditions and
metrics, highlighting its strengths and limitations. These findings can guide future improvements and
enhancements to the Hand gesture Controlled White Board.

34
CHAPTER 8

CONCLUSION AND FUTURE WORK

8.1 Conclusion
In conclusion, we have made a hand gesture identification system using a PC webcam to recognize different
hand acts. The program was trained with a total of 6000 images to effectively categorize hand motions, which
is suitable for basic cameras such as a laptop webcam. The implementation of the system was based on
captured pictures, and hand detection was performed with the image processing libraries OpenCV, Mediapipe
and Pytorch. The primary objective of this report is to define a virtual blackboard that can be utilized by
finger movements as the hand gesture in online clashes. This system presents a viable solution for identifying
hand gestures, and the findings could have a significant impact on various applications, particularly those
related to virtual teaching and learning environments. Further research could be conducted to optimize the
system's performance and enhance its accuracy to improve its effectiveness for various use cases.

8.2 Future Work

While the hand gesture identification system developed in this project shows promising results, there are
several areas for future improvement and enhancement. Here are some suggestions for future work:

1. Refinement of Gesture Recognition: The current system demonstrates satisfactory performance in

recognizing basic hand gestures. However, further research can be conducted to refine the gesture
recognition algorithm and expand the system's capabilities to recognize a wider range of hand
movements. This could involve exploring advanced deep learning techniques, incorporating more
extensive training datasets, or considering the temporal aspect of gestures for improved accuracy.

2. Real-Time Performance: Although the system operates in real-time, there is room for optimizing its
performance to ensure smooth and seamless hand gesture detection and response. This can be
achieved by optimizing the code implementation, leveraging hardware acceleration techniques, or
exploring parallel processing methods to enhance the system's speed and responsiveness.

35
3. Robustness to Environmental Factors: The system's performance may be affected by various
environmental factors, such as varying lighting conditions, complex backgrounds, or occlusions.
Future work can focus on developing robust algorithms that can handle these challenges effectively,
ensuring consistent and reliable hand gesture detection and recognition across different environments.

4. Integration with Virtual Whiteboard Software: The current system serves as a standalone hand
gesture identification tool. To enhance its usability and practicality, future work can involve
integrating the system with existing virtual whiteboard software or developing a dedicated virtual
whiteboard application. This would enable users to utilize hand gestures for writing, drawing, and
interacting with the virtual environment, providing a more immersive and intuitive user experience.

5. User Interface and Interaction Design: The user interface plays a crucial role in the overall user
experience. Future work can focus on designing an intuitive and user-friendly interface for the hand
gesture system, incorporating visual feedback and guidance to assist users in understanding and
utilizing the available hand gestures effectively.

6. User Studies and Evaluation: Conducting user studies and evaluations can provide valuable insights
into the usability, effectiveness, and user satisfaction of the hand gesture identification system. Future
work can involve designing and conducting user experiments, collecting user feedback, and analyzing
user performance and preferences to further refine and improve the system based on real-world user
needs and requirements.

By addressing these areas of future work, the hand gesture identification system can be enhanced to offer
improved performance, expanded functionality, and a more seamless user experience. These advancements
would contribute to the development of more effective and intuitive human-computer interaction systems,
with potential applications in virtual teaching and learning environments, augmented reality, gaming, and
other domains.

36
REFERENCES

[1] Panwar, M., & Mehra, P. S. (2011, November). Hand gesture recognition for human computer
interaction.
In 2011 International Conference on Image Information Processing (pp. 1-7). IEEE.

[2] Sonkar, Dr &Aher, Gouri & Anerao, Nikita & Gavhale, Shivam & Kadlag, Aditya. (2021). Virtual
Teaching Board Using Computer Vision. International Journal of Advanced Research in Science,
Communication and Technology. 436-442. 10.48175/IJARCET-1421.

[3] Abhishek, B., Krishi, K., Meghana, M., Daaniyaal, M., & Anupama, H. S. (2020). Hand gesture
recognition using machine learning algorithms. Computer Science and Information Technologies, 1(3), 116-
120.
[4] Open Source Computer Vision (OpenCV) [Online]. Accessed on 21st April 2022:
https://github.com/opencv/opencv/wiki

[5] Mazumdar, D., Talukdar, A. K., & Sarma, K. K. (2013, September). Gloved and free hand tracking based
hand gesture recognition. In 2013 1st International Conference on Emerging Trends and Applications in
Computer Science (pp. 197-202). IEEE.

[6] Fernando, M., & Wijayanayaka, J. (2013, December). Low cost approach for real time sign language
recognition. In 2013 IEEE 8th International Conference on Industrial and Information Systems (pp. 637-
642). IEEE.

[7] Ghotkar, A. S., Khatal, R., Khupase, S., Asati, S., & Hadap, M. (2012, January). Hand gesture recognition
for indian sign language. In 2012 International Conference on Computer Communication and Informatics
(pp. 1-4). IEEE.

[8] Chen, Q., Georganas, N. D., & Petriu, E. M. (2007, May). Real-time vision-based hand gesture
recognition using haar-like features. In 2007 IEEE instrumentation & measurement technology conference
IMTC 2007 (pp. 1-6). IEEE.

[9] Zhu, Y., Yang, Z., & Yuan, B. (2013, April). Vision based hand gesture recognition. In 2013
International Conference on Service Sciences (ICSS) (pp. 260-265). IEEE.

[10] Gajjar, V., Mavani, V., & Gurnani, A. (2017, September). Hand gesture real time paint tool-box:
Machine learning approach. In 2017 IEEE International Conference on Power, Control, Signals and
Instrumentation Engineering (ICPCSI) (pp. 856-860). IEEE.

37
[11] Dhule, C., & Nagrare, T. (2014, April). Computer vision based human-computer interaction using color
detection techniques. In 2014 Fourth International Conference on Communication Systems and Network
Technologies (pp. 934-938). IEEE.

[12] Hartanto, R., Susanto, A., & Santosa, P. I. (2014, August). Real time hand gesture movements tracking
and recognizing system. In 2014 Electrical Power, Electronics, Communications, Control and Informatics
Seminar (EECCIS) (pp. 137-141). IEEE.

[13] Luo, R. C., & Wang, H. (2018, October). Automated tool coordinate calibration system of an industrial
robot. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 5592-
5597). IEEE.

[14] Neethu, P. S., Suguna, R., & Sathish, D. (2020). An efficient method for human hand gesture detection
and recognition using deep learning convolutional neural networks. Soft Computing, 24(20), 15239-15248.

[15] Yasen, M., & Jusoh, S. (2019). A systematic review on hand gesture recognition techniques, challenges
and applications. PeerJ Computer Science, 5, e218.

[16]Soroni, F., al Sajid, S., Bhuiyan, M. N. H., Iqbal, J., & Khan, M. M. (2021, October). Hand Gesture
Based Virtual Blackboard Using Webcam. In 2021 IEEE 12th Annual Information Technology, Electronics
and Mobile Communication Conference (IEMCON) (pp. 0134-0140). IEEE

38
39

NLP With Transformers
0% (1)
NLP With Transformers
3 pages
Sathyabama: Smart System For Presentations Using Gesture Control
No ratings yet
Sathyabama: Smart System For Presentations Using Gesture Control
36 pages
If Synopsis
No ratings yet
If Synopsis
17 pages
Project Report To EDITED
No ratings yet
Project Report To EDITED
17 pages
Project Report Dummy
No ratings yet
Project Report Dummy
50 pages
Project Exhib1 Final Report
No ratings yet
Project Exhib1 Final Report
40 pages
SRS Report
No ratings yet
SRS Report
59 pages
Sneha Black Book Complete 2025
No ratings yet
Sneha Black Book Complete 2025
40 pages
Gesture Recognition
No ratings yet
Gesture Recognition
56 pages
Virtual Whiteboard Using Hand Gestures
No ratings yet
Virtual Whiteboard Using Hand Gestures
42 pages
Phase 1
No ratings yet
Phase 1
31 pages
Hand Gesture Recognition For Performing General Operations On Computer Machine
No ratings yet
Hand Gesture Recognition For Performing General Operations On Computer Machine
4 pages
Gesture Recognition Based Virtual Mouse
No ratings yet
Gesture Recognition Based Virtual Mouse
3 pages
Formatted BlackBook Template
No ratings yet
Formatted BlackBook Template
76 pages
Miniproject Synopsis:: Prof - Sumarani H
No ratings yet
Miniproject Synopsis:: Prof - Sumarani H
8 pages
ANANDHARAJ FRONT PAGE-merged
No ratings yet
ANANDHARAJ FRONT PAGE-merged
88 pages
Sign Language Word-1
No ratings yet
Sign Language Word-1
12 pages
IJCRT2305927
No ratings yet
IJCRT2305927
6 pages
Wireless Mouse Using Tri Axis MEMS Ijariie1386 Volume 1 15 Page 33 40
No ratings yet
Wireless Mouse Using Tri Axis MEMS Ijariie1386 Volume 1 15 Page 33 40
8 pages
A Project Report On GESTURE RECOGNITION USING MATLAB
67% (3)
A Project Report On GESTURE RECOGNITION USING MATLAB
33 pages
Hand Gesture Recognition
No ratings yet
Hand Gesture Recognition
26 pages
Cursor Movement Using Hand Gesture
No ratings yet
Cursor Movement Using Hand Gesture
10 pages
Final Report
No ratings yet
Final Report
42 pages
FINALREPORTASHWIJA
No ratings yet
FINALREPORTASHWIJA
29 pages
Proposal Final1.9-1
No ratings yet
Proposal Final1.9-1
23 pages
Virtual Mouse Using Hand Gestures
No ratings yet
Virtual Mouse Using Hand Gestures
15 pages
FINALREPORTASHWIJA
No ratings yet
FINALREPORTASHWIJA
28 pages
Sketch To Solve
No ratings yet
Sketch To Solve
20 pages
GestureFlowProposal V2.3.1 SCS UPDATED
No ratings yet
GestureFlowProposal V2.3.1 SCS UPDATED
30 pages
Hand Gesture Controlled Robot Using Accelerometer
No ratings yet
Hand Gesture Controlled Robot Using Accelerometer
51 pages
Volume Report Final
No ratings yet
Volume Report Final
18 pages
Final Presentation
No ratings yet
Final Presentation
13 pages
myML RESEARCH
No ratings yet
myML RESEARCH
7 pages
Hand Gestures Recognition444
No ratings yet
Hand Gestures Recognition444
56 pages
Hand Gesture Recognition
No ratings yet
Hand Gesture Recognition
28 pages
Smart Presentation System Using Hand Gestures
No ratings yet
Smart Presentation System Using Hand Gestures
11 pages
Research Methodology
No ratings yet
Research Methodology
13 pages
Early Detection of Diabetes - Diabetic Retinopathy
No ratings yet
Early Detection of Diabetes - Diabetic Retinopathy
66 pages
Project Synopsis Title: Mouse Control Using Hand Gestures
No ratings yet
Project Synopsis Title: Mouse Control Using Hand Gestures
3 pages
3rd Year
No ratings yet
3rd Year
9 pages
Computer Control With Hand Gestures Usin PDF
No ratings yet
Computer Control With Hand Gestures Usin PDF
5 pages
Controlling Features of System Muhammad Ullah Ahad
No ratings yet
Controlling Features of System Muhammad Ullah Ahad
8 pages
CPP Report
No ratings yet
CPP Report
40 pages
Capstone Project Proposal
No ratings yet
Capstone Project Proposal
11 pages
Mini Project Report 400
No ratings yet
Mini Project Report 400
57 pages
Virtual Hand Gesture Control For
No ratings yet
Virtual Hand Gesture Control For
4 pages
Major Report
No ratings yet
Major Report
50 pages
Final Report
No ratings yet
Final Report
31 pages
Major Report
No ratings yet
Major Report
14 pages
Seminar Template
No ratings yet
Seminar Template
28 pages
Main Report
No ratings yet
Main Report
40 pages
AaryanSachdeva Project Report
No ratings yet
AaryanSachdeva Project Report
41 pages
Ai Virtual Mouse and Keyboard Using Python and Opencv To Abate The Spread of Covid-19
No ratings yet
Ai Virtual Mouse and Keyboard Using Python and Opencv To Abate The Spread of Covid-19
73 pages
NW N Android
No ratings yet
NW N Android
44 pages
TNSCST Project
No ratings yet
TNSCST Project
3 pages
Software Requirement Specification Document
100% (1)
Software Requirement Specification Document
9 pages
Gesture Recognition-Synopsis
No ratings yet
Gesture Recognition-Synopsis
9 pages
Report PDF
No ratings yet
Report PDF
22 pages
Jarvis - AI Based Virtual Mouse
No ratings yet
Jarvis - AI Based Virtual Mouse
5 pages
Human-Computer Interaction (HCI) With Gesture Recognition
No ratings yet
Human-Computer Interaction (HCI) With Gesture Recognition
63 pages
T6 - Huang & Rust 2021
No ratings yet
T6 - Huang & Rust 2021
12 pages
Predictive Analytics: Myth or Reality
No ratings yet
Predictive Analytics: Myth or Reality
9 pages
Lecture-4: Introduction To Data Science
No ratings yet
Lecture-4: Introduction To Data Science
41 pages
Tisha
No ratings yet
Tisha
1 page
Pauta Relatorio 6
No ratings yet
Pauta Relatorio 6
79 pages
Data Mining Unit 5
No ratings yet
Data Mining Unit 5
70 pages
Resume 2025
No ratings yet
Resume 2025
1 page
Application of Machine Learning and Deep Learning For Predicting Groundwater Levels in The West Coast Aquifer System, South Africa
No ratings yet
Application of Machine Learning and Deep Learning For Predicting Groundwater Levels in The West Coast Aquifer System, South Africa
18 pages
5028-Article Text-16342-1-4-20230406
No ratings yet
5028-Article Text-16342-1-4-20230406
8 pages
d2l en Pytorch
No ratings yet
d2l en Pytorch
979 pages
Direct Market Access Through Digital Platforms For Farmers
No ratings yet
Direct Market Access Through Digital Platforms For Farmers
1 page
1 AI in Banking
No ratings yet
1 AI in Banking
14 pages
Broad Agency Announcement Assured Neuro Symbolic Learning and Reasoning (ANSR) Information Innovation Office HR001122S0039 June 1, 2022
No ratings yet
Broad Agency Announcement Assured Neuro Symbolic Learning and Reasoning (ANSR) Information Innovation Office HR001122S0039 June 1, 2022
48 pages
Detr
No ratings yet
Detr
5 pages
Project Report 1
No ratings yet
Project Report 1
35 pages
Frontiers in Computational Neuroscience
No ratings yet
Frontiers in Computational Neuroscience
14 pages
5 HCIA-Artificial Intelligence ModelArts Lab Guide
100% (1)
5 HCIA-Artificial Intelligence ModelArts Lab Guide
13 pages
IIT KGP CALENDER - Cohort 3
No ratings yet
IIT KGP CALENDER - Cohort 3
2 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
8 pages
KNIME Usage and Benefits
No ratings yet
KNIME Usage and Benefits
3 pages
Data Science Manual
No ratings yet
Data Science Manual
155 pages
Deepeye Document
No ratings yet
Deepeye Document
53 pages
K Means Clustering
100% (1)
K Means Clustering
14 pages
PPAP4.0 - Using AI To Improve PPAP Effectiveness by John Cachat Nov 2024
No ratings yet
PPAP4.0 - Using AI To Improve PPAP Effectiveness by John Cachat Nov 2024
12 pages
Intro To Machine Learning
No ratings yet
Intro To Machine Learning
5 pages
A Beginner's Guide To Large Language Mo-Ebook-Part1
No ratings yet
A Beginner's Guide To Large Language Mo-Ebook-Part1
25 pages
A Deep Convolutional Neural Network For Wafer
No ratings yet
A Deep Convolutional Neural Network For Wafer
9 pages
Introduction To Pattern Recognition
No ratings yet
Introduction To Pattern Recognition
6 pages
Shobitha-U-Nayak Resume
No ratings yet
Shobitha-U-Nayak Resume
1 page

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Sample Project Report-1

Uploaded by

Sample Project Report-1

Uploaded by

A Project Report

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Affiliated from Dr. A.P.J Abdul Kalam Technical University, Lucknow

1. Course : Bachelor of Technology

6. SUPERVISOR: Mr. Shomil Bansal

Remarks from Project Guide:

Dhruv Singhal (2002220100060)

Mr. Shomil Bansal

Dr. Ashish Kumar

Dhruv Singhal Dhananjay Vashishta

CHAPTER NO. FIGURE NO. TITLE PAGE NO.

1.1 Project Introduction

The reason for choosing this topic:

The project will focus on the following key aspects:

1 User Experience Better User Experience by using Gesture

2 Limited functionality Can have multiple gesture

3 External Device No need of extra device

4 Background noise No background noise with gesture

5 Learning Curve Hand gesture are easy to learn and remember

6 Costly No need of extra device

7 precision Hand more precise than mouse

Table 1.1 Problems and possible solutions

The proposed work objective are follows:

2.To develop a user-friendly interface which is easy to learn and use.

3.1.2 Gesture recognition using Mediapipe

Fig. 3.1 Hand landmark model bundle

Building Neural Network with PyTorch

There are two main parts to a CNN architecture

Fig 3.2 Convolutional Neural Network

Fig 3.3 Convolutional Layer

The system architecture diagram is presented in Figure 3.5.

3.1.6 Database Design

3.1.7 System Modules

Module 1: Hand Tracking and Detection

Module 3: Neural Network Training

Module 4: Gesture Recognition

Fig. 3.6 Proposed System

REQUIREMENT AND METHODOLOGY

4.1.2 Software Requirements

4.2.2 Dataset Collection

4.2.4 Training the Neural Network

II. Dropout layers

III. Activation function

IV. Loss Functions

CODING/ CODE TEMPLATE

def landmark_extract(hand_lms, mpHands):

landmark_list = [action, *landmark_extract(hand_landmarks, mpHands)]

def forward(self, x):

# Define the custom dataset class

def __getitem__(self, index):

# Create an instance of the dataset

# Create data loaders for training

# Create an instance of the model

# Define the loss function and optimizer

epoch_loss = running_loss / len(train_loader)

# Save the model

# Test the model

import cv2, torch

# Camera number, can be varied if using multiple webcams

# This creates a trackbar to adjust various values.

## Checks if the position is out of bounds or not

## Loading torch model

action_map = {0: 'Draw', 1: 'Erase', 2: 'None'}

## cv2 text parameters

circles.append((pos, pen_color, pen_size))

bottom_right = (eraser_mid[0] + eraser_size, eraser_mid[1] + eraser_size)

## Draws all stored circles

if cv2.waitKey(1) and 0xFF == ord('q'):

Fig. 6.1 Hand detection

Fig. 6.2 Drawing Mode

Fig. 6.4 Dim light

Fig. 6.5 Skin colored background

Fig. 6.7 Bright light source in background

Observation Metric - Performance

1 Bright/Mid light Good

2 Dim light Average

def getitem(self, index):