0% found this document useful (0 votes)
51 views42 pages

Final Report

The document describes a project report for a hand gesture controlled video application. It was created by 4 students - Akash Shetty, Medha Pai B, Nidhi Upadhya, and Samruddhi V Pai K under the guidance of 2 professors. The project involved developing a system using computer vision and mediapipe to recognize hand gestures in order to control a video application through gestures like increasing/decreasing volume, moving video forward/backward, taking screenshots, and controlling the mouse. The system was implemented using Python libraries for computer vision and mediapipe to detect hand landmarks and recognize gestures in real-time.

Uploaded by

Sri Raksha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views42 pages

Final Report

The document describes a project report for a hand gesture controlled video application. It was created by 4 students - Akash Shetty, Medha Pai B, Nidhi Upadhya, and Samruddhi V Pai K under the guidance of 2 professors. The project involved developing a system using computer vision and mediapipe to recognize hand gestures in order to control a video application through gestures like increasing/decreasing volume, moving video forward/backward, taking screenshots, and controlling the mouse. The system was implemented using Python libraries for computer vision and mediapipe to detect hand landmarks and recognize gestures in real-time.

Uploaded by

Sri Raksha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

“Hand Gesture Controlled Video Application’’

Project Report submitted by

AKASH SHETTY MEDHA PAI B

(4NM18IS005) (4NM18IS061)

NIDHI UPADHYA SAMRUDDHI V PAI K


(4NM18IS072) (4NM18IS090)
Under the guidance of

Ms. NIKITHA SAURABH Ms. PRATHEEKSHA HEGDE N

Assistant Professor Gd-II Assistant Professor Gd-I

In partial fulfillment of the requirements for the award of

the Degree of

Bachelor of Engineering in Information Science and Engineering

from

Visvesvaraya Technological University, Belagavi

Department of Information Science and Engineering


NMAM Institute of Technology, Nitte - 574110
(An Autonomous Institution affiliated to VTU, Belagavi)

June 2022

i
ISO 9001:2015 Certified Accredited with ‘A’ Grade by NAAC

DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING

CERTIFICATE
Certified that the project work entitled

“Hand Gesture Controlled Video Application’’

is a bonafide work carried out by


Akash Shetty(4NM18IS005)
Medha Pai B (4NM18IS061)
Nidhi Upadhya (4NM18IS072)
Samruddhi V Pai K (4NM18IS090 )
in partial fulfillment of the requirements for the award of
Bachelor of Engineering Degree in Information Science Engineering
prescribed by Visvesvaraya Technological University, Belagavi
during the year 2021-2022.
It is certified that all corrections/suggestions indicated for Internal Assessment have been
incorporated in the report deposited in the departmental library.
The project report has been approved as it satisfies the academic requirements in respect of
the project work prescribed for the Bachelor of Engineering Degree.
__________________ _________________ ___________________
Signature of the Guide Signature of the HOD Signature of the Principal

Semester End Viva Voce Examination


Name of the Examiners Signature with Date

1. __________________________ __________________________

2. __________________________ __________________________
ii
ACKNOWLEDGEMENT

It is with great satisfaction and delight that we are submitting the Project Report on
“Hand Gesture Controlled Video Application”. We have completed it as a part of
the curriculum of Visvesvaraya Technological University, Belagavi for the award of
Bachelor of Engineering in Information Science and Engineering.
We sincerely thank Dr. Niranjan N Chiplunkar, Principal, NMAM Institute of
Technology, Nitte and Dr. I Ramesh Mithanthaya, Vice Principal & Dean
(Academics), NMAM Institute of Technology, Nitte, who have always been a great
source of inspiration.
We are profoundly indebted to our guides, Ms. Nikitha Saurabh, (Assistant
Professor Gd II) and Ms. Pratheeksha Hegde N, (Assistant Professor Grade I),
Department of Information Science and Engineering for innumerable acts of timely
advice, encouragement and we sincerely express our gratitude.
We also thank Mr. Vasudeva Pai, Project Coordinator & Assistant Professor Gd II,
Mr. Devidas, Project Coordinator & Assistant Professor Gd III, Department of
Information Science & Engineering for their constant encouragement and support
extended throughout.
We express our sincere gratitude to Dr. Karthik Pai B.H, Head and Associate
Professor, Department of Information Science and Engineering for his invaluable
support and guidance.
Finally, yet importantly, we express our heartfelt thanks to our family and friends for
their wishes and encouragement throughout the work.

AKASH SHETTY (4NM18IS005)


MEDHA PAI B (4NM18IS061)
NIDHI UPADHYA (4NM18IS072)
SAMRUDDHI V PAI K (4NM18IS090)

iii
ABSTRACT

Gestures through hands and fingers, popularly called Hand gestures, are a
common way for human-robot interaction. Hand motions are a type of nonverbal
correspondence that can be utilized in a few fields, for example, correspondence
between hard of hearing quiet individuals, robot controlled, Human and Computer
communication, house mechanization and clinical areas. Research articles in view
of hand and finger motions have embraced various strategies, including those in
light of instrumented sensor innovation and PC vision. Hand motion
acknowledgment gives a canny, normal and helpful method of human-PC
communication (HCI). Hand motion acknowledgment has numerous applications in
medical, engineering and even military research areas. As the reliance of our
general public on innovation develops step by step, the utilization of gadgets like
cell phones, PCs are additionally expanding. In our regular routine we utilize
various methods of correspondence which incorporate talking, composing and
furthermore with some type of body development however while if there should
arise an occurrence of machines we are as yet stayed with composing or talking
thus, we really want some headway so we can speak with machines in some body
development as well. This method of correspondence wherein any kind of body
development is involved is called Gestures. As such, signals are non-vocal
methods of correspondence which use hand movements, various stances of the
body, and looks. Thus, to make machines brilliant we empower our machines to
take order by perceiving various hand motions. Hand signals are utilized as a
contribution to our framework. Hand motion acknowledgment based man-machine
point of interaction is being grown overwhelmingly lately. Because of the effect of
lighting and complex foundation, many hand gesture recognition frameworks work
well. A versatile skin variety model in light of face recognition is used to identify skin
variety areas like hands. To order the unique hand signals, we fostered a
straightforward and quick movement history picture based strategy. Four
gatherings of haar-like directional examples were prepared for the up, down, left,
and right hand signals classifiers.

iv
LIST OF FIGURES
Sl no. Figure no. Description Page no.
1 Fig. 1.1.1 Instrumental gloves gestures 5
2 Fig. 1.2.1 Computer vision gestures 6
2 Fig. 5.1 System design flowchart 13
3 Fig. 7.1 Flowchart of the system 16
4 Fig. 7.2 Hand landmarks using mediapipe 17
5 Fig. 7.3 Finger values of an array 18
6 Fig. 7.4 Different hand gestures 19
7 Fig. 7.5 Code snippet for hand detection 19
8 Fig. 7.6 Code snippet for volume Control 20
9 Fig. 7.7 Code snippet to capture screen 21
10 Fig. 7.8 Code snippet for play/pause the media 22
11 Fig. 9.1 Camera window pop up 26
12 Fig. 9.2 Hand detection with landmarks 26
13 Fig. 9.3 Fingers count 27
14 Fig. 9.4 Detecting multiple hands 28
15 Fig. 9.5(a) Gesture to increase the system volume 28
16 Fig. 9.5(b) Gesture to decrease the system volume 29
17 Fig. 9.6(a) Gestures to move video forward 29
18 Fig. 9.6(b) Gestures to move video backward 30
19 Fig. 9.7 Controlling the mouse with gesture 30
20 Fig. 9.8 Screenshot through gesture 31

v
LIST OF TABLES

Sl no. Table no. Description Page no.


1 Table 4.1.1 Hardware requirements 12
2 Table 4.2.1 Software requirements 12
3 Table 5.1 Libraries and Packages 14

vi
TABLE OF CONTENTS
CONTENTS PAGE NO.

Title Page i

Certificate ii

Acknowledgements iii

Abstract iv

List of Figures v

List of Tables vi

CHAPTER 1 INTRODUCTION 1-8

1.1 Instrumental Gloves Approach 5

1.2 Computer Vision Approach 6

1.3 Advantage of the CV over instrumental approach 7

1.4 Applications of Hand Gesture Controlled Application 8

CHAPTER 2 LITERATURE SURVEY 9-10

CHAPTER 3 PROBLEM STATEMENT 11

CHAPTER 4 SYSTEM REQUIREMENTS 12

4.1 Hardware Requirements 12

4.2 Software Requirements 12

CHAPTER 5 SYSTEM DESIGN 13-14


CHAPTER 6 METHODOLOGY 15

6.1 Recognizing the hand 15

6.2 Processing the image 15

6.3 Defining the Gesture 15

6.4 Testing the end result 15

CHAPTER 7 IMPLEMENTATION 16-22

CHAPTER 8 TESTING 23-25

8.1 Types of Testing 23

8.2 Types of Software Testing 24

8.3 Module Testing 25

CHAPTER 9 RESULTS 26-31

CHAPTER 10 CONCLUSION AND FUTURE WORK 32

REFERENCES
Hand Gesture Controlled Video Application 2021-2022

1. INTRODUCTION

Communication is the process of sending and receiving messages via linguistic or


nonverbal means, including speech or verbal communication. Lighting and graphic
representation. Connected computers, the Internet, telecommunications networks,
and digitization capabilities are important in the digital global era, bringing significant
socio-economic value to all countries, governments, industries, organizations, and
academia that contribute to digital. Component. Change, progress, and prosperity.

Natural Language Processing (NLP) is a branch of Artificial Intelligence (AI) that


helps PCs to understand, predict and manage human languages. Its main goal is to
bridge the gap of communication among computers and humans. There isn’t one
best way to accomplish this because of all the different types of data and
applications of it, but in general, NLP breaks down the language into shorter,
elementary pieces. This helps it understand the relations between the pieces and
how they all combine to form a meaning.

Some basic tasks of NLPs include:

❖ Tokenization: Taking a text and breaking it up into individual words.


❖ Lemmatization /Stemming: Taking an individual word and finding the root.
❖ Part-of-speech tagging: Tagging a word as a noun, verb, adjective, etc.
❖ Semantic labeling: Assigns words a label like agent, patient, goal, etc.

Focal points on the palm and finger position are the hand motions which are a part
of non verbal communication. Static and dynamic are the two types of hand gestures
[3]. As the name infers, the steady state of the hand implies static motion, while the
powerful signal includes the progression of hand moments like waving. There are an
assortment of hand developments inside a motion. For instance, shaking of hand
differs starting with one individual then onto the next and changes as indicated by
general setting. The primary distinction among stance and signal is that stance zeros
in more on the state of the hand though motion centers around the hand
development.
.

Department of Information Science and Engineering, NMAMIT, Nitte. 1


Hand Gesture Controlled Video Application 2021-2022

Already, hand motion acknowledgment was accomplished with sensored gloves that
can be worn. These sensored gloves distinguished the reaction as per hand
developments or movement of the finger. The collected information was later
handled by utilizing a PC associated with wired gloves.

Although strategies referenced above have given great results, they have different
restrictions that make them unacceptable for the older, who might encounter
uneasiness and disarray because of wire association issues. Hand motions offer a
rousing field of examination since they can work with correspondence and give a
characteristic method for association that can be utilized across an assortment of
utilizations. Beforehand, the motion of the hand can be acknowledged by the hand
with censored gloves. These sensored gloves distinguished the reaction as per hand
developments or movement of the finger. The collected information was later
handled by utilizing a PC associated with wired gloves.

Algorithms based on computer vision techniques have been developed which helps
to identify the hands with various cameras. This algorithm helps us to recognize as
well as segment the features of the hand as in color of the skin, the way it appears,
movement, skeleton, how deep it is, 3D models, and detection of deep-learning.

As you know, in human-computer interaction (HCI) visual-based hand gestures play


an important role [6]. Keyboards and mouse have played an important role in HCI
over the last few decades. However, a new HCI process requires the rapid
development of software and hardware. Especially in the field of HCI, speech
recognition and gesture recognition technologies are attracting attention.

Gestures are symbols of physical behavior or emotional expression. Includes hand


and body gestures. Can be classified into two categories: static gestures and
dynamic gestures. In the previous case, a hand pose or gesture indicates a sign. In
the latter case, body and hand movements convey some message. Gestures can be
a form of communication between humans and the electrical computer systems.

Department of Information Science and Engineering, NMAMIT, Nitte. 2


Hand Gesture Controlled Video Application 2021-2022

Unlike conventional hardware methods, gesture recognition is allowed through HCI.


Motion of the hand decides the user's purpose of action by recognizing the
movements of the body or body parts. Over the last few years, many research
workers have sought the betterment of hand gesture recognition techniques. The
main difference between poses and gestures is that gestures focus on hand
movements, while poses focus on hand shapes. Sign language recognition plays a
major role in various applications such as recognition of sign language, augmented
reality (AR) and virtual reality (VR), sign-language interpreters for people with
disabilities, and hardware robot controls.

With the tremendous development of computing technology in this global world, the
current user interaction with pointing and positioning devices such as mice,
keyboards and pens is not very sufficient due to ubiquitous computing techniques.
Since these devices are only limited, so is the instruction set. B. Hands are a better
option, such as using body parts for interaction. The hand can be used as an input
device to provide natural interactions. There are basically two ways to recognize
hand gesture recognition. One is a tool and the other is visual-based cognition.
Instrument gloves include gloves in which some sensors are fixed to the glove, and
to get input from this glove, one wears it and depending on the movement of the
hand or finger, the hand gesture is generated. Vision Based receives input from
webcams, but this method is more complicated than the Globe method because
everything is important depending on the number and location of the cameras.
Visibility is an important factor in this approach. It captures an image from the stream
and recognizes the skin to distinguish it from the background. All of this creates this
visual-based complexity, as the background can also be the same color as the skin.
However, it is still very popular due to its low probability of error and high efficiency.

Gesture detection system has been considered for various research uses from
detection of facial gestures to body movements. Many other applications have
evolved and hence created a need for such a recognition method. Stable action
recognition is a pattern identification problem; as such, an inevitable part of the
pattern observation preprocessing stage, that is extracting features, should be
.

Department of Information Science and Engineering, NMAMIT, Nitte. 3


Hand Gesture Controlled Video Application 2021-2022

conducted prior to any other standard techniques that recognize patterns. Human
and robot interaction is another application of our project where the underlying
motivation for such kinds of systems is that the communication resembles natural
human communication as much as possible. There are two main approaches to
hand gestures that can be categorized as the hardware approach using wearable
glove-based sensors and the software approach using camera view-based
approaches. These approaches are commonly used to detect gestures in HCI
systems. One approach which is based on wired gloves (wearable or direct contact).
The next approach is using the computer vision method where the sensored-gloves
do not play any role.

Hand motions are the features of the language of the body that can be expressed by
the center of the palm, position of the human fingers, and the shape of the formed
hand. There are Static hand gestures and Dynamic hand gestures. As the name
implies, static gestures refer to the hand being in a stable state without any
movements, and dynamic gestures are the following sequence of hand movements
as if in a video.

There exist various hand movements within the gesture. For example, handshakes
differ from person to person and from place to time. The main variation between
posture and gesture is that posture focuses more on the shape of the hand whereas
gesture focuses on the movements of the hand.

Department of Information Science and Engineering, NMAMIT, Nitte. 4


Hand Gesture Controlled Video Application 2021-2022

1.1 Hand Gestures Based on Instrumental Gloves Approach

Fig. 1.1.1 Working of gestures through gloves

In the Fig. 1.1.1 gloves with sensors that can be worn and be utilized to detect the
hand movement and position of the fingers. Also, with the help of gloves with
sensors they can give the directions specified by the fingers and palm area.
Nonetheless, the simplicity of cooperation among client and PC will impede when
the client associates with the PC. Moreover the gadgets become unaffordable. In
any case, the cutting edge gloves based approach utilizes the innovation of the
touch-based device, which is an encouraging innovation and thus viewed as industry
grade haptic innovation. These gloves with sensors that can be worn and be utilized
to detect the hand movement and position of the fingers. Also, with the help of
gloves with sensors they can give the directions specified by the fingers and palm
area. Be that as it may, this approach requires the client to be associated with the
PC truly, which impedes the simplicity of communication among client and PC. In
addition, the cost of some of the gadgets is not affordable.

Nonetheless, the cutting edge glove-based approach utilizes the innovation of touch,
which is an encouraging innovation and it can be seen as Industrial-grade haptic
innovation.

Department of Information Science and Engineering, NMAMIT, Nitte. 5


Hand Gesture Controlled Video Application 2021-2022

1.2 Hand Gestures Based on the Camera Vision Approach

Fig. 1.2.1 Working of gestures through CV

Hand motions are a part of non-verbal communication which are passed on by the
finger position, focal point of the palm and the shape that is built by hand. Hand
signals can be differentiated as dynamic and static. As the name infers, the static
signal suggests the steady state of the hand, while more powerful motion contains a
progression of hand developments like waving. There is an assortment of hand
developments inside the signal like a handshake shifts starting with one individual
then onto the next and changes as per overall setting.

PC vision method for motion detection. When a client plays a particular signal with
one or both hands in front of the camera, a frame that contains various possible
strategies for separating highlights and grouping hand gestures so that you can
control possible applications. Connect to the work. Use the PCVision method to
detect the signal. Fig. 1.2.1 shows that when a client performs a particular
movement in front of a camera with one or both hands, it separates the elements to
control possible applications and connects to a framing system that includes various
potential ways to group hand signals. Will be done.

Department of Information Science and Engineering, NMAMIT, Nitte. 6


Hand Gesture Controlled Video Application 2021-2022

1.3 Advantage of Hand Gesture over human-machine interaction

The workspace that registers the worldview is customer adaptability by forcing it to


associate with the 2DegreeOfFreedom gadget (mouse) when the customer is
accustomed to connecting to the real world in a much more isolated way. To limit,
motion allows clients to process different sources without delay and even
characterize some boundaries. They are a more normal type of correspondence.
Hand-motion applications give three fundamental benefits over ordinary human
machine association frameworks:

I. Accessing the information while maintaining sterility

The touchless connection points are helpful in medical services conditions.

II. Conquering physical handicap

Control of home gadgets and machines for individuals with actual impediment and
additionally old clients with hindered versatility.

III. Exploring the large data volumes

Exploration of the big data and manipulation of the high-definition images through
instinctive actions can make use of 3D interactive methods, rather than constrained
traditional 2D techniques.

Department of Information Science and Engineering, NMAMIT, Nitte. 7


Hand Gesture Controlled Video Application 2021-2022

1.4 Applications of Hand Gesture Controlled Application

There are various use cases of our application which are mentioned below as
follows :

I. To reduce system design complexity design

Using gestures as a way to control user functions, the wired devices such as the
keyboard and mouse could be isolated and hence the complexity of the system can
be reduced and this could also reduce the overall system expenses.

II. To simplify the way of using media applications

Hand gesture controlled video application makes it easy for the user to operate the
media by just using the simple gestures without the need of the hardware wired
devices (keyboard or mouse).

III. To help specially challenged users use applications

Since users could control the media just by gesture, people who are specially abled
might find this application very much useful.

IV. To prevent touch based control

The touchless systems became very useful during the pandemic that could help
prevent transmission of virus from person to person.

Department of Information Science and Engineering, NMAMIT, Nitte. 8


Hand Gesture Controlled Video Application 2021-2022

2. LITERATURE SURVEY

There are many existing gesture-based communication methods and various


techniques have been proposed to improve the system. The method presented by
Munir Oudah, Ali Al-Naji and Javaan Chahl, et al. [4] found various methods to
detect our hand, either through various sensors attached to the wired gloves, a
camera using marked hardware gloves using computer-vision or just a bare hand.
The system presented by Ram Pratap Sharma and Gyanendra K, Verma, et al. [7]
worked on a human computer interaction, where they use image processing. After
the frame is extracted, it is converted from RGB and then the palm and fingers are
detected in the input image using skin color-based segmentation techniques. Then
the non-skin pixel is considered as black and remaining as white and various
pre-handling image processing techniques.

The system presented by Haitham Badi, et al. [2] developed a vision-based hand
gesture recognition method where after an image is captured, the image is scaled to
required level, then adjusted to required lighting conditions. A deep learning neural
network model, Artificial Neural Network (ANN) is used and comparison is made
between hand contour-based feature extraction and complex moments-based
Artificial Neural Network . The system proposed by Muhammad Inayat Ullah Khan,
et al. [3] considers the idea of skin color. It took into account various details like
variations in image plane and pose, skin color and other structure components like
presence of more features like hairs on the human hand further makes a difference
and adds variability . Background and lighting conditions were also considered here.

The system presented by Zhi-hua Chen, Jung-Tae Kim, Jianning Liang, Jing
Zhang,and Yu-Bo Yuan, et al. [8] worked on an idea of finger segmentation, After
the hand is detected, fingers and palm are segmented. Then fingers are recognized
and thus hand gestures are recognized. The point on the palm is characterized by
the middle place of the palm. It is found by the technique for gap change. Distance
change, likewise distance map is a portrayal of a picture. Somewhere out there
every pixel records its distance and the closest limit pixel. Whenever the point on the
.

Department of Information Science and Engineering, NMAMIT, Nitte. 9


Hand Gesture Controlled Video Application 2021-2022

palm is detected, it draws a circle with the radius measured from palm point to the
middle point inside the palm. The round shape is known as the internal circle since it
is incorporated on the inside of the palm. The range of the circle steadily spreads
until it arrives at the end of the palm. That is the range of the circle stops to
increment when the dark pixels are remembered for the circle. With the assistance
of the palm cover, human fingers, palm can be divided without any problem. The part
of the hand that is covered by the palm veil is the palm, while different parts of the
hand are fingers. Next, get palm points and wrist points. You can add an arrow
pointing from the point on the palm to the midpoint of the wrist line at the bottom of
the wrist. Then the arrows are aligned north.

Norah Meshari Alnaim r, et al. [5] provides a comprehensive literature review of


2D/3D image and video related technologies and applications. Research on our
areas of interest was discussed, starting with an evaluation with a 3D camera. After
finger movement measurement techniques, the use of classification algorithms for
gesture recognition was discussed, followed by image processing techniques and
2D / 3D video and image applications.

In the paper by Dr G Shivakumar, Yashas J, et al. [1] proposes a system that


presents literature reviews on hand gesture recognition (HGR). These are less
important because we have the best possible data acquisition options, such as
cameras, wrist sensors, and gloves. Currently, the focus is on the extraction of the
features from the currently existing data, the methods and codes used to better the
process of feature extraction. These processes have also been tested, and in recent
publications, the shortcomings associated with one feature extraction technique
have been covered by the combination of other feature extraction techniques. A
recent study of HGR is working on multimodal merging such as skeletal point
information. RGB and depth images are a set of input units, and problems related to
one concept can be overcome with another modal. The gesture system proposed by
Sushmita Mitra, et al. [6] provides a glimpse on gesture recognition with particular
focus on hand gestures and facial expression.

Department of Information Science and Engineering, NMAMIT, Nitte. 10


Hand Gesture Controlled Video Application 2021-2022

3. PROBLEM DEFINITION

Controlling the system functionalities using just gestures without actually touching
and the wired controllers is very useful.

It also becomes very difficult to use wired controllers during the pandemic and hence
using gestures as a way to control user functions, the wired devices such as the
keyboard, mouse, touchpad etc could be isolated and thereby the complexity of the
system can be reduced and this could also reduce the overall system expenses.

3.1 Objectives:

● Recognize hand and display images using CV2.


● Pre-processing the image.
● Defining the landmarks and connecting the coordinates.
● Defining various gestures and implementing it.
● Running the gestures for controlling the media.

Department of Information Science and Engineering, NMAMIT, Nitte. 11


Hand Gesture Controlled Video Application 2021-2022

4. SOFTWARE REQUIREMENTS SPECIFICATION

A collection of all requirements that are to be put up on the design and validating
the product is called a requirement specification. And the specification also
contains other relevant information necessary for the designing, validating, and
maintenance of the product. There are two types of requirement specification. They
are hardware and software requirements and the most frequent set of
requirements defined by any operating system or software application is the
physical computer resources. Hardware compatibility is normally accompanied by
the hardware requirements list. The product prerequisites are depictions of
highlights and functionalities of the objective framework. Prerequisites likewise
convey the assumption for clients from the product item.

4.1 Hardware Requirements

Device Description

Processor Intel i3 and above

RAM 4GB and above

Operating System Windows 8 and above, 64-bit


architecture

Camera 0.8Mp and above

Table 4.1.1 Hardware requirements

4.2 Software Requirements

Python Environment Pycharm IDE, Jupyter Notebook

Language Python 3.8

Libraries OpenCV, Pycaw, Pyautogui, Pynput and


Mediapipe

Table 4.2.1 Software requirements

Department of Information Science and Engineering, NMAMIT, Nitte. 12


Hand Gesture Controlled Video Application 2021-2022

5. SYSTEM DESIGN

Structure design is the strategy engaged with portraying the parts, modules, places
of cooperation, and data for a system to satisfy decided necessities. Structure
improvement is the most well-known approach to making or changing systems,
close by the cycles, practices, models, and techniques used to encourage them.

First, we play a video by selecting it from the VLC player either by selecting
manually or by using our fingers as virtual mouse. After that we open the OpenCV
frame from which we can sense the gestures continuously so that it performs various
functionalities. Then based on the activities we have decided for the gestures, that
particular activity takes place. Fig. 5.1 shows the basic view of the system that we
are going to implement.

Fig. 5.1 System design flowchart

The framework depends on a solitary camera instrument with next to no requirement


for wearable gadgets. Numerous vision strategies are utilized to achieve the
proposed framework. In the first place, the hand picture is isolated from its
experience utilizing skin division. Movement recognition is likewise used to identify
moving articles. An associated part naming calculation is then applied, which
accelerated the presentation and brought about a more precise recognizable proof
of the article. Besides, a variety dispersion model was utilized to perform hand
following.

Department of Information Science and Engineering, NMAMIT, Nitte. 13


Hand Gesture Controlled Video Application 2021-2022

Sl no. Libraries and Packages Description

1 OpenCV Used to speeden the utilization of


machine discernment.

2 Mediapipe Used to process the image captured.

3 Pycaw Windows audio library to control system


volume.

4 Pyautogui Used to control different mouse graphical


user interface functionalities.

5 Pynput Used to control input devices.

6 Cv2.Circle Used to mark landmarks on the hand.

7 Cv2.imshow Used to display the captured image in a


window.

8 Cv2.waitkey(n) Used to wait for n seconds.

Table 5.1 Libraries and Packages

Department of Information Science and Engineering, NMAMIT, Nitte. 14


Hand Gesture Controlled Video Application 2021-2022

6. METHODOLOGY

Hand Gesture Controlled Video Application project utilizes several machine learning
techniques commonly used in computer vision. We have made use of several
python libraries such as mediapipe, pycaw, auto pygui along with the OpenCV
package. Based on the literature survey, these are the tasks to be performed:

6.1 Recognize the hand using OpenCV library


The OpenCV library tool can be used to perform image processing and various
computer vision tasks and is used to perform tasks like face detection, landmark
detection, objection tracking, etc. The primary step is to load the video file in the
program and then segregate the video data frame by frame and perform object
detection.

6.2 Processing the image


After segregating the video data frame by frame, frame rates are calculated.
Time-series data like video and audio can be processed by using a MediaPipe
framework. On detecting the hand, the image is processed and the landmarks are
constructed.

6.3 Defining the Gesture


The constructed landmarks consist of twenty one points. These landmarks are
constructed using the cv2.circle function. We calculate the coordinate values for
different gestures and define the function to be performed.

6.4 Testing the end result


Once the gesture is defined, testing is done to check if the system works on all of the
functions defined and also to see if it works on various cross functional-platforms.
The errors are minimized and then the efficiency of the algorithm is improved so that
the application runs smoothly.

Department of Information Science and Engineering, NMAMIT, Nitte. 15


Hand Gesture Controlled Video Application 2021-2022

7. IMPLEMENTATION

In implementation of live video streaming, the streaming application will pop up and
with the help of a web camera raw image will be taken as input. This is done with the
help of mediapipe along with the OpenCV package. The obtained image is
pre-processed with the help of twenty one landmarks and each landmark position is
denoted by a circle using CV2.Circle function to get the specific points in the palm. In
Fig. 7.1 the flowchart of the course of action is shown.

Fig. 7.1 Flowchart of the system

Department of Information Science and Engineering, NMAMIT, Nitte. 16


Hand Gesture Controlled Video Application 2021-2022

We can use the double click option as a gesture and open the application by moving
the cursor using our fingers. Then we can select the video by opening the browser
option to select the files. Then after that we play the video and openCV instance will
keep waiting for input from the user and perform action for those gestures.

We use a python library called mediapipe for hand detection and then to calculate
the frame rate, we calculate the current time and subtract it with previous time.
Mediapipe provides ML solutions like Iris Detection, Face Mesh Detection, Face
Detection, Pose Detection, Hands Detection, Holistic Detection, Hair Segmentation,
Object Detection, Instant Motion Tracking, Box Tracking etc.

The Fig. 7.2 displays the various landmarks in the hand that can be detected and we
calculate coordinate values for each landmark. We locate the coordinate value of the
index finger tip and the tip of the thumb finger. We calculate the location of those
coordinates and find the distance between them. Along with this distance and the
use of the pycaw library we can increase or decrease the value of the volume.

Fig. 7.2 Hand landmarks using Mediapipe

Department of Information Science and Engineering, NMAMIT, Nitte. 17


Hand Gesture Controlled Video Application 2021-2022

As seen in the sample output Fig. 7.2, the landmark positions of the hand are
detected first, then we can draw a circle using the cv2.circle function on those
landmarks.

Fig. 7.3 Depicts the finger values in an array

The hand consists of the landmarks as shown in Fig. 7.3 of which the values of each
finger of each hand is set, values numbered from 0 to 4 respectively. Left and Right
hand are set as hand1[] and hand2[] with the finger count. For example, the index
finger of the right hand can be denoted as hand1[1] and that of the left hand as
hand2[1]. And if it's straight then it can be denoted as hand1[1]==1.

Department of Information Science and Engineering, NMAMIT, Nitte. 18


Hand Gesture Controlled Video Application 2021-2022

Fig. 7.4 Different hand gestures

The Fig. 7.4 shows various Hand Gestures defined in our application.

Fig. 7.5 Code snippet for hand detection

Fig. 7.5 shows the code snippet for hand detection for increasing and decreasing
volume, we find the landmarks of index tip and thumb tip and then find distance
between them. And map it into volume level using the pycaw package.

Department of Information Science and Engineering, NMAMIT, Nitte. 19


Hand Gesture Controlled Video Application 2021-2022

Fig. 7.6 Code snippet for volume control

Fig. 7.6 shows the code snippet for volume control. To control the volume of the
system we first identify the landmark value at the tip of the index finger and the tip
of the thumb finger. Now we find the distance between these coordinates. The
distance between them is mapped into the volume scale. Then as we increase and
decrease the distance the volume value also changes. To move the video forwards
or backwards, play and pause the video and take screenshots what we do is first
find all the coordinates. Then put the coordinate values of the tip of each finger into
an array called fingers. Now if the value of any element in the array is one then we
can say the finger is standing. Based on this concept, we can give various gestures
for various functions.

Department of Information Science and Engineering, NMAMIT, Nitte. 20


Hand Gesture Controlled Video Application 2021-2022

Fig. 7.7 Code snippet to capture screen

In the Fig. 7.7 We use additional packages like pynput, pyautogui, pycaw to take
values from the keyboard, take screenshots and to control the volume
functionalities. To control the mouse functionalities, we make a rectangle on the
output frame of the OpenCV. We limit the moment of the fingers to this scale else
this will leave scaling problems. Using pynput, we control the mouse functionality
like moving the mouse cursor and single clicking. pause the video and take
screenshots. What we do is first find all the coordinates. Then put the coordinate
values of the tip of each finger into an array called fingers. Now if the value of any
element in the array is one then we can say the fingNow we find the distance
between these coordinates. The distance between them is mapped into the volume
scale. Then as we increase and decrease the distance the volume value also
changes. To move the video forwards or backwards, the player is standing. Based
on this concept, we can give various gestures for various functions.

Department of Information Science and Engineering, NMAMIT, Nitte. 21


Hand Gesture Controlled Video Application 2021-2022

Fig. 7.8 Code snippet to seek the media forward and backward

Fig. 7.8 shows code snippets for seeking the media forward and backward and the
media is seeked forward by showing the gesture having the finger1 and finger2
values of a specific hand to one. And the media is seeked backward by doing the
same on the other hand.

Fig. 7.9 Code snippet for play/pause the media

Fig. 7.9 indicates code snippets to play and pause the media by projecting the
gesture of finger4 value to 1 of either hand. Cv2.waitkey makes the program wait
for a few seconds for a smooth transition.

Department of Information Science and Engineering, NMAMIT, Nitte. 22


Hand Gesture Controlled Video Application 2021-2022

8. TESTING
Programming testing is the demonstration of looking at the antiques and the way of
behaving of the product under test by approval and confirmation. Programming
testing can likewise give a goal, autonomous perspective on the product to permit
the business to appreciate and figure out the dangers of programming execution.
There are various type of testing and some of these are mentioned below:

8.1. Types of Testing:

8.1.1. Unit Testing

It centers around the smallest unit of programming plan. In this, we test a singular
unit or gathering of interrelated units. Much of the time is done by the developer by
utilizing test input and noticing its related yields.

8.1.2. Module Testing

Module testing is a process where you need to test each unit of these modules to
ensure they adhered to the best coding standards. Unless a module passes the
testing phase, it cannot go for the application testing process. Module testing, aka
component testing, helps to early detection of errors in application testing.

8.1.3. System Testing

System Testing is a sort of programming testing that is performed on a total


coordinated framework to assess the consistency of the framework with the
comparing prerequisites. In System testing, combination testing passed parts are
taken as information.

8.1.4. Integration Testing

The goal is to take unit tried parts and construct a program structure that has been
directed by plan. Reconciliation testing will be tried in which a gathering of parts is
consolidated to deliver yield.
.

Department of Information Science and Engineering, NMAMIT, Nitte. 23


Hand Gesture Controlled Video Application 2021-2022

8.2. Types of Software Testing Techniques:

8.2.1. Black Box Testing

Black Box testing is a method of testing without having any knowledge of the
application's internal workings. The tester has no understanding of the system
architecture and no access to the source code. A tester will often engage with the
system's user interface by providing inputs and examining results without
understanding how and where the inputs are processed when performing a black
box test. Advanced testing that focuses on the behavior of software is known as
black box testing. It entails testing from the outside or from the point of view of the
end user. Almost any level of software testing can benefit from black box testing.

8.2.2. White Box Testing

White box testing entails a thorough examination of the code's fundamental logic
and structure. Glass testing or open box testing are other names for white box
testing. In order to perform white box testing on an application, the tester must be
familiar with the code's internal workings. The tester must examine the source code
to determine which unit/chunk of code is acting abnormally. Testing is based on the
coverage of code statements, branches, paths, or conditions in this approach.
Low-level testing is referred to as white-box testing. Glass box, transparent box,
clear box, and code base testing are all terms used to describe this type of testing.

8.2.3. Grey Box Testing

rey Box Testing Grey box testing is a technique for testing an application with only a
limited understanding of its internal workings. When it comes to software testing, the
phrase "the more you know, the better" has a lot of weightage. A tester with
extensive subject expertise always has an advantage over someone with little
domain knowledge.

Department of Information Science and Engineering, NMAMIT, Nitte. 24


Hand Gesture Controlled Video Application 2021-2022

8.3. Module Testing:

8.3.1. Hand Detection

In the beginning, the hand is shown to the system camera and the expected
output should be that the captured image is processed and landmarks are
added using libraries such as mediapipe and openCV. And the output
obtained is working in accordance with the expected output.

8.3.2. Volume Control Module

When the module is executed, a window pops up and a volume gesture is


shown. It is expected that when the tip of the thumb and index finger is
shown with defined adjustment then the volume of the system changes in
accordance with the gesture defined. And the output obtained is working in
accordance with the expected output.

8.3.3. Mouse Control And Screen Capture

When the module is executed, a window pops up and a defined gesture is


projected in front of the camera. It is expected that when the index finger is
hovered, the mouse cursor also aligns with that movement and along with
that when the palm is shown the screenshot of the desktop is taken. And the
output obtained is working in accordance with the expected output.

8.3.4. Play/Pause And Media Seeking

When this particular module is executed the camera window pops and the
gesture that is defined to this is shown and it is expected that when the video
is played parallely to this, the media play/pause along with seeking forward
and backward is controlled by using the gesture. And the output obtained is
working in accordance with the expected output.

Department of Information Science and Engineering, NMAMIT, Nitte. 25


Hand Gesture Controlled Video Application 2021-2022

9. RESULTS

Fig. 9.1 Camera window pop-up

The Fig. 9.1 depicts the live streaming window application. This can be done using
cvShowImage. We can display the image captured by the camera through an
application window using OpenCV.

Fig. 9.2 Hand Detection with landmarks

Department of Information Science and Engineering, NMAMIT, Nitte. 26


Hand Gesture Controlled Video Application 2021-2022

The Fig. 9.2 Depicts the capturing of images by hand with computer vision and
camera using OpenCV library. OpenCV (Open Source Computer Vision Library) is
an open source PC vision and AI programming library. OpenCV was designed to
give a typical framework to PC vision applications and to speed up the utilization of
machine discernment in the business items.

Fig. 9.3 Fingers count

The Fig. 9.3 Demonstrates the processing of images from the video input and
tracing the landmarks using the Mediapipe library. MediaPipe is a Framework for
building AI pipelines for handling time-series information like video, sound, and so
on. This cross-stage Framework works in Desktop/Server, Android, iOS, and
implanted gadgets like Raspberry Pi and Jetson Nano. Here FPS stands for frames
per second.

Department of Information Science and Engineering, NMAMIT, Nitte. 27


Hand Gesture Controlled Video Application 2021-2022

Fig. 9.4 Detecting multiple hands

The Fig. 9.4 Shows the detection of multiple hands in a single screen. Though the
gesture can be defined to both hands, only a single hand gesture will be executed.

Fig. 9.5(a) Gesture to increase the system volume

Department of Information Science and Engineering, NMAMIT, Nitte. 28


Hand Gesture Controlled Video Application 2021-2022

Fig. 9.5(b) Gesture to decrease the system volume

Fig. 9.5(a) demonstrates the gesture to increase the system volume Fig. 9.5(b)
demonstrates the gesture to decrease the system volume using the Pycaw library.
Wherein we can control the system volume accordingly.

Fig. 9.6(a) Making video move forward

The Fig. 9.6(a) The gesture that is used to seek the media forward by 5 seconds.

Department of Information Science and Engineering, NMAMIT, Nitte. 29


Hand Gesture Controlled Video Application 2021-2022

Fig. 9.6(b) Making video move backwards

The Fig. 9.6(b) The gesture that is used to seek the media backward by 5 seconds.

Fig. 9.7 Controlling the mouse with gesture

The Fig. 9.7 Depicts about controlling the mouse and performing double click action
using pyautogui.

Department of Information Science and Engineering, NMAMIT, Nitte. 30


Hand Gesture Controlled Video Application 2021-2022

Fig. 9.8 Screenshot through gesture

The Fig. 9.8 shows the gesture for capturing the screenshot.

Department of Information Science and Engineering, NMAMIT, Nitte. 31


Hand Gesture Controlled Video Application 2021-2022

10. CONCLUSION AND FUTURE WORK


Hand motion gesture systems tend to say the least in connection frameworks. It is
easier to control things by hand, simpler, better adaptable and less expensive, and
there is a reason to solve issues brought about by equipment gadgets, since none is
required. Every strategy referenced above, be that as it may, enjoys its benefits and
hindrances and may perform well in certain difficulties while being mediocre in
others.

We have added Gestures to detect the number of fingers, Move the mouse cursor,
Single Click functionality, taking screenshots, Increase and decrease the volume,
Move the video forward and backwards by n seconds based on the requirement,
pause and play the video in any media player. The results also showed that the
gesture controlled application was quite robust for web captured images. The
application was very sensitive to the live noise on the video stream. Slight
movements in the palm could affect recognition of the gesture. However, when the
hand is kept steady enough for a long time, the program runs efficiently.

Based on our results, computer vision applications can recognize basic hand
gestures in robot navigation using heuristic rules. The primary goal of this project
was to simplify the way of using the existing system through gestures rather than
touch based devices.

Additionally the future work includes that we embed our system to work on various
media applications control media functionalities through gesture. In real time with the
help of representing numbers can be done with the help of commands. Enhancing
the recognition capability for various lightning conditions, which is encountered as a
challenge in this project can be worked upon in future.

Department of Information Science and Engineering, NMAMIT, Nitte. 32


Hand Gesture Controlled Video Application 2021-2022

REFERENCES

[1] AJ. Yashas and G. Shivakumar, "Hand Gesture Recognition: A Survey," 2019
International Conference on Applied Machine Learning (ICAML), 2019, pp.
3-8,

[2] Badi, Haitham. (2016). Recent methods in vision-based hand gesture


recognition. International Journal of Data Science and Analytics. 1.
10.1007/s41060-016-0008-z.

[3] Muhammad, Inayat,Ullah, Khan. Hand Gesture Detection and Recognition


System, 2011.

[4] Oudah, M.; Al-Naji, A.; Chahl, J. Hand Gesture Recognition Based on
Computer Vision: A Review of Techniques. J. Imaging 2020, 6, 73.
https://doi.org/10.3390/jimaging6080073

[5] Norah, Meshari; Alnaim. Hand Gesture Recognition using Deep Learning
Neural Networks, 2020.

[6] S. Mitra and T. Acharya, "Gesture Recognition: A Survey," in IEEE


Transactions on Systems, Man, and Cybernetics, Part C (Applications and
Reviews), vol. 37, no. 3, pp. 311-324, May 2007, doi:
10.1109/TSMCC.2007.893280.

[7] Verma, Gyanendra. (2015). Human Computer Interaction using Hand


Gesture. Procedia Computer Science. 54. 10.1016/j.procs.2015.06.085.

[8] Xi-hua Chen, Jung-Tae Kim, Jianning Liang, Jing Zhang, Yu-Bo Yuan,
"Real-Time Hand Gesture Recognition Using Finger Segmentation", The
Scientific World Journal, vol. 2014, Article ID 267872, 9 pages, 2014
Hand Gesture Controlled Video Application 2021-2022

[9] Y Haria A., Subramanian A., Asokkumar N., Poddar S., Nayak J.S, Hand
Gesture Recognition for Human Computer Interaction Procedia Computer
Science, 2017.

[10] Z Suriya, V Vijayachamundeeswari, A survey on hand gesture recognition


for simple mouse control. Information Communication and Embedded
Systems (ICICES); 2014 International Conference on; 2014 Feb 27; India,
Chennai, IEEE, New York (2014), pp. 1-5

[11] Z Hussain, AK Talukdar, KK Sarma, Hand gesture recognition system with


real-time palm tracking. India Conference(INDICON);2014 Annual IEEE ;2014
Dec 11; India, IEEE, Pune; New York (2014), pp. 1-6

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy