0% found this document useful (0 votes)
13 views5 pages

Ieee 659 New

The Hand Motion Implicit Mouse project utilizes computer vision, machine learning, and gesture recognition to enable intuitive user-computer interaction through real-time hand gesture interpretation via a webcam. This system enhances accessibility for individuals with physical disabilities and finds applications in various sectors such as education and healthcare. By employing Convolutional Neural Networks and frameworks like OpenCV and Mediapipe, the project aims to improve gesture recognition accuracy and user experience while promoting hands-free technology interaction.

Uploaded by

Manav Patidar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views5 pages

Ieee 659 New

The Hand Motion Implicit Mouse project utilizes computer vision, machine learning, and gesture recognition to enable intuitive user-computer interaction through real-time hand gesture interpretation via a webcam. This system enhances accessibility for individuals with physical disabilities and finds applications in various sectors such as education and healthcare. By employing Convolutional Neural Networks and frameworks like OpenCV and Mediapipe, the project aims to improve gesture recognition accuracy and user experience while promoting hands-free technology interaction.

Uploaded by

Manav Patidar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Hand Motion Implicit Mouse Using Opencv,

Mediapipe, PYautoGUI

Aryan Saproo Manav Patidar


C. Tech C. tech
SRM Institute of Science & Technology SRM Institute of Science & Technology
Chennai, Tamil Nadu Chennai, Tamil Nadu
As1083@srmist.edu.in mp3122@srmist.edu.in

Dr. S. Nagadevi
C. Tech
SRM Institute of Science & Technology
Chennai, Tamil Nadu

Abstract— By combining computer vision, machine emphasis on ongoing user feedback and usability testing.
learning, and gesture recognition technologies, the Hand The hand motioning implicit mouse has many uses outside
Motioning Implicit Mouse project transforms user- of personal computers.
computer interaction. By interpreting hand gestures
recorded by a webcam in real time and converting them Sectors like education, where hand motioning interfaces
into exact cursor movements and commands, it enables allow for immersive and interactive learning experiences,
intuitive navigation of digital environments. The system's and healthcare, where hands-free control enhances
use of Convolutional Neural Networks (CNNs) improves hygiene. By investigating these possible uses, the project
the accuracy and resilience of gesture recognition, which is hopes to have the greatest possible impact and change
especially advantageous for people with physical
how people interact with technology, encouraging
disabilities. Beyond conventional computing, this
technology streamlines accessibility and user experience creativity and accessibility.
while finding uses in virtual reality, gaming, and interactive
presentations. The project, which is dedicated to ongoing 2 RELATED WORK
innovation, actively integrates usability testing and user The Leap Motion Controller and Microsoft Kinect are two
feedback to make sure the system is still user-friendly and examples of systems designed for gesture-based HCI that track
accessible to a wide variety of users. hand and body movements using infrared sensors. However, the
specialized hardware needed for these systems can be expensive
1 Introduction and unfeasible for broad use. Our system, on the other hand, is
By combining computer vision, machine learning, and more accessible and reasonably priced for a larger audience
gesture recognition technologies, the "Hand gesturing because it uses a standard webcam.
implicit Mouse" project offers an innovative way to
rethink how people interact with computers. The
system offers a natural, user-friendly substitute for It works by concentrating on real-time hand tracking using deep
conventional mouse or touchpad input devices by learning algorithms, research in computer vision and hand
using a webcam to record simple hand gestures. It is gesture recognition has grown. Although they frequently require
especially advantageous for users with physical large datasets for training and significant computational
disabilities or mobility issues. Through the use of resources, hand tracking techniques like convolutional neural
sophisticated algorithms, computer vision enables networks (CNNs) and region-based approaches have shown
real-time interpretation of hand gestures, allowing great promise. Using mediapipe, an open-source framework
users to perform actions such as dragging, clicking, created by Google, has made it easier to implement actual
scrolling, and moving the cursor. By analyzing hand nowhere in the paper should you include pagination.
movement data, machine learning also improves the Using 21-point hand landmarks, time hand tracking and
system over time by increasing accuracy and gesture recognition are performed. As demonstrated
responsiveness and recognizing different gestures for other noteworthy contributions to hand motioning
versatile user interaction. interfaces have addressed a variety of topics, including
In order to properly recognize and categorize hand dynamic gesture recognition using Time Delay Neural
gestures and guarantee a seamless and simple user Networks (TDNNs) and Hidden Markov Models (HMMs).
experience, gesture recognition algorithms are In contrast, our system relies on straightforward static
essential. In order to improve the interface and make gestures for reliable, real-time control.
computing more inclusive, the project places a strong
NEED FOR THE PROPOSED WORK shown in the architectural diagram for a hand motioning
The suggested system creates a dynamic, finger-only implicit mouse system. These devices record user voice
virtual mouse that can follow hand motions without commands and hand gestures in real time. The voice
the need for extra hardware or color markers. The recognition module decodes spoken commands using
system uses MediaPipe for hand landmark detection, natural language processing (NLP), while the gesture
OpenCV for image processing, and a standard recognition module processes the recorded input by
webcam. Certain mouse actions, such as left and right analyzing and interpreting the hand movements using
clicks and scrolling, are mapped to different computer vision techniques. After being identified, the
combinations of raised fingers. control logic maps the gestures to the appropriate mouse
operations, ensuring that these commands are executed
A. System Architecture
smoothly and that the computer interface is managed. A
The webcam keeps taking pictures.
browser-based interface allows users to alter and
2. OpenCV is used to process frames.
personalize their gestures, making the experience more
3. MediaPipe recognizes the status of the fingers (up
or down) and hand landmarks. accessible and improved. Lastly, the system transforms
4. Mouse actions are determined by gesture voice and gesture inputs into particular output actions.
recognition.
5. PyAutoGUI uses recognized gestures to carry out
the actual mouse clicks and movements .
B. Algorithm Overview
1. Launch the webcam stream and initialize the
system.
2. Use MediaPipe to detect hands and fingertips.
3. Use finger positions to identify gestures.
4. Associate mouse events with gestures.
5. Use PyAutoGUI to carry out the appropriate mouse
action.
Computer vision and hand gesture recognition
research
has grown, with research by
1. Concentrating on deep learning algorithms for real-
time hand tracking. Although they frequently require
large datasets for training and significant
computational resources, hand tracking techniques
like convolutional neural networks (CNNs) and
region-based approaches have shown great promise.
2. Using mediapipe, an open-source framework Figure 1: Architecture Diagram
created by Google, has made it easier to implement
actualNowhere in the paper should you include
pagination. The template will number the text heads It works by concentrating on real-time hand tracking using
for you; don't do it yourself. deep learning algorithms, research in computer vision and
3. Using 21-point hand landmarks, time hand tracking hand gesture recognition has grown. Although they
and gesture recognition are performed. As frequently require large datasets for training and significant
demonstrated in [2] and [3], other noteworthy computational resources, hand tracking techniques like
contributions to hand motioning interfaces have convolutional neural networks (CNNs) and region-based
addressed a variety of topics, including dynamic approaches have shown great promise. Using mediapipe, an
gesture recognition using Time Delay Neural open-source framework created by Google, has made it
Networks (TDNNs) and Hidden Markov Models easier to implement actualNowhere in the paper should you
(HMMs). In contrast, our system relies on include pagination.
straightforward static gestures for reliable, real-time
control.
The Leap Motion Controller and Microsoft Kinect are
two examples of systems designed for gesture-based
HCI that track hand and body movements using
infrared sensors. However, the specialized hardware
needed for these systems can be expensive and
unfeasible for broad use. On the other hand, our
system uses a regular webcam,
increasing its affordability and accessibility for a
larger group of people.

ARCHITECTURE DIAGRAM
A webcam and microphone are the first input devices
3 METHODOLOGY in smart homes, medical settings, or for people with
physical disabilities.
Capture Input
Real-time video of the user's hand movements is The technique uses gesture recognition and
recorded by a camera, typically a webcam or depth computer vision to decipher hand movements,
sensor. The system receives this input continuously for which are subsequently converted into common
processing. mouse operations like scrolling, clicking, and
A. Hand Detection cursor movement. This method's natural and
intuitive interface—which allows users to
The system detects the presence of a hand and
interact with the system through everyday
recognizes important landmarks like the palm
gestures without making physical contact—is
center, joints, and fingertips using computer vision
one of its key advantages.
techniques, typically through pre-trained models
However, hand detection and gesture recognition quality have a
like MediaPipe Hands. These landmarks serve as
significant impact on the system's accuracy and responsiveness.
the foundation for gesture comprehension. Performance may be impacted by changes in background noise,
B. Gesture Recognition
hand shape, camera angle, and lighting. Furthermore, extended
The system categorizes gestures according to the
hand use without assistance can cause user fatigue; this
locations and relative separations of the identified
phenomenon is sometimes known as the "gorilla arm" effect in
landmarks. For example, a two-finger swipe may
gesture-based interfaces.
indicate scrolling, pinching the thumb and index
finger simulates a left-click, and moving the open The system needs to have strong smoothing algorithms, error
hand moves the cursor. For more intricate tolerance features, and adjustable sensitivity settings in order
gestures, machine learning algorithms or rule- to lessen these problems. For users to comprehend how their
based logic can be used. gestures are being interpreted, real-time feedback is also
C. Cursor Mapping essential. Further improvements could include multi-hand
interactions, 3D depth awareness, or machine learning for
The cursor position is controlled by normalizing
adaptive gesture learning, even though existing technologies
and mapping the detected hand's coordinates to
like MediaPipe and Pyautogui provide a solid foundation.
the screen's dimensions. This stage guarantees
that the hand's motion is faithfully represented as In conclusion, the hand motion implicit mouse has potential
responsive and fluid cursor movement. advantages for accessibility and natural interaction, but its
D. Mouse Action Simulation efficacy is dependent on both careful user-centric design and
Mouse events like clicks, double clicks, drag-and- technical stability.
drop, and scrolling are simulated using software When punctuating a word or phrase, it should come
tools like pyautogui or autopy. The recognized outside of quotation marks. When a sentence ends with a
gestures serve as the basis for these actions. parenthetical phrase or statement, it is punctuated
outside of the closing parenthesis.
E. Smoothing & Calibration
An "inset," not a "insert," is a graph inside a graph. Unless
The system uses smoothing methods like Kalman you really mean something that alternates, the word
filters or exponential moving averages to avoid alternatively is preferable to the word "alternately."
jittery or unpredictable cursor behavior. In order
to account for user comfort, camera position, and It works by concentrating on real-time hand tracking
hand size, calibration options are frequently using deep learning algorithms, research in computer
included. vision and hand gesture recognition has grown. Although
F. User Feedback they frequently require large datasets for training and
To improve usability, the system might offer audio significant computational resources, hand tracking
feedback or visual cues (like on-screen icons) to techniques like convolutional neural networks (CNNs)
verify gesture recognition and action execution, and region-based approaches have shown great promise.
assisting users in understanding what the system
is interpreting.

With its natural, intuitive, and touchless interface,


this methodology is perfect for use in hygienic
settings like hospitals, gesture-controlled interfaces,
and accessibility tools.

A major breakthrough in human-computer interaction


has been made with the creation of a hand motion
implicit mouse, which provides a touchless substitute
for conventional input devices. This system is
especially helpful for situations where hands-free
interaction, accessibility, or hygiene are crucial, like
4 RESULT
Using OpenCV and PyAutoGUI, the real-time hand
detection application takes screen captures, converts them to
grayscale, applies a pre-trained Haar cascade classifier to
identify hands, and moves the mouse pointer to the center of
the hand it detects. The system recognizes hand gestures with
accuracy, moves the mouse pointer to the center of the hand
to line up with the gestures, and draws a bounding box
around the hands it detects. Its uses range from basic desktop
navigation to complex virtual reality gesture control. The
classifier can efficiently identify hands in a variety of The classifier enhances the fluidity of user interactions
scenarios by utilizing Haar-like features, which enhances by efficiently detecting hands in a variety of conditions
user interaction fluidity. It makes working with graphical by employing Haar-like features.
user interfaces (GUIs) easier when combined with
PyAutoGUI. Accuracy and gesture detection and recognition Based on predefined gestures, the implemented hand
using deep learning models are discussed below :- motion implicit mouse system was able to detect hand
movements and carry out basic mouse operations like
drag, scroll, single-click, double-click, and cursor
movement. On mid-range hardware, the system
achieved real-time performance with minimal lag by
utilizing a standard webcam and the MediaPipe Hand
Tracking framework.
By integrating it with PyAutoGUI, graphical user
interfaces (GUIs) can be easily manipulated. Future
developments might include adding deep learning
models to improve gesture recognition and accuracy,
broadening its uses in industries like education,
gaming, and healthcare.
The models' descriptions, accuracy, and notes are
summarized in the following table:

Table 1: Comparing Accuracy and notes


The table contrasts hand detection and gesture
recognition models, describing their features and
precision. The Haar Cascade Classifier performs
well under a variety of circumstances and uses
Haar-like features to detect hands with 85–90%
accuracy. The Deep Learning Model (i.e., CNN)
performs best on complex gestures and uses
convolutional networks to estimate hand pose
with over 95% accuracy. With a 90%accuracy rate
in identifying unique hand gestures,the Gesture
Recognition System enhances user interaction,
especially in virtual reality and gaming. Finally,
Real-Time Tracking offers movement tracking
accuracy of about 85%, though hardware affects
how well it works. The benefits and drawbacks of
each model for application in healthcare, gaming,
and education are shown in this table.
5 IEEE References
1. "Vision-based hand gesture recognition for
human–computer interaction," by A. Królak
and P. Strumiłło, Journal of
Telecommunications and Information
Technology, no. 4, pp. 110–115, 2012. 2. 2. 3.
3. K. O'Hara, A. Sellen, and A.
2. Harper, "Embodied interaction:
Understanding users' engagement with
gesture-based interfaces," Proc. CHI, pp.
113–122, 2013. R. O. Duda, P. E. Hart, and D.
G. Stork, Pattern Classification, 2nd ed.
Wiley-Interscience, 2000.
3. Bolt, R. A. "Put-that-there": Voice and
gesture at the graphics interface," SIGGRAPH
'80: Proceedings of the 7th Annual
Conference on Computer Graphics and
Interactive Techniques, pp. 262–270, 1980.
4. 5D. Kim, H. Lee, S. Kim, and J. Kim, "Cursor
control system based on hand gestures using
real-time hand tracking and click gesture
recognition," IEEE International Conference
on Consumer Electronics, 2016

Lastly, by enabling the natural use of hand gestures to


operate digital devices, the Hand Motion Implicit Mouse
project marks an exciting advancement in human-
computer interaction. The project demonstrates the use
of real-time virtual mouse control by utilizing computer
vision and machine learning along with libraries like
OpenCV, Mediapipe, and PyAutoGUI. In order to achieve
accurate gesture detection under a variety of conditions,
we addressed the complexity of gesture recognition and
used Convolutional Neural Networks (CNNs). Algorithms
for machine learning increased accuracy and flexibility,
facilitating the seamless recognition of a wide range of
gestures.

Additionally, this project improves accessibility by


providing motor-disabled individuals with alternative
input modes and enhancing interaction in settings where
traditional devices are inconvenient. Future research
could expand gesture support, increase precision, and
look into wearable and augmented reality applications.

The project envisions a time when digital devices can


react to human gestures, making technology more
approachable and empowering for all.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy