0% found this document useful (0 votes)
21 views21 pages

Orr 29012024

The document summarizes an object recognition app project submitted for a bachelor's degree. It describes the objectives of object recognition including identifying objects from images and videos. It also outlines some of the tools used for computer vision like features extraction and object detection frameworks.

Uploaded by

ashish1234999
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views21 pages

Orr 29012024

The document summarizes an object recognition app project submitted for a bachelor's degree. It describes the objectives of object recognition including identifying objects from images and videos. It also outlines some of the tools used for computer vision like features extraction and object detection frameworks.

Uploaded by

ashish1234999
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Synopsis Of

OBJECTIVE RECOGNITION APP


Submitted in Partial Fulfillment of the Requirement for the Award of the Degree of

BACHELOR OF TECHNOLOGY
(COMPUTER SCIENCE AND ENGINEERING)
To

VEER BAHADUR SINGH PURVANCHAL UNIVERSITY,


JAUNPUR
Under the Supervision of
Dr. Sunil Yadav
(Assistant Professor of CSE)

Submitted By:
Shivangi Srivastava (205459)
Ashish Singh (205418)
Yadav Nandani Ramavatar (205476)
Sumit Tiwari (205468)

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


UNSIET VBS PU JAUNPUR
January, 2024

i
CANDIDATE’S DECLARATION

We, Shivangi Srivastava (205459), Yadav Nandani Ramavatar (205476), Sumit


Tiwari (205468) and Ashish Singh (205418), students of B.Tech.(Information
Technology) at Uma Nath Singh Institute of Engineering and Technology, V.B.S.
Purvanchal University, Jaunpur declare that the work presented in this project titled
“Object Recognition App”, submitted to Department of Information Technology
for the award of Bachelor of Technology degree in Information Technology. All the
work done in this project is entirely our own except for the reference quoted. To the
best of my knowledge, this work has not been submitted to any other university or
Institution for award of any degree.

Date: 29th January, 2024 Student Names:


Place: Jaunpur Shivangi Srivastava (205459)
Yadav Nandani Ramavatar (205476)
Sumit Tiwari (205468)
Ashish Singh (205418)

ii
CERTIFICATE

It is certified that, this project entitled “OBJECT RECOGNITION APP”, submitted


by (Shivangi Srivastava, Yadav Nandani Ramavatar, Ashish Singh, Sumit
Tiwari) in partial fulfillment of the requirement for the award of Bachelor of
Technology in Information Technology degree from V.B.S. Purvanchal University,
Jaunpur, is record of students own study carried under my/our supervision. This
Project progress has not been submitted to any other university or institution for the
award of any degree.

Project Guide: External Examiner


Dr. Sunil Yadav
(Assistant Professor of CSE)

iii
ABSTRACT

Object recognition is one of the classic problems in computer vision. It is very important
for computers to recognize the common objects in life like the human brain and the
human eye. This is also an important step in the development of computers in the
direction of intelligence. This article summarizes the current research on object
recognition, and elaborates on several aspects of object feature extraction, object feature
matching and object recognition methods.

Object recognition and tracking is a common task in video processing with multiple
applications including surveillance, security, industrial inspection, medicine, and more.
Recognition and tracking accuracy can drop significantly when the scene dynamic
range exceeds that of common camera sensors, which may lead to decreased tracking
accuracy.

iv
ACKNOWLEDGEMENT

The merciful guidance bestowed to us by the almighty made us stick out this project
to a successful end. I humbly pray with sincere heart for his guidance to continue
forever.
We pay thanks to our project guide who has given guidance and light to me during
this project. His versatile knowledge has caused me in the critical times during the
span of this project.
We also take this opportunity to express our gratitude to all those people who have
been directly and indirectly with us during the completion of theproject.
We want to thank our friends who’ve always encouraged us during this project. At the
last but not least thanks to all the faculty of IT department who provided valuable
suggestions during the period of project.

v
TABLE OF CONTENTS

S.No. Description Page no.

01 Candidate declaration ii
02 Certificate from supervisor iii
03 Abstract iv
04 Acknowledgement v
05 Table of content vi
06 List of figures vii

07 Chapter 1: Introduction 1-2


1.1 Introduction 1
1.2 Objectives 1
1.2 Motivation 2
08 Chapter 2: Tools and Technology used 3-8
2.1 Computer Vision 3
2.1.1 Features of Computer Vision 4-6
2.2 Framework and Software tools 6-8
2.3 Software Description 8
2.4 Conclusion 8-9
09 Chapter 3: Result and Discussions 10-11
3.1 Result 10
3.2 Discussion 11
10 Chapter 4: Future Scope and Conclusion 12-13
4.1 Future Scope 12
4.2 Conclusion 13
11 References 14

vi
LIST OF FIGURES

Fig no. Fig. Name Page No.

Fig. 2.1 Various fields of computer vision 02


Fig. 2.1.1 Object Detection 03
Fig. 3.1 Object Recognition 09

vii
CHAPTER 1

INTRODUCTION

1.1 INTRODUCTION
Object recognition is a computer vision technique for identifying objects in
images or videos. Object recognition is a key output of deep learning and
machine learning algorithms. When humans look at a photograph or watch a
video, we can readily spot people, objects, scenes, and visual details. The goal
is to teach a computer to do what comes naturally to humans: to gain a level of
understanding of what an image contains.
Object recognition is a key technology behind driverless cars, enabling them to
recognize a stop sign or to distinguish a pedestrian from a lamppost. It is also
useful in a variety of applications such as disease identification in bioimaging,
industrial inspection, and robotic vision.
Image classification involves predicting the class of one object in an image.
Object localization refers to identifying the location of one or more objects in
an image and drawing abounding box around their extent. Object detection
combines these two tasks and localizes and classifies one or more objects in an
image.

1.2 OBJECTIVES
Object recognition is the area of artificial intelligence (AI) concerned with the
abilities of robots and other AI implementations to recognize various things and
entities.
Object recognition allows robots and AI programs to pick out and identify
objects from inputs like video and still camera images. Methods used for object
identification include 3D models, component identification, edge detection and
analysis of appearances fromdifferent angles.
Object recognition constitutes only one of the major visual problems of interest,
but other activities build on information obtained through vision. The most
important one is navigation, that is, moving around in the world with the help of
visual sensors.
Developers and engineers are using these techniques to build futuristic
machines that will deliver groceries and medicines to our doorsteps.
Object detection forms a ground for other important AI vision techniques like
image classification, image retrieval, or object co-segmentation that drives
meaningful information out of real-life objects.

1
1.3 MOTIVATION

Nowadays, automatic object recognition is a topic of growing interest for the


machine vision community. In particular, the automatic building detection from
monocular satellite and aerial images has been an important tool for many
applications such as creation and update of maps and the Geographical
Information Systems database, land use analysis, change detection, and urban
monitoring applications. Due to the rapidly growing urbanization, detecting
buildings from images is a hot topic and an active field of research. Recently,
vision and photogrammetry tools have been increasingly used in the processing
of Geographical Information Systems, cultural heritage modelling, risk
management, and monitoring of urban regions. More specifically, extracting
objects such as roads and buildings has gained significant attention over the last
decade. Aerial
The choice of tool is influenced by various elements, according to the needs for a
certain work, the preferred programming language, community support, and the
resources thatare available.

2
CHAPTER 2

TOOLS AND TECHNOLOGY USED

2.1 COMPUTER VISION


Within the broad field of artificial intelligence, computer vision has become
more and more of a promising area in recent years. It focuses on the creation of
algorithms and methods that provide machines with the ability to derive useful
information from visual data, simulating the capabilities of human vision.
Computer vision has attracted a lot of attention and found applications in a
variety of fields, including robotics surveillance, automotive, healthcare and
entertainment. This is due to the growing availability of strong computing
resources and the exponential growth of picture and video data.
By analyzing and deciphering digital images or video clips, computer vision’s
primary objective is to enable machines to sense and understand the visual
world. This covers a variety of activities, including object detection,
classification, tracking, and recognition in images. Computer vision systems can
make intelligent choices, carry out difficult tasks, and interact with their
surroundings more successfully by extracting pertinent information from visual
data.

Fig. 2.1 Various fields of Computer Vision

3
2.1.1 FEATURES OF COMPUTER VISION

Computer vision is a challenging and broad field that includes many aspects
and skills that let machines comprehend and evaluate visual data. The following
are some essential aspects of computer vision:

1. Image Classification
Algorithms for computer vision can categorize images into groups or categories
that have been predetermined. Models can recognize and give the proper
classifications to new, unseen photos by learning from a huge dataset of
labelled images. Applications like face detection, object recognition, and scene
understanding frequently take advantage of this functionality.
2. Object detection
Object detection is the process of locating and identifying objects inside a frame
of an image or video. Multiple instances of an object can be found using
computer vision techniques, which can also provide bounding boxes or
segmentation masks at the pixel level. For tasks like autonomous driving,
monitoring, and augmented reality, object detection is essential.

Fig. 2.1.1 Object Detection Image Segmentation


This technique divides an image into useful sections or segments. Each pixel
receives a label designating which object or area of the backdrop it belongs to.
For projects like clinical image analysis, self-navigating machines, and video
editing, this functionality is useful.
3. Optical Character Recognition (OCR)
OCR is a feature that lets computers read text from scanned or imaged
documents and extract it. OCR systems may transform printed or handwritten
content into modifiable and accessible digital text by utilizing computer vision
algorithms. OCR is frequently employed in text analysis, document digitization,
and text-based information extraction.

4
4. Image Restoration and Enhancement

Digital picture quality can be restored or improved using computer vision


algorithms.
This covers operations like deblurring, super-resolution, denoising, and color
correction. Techniques for image restoration and improvement are helpful in
industrieslike forensics, digital media processing, and medical imaging.
5. The Histogram of Oriented Gradients (HOG)
It is a feature descriptor, captures the distribution of gradient orientations in an
image.It is frequently employed for object detection and pedestrian recognition.
6. Scale-Invariant Feature Transform (SIFT)
SIFT is a feature term that locates and characterizes local features in an image
that are unaffected by affine, rotational, or scale transformations. It is frequently
employed forobject recognition and image matching.
7. Speeded-Up Robust Features (SURF)
SURF is an upgrade on SIFT's feature descriptor in terms of computational
efficiency. It is employed in processes like 3D reconstruction, image stitching,
and object recognition.
8. Tracking and Motion Analysis
Computer vision algorithms can identify objects or examine motion in a series
of still or moving pictures. This includes activities like tracking objects,
recognizing gestures, identifying activities, and analyzing behavior. For
applications like surveillance, sports analysis, and human-computer interaction,
tracking and motion analysis are essential.
9. 3D Vision
From two-dimensional images or videos, computer vision algorithms can infer
the three-dimensional structure of objects and scenes. This involves methods
like depth estimation, structure from motion, and stereo vision. Applications
like robotics, augmented reality, and 3D reconstruction are made achievable by
3D vision.
10. Face Recognition
Based on facial features, face recognition algorithms can recognize and
authenticate people. Access management, biometric identification, and security
systems all make extensive use of this feature.

5
11. Scene Understanding

By identifying objects, their connections, and the overall context of complex


scenes, scene understanding involves analyzing and interpreting the scene. To
provide a comprehensive understanding of visual scenes, it combines various
computer vision tasks. Scene understanding is useful for augmented reality,
video surveillance, and autonomous navigation.
12. Texture features
Texture-based features record an image's structural and textural details.
Examples involve gray-level co-occurrence matrices (GLCM), Gabor filters,
local binary patterns(LBP), and statistical texture descriptors.
13. Shape-based features
These attributes describe the geometrical characteristics of objects in an image.
Hu moments, Zernike moments, and Fourier descriptors are a few examples.
14. Features based on Color
Color information, such as color histograms, color moments, or color
descriptors like Color Histograms of Oriented Gradients (COHOG), can be
used as features. These characteristics record how colors are distributed or
statistically behave in an image.
These characteristics of computer vision work together to make it possible for
machines to perceive and comprehend visual data, opening an extensive number
of applications in various fields. The abilities of computer vision systems are
anticipated to grow further as research and technological developments proceed,
creating new opportunitiesand challenges for the industry.

2.2 Frameworks and Software Tools


Frameworks and Software tools provide a foundation for developing Computer
Vision applications and offer a variety of features and pre trained models that
can be utilized to create reliable, robust, and accurate systems. Some of the
popular Software tools andframework are:
1. OpenCV
OpenCV, also known as the open-source Computer Vision Library, is a well-
known open-source library that offers a wide range of computer vision
functions and algorithms. Because it supports a wide range of programming
languages, including C++, Python, and Java, it is usable by developers working
on a variety of platforms. Inthis project we use the OpenCV software.

6
2. TensorFlow

TensorFlow is a machine learning framework that is open-source and was


created by Google. TensorFlow Object Detection API, a high-level API it offers,
offers pre-trained models and resources designed for computer vision tasks
involving object detection. Applications based on deep learning for computer
vision frequently use TensorFlow.

3. PyTorch
PyTorch is an additional well-liked open-source deep learning framework that
has been quite well-liked in the domain of computer vision. It offers a flexible
computational graph and comprehensive assistance for creating neural
networks. The user-friendly interface and broad community support of PyTorch
are well known.
4. Caffe
The Berkeley Vision and Learning Centre (BVLC) created the Caffe deep
learning framework. It is frequently used for computer vision tasks like image
classification and object recognition and is particularly well known for its
effectiveness in training and deploying convolutional neural networks (CNNs).
5. MATLAB
For computer vision applications, MATLAB is a proprietary programming
language and development environment that is frequently utilized in academic
and research contexts. For image processing, feature extraction, and algorithm
creation, it offers a complete collection of tools and operations.
6. Microsoft Cognitive Toolkit (CNTK)
Also referred to as CNTK, the Microsoft Cognitive Toolkit is a deep learning
platform created by Microsoft. It allows distributed training across numerous
GPUs and processors and provides effective deep learning algorithm
implementations. Computer vision applications such as object identification and
image recognition have both employed CNTK.
7. Scikit-image
Scikit-image is an open-source Python library for tasks involving image
processing and computer vision. It is a popular option for researchers and
developers working on computer vision projects since it offers a wide range of
features for image editing, filtering, segmentation, and feature extraction.
8. Darknet
Darknet is a C and CUDA-based open-source neural network framework. It is
frequently employed for tasks involving the detection and identification of

7
objects, particularly in real-time or embedded system applications. The YOLO
(You Only Look Once) algorithm, a quick and effective object identification
method, is one of Darknet'smost well-known features.

2.3 SOFTWARE DESCRIPTION


1. IDE
An integrated development environment (IDE) is a software application that
provides comprehensive facilities to computer programmers for software
development. An IDE normally consists of at least a source code editor, build
automation tools, and a debugger. The boundary between an IDE and other
parts of the broader software development environment is not well-defined;
sometimes a version control system or various tools to simplify the construction
of a graphical user interface (GUI) are integrated. Many modern IDEs also have
a class browser, an object browser, and a class hierarchy diagram for use in
object-oriented software development.
2. PYTHON IDE
Python is one of the most widely used programming languages. Guido van
Rossum created it, and it was released in 1991.

Python IDEs (Integrated Development Environments) are software tools that


provide developers with a comprehensive environment for writing, testing, and
debugging Python code. These IDEs offer a range of features and
functionalities that make the coding experience more efficient and productive.
Its key features include Code Editor, debugging tools, Version Control
Integration, Testing Frameworks and Project Management.

To make this project possible, python was majorly used to code the needed
functions. A lot of libraries were used from python including keras, matplotlib
and others.

2.4 CONCLUSION
The topic of computer vision is one that is quickly developing and has a wide
range of applications in many different industries. The analysis, comprehension,
and interpretation of visual data are made possible by computer vision thanks to
itsadvanced algorithms and strong software tools.
With its continued development, computer vision has a bright future in a variety
of industries, including healthcare, manufacturing, retail, transportation, and
security. Using computer vision technology can result in increased automation,
better decision-making, and higher productivity.

8
In general, computer vision is a fascinating and transformational field that is
redefining how we interact with visual data and opening new possibilities for
innovation and advancement in the production of reports and the analysis of
visual data.

9
CHAPTER 3
RESULT AND DISCUSSION

3.1 RESULT

Result output for object recognition-

Fig. 3.1 Object Recognition


Our project can identify any number of objects that we add to it.

3.2 DISCUSSION
• Accuracy and Reliability: The accuracy and reliability of object recognition
and obstacle detection algorithms can vary depending on various factors such as
lighting conditions, object complexity, and occlusions. In some cases, these
algorithms may struggle to accurately identify specific objects or detect certain
types of obstacles. False positives or false negatives can occur, potentially
leading to confusion or safety risks for users.

10
• Limited Range and Field of View: The range and field of view of the smart
glasses' sensors can be limited, which means that objects or obstacles outside of
this range may not be detected. Users may still face challenges in identifying
objects or obstacles that are located beyond the range of the sensors. This
limitation can restrict the overall effectiveness of the device, particularly in
large and open environments.
• Social Acceptance and Privacy Concerns: The use of app can raise social
acceptance and privacy concerns. Some individuals may feel uncomfortable
being recorded or observed by wearable devices. The presence of cameras and
sensors can be perceived as invasive, potentially limiting the acceptance and
adoption of these devices in publicspaces.
Continued research, innovation, and user feedback can help address these
limitations and improve the overall effectiveness and usability of smart glasses
for enhanced recognition and detection.
data are very useful for the coverage of large areas such as cities and several
aerials- based approaches have been proposed for the extraction of buildings.

11
CHAPTER 4

FUTURE SCOPE AND CONCLUSION

4.1FUTURE SCOPE
Indeed, object detection is a key ability for most computer and robot vision
system. Although great progress has been observed in the last years and there
will be improvements in the future thanks to the remarkable evolution of
artificial intelligence, and some existing techniques that are now part of many
consumer electronics or have been integrated in assistant driving technologies,
we are still far from achieving human-level performance, in particular in terms of
open-world learning. It should be noted that object detection has not been used
much in many areas where it could be of great help. Hence, we need to consider
that we will need object detection systems for nano-robots or for robots that will
explore areas that have not been seen by humans, such as depth parts of the sea
or other planets, and the detection systems will have to learn to new object
classes as they are encountered. In such cases, a real-time open-world learning
ability will be critical.
The future work scope for smart object recognition and obstacle detection
features is promising and holds several possibilities for further development and
improvement. Here are some potential areas of future work:
• Enhanced Object Recognition: Researchers can focus on improving the
accuracy and efficiency of object recognition algorithms. This involves refining
the algorithms to better handle complex objects, low-light conditions, and
occlusions. Developing machine learning techniques that can adapt and learn
from user feedback to improve recognition capabilities is another area of
interest.
• Advanced Obstacle Detection: Future work can explore the development of
more robust obstacle detection algorithms that can handle dynamic
environments, moving objects, and crowded spaces more effectively. This
could involve integrating multiple sensors, such as depth cameras, LiDAR, or
radar, to enhance the precision and range ofobstacle detection.
• Long-Term Usability and Adaptability: Continuous monitoring and
improvement of long-term usability and adaptability are necessary to assess the
performance and user satisfaction with smart glasses over time. Conducting
longitudinal studies and collecting user feedback can help identify areas for
refinement and ensure that the technology remains effective and relevant as
users' needs evolve.

12
4.2 CONCLUSION
Object recognition technology has experienced decades of development. From
traditional computer vision technology to the current popular deep learning
recognition technology, the recognition effect is more and more accurate, and the
algorithm is more robust. But the better the effect and efficiency of the algorithm
means the improvement of the network and hardware requirements, which is a
relatively large restriction for the application of the algorithm. And the current
recognition algorithm based on deep learning requires a sufficient number of
training sets to train the model to achieve a good recognition effect. This is not
a priority for objects without a large number of similar object data sets, such as
some rare collections in museums. In addition, the current object recognition
almost always requires network transmission of images, andthen transmission of
recognition feedback results through the network, which is much less efficient
in environments with relatively poor network conditions, such as the wild and
crowded indoor environments.
In summary, the future development prospects of object recognition technology
should not be limited to the recognition accuracy, recognition efficiency,
algorithm robustness, etc., but also specific issues should be considered. For
objects with different characteristics and the environment in which the objects
are located, designing corresponding recognition algorithms and considering
the degree of dependence on network and equipment performance should also
be a focus of future research.

13
REFERENCES

[1] Shokoufandeh, A., Keselman, Y., Demirci, M. F., Macrini, D. and Dickinson,
S.J., 2012. Many to many features matching in object recognition: A Review
of threeapproaches. IET Computer Vision,6(6), pp.500–513.2.

[2] Lillywhite, K. and Archibald, J., 2013. A feature construction method for
general object recognition. Pattern Recognition, New York, NY, USA, Elsevier
Science Inc.

[3] Vol. 46, pp.3300–3314

[4] Ren, Shaoqing, Kaiming He, Ross Girshick, and Jian Sun. (2017). “Faster R-
CNN: Towards Real-Time Object Detection with Region Proposal Networks.”
IEEE transactions on pattern analysis and machine intelligence, 39(6): 1137–
49.

[5] Caffe, http://caffe.berkeleyvision.org/

[6] Microsoft Cognitive Toolkit CNTK, https://www.microsoft.com/en-


us/research/product/cognitive-toolkit/

[7] TensorFlow, https://www.tensorflow.org/

14

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy