Orr 29012024
Orr 29012024
BACHELOR OF TECHNOLOGY
(COMPUTER SCIENCE AND ENGINEERING)
To
Submitted By:
Shivangi Srivastava (205459)
Ashish Singh (205418)
Yadav Nandani Ramavatar (205476)
Sumit Tiwari (205468)
i
CANDIDATE’S DECLARATION
ii
CERTIFICATE
iii
ABSTRACT
Object recognition is one of the classic problems in computer vision. It is very important
for computers to recognize the common objects in life like the human brain and the
human eye. This is also an important step in the development of computers in the
direction of intelligence. This article summarizes the current research on object
recognition, and elaborates on several aspects of object feature extraction, object feature
matching and object recognition methods.
Object recognition and tracking is a common task in video processing with multiple
applications including surveillance, security, industrial inspection, medicine, and more.
Recognition and tracking accuracy can drop significantly when the scene dynamic
range exceeds that of common camera sensors, which may lead to decreased tracking
accuracy.
iv
ACKNOWLEDGEMENT
The merciful guidance bestowed to us by the almighty made us stick out this project
to a successful end. I humbly pray with sincere heart for his guidance to continue
forever.
We pay thanks to our project guide who has given guidance and light to me during
this project. His versatile knowledge has caused me in the critical times during the
span of this project.
We also take this opportunity to express our gratitude to all those people who have
been directly and indirectly with us during the completion of theproject.
We want to thank our friends who’ve always encouraged us during this project. At the
last but not least thanks to all the faculty of IT department who provided valuable
suggestions during the period of project.
v
TABLE OF CONTENTS
01 Candidate declaration ii
02 Certificate from supervisor iii
03 Abstract iv
04 Acknowledgement v
05 Table of content vi
06 List of figures vii
vi
LIST OF FIGURES
vii
CHAPTER 1
INTRODUCTION
1.1 INTRODUCTION
Object recognition is a computer vision technique for identifying objects in
images or videos. Object recognition is a key output of deep learning and
machine learning algorithms. When humans look at a photograph or watch a
video, we can readily spot people, objects, scenes, and visual details. The goal
is to teach a computer to do what comes naturally to humans: to gain a level of
understanding of what an image contains.
Object recognition is a key technology behind driverless cars, enabling them to
recognize a stop sign or to distinguish a pedestrian from a lamppost. It is also
useful in a variety of applications such as disease identification in bioimaging,
industrial inspection, and robotic vision.
Image classification involves predicting the class of one object in an image.
Object localization refers to identifying the location of one or more objects in
an image and drawing abounding box around their extent. Object detection
combines these two tasks and localizes and classifies one or more objects in an
image.
1.2 OBJECTIVES
Object recognition is the area of artificial intelligence (AI) concerned with the
abilities of robots and other AI implementations to recognize various things and
entities.
Object recognition allows robots and AI programs to pick out and identify
objects from inputs like video and still camera images. Methods used for object
identification include 3D models, component identification, edge detection and
analysis of appearances fromdifferent angles.
Object recognition constitutes only one of the major visual problems of interest,
but other activities build on information obtained through vision. The most
important one is navigation, that is, moving around in the world with the help of
visual sensors.
Developers and engineers are using these techniques to build futuristic
machines that will deliver groceries and medicines to our doorsteps.
Object detection forms a ground for other important AI vision techniques like
image classification, image retrieval, or object co-segmentation that drives
meaningful information out of real-life objects.
1
1.3 MOTIVATION
2
CHAPTER 2
3
2.1.1 FEATURES OF COMPUTER VISION
Computer vision is a challenging and broad field that includes many aspects
and skills that let machines comprehend and evaluate visual data. The following
are some essential aspects of computer vision:
1. Image Classification
Algorithms for computer vision can categorize images into groups or categories
that have been predetermined. Models can recognize and give the proper
classifications to new, unseen photos by learning from a huge dataset of
labelled images. Applications like face detection, object recognition, and scene
understanding frequently take advantage of this functionality.
2. Object detection
Object detection is the process of locating and identifying objects inside a frame
of an image or video. Multiple instances of an object can be found using
computer vision techniques, which can also provide bounding boxes or
segmentation masks at the pixel level. For tasks like autonomous driving,
monitoring, and augmented reality, object detection is essential.
4
4. Image Restoration and Enhancement
5
11. Scene Understanding
6
2. TensorFlow
3. PyTorch
PyTorch is an additional well-liked open-source deep learning framework that
has been quite well-liked in the domain of computer vision. It offers a flexible
computational graph and comprehensive assistance for creating neural
networks. The user-friendly interface and broad community support of PyTorch
are well known.
4. Caffe
The Berkeley Vision and Learning Centre (BVLC) created the Caffe deep
learning framework. It is frequently used for computer vision tasks like image
classification and object recognition and is particularly well known for its
effectiveness in training and deploying convolutional neural networks (CNNs).
5. MATLAB
For computer vision applications, MATLAB is a proprietary programming
language and development environment that is frequently utilized in academic
and research contexts. For image processing, feature extraction, and algorithm
creation, it offers a complete collection of tools and operations.
6. Microsoft Cognitive Toolkit (CNTK)
Also referred to as CNTK, the Microsoft Cognitive Toolkit is a deep learning
platform created by Microsoft. It allows distributed training across numerous
GPUs and processors and provides effective deep learning algorithm
implementations. Computer vision applications such as object identification and
image recognition have both employed CNTK.
7. Scikit-image
Scikit-image is an open-source Python library for tasks involving image
processing and computer vision. It is a popular option for researchers and
developers working on computer vision projects since it offers a wide range of
features for image editing, filtering, segmentation, and feature extraction.
8. Darknet
Darknet is a C and CUDA-based open-source neural network framework. It is
frequently employed for tasks involving the detection and identification of
7
objects, particularly in real-time or embedded system applications. The YOLO
(You Only Look Once) algorithm, a quick and effective object identification
method, is one of Darknet'smost well-known features.
To make this project possible, python was majorly used to code the needed
functions. A lot of libraries were used from python including keras, matplotlib
and others.
2.4 CONCLUSION
The topic of computer vision is one that is quickly developing and has a wide
range of applications in many different industries. The analysis, comprehension,
and interpretation of visual data are made possible by computer vision thanks to
itsadvanced algorithms and strong software tools.
With its continued development, computer vision has a bright future in a variety
of industries, including healthcare, manufacturing, retail, transportation, and
security. Using computer vision technology can result in increased automation,
better decision-making, and higher productivity.
8
In general, computer vision is a fascinating and transformational field that is
redefining how we interact with visual data and opening new possibilities for
innovation and advancement in the production of reports and the analysis of
visual data.
9
CHAPTER 3
RESULT AND DISCUSSION
3.1 RESULT
3.2 DISCUSSION
• Accuracy and Reliability: The accuracy and reliability of object recognition
and obstacle detection algorithms can vary depending on various factors such as
lighting conditions, object complexity, and occlusions. In some cases, these
algorithms may struggle to accurately identify specific objects or detect certain
types of obstacles. False positives or false negatives can occur, potentially
leading to confusion or safety risks for users.
10
• Limited Range and Field of View: The range and field of view of the smart
glasses' sensors can be limited, which means that objects or obstacles outside of
this range may not be detected. Users may still face challenges in identifying
objects or obstacles that are located beyond the range of the sensors. This
limitation can restrict the overall effectiveness of the device, particularly in
large and open environments.
• Social Acceptance and Privacy Concerns: The use of app can raise social
acceptance and privacy concerns. Some individuals may feel uncomfortable
being recorded or observed by wearable devices. The presence of cameras and
sensors can be perceived as invasive, potentially limiting the acceptance and
adoption of these devices in publicspaces.
Continued research, innovation, and user feedback can help address these
limitations and improve the overall effectiveness and usability of smart glasses
for enhanced recognition and detection.
data are very useful for the coverage of large areas such as cities and several
aerials- based approaches have been proposed for the extraction of buildings.
11
CHAPTER 4
4.1FUTURE SCOPE
Indeed, object detection is a key ability for most computer and robot vision
system. Although great progress has been observed in the last years and there
will be improvements in the future thanks to the remarkable evolution of
artificial intelligence, and some existing techniques that are now part of many
consumer electronics or have been integrated in assistant driving technologies,
we are still far from achieving human-level performance, in particular in terms of
open-world learning. It should be noted that object detection has not been used
much in many areas where it could be of great help. Hence, we need to consider
that we will need object detection systems for nano-robots or for robots that will
explore areas that have not been seen by humans, such as depth parts of the sea
or other planets, and the detection systems will have to learn to new object
classes as they are encountered. In such cases, a real-time open-world learning
ability will be critical.
The future work scope for smart object recognition and obstacle detection
features is promising and holds several possibilities for further development and
improvement. Here are some potential areas of future work:
• Enhanced Object Recognition: Researchers can focus on improving the
accuracy and efficiency of object recognition algorithms. This involves refining
the algorithms to better handle complex objects, low-light conditions, and
occlusions. Developing machine learning techniques that can adapt and learn
from user feedback to improve recognition capabilities is another area of
interest.
• Advanced Obstacle Detection: Future work can explore the development of
more robust obstacle detection algorithms that can handle dynamic
environments, moving objects, and crowded spaces more effectively. This
could involve integrating multiple sensors, such as depth cameras, LiDAR, or
radar, to enhance the precision and range ofobstacle detection.
• Long-Term Usability and Adaptability: Continuous monitoring and
improvement of long-term usability and adaptability are necessary to assess the
performance and user satisfaction with smart glasses over time. Conducting
longitudinal studies and collecting user feedback can help identify areas for
refinement and ensure that the technology remains effective and relevant as
users' needs evolve.
12
4.2 CONCLUSION
Object recognition technology has experienced decades of development. From
traditional computer vision technology to the current popular deep learning
recognition technology, the recognition effect is more and more accurate, and the
algorithm is more robust. But the better the effect and efficiency of the algorithm
means the improvement of the network and hardware requirements, which is a
relatively large restriction for the application of the algorithm. And the current
recognition algorithm based on deep learning requires a sufficient number of
training sets to train the model to achieve a good recognition effect. This is not
a priority for objects without a large number of similar object data sets, such as
some rare collections in museums. In addition, the current object recognition
almost always requires network transmission of images, andthen transmission of
recognition feedback results through the network, which is much less efficient
in environments with relatively poor network conditions, such as the wild and
crowded indoor environments.
In summary, the future development prospects of object recognition technology
should not be limited to the recognition accuracy, recognition efficiency,
algorithm robustness, etc., but also specific issues should be considered. For
objects with different characteristics and the environment in which the objects
are located, designing corresponding recognition algorithms and considering
the degree of dependence on network and equipment performance should also
be a focus of future research.
13
REFERENCES
[1] Shokoufandeh, A., Keselman, Y., Demirci, M. F., Macrini, D. and Dickinson,
S.J., 2012. Many to many features matching in object recognition: A Review
of threeapproaches. IET Computer Vision,6(6), pp.500–513.2.
[2] Lillywhite, K. and Archibald, J., 2013. A feature construction method for
general object recognition. Pattern Recognition, New York, NY, USA, Elsevier
Science Inc.
[4] Ren, Shaoqing, Kaiming He, Ross Girshick, and Jian Sun. (2017). “Faster R-
CNN: Towards Real-Time Object Detection with Region Proposal Networks.”
IEEE transactions on pattern analysis and machine intelligence, 39(6): 1137–
49.
14