0% found this document useful (0 votes)
15 views24 pages

Arvr Unit 1

The document provides an overview of Augmented Reality (AR) and Virtual Reality (VR), detailing their definitions, components, and differences. It discusses the three types of VR experiences—non-immersive, semi-immersive, and fully immersive—and outlines the benefits of VR in education, training, and entertainment. Additionally, it covers the key technologies and hardware involved in AR, including head-mounted displays, sensors, and various applications across different industries.

Uploaded by

abiderdude123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views24 pages

Arvr Unit 1

The document provides an overview of Augmented Reality (AR) and Virtual Reality (VR), detailing their definitions, components, and differences. It discusses the three types of VR experiences—non-immersive, semi-immersive, and fully immersive—and outlines the benefits of VR in education, training, and entertainment. Additionally, it covers the key technologies and hardware involved in AR, including head-mounted displays, sensors, and various applications across different industries.

Uploaded by

abiderdude123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

UNIT I: AUGMENTED REALITY, VIRTUAL REALITY WITH AI

Introduction to Virtual Reality – Definition – Three I’s of Virtual Reality – Virtual Reality Vs
3D Computer Graphics – Benefits of Virtual Reality – Components of VR System –
Introduction to AR – System Structure of Augmented Reality – Key Technology in AR – 3D
Vision – Approaches to Augmented Reality – Alternative Interface Paradigms – Spatial AR
– Input Devices – 3D Position Trackers – Performance Parameters – Types Of Trackers –
Navigation and Manipulation Interfaces – Gesture Interfaces – Types of Gesture Input
Devices – Output Devices – Graphics Display – Human Visual System – Personal
Graphics Displays – Large Volume Displays – Sound Displays – Human Auditory System.

1.1 Introduction to Virtual Reality

 Virtual Reality (VR) is the use of computer technology to create a simulated


environment.
 VR’s most recognizable component is the Head-Mounted Display (HMD).
 Human beings are visual creatures, and display technology is often the single
biggest difference between immersive VR systems and traditional user interfaces.
 Major players in VR include HTC Vive, Oculus Rift and PlayStation VR (PSVR).
 All three types of VR, from non-immersive, semi-immersive, full immersive or a
mixture of them, are also referred to as extended reality (XR).
 Three types of virtual reality experiences provide different levels of computer-
generated simulation.

1.2 Three I’s of Virtual Reality

1. Non-Immersive Virtual Reality


 This category is often overlooked as VR simply because it’s so common.
 Non-immersive VR technology features a computer-generated virtual environment
where the user simultaneously remains aware and controlled by their physical
environment.
 Video games are a prime example of non-immersive VR.
2. Semi-Immersive Virtual Reality
 This type of VR provides an experience partially based in a virtual environment.
 This type of VR makes sense for educational and training purposes with graphical
computing and large projector systems, such as flight simulators for pilot trainees.
3. Fully Immersive Virtual Reality
 This type of VR generates the most realistic simulation experience, from sight to
sound to sometimes even olfactory sensations.
 Car racing games are an example of immersive virtual reality that gives the user the
sensation of speed and driving skills.
 Developed for gaming and other entertainment purposes
 VR use in other sectors is increasing.

1.3 Virtual Reality Vs 3D Computer Graphics

 VR is a technology that allows users to fully immerse themselves in an artificially


created environment.
 Users wear a VR headset, which displays a 3D image in front of their eyes.
 The headset also tracks the user's head movements, so that the image changes as
the user looks around.
1
 This creates the illusion that the user is actually inside the virtual environment.
 3D, on the other hand, refers to any image or object that has three-dimensional
properties, such as width, height, and depth.
 This can include movies, video games, and other forms of media that are displayed
on a screen.
 However, with 3D the user is not fully immersed in the environment, they are simply
viewing a 3D representation of it on a 2D screen.
 In summary, VR is a technology that allows users to fully immerse themselves in a
virtual environment, while 3D is a term used to describe any image or object that
has three-dimensional properties.
 In VR, the user wears a VR headset that completely covers their eyes and displays
a virtual environment in front of them.
 The headset is equipped with sensors to track the user's head movements, allowing
them to look around and explore the virtual world.
 The user may also wear handheld controllers or use other input devices to interact
with objects and navigate through the virtual space.
 This immersive experience gives the user a sense of presence and the feeling of
being physically present in the virtual environment.

3D Graphics

 3D computer graphics represent the creation and rendering of three-dimensional


objects and scenes using computer software.
 The images are typically displayed on a 2D screen, such as a computer monitor or
a television.
 These graphics can be generated for various purposes, including movies, video
games, architectural visualizations, and more.
 While they provide depth and a sense of realism, the user does not have an
immersive experience or interact directly with the virtual objects.
 To summarize, Virtual Reality (VR) involves wearing a headset and being fully
immersed in a virtual environment, while 3D computer graphics refer to the creation
of three-dimensional objects and scenes displayed on a 2D screen.
 VR provides an interactive and immersive experience, whereas 3D computer
graphics are primarily visual representations

1.4 Benefits of Virtual Reality

 VR helps in exploring places without actually being there


 VR refers to the imaginary environment made with the help of technology with an
essence of reality.
 It helps therefore to explore various places without even going to that place.
 This has made the life of people much easier and more entertaining.
 The fact that one has not got enough money to explore everything has made
possible the entrance of virtual reality.
 The education system has been improved:
 The old text-based learning has now been replaced by virtual study in which a
teacher teaches the student with the help of virtual reality equipment.
 This equipment allows the user to see the imaginary environment based on the
topic and analyze the study.
 It creates a realistic world: Virtual reality creates an imaginary world for the user
based on the topics to study or for entertainment.
2
 Although the Virtual Reality created is imaginary, it seems to the user as if it is an
actual real world.
 Therefore help them to have a better experience of the current reality although
everything may be unreal.
 Help in providing training: A lot of people who are not skilled in different fields of
work can get training in the virtual environment.
 E.g., engineering requires practical knowledge therefore for the application of
knowledge virtual reality technology can be used.
 Virtual Reality is a platform to undergo the situation that a person has studied in
their textbooks.
 Practical knowledge is more interesting and exciting as compared to book reading
 Cost-effective: Only at the time of installation of Virtual Reality technology do
expenses occur but after that, the maintenance and cost per person become almost
low thus it is cost-effective.

1.5 Basic components

1. HMDs [Head Mounted Displays]:

 It is a display that consists of two screens that display the virtual world in front of
the users.
 They have motion sensors that detect the orientation and position of your head and
adjust the picture accordingly.
 It is built-in headphones or external audio connectors to output sound.
 Moreover, they have a blackout blindfold to ensure the users are fully disconnected
from the outside world.

2. Computing device:

 It is a strong, powerful machine that processes and creates the 3D world.


 All other input devices pass their data onto it, it tracks the user movement and
renders all the graphics.
 Computing devices should have a large amount of RAM, a good GPU, a powerful
CPU, and a sufficient storage device.

3. Sensor(s):

 Sensors are mostly incorporated into the headset of VR.


 They track users’ poses and their head position, detect movement and rotation, and
then pass all this data to the VR processor/computing device.
 Because of these sensors, the user can interact with the virtual environment.
 VR depends upon several sensors, including accelerometers, gyroscopes,
magnetometers, and 6DoF.

3
 Input devices: Input devices are used by users in the VR system to interact with the
virtual world in front of them.
 These devices might be a tool or a weapon in their artificial world.
 The input devices include mice, controllers, joysticks, gloves with sensors, and
body tracking systems.

4. Audio systems:

 Audio systems have a particularly important job in VR, ensuring a great VR


experience in which users’ brain is forced to think like they are in that artificial
world.
 They are mostly integrated inside the HMD.
 VR provides spatial audio, so the users feel how real the virtual world is.

5. Software:

 Software is a crucial part of VR systems.


 The software is an application designed that runs on VR hardware and creates an
artificial world.
 There are several different types of software based on what users need.
 For example, games, simulations, medical ecosystems, etc.

1.6 Introduction to Augmented Reality [AR]

Augmented reality (AR):

Definition: It is an interactive experience that combines the real world and computer-
generated content. The content can distance multiple sensory modalities,
including visual, auditory, etc.

System structure of AR

This Architecture comprised of all above components and interactive relationship between
them helps to develop augmented reality working model.

1. User: The most essential part of augmented reality is its user. The user can be a
student, doctor and employee. This user is responsible for creation of AR models.
2. Interaction: It is a process between device and user. The word itself consists of its
meaning some action perform by one entity as result in creation or some action
performed by other entity.
3. Device: This component is responsible for creation, display and interaction of 3D
models. The device can be portal or in static state. Example, mobile, computer, AR
headsets etc.
4. Virtual Content: The virtual content is nothing but the 3D model created or
generated by the system or AR application. Virtual content is type of information
that can be integrated in real world user’s environment. This Virtual content can be
3D models, texture, text, images etc.
4
5. Tracking: This component is basically process which makes possible creation of AR
models. Tracking is sort of algorithm which help to determine the device where to
place or integrate the 3D model in real world environment. There are many types of
Tracking algorithm available which can be used in development of AR applications.
6. Real-life entity: The last component AR architecture is real world entities. These
entities can be tree, book, fruits, computer or anything which is visible in screen. AR
application does not change position of real life entity. It only integrate the digital
information with this entities

History of AR

 Thomas Caudell [Boeing Computer Services] coined the term augmented reality in
1990 to describe how the head-mounted displays that electrician’s use when
assembling complicated wiring harnesses worked.
 In 1998, One of the first commercial applications of augmented reality technology
was the yellow first down marker that began appearing in televised football games
 Today, Google Glass, smartphone games and heads-up displays (HUDs) in car
windshields are the most well-known consumer AR products.
 But the technology is also used in many industries, including healthcare, public
safety, gas and oil, tourism and marketing.

How does AR work?

 The technology requires hardware components, such as a processor, sensors, a


display and input devices.
 AR delivered through contact lenses is also being developed.
 Mobile devices already typically have this hardware available, with sensors
including cameras, accelerometers, Global Positioning System (GPS) and solid-
state compasses.
 A GPS is used to pinpoint the user's location, and its range is used to detect device
orientation, for example.
o Sophisticated AR programs used by the military for training can also include
machine vision, object recognition and gesture recognition.
 AR can be computationally intensive, so if a device lacks processing power, data
processing can be offloaded to a different machine.
 AR apps are written in special 3D programs that enable developers to tie animation
or contextual digital information in the computer program to an AR marker in the
real world.
 When a computing device's AR app or browser plugin receives digital information
from a known marker, it begins to execute the marker's code and layer the correct
image or images.

Top AR use cases

AR can be used in the following ways:

 Retail: Consumers can use a store's online app to see how products, such as
furniture, will look in their own homes before buying.
 Entertainment and gaming. AR can be used to overlay a virtual game in the real
world or enable users to animate their faces in different and creative ways on social
media.

5
 Navigation. AR can be used to overlay a route to the user's destination over a live
view of a road. AR used for navigation can also display information about local
businesses in the user's immediate surroundings.
 Tools and measurement. Mobile devices can use AR to measure different 3D
points in the user's environment.
 Architecture. AR can help architects visualize a building project.
 Military. Data can be displayed on a vehicle's windshield that indicates destination
directions, distances, weather and road conditions.
 Archaeology. AR has aided archaeological research by helping archaeologists
reconstruct sites. 3D models help museum visitors and future archaeologists
experience an excavation site as if they were there.

1.7 Key Technology in AR

Hardware
 Hardware components for augmented reality are: a processor, display, sensors and
input devices.
 Modern mobile computing devices like smartphones and tablet computers contain
these elements, which often include a camera and micro electromechanical systems
 The main 2 AR Techniques are 1) diffractive waveguides and 2) reflective waveguides.

Display
 Various technologies are used in AR rendering, including optical projection systems,
monitors, handheld devices, and display systems, which are damaged on the human
body.

HMD
 HMD is a display device worn on the forehead, such as a harness or helmet-mounted.
 HMDs place images of both the physical world and virtual objects over the user's field
of view.
 Modern HMDs often employ sensors for six degrees of freedom monitoring that allow
the system to align virtual information to the physical world and adjust accordingly with
the user's head movements.
 HMDs can provide VR users with mobile and collaborative experiences.
 Specific providers, such as uSens and Gestigon, include gesture controls for full virtual
immersion.

Eyeglasses
 AR displays can be rendered on devices resembling eyeglasses.
 Versions include eyewear that employs cameras to intercept the real world view and
re-display its augmented view through the eyepieces

HUD [Head-Up Display]

 It is a transparent display that presents data without requiring users to look away from
their usual viewpoints.
 A precursor technology to augmented reality, heads-up displays were first developed
for pilots in the 1950s, projecting simple flight data into their line of sight, thereby
enabling them to keep their "heads up" and not look down at the instruments.
 Near-eye augmented reality devices can be used as portable head-up displays as they
can show data, information, and images while the user views the real world.

6
 Practically, AR is expected to include registration and tracking between the
superimposed perceptions, sensations, information, data, and images and some
portion of the real world.

Contact lenses

 Contact lenses that display AR imaging are in development.


 These bionic contact lenses might contain the elements for display embedded into the
lens including integrated circuitry, LEDs and an antenna for wireless communication.
 The first contact lens was in 1999 by Steve Mann and was intended to work in
combination with AR spectacles

Virtual retinal display (VRD)


 It is a personal display device under development at the University of Washington's
Human Interface Technology Laboratory under Dr. Thomas A.

EyeTap
 The EyeTap (also known as Generation-2 Glass) captures rays of light that would
otherwise pass through the centre of the lens of the wearer's eye, and substitutes
synthetic computer-controlled light for each ray of real light.

Handheld
 A Handheld display employs a small display that fits in a user's hand. All handheld AR
solutions to date opt for video see-through.

Projection mapping
 Projection mapping augments real-world objects and scenes, without the use of special
displays such as monitors, head-mounted displays or hand-held devices.
 Projection mapping makes use of digital projectors to display graphical information
onto physical objects.
 The key difference in projection mapping is that the display is separated from the users
of the system.

Tracking

 Modern mobile AR systems use one or more of the following motion tracking
technologies: digital cameras and/or other optical sensors, accelerometers, GPS,
gyroscopes, solid state compasses, radio-frequency identification (RFID).
 These technologies offer varying levels of accuracy and precision.
 These technologies are implemented in the ARKit API by Apple and ARCore API by
Google to allow tracking for their respective mobile device platforms.

Networking

 Mobile augmented reality applications are gaining popularity because of the wide
adoption of mobile and especially wearable devices.
 This requires computationally intensive algorithms with extreme latency requirements.
 To compensate for the lack of computing power, offloading data processing to a distant
machine is often desired.

7
Input devices

 Speech recognition systems that translate a user's spoken words into computer
instructions
 Gesture recognition systems that interpret a user's body movements by visual
detection or from sensors embedded in a peripheral device such as a wand, stylus,
pointer, glove Products which are trying to serve as a controller of AR headsets include
Wave by Seebright Inc. and Nimble by Intugine Technologies.

Computer

 The computer analyzes the sensed visual and other data to synthesize and position
augmentations.
 Computers are responsible for the graphics that go with AR.
 Augmented Reality uses a computer-generated image which has a striking effect on
the way the real world is shown.

Projector

 The projector can throw a virtual object on a projection screen and the viewer can
interact with this virtual object.
 Projection surfaces can be many objects such as walls or glass panes.

Software and algorithms

 The software must derive real world coordinates, independent of camera, and camera
images.
 That process is called image registration, and uses different methods of computer
vision, mostly related to video tracking.
 Many computer vision methods of AR are inherited from visual odometry.
 An augogram is a computer generated image that is used to create AR.
 Augography is the science and software practice of making augograms for AR.

Augmented Reality Markup Language (ARML)

 It is a data standard developed within the Open Geospatial Consortium (OGC), which
consists of Extensible Markup Language (XML) grammar to describe the location and
appearance of virtual objects in the scene, as well as ECMAScript bindings to allow
dynamic access to properties of virtual objects.

Development

 The implementation of augmented reality in consumer products requires considering


the design of the applications and the related constraints of the technology platform.
 Since AR systems rely heavily on the immersion of the user and the interaction
between the user and the system, design can facilitate the adoption of virtuality.

Environmental/context design

 Context Design focuses on the end-user's physical surrounding, spatial space, and
accessibility that may play a role when using the AR system.

8
 Designers should be aware of the possible physical scenarios the end-user may be in
such as:

Interaction design

 Interaction design in augmented reality technology centers on the user's engagement


with the end product to improve the overall user experience and enjoyment. The
purpose of interaction design is to avoid alienating or confusing the user by organizing
the information presented.

Visual design

 Visual design is the appearance of the developing application that engages the user.
 To improve the graphic interface elements and user interaction, developers may use
visual cues to inform the user what elements of UI are designed to interact with and
how to interact with them.
 Since navigating in an AR application may appear difficult and seem frustrating, visual
cue design can make interactions seem more natural.

1.8 3D Vision

 “3D Vision,” depth perception is dependent on the ability to use both eyes together at
the highest level.
 3D vision relies on both eyes working together to accurately focus on the same point in
space.
 The brain is then able to interpret the image the each eye sees to create your
perception of depth.
 Deficiencies in depth perception can result in a lack of 3D vision or headaches and
eyestrain during 3D movies.

SIGNS OF 3D VISION SYNDROME

 3D Vision Syndrome describes a condition that many individuals experience when


watching 3D movies or watching a 3D TV and may be the result of underlying binocular
vision dysfunction (eye teaming difficulties).

 3D Vision Syndrome can be remembered by the 3 D’s:

 Discomfort: 3D technology requires the viewer to focus their eyes either in front of
the screen (by converging) or behind the screen (by diverging). If you have
difficulties with convergence, such as convergence insufficiency, it can result in
headaches, eyestrain, or fatigue.
 Dizziness: Often times people will complain of dizziness or nausea after viewing 3D
material. One reason for these complaints is that some of the technology used to
create 3D can worsen Visual Motion Hypersensitivity (VMH).
 Depth: Some individuals describe 3D as “popping off the screen” or “coming right at
them”, while others only see a faintly raised image or a flat image that resembles a
traditional screen. This lack or absence of depth is one of the signs that the
binocular vision system is not functioning properly. People who are unable to use
both eyes together to achieve binocularity will not see the depth of 3D content.

9
1.9 Approaches to Augmented Reality

Augmented Reality (AR) is a technology that overlays digital information, such as


images, videos, or 3D objects, onto the real world. There are several approaches to
implementing AR, depending on the hardware and software used. Here are some of the
main approaches to augmented reality:

Marker-based AR:

Marker-based AR relies on physical markers or symbols, such as QR codes or


fiducial markers, to trigger the display of digital content when detected by a camera-
equipped device, like a smartphone or AR headset.
When the marker is recognized, the AR application can superimpose digital objects,
animations, or information onto the marker's location in the real world.

Markerless AR (Location-based AR):

Markerless AR, also known as location-based AR, uses the device's sensors (e.g.,
GPS, accelerometers, and compass) to determine the user's location and orientation.
This approach can be used to display location-specific information, such as points of
interest, directions, or geolocated content, on a mobile device or AR headset.

Projection-based AR:

Projection-based AR projects digital content onto physical surfaces or objects in the


real world using projectors.
This approach is often used in interactive installations, art exhibitions, and marketing
campaigns to create dynamic and immersive experiences.

SLAM (Simultaneous Localization and Mapping) AR:

SLAM technology enables AR devices to understand and map their surroundings in


real-time by combining data from sensors like cameras, depth sensors, and inertial
measurement units (IMUs).
This approach allows AR devices to place digital objects accurately in the user's
environment and maintain their position as the user moves.

Head-mounted AR:

Head-mounted AR devices, such as AR glasses or headsets, provide a more


immersive AR experience by overlaying digital content directly into the user's field of view.
These devices often use cameras and sensors to understand the environment and track
the user's head movements, enabling a natural and interactive AR experience.

Smartphones and Tablets:

AR apps for smartphones and tablets use the device's camera and sensors to
provide AR experiences. They can recognize images, surfaces, or objects and overlay
digital content on the device's screen.
Popular platforms like Apple's ARKit and Google's ARCore provide tools and frameworks
for developing AR applications on mobile devices.

10
Web-based AR:

Web-based AR allows users to access AR experiences through web browsers


without the need for downloading and installing dedicated apps. Technologies like WebXR
enable developers to create AR applications that can run directly in a web browser,
expanding the reach of AR to a broader audience.

Wearable AR:

Wearable AR devices, including smart glasses, offer hands-free AR experiences,


often in professional settings like industrial maintenance, healthcare, and training.
These devices are designed to enhance productivity and provide real-time information and
guidance to the user.

Each approach to augmented reality has its own advantages and limitations, and
the choice of which to use depends on the specific use case, hardware, and user
requirements. AR technology continues to evolve, opening up new possibilities and
applications across various industries.

1.10 Alternative Interface Paradigms

Augmented Reality (AR) offers various interface paradigms that go beyond traditional
graphical user interfaces (GUIs) to provide more immersive and interactive experiences.

Here are some alternative interface paradigms for AR:

Gesture-Based Interfaces:

Gesture-based AR interfaces allow users to interact with digital objects in the real world
using hand gestures or body movements. Devices like Microsoft's HoloLens and Leap
Motion have popularized this approach.
Users can control, manipulate, or select virtual objects by performing specific hand or
body movements, making AR more intuitive and engaging.

Voice Commands:

Voice commands in AR enable users to interact with digital content and applications using
natural language. Virtual assistants like Siri, Google Assistant, and AR-specific voice
command systems can recognize and respond to spoken instructions.
This interface paradigm is particularly useful for hands-free and eyes-free interaction in
situations where touch or gesture-based input may be impractical.

Touch and Tap Interfaces:

AR devices with touch-sensitive surfaces or handheld controllers provide users with the
ability to touch, tap, and interact with virtual objects as if they were physical. For example,
AR glasses may have touch-sensitive frames or handheld devices for interaction.
Users can interact with objects, select options, and navigate through menus by physically
touching or tapping the AR interface.
Brain-Computer Interfaces (BCI):

11
Emerging BCIs can be used to control AR systems using brain signals, bypassing the
need for physical input devices. These interfaces are still in experimental stages but have
the potential to offer a high degree of control and personalization.
BCIs can interpret brainwave patterns to trigger actions, navigate menus, or select objects
in the AR environment.

Eye-Tracking Interfaces:

Eye-tracking technology can be integrated into AR headsets to determine where a user is


looking. This information can be used for various purposes, such as selecting objects,
adjusting focus, or providing additional information based on the user's gaze.
Eye-tracking interfaces can enhance user engagement and streamline interactions in AR
applications.

Haptic Feedback:

Haptic feedback provides tactile sensations to the user, enhancing the sense of touch in
AR experiences. It can be delivered through vibrations, force feedback, or other tactile
feedback mechanisms.
In AR, haptic feedback can simulate the feeling of interacting with virtual objects, providing
a more immersive experience.

Augmented Reality Markup Language (ARML):

ARML is a language designed for creating augmented reality experiences that include
interactive 3D models, animations, and information overlays. It allows developers to define
the structure and behavior of AR content. ARML can be used to create interactive AR
applications that respond to user actions, such as selecting, moving, or resizing virtual
objects.

Social and Multi-User Interfaces:

Social AR interfaces focus on enabling collaborative and multiplayer AR experiences.


Users can interact with each other and share digital content in the same physical space.
Multi-user AR interfaces often incorporate avatars and real-time communication, allowing
users to collaborate, play games, or engage in shared activities in a mixed-reality
environment. These alternative interface paradigms for augmented reality aim to provide
more natural, intuitive, and immersive interactions, enhancing the potential of AR for a
wide range of applications, from gaming
and education to industrial use cases and
social collaboration. The choice of interface
paradigm depends on the specific use
case, hardware capabilities, and user
preferences.

1.11 Spatial AR

A special type of Augmented Reality


technology where the combination of virtual
and real objects is produced by projecting
virtual images onto real objects using projection
mapping. Hence, display monitors, head-
mounted displays or hand-held devices are not
typically used in this type of AR
12
Some Examples

Augmented Reality with Scene Recognition


Augmented Reality (AR) comes with different tracking types. The most advanced
type of experience is making use of scene recognition.

When designing this kind of experience, a detailed 3D scan of a real-life location


needs to be created first. Inside this model, AR content can be positioned with very
high precision. This approach allows for complex experiences like scene
recognition without the need for a target or very precise location tracking. Example
Tyre pressure, oil, etc.,

Instant Tracking with SLAM

Instant tracking allows the user to


place content inside of scenes without the
need for an image or a surface. Instant
tracking does not rely on object recognition
or images as it tracks features of the
physical environment. It makes use of
SLAM (Simultaneous Localization and
Mapping) which is capable of understanding
the physical reality of a scene. This makes it
more precise than GPS tracking and perfect
for indoor usage.

Spatial AR can be used, whenever precision


is key to the experience. Below you will find
a few different use cases that make use of
Spatial AR in home and its advanced possibilities.

13
1.13 3D Position Trackers

 3D Tracking: Much like vehicle GPS navigation, coordinate data are used to show
where an object is in 3D space, and where it needs to go next.
 With vehicle GPS navigation, the car is moving and the destination is fixed.
 A map supplies the travel route.
 However, only a 2D (planar) view of the vehicle’s position is available, and only
movement along the longitude and latitude lines (X and Y axes, respectively) is
shown.
 Altitude info (Z-axis) is missing, as is rotational data – only two degrees-of-freedom
are reported.
 A view that limited cannot support complex OEM surgical navigation applications

14
 The Polaris optical measurement solution, and Aurora and 3D Guidance
electromagnetic (EM) tracking solutions capture the position (X-Y-Z coordinate
data) and orientation (roll, pitch, yaw) of an optical navigation marker or EM sensor
in relation to a fixed object or reference point in 3D space.
 Position and orientation measurements also refer to the “degrees of freedom”
(DOF) in which an object moves in 3D space.
 There are six degrees of freedom in total; NDI’s solutions capture all six degrees of
freedom in real-time.
 This type of technology, known as 3D measurement, spatial measurement, or 3D
motion tracking, can be used for real-time tool tracking and navigation purposes.

Motion Tracking in All Directions

 3D tracking technology provides that needed detailed view by digitizing, formatting,


and enabling visualization measurement data.
 Positional movement of the optical navigation marker or EM sensor can be tracked
on the X, Y and Z axes of a 3D coordinate system.
 Rotation (roll, pitch and yaw) on these axes is calculated as orientation data.
 Movement in all directions is known, from any angle/perspective.
 This movement is reported in relation to a fixed object or reference frame; i.e., a
‘home’ location.
 Multiple objects—and their respective locations to each other—can be dynamically
tracked at once.
 To apply the GPS analogy to surgical navigation applications, patient imaging
datasets represent the map.
 The target/treatment site is the destination.
 And an EM sensor embedded into an OEM surgical instrument such as a catheter,
or a medical instrument fitted with optical markers act as the vehicle.
 3D tracking technology can enable the position and orientation of the catheter or
instrument to be known in relation to its home/start location and destination as it’s
navigated through the body.
 The catheter or instrument’s path (route) is visualized, planned, navigated, and
presented in real-time to the clinician in the host OEM software interface.
 3D tracking technology bridges the gap between static patient images and dynamic
instrument movements; it brings the physical world into digital interfaces.
 The live stream of measurement data allows instruments to be shown at the right
place, at the right time.
 As with GPS navigation, the value of 3D tracking technology is tied to its accuracy.
 It’s the difference between arriving exactly at your destination or being off by miles
– or millimetres in surgical navigation applications

1.14 Types of 3D Trackers

1 Single-Point Tracking

 This beginner-friendly motion tracking technique is perfect for getting your feet wet in
the world of motion tracking.
 A single-point tracker refers to tracking an object using a single point of reference
within a composition.
 How it works: In single-point tracking, the motion tracking software is given a single
point in a clip to focus on and tracks the movement of the camera around that single
point.
15
2. Two-Point Tracking

 Two-point tracking allows you to apply two different points of motion tracking to an
image and track more than one type of movement.
 How it works: In two-point tracking, you apply two separate tracking points to an image
and use each one to track a different type of motion.

3. Four-Point Tracking

 So, you’ve mastered single-point and two-point tracking, it’s time to move up to
something a bit more challenging, but very useful: four-point tracking.
 How it works: Also known as “corner pin tracking” or “perspective tracking,” four-point
tracking allows you to track each corner of a four-point surface throughout a shot (such
as a smartphone screen).

4. Planar tracking

 Planar tracking is one of the most effective forms of motion tracking, but does require
you to be very comfortable with motion tracking tools to be able to use. However, the
results speak for themselves as to why this form of tracking is well-worth the effort to
learn.
 How it works: Planar tracking utilizes the Mocha plug-in (included with After Effects) to
track a plane or a flat surface through a motion.

5. Spline tracking

 This complex motion tracking technique is one of the most accurate of all techniques,
but comes with a significant learning curve.
 How it works: Spline tracking allows you to trace around an object that you want to
track instead of focusing on a single or set of points. This creates a custom 2D object
that After Effects will try to track.

6. 3D Camera Tracking

 Last, but certainly not least, is 3D camera tracking, a feature of After Effects that has
gained mounting popularity by hobbyists and pro-visual effects artists alike.
 How it works: The 3D tracking tool automatically generates dozens of possible motion
tracking points, then allows the user to select which points they would like to track. This
takes a lot of the manual labor out of setting tracking points, but does require quite a bit
of time and processing power to use.

1.15 Navigation and Manipulation Interfaces

16
1.16 Gesture Interfaces

Gesture recognition is the process by which gestures formed by a user are made known to
the system. In completely immersive VR environments, the keyboard is generally not
included, and some other means of control over the environment is needed

Fig1: Gesture interface Fig 2: Hand points

A rough overview of the main parts that make up the framework is shown in Fig1. The
framework supports an easy swap of the Hand Tracker component, allowing many hand
tracking solutions to be used with the framework. The Gesture Recorder allows recording
and saving gestures that are then stored in a set of known gestures. These gestures are
then recognized by the Gesture Recognizer by comparing stored data with live data from
the hand tracker. A Gesture Interpreter is used to communicate with the desired
application using events that inform when gestures are performed.

The hand detection in the proposed framework is realized using a hand tracker built into
an HMD. The tracking part is crucial and serves as an entry point to the gesture
recognition framework as it is responsible for accurate pose matching. The hand tracker
primarily used in the development of the framework provides 23 points for each hand
(see Fig 2. Other configurations are supported as well. The more points a tracking device
provides, the more fine-grained the gesture recognition is
requires two primary features for recognizing one-handed gestures:
 Joint Positions for static hand shape detection and matching;
 Finger Tip Positions for recording spatial information required to perform dynamic
gestures.

Hand poses and shapes can be stored for later recognition of static hand gestures,
e.g., while a user is performing the hand movement, it can be matched against a
predefined set of hand shapes. In order to increase the recognition performance for users
with hands below or above the average human hand size, the hand positions are adjusted
by a scaling factor. This normalization is necessary to increase the recognition accuracy
for gestures that were recorded by a different user than the user performing the gestures.
To achieve this, each joint position is divided by the hand scaling factor.

17
The raw hand shape does not take hand rotation or orientation into account and
therefore has rotation invariance. Performing a hand shape that resembles a “thumbs up”
will therefore be indistinguishable from a “thumbs down”, but it should likely have a
different meaning. Furthermore, for some gestures, it is necessary to know whether the
hand is facing the user.

1.17 Types of Gesture Input and output Devices

Gesture input and output devices are technology tools that enable users to interact with
computers, smartphones, and other digital devices through hand and body movements.
These devices can be categorized into various types based on their functions and
capabilities. Here are some common types of gesture input and output devices:

Gesture Input Devices:

Touchscreens: Touchscreens are one of the most common gesture input devices,
allowing users to interact with a device by tapping, swiping, pinching, and zooming using
their fingers or a stylus.

Motion Controllers: Motion controllers, such as the ones used with gaming consoles like
the PlayStation Move or the Xbox Kinect, capture the user's hand and body movements to
control on-screen actions in games and applications.

Gesture Recognition Cameras: Devices like the Kinect for Xbox and various webcams
equipped with gesture recognition software can track hand and body movements to
control applications and games.

Gesture Gloves: Specialized gloves with embedded sensors can detect hand and finger
movements, making them suitable for virtual reality and augmented reality applications.

Inertial Measurement Units (IMUs): IMUs are sensors that can be attached to various
body parts to track their movements, commonly used in motion capture and gesture
recognition systems.

Depth-Sensing Cameras: Cameras like the Intel RealSense and the Orbbec Astra use
infrared technology to create 3D depth maps, allowing for precise gesture recognition and
tracking.

Gesture Output Devices:

Haptic Feedback Devices: These devices provide tactile feedback to the user, such as
vibration or force feedback, in response to specific gestures. Examples include haptic
feedback in smartphones and game controllers.

Augmented Reality (AR) and Virtual Reality (VR) Headsets: AR and VR headsets, like
the Oculus Rift and HoloLens, offer immersive experiences where gestures can control
virtual objects and environments.

Projectors: Projectors can display interactive content on various surfaces, enabling users
to interact with projected images and interfaces through gestures and touch.

18
Interactive Whiteboards: These large touchscreen displays are used in educational and
business settings to allow users to control and interact with digital content using gestures
and digital pens.

Smart TVs: Some modern smart TVs come with gesture control features, allowing users
to change channels, adjust volume, and navigate menus with hand movements.

Ambient Displays: Ambient displays use lights, colors, and motion to provide information
or convey messages, and users can interact with them through gestures or proximity.

Wearable Devices: Smartwatches and other wearable devices can provide gesture-based
feedback through vibrations and displays on the user's wrist.

Interactive Tables: Tables equipped with touch-sensitive or gesture-sensing surfaces are


used in various applications, such as restaurants, retail, and collaborative workspaces.

These are just some of the many types of gesture input and output devices available, and
technology in this field continues to evolve, offering new and innovative ways to interact
with digital systems and environments.

1.18 Graphics Display

Head-mounted displays (HMDs)

HMDs are small displays or projection technology integrated into eyeglasses or


mounted on a helmet or hat. Heads-up displays are a type of HMD that does not block the
user’s vision, but superimposes the image on the user’s view of the real world. An
emerging form of heads-up display is a retinal display that “paints” a picture directly on the
sensitive part of the user’s retina. Although the image appears to be on a screen at the
user’s ideal viewing distance, there is no actual screen in front of the user, just special
19
optics (for example, modified eyeglasses) that reflect the image back into the eye. Other
heads-up displays that are not worn by the user but are projected on a surface (for
example, on a car or plane windshield) are not covered in this discussion. Some HMDs
incorporate motion sensors to determine direction and movement (for example, to provide
context-sensitive geographic information) or as the interface to an immersive virtual reality
application.

Hand Supported Displays

 The user can hold the device in one or both hands in order to periodically view a
synthetic scene
 Allows user to go in and out of the simulation environment as demanded by the
application
 It has push buttons that can be used to interact with the virtual scene

Floor Supported display

The latest application of augmented reality is flooring. It works by using a smartphone


camera to project various virtual objects onto the floor. The app can simulate various
objects, such as water waves. Researchers at McGill University came up with the idea of
putting floor tiles on the floor, which can mimic different objects. The app also allows users
to share these measurements with contractors. It also allows for private-label versions.
This will make the process of selecting the right floor plan easier.
The technology behind AR can help buyers and sellers find the right floor tile, and building
remodelers can use it to make informed decisions about the design of their new floors. It
also allows the customers to try on different tile designs without having to leave the house.
Using this technology, floor planners can estimate the appropriate number of floor tiles
without leaving the house, saving both time and money. The technology is also being used
on factory floors, where it can help improve efficiency and safety.

Large Volume Displays Large volume displays are used in VR environment that allow
more than one user located in close proximity

Large-volume displays in augmented reality (AR) refer to systems or setups that enable
the projection of AR content into a physical space on a larger scale, often beyond the
confines of typical handheld devices or headsets. These displays can be used for various
purposes, such as virtual design and prototyping, immersive gaming experiences,
architectural visualization, and more. Here are some methods and technologies commonly
used for creating large-volume AR displays:

Projection-Based AR: One approach to creating large-volume AR displays involves using


projectors to display augmented content onto physical objects or surfaces. These
projectors can be mounted on the ceiling or placed strategically around a room.
Projection-based AR can provide an immersive experience by overlaying digital
information onto a larger physical environment. Some systems use depth-sensing
cameras and tracking technology to ensure that the projected content aligns correctly with
the real-world objects.

Cave Automatic Virtual Environment (CAVE): CAVE systems are immersive 3D


environments where multiple walls are used as screens to display AR content. Users
typically wear 3D glasses, and the environment is often coupled with motion tracking
20
systems for interactive experiences. CAVEs are used in fields like scientific visualization,
virtual prototyping, and architecture.

1.19 Human Visual System

 The Visual System The human visual system can be regarded as consisting of two
parts. The eyes act as image receptors which capture light and convert it into
signals which are then transmitted to image processing centres in the brain.
 These centres process the signals received from the eyes and build an internal
“picture” of the scene being viewed.

 Processing by the brain consists of partly of simple image processing and partly of
higher functions which build and manipulate an internal model of the outside world.
 Although the division of function between the eyes and the brain is not clear-cut, it
is useful to consider each of the components separately.
 The basic structure of the eye is displayed in figure a cross-section of a right eye.
The cornea and aqueous humour act as a primary lens which perform crude
focusing of the incoming light signal.
 A muscle called the zonula controls both the shape and positioning (forward and
backwards) of the eye’s lens.
 This provides a fine control over how the light entering the eye is focused.
 The iris is a muscle which, when contracted, covers all but a small central portion of
the lense.
 This allows dynamic control of the amount of light entering the eye, so that the eye
can work well in a wide range of viewing conditions, from dim to very bright light.
 The portion of the lens not covered by the iris is called the pupil.
 The retina provides a photo-sensitive screen at the back of the eye, which incoming
light is focused onto.
 Light hitting the retina is converted into nerve signals.
 A small central region of the retina, called the fovea, is particularly sensitive
because it is tightly packed with photo-sensitive cells.
 It provides very good resolution and is used for close inspection of objects in the
visual field.
 The optic nerve transmits the signals generated by the retina to the vision
processing centres of the brain.

21
 The retina is composed of a thin layer of cells lining the interior back and sides of
the eye.
 Many of the cells making up the retina are specialised nerve cells which are quite
similar to the tissue of the brain.
 Other cells are light-sensitive and convert incoming light into nerve signals which
are transmitted by the other retinal cells to the optic nerve and from there to the
brain.
 There are two general classes of light sensitive cells in the brain; rods and cones.
 Rod cells are very sensitive and provide visual capability at very low light levels.
 Cone cells perform best at normal light levels.
 The provide our daytime visual facilities, including the ability to see in colour (which
we discuss in the next chapter).
 There are roughly 120 million rod cells and 6 million cone cells in the retina.
 There are many more rods than cones because they are used at low light levels
and so more of them are required to gather the light

1.20 Sound Displays

To ensure full immersion in VR systems, the spatial sounds need to match the spatial
characteristics of the visuals – so if you see a car moving away from you in the VR
environment, you will also expect to hear the car moving away from you.

Stereo sound

If we have two loudspeakers (stereo), we can move the perceived position of a sound
source anywhere along the horizontal plane between the two loudspeakers. We can ‘pan’
the sound to the left side by increasing the amplitude level of the left loudspeaker and
lowering the amplitude of the right loudspeaker.

If the sound is played at the same amplitude level through both loudspeakers it will be
heard as if coming from directly in between the two.

This technique of ‘amplitude panning’ to move sound sources between loudspeakers can
be scaled up and used across an array of multiple loudspeakers to reproduce three
dimensional surround sound.

But of course, not many people have access to a large number of loudspeakers arranged
in a sphere which can produce enveloping surround sound. Most VR/AR systems
designed for personal/home use rely on headphones to deliver surround sound to the
user.

So – how do we go about delivering a 360 degree immersive sound scene over a pair of
headphones?

Spatial sound

Take a moment to close your eyes and listen very carefully to the sounds all around you.
(If you’re somewhere really quiet then you might want to try this exercise somewhere with
more noise.) What can you hear? Where are those sounds coming from? What can you
hear in front of you? What can you hear behind you? What about above you, below you,
or to each side? Try rotating your head – do the sounds change?

22
Even with our eyes closed, our hearing system is very well evolved to allow us to locate
sounds in space around us – we call this sound localisation. But how does it work?

To localise sounds in space we rely on ‘binaural cues’. Our brain takes ‘cues’, information
about the level, timing and overall tone of the sound arriving at our left ear, and compares
it with the sound arriving at our right ear. Differences between the sounds in each ear help
us to work out where the sound is placed relative to our own position.

In How Your Ear Changes Sound we discovered how sounds are filtered by the outer ear,
mainly the pinna. Think about a sound placed to your right, slightly above your head. The
acoustic wave that reaches your ears from this sound has travelled directly to your right
ear, but has had to travel around your head to reach your left ear. The shape of your head
actually filters the sound, meaning that some frequencies are dampened and the overall
tone is altered – these are spectral cues.

There are two more types of binaural cue your brain can use:

The interaural level difference (ILD) – in our example the sound has travelled further to get
to your left ear, so it’s quieter because it’s lost more energy on the way
The interaural time difference (ITD) – in our example the sound reaches your right ear a
fraction of a millisecond before it reaches your left ear.

1.21 Human Auditory System.

Sound: Intensity, Frequency, Outer and Middle Ear Mechanisms, Impedance Matching by
Area and Lever Ratios

The auditory system changes a wide range of weak mechanical signals into a complex
series of electrical signals in the central nervous system. Sound is a series of pressure
changes in the air. Sounds often vary in frequency and intensity over time. Humans can
detect sounds that cause movements only slightly greater than those of Brownian
movement. Obviously, if we heard that ceaseless (except at absolute zero) motion of air
molecules we would have no silence.

Figure 12.2 depicts these alternating compression and rarefaction (pressure) waves
impinging on the ear. The pinna and external auditory meatus collect these waves, change
23
them slightly, and direct them to the tympanic membrane. The resulting movements of the
eardrum are transmitted through the three middle-ear ossicles (malleus, incus and stapes)
to the fluid of the inner ear. The footplate of the stapes fits tightly into the oval window of
the bony cochlea. The inner ear is filled with fluid. Since fluid is incompressible, as the
stapes moves in and out there needs to be a compensatory movement in the opposite
direction. Notice that the round window membrane, located beneath the oval window,
moves in the opposite direction.

Because the tympanic membrane has a larger area than the stapes footplate there is a
hydraulic amplification of the sound pressure. Also because the arm of the malleus to
which the tympanic membrane is attached is longer than the arm of the incus to which the
stapes is attached, there is a slight amplification of the sound pressure by a lever action.
These two impedance matching mechanisms effectively transmit air-born sound into the
fluid of the inner ear. If the middle-ear apparatus (ear drum and ossicles) were absent,
then sound reaching the oval and round windows would be largely reflected.

24

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy