Arvr Unit 1
Arvr Unit 1
Introduction to Virtual Reality – Definition – Three I’s of Virtual Reality – Virtual Reality Vs
3D Computer Graphics – Benefits of Virtual Reality – Components of VR System –
Introduction to AR – System Structure of Augmented Reality – Key Technology in AR – 3D
Vision – Approaches to Augmented Reality – Alternative Interface Paradigms – Spatial AR
– Input Devices – 3D Position Trackers – Performance Parameters – Types Of Trackers –
Navigation and Manipulation Interfaces – Gesture Interfaces – Types of Gesture Input
Devices – Output Devices – Graphics Display – Human Visual System – Personal
Graphics Displays – Large Volume Displays – Sound Displays – Human Auditory System.
3D Graphics
It is a display that consists of two screens that display the virtual world in front of
the users.
They have motion sensors that detect the orientation and position of your head and
adjust the picture accordingly.
It is built-in headphones or external audio connectors to output sound.
Moreover, they have a blackout blindfold to ensure the users are fully disconnected
from the outside world.
2. Computing device:
3. Sensor(s):
3
Input devices: Input devices are used by users in the VR system to interact with the
virtual world in front of them.
These devices might be a tool or a weapon in their artificial world.
The input devices include mice, controllers, joysticks, gloves with sensors, and
body tracking systems.
4. Audio systems:
5. Software:
Definition: It is an interactive experience that combines the real world and computer-
generated content. The content can distance multiple sensory modalities,
including visual, auditory, etc.
System structure of AR
This Architecture comprised of all above components and interactive relationship between
them helps to develop augmented reality working model.
1. User: The most essential part of augmented reality is its user. The user can be a
student, doctor and employee. This user is responsible for creation of AR models.
2. Interaction: It is a process between device and user. The word itself consists of its
meaning some action perform by one entity as result in creation or some action
performed by other entity.
3. Device: This component is responsible for creation, display and interaction of 3D
models. The device can be portal or in static state. Example, mobile, computer, AR
headsets etc.
4. Virtual Content: The virtual content is nothing but the 3D model created or
generated by the system or AR application. Virtual content is type of information
that can be integrated in real world user’s environment. This Virtual content can be
3D models, texture, text, images etc.
4
5. Tracking: This component is basically process which makes possible creation of AR
models. Tracking is sort of algorithm which help to determine the device where to
place or integrate the 3D model in real world environment. There are many types of
Tracking algorithm available which can be used in development of AR applications.
6. Real-life entity: The last component AR architecture is real world entities. These
entities can be tree, book, fruits, computer or anything which is visible in screen. AR
application does not change position of real life entity. It only integrate the digital
information with this entities
History of AR
Thomas Caudell [Boeing Computer Services] coined the term augmented reality in
1990 to describe how the head-mounted displays that electrician’s use when
assembling complicated wiring harnesses worked.
In 1998, One of the first commercial applications of augmented reality technology
was the yellow first down marker that began appearing in televised football games
Today, Google Glass, smartphone games and heads-up displays (HUDs) in car
windshields are the most well-known consumer AR products.
But the technology is also used in many industries, including healthcare, public
safety, gas and oil, tourism and marketing.
Retail: Consumers can use a store's online app to see how products, such as
furniture, will look in their own homes before buying.
Entertainment and gaming. AR can be used to overlay a virtual game in the real
world or enable users to animate their faces in different and creative ways on social
media.
5
Navigation. AR can be used to overlay a route to the user's destination over a live
view of a road. AR used for navigation can also display information about local
businesses in the user's immediate surroundings.
Tools and measurement. Mobile devices can use AR to measure different 3D
points in the user's environment.
Architecture. AR can help architects visualize a building project.
Military. Data can be displayed on a vehicle's windshield that indicates destination
directions, distances, weather and road conditions.
Archaeology. AR has aided archaeological research by helping archaeologists
reconstruct sites. 3D models help museum visitors and future archaeologists
experience an excavation site as if they were there.
Hardware
Hardware components for augmented reality are: a processor, display, sensors and
input devices.
Modern mobile computing devices like smartphones and tablet computers contain
these elements, which often include a camera and micro electromechanical systems
The main 2 AR Techniques are 1) diffractive waveguides and 2) reflective waveguides.
Display
Various technologies are used in AR rendering, including optical projection systems,
monitors, handheld devices, and display systems, which are damaged on the human
body.
HMD
HMD is a display device worn on the forehead, such as a harness or helmet-mounted.
HMDs place images of both the physical world and virtual objects over the user's field
of view.
Modern HMDs often employ sensors for six degrees of freedom monitoring that allow
the system to align virtual information to the physical world and adjust accordingly with
the user's head movements.
HMDs can provide VR users with mobile and collaborative experiences.
Specific providers, such as uSens and Gestigon, include gesture controls for full virtual
immersion.
Eyeglasses
AR displays can be rendered on devices resembling eyeglasses.
Versions include eyewear that employs cameras to intercept the real world view and
re-display its augmented view through the eyepieces
It is a transparent display that presents data without requiring users to look away from
their usual viewpoints.
A precursor technology to augmented reality, heads-up displays were first developed
for pilots in the 1950s, projecting simple flight data into their line of sight, thereby
enabling them to keep their "heads up" and not look down at the instruments.
Near-eye augmented reality devices can be used as portable head-up displays as they
can show data, information, and images while the user views the real world.
6
Practically, AR is expected to include registration and tracking between the
superimposed perceptions, sensations, information, data, and images and some
portion of the real world.
Contact lenses
EyeTap
The EyeTap (also known as Generation-2 Glass) captures rays of light that would
otherwise pass through the centre of the lens of the wearer's eye, and substitutes
synthetic computer-controlled light for each ray of real light.
Handheld
A Handheld display employs a small display that fits in a user's hand. All handheld AR
solutions to date opt for video see-through.
Projection mapping
Projection mapping augments real-world objects and scenes, without the use of special
displays such as monitors, head-mounted displays or hand-held devices.
Projection mapping makes use of digital projectors to display graphical information
onto physical objects.
The key difference in projection mapping is that the display is separated from the users
of the system.
Tracking
Modern mobile AR systems use one or more of the following motion tracking
technologies: digital cameras and/or other optical sensors, accelerometers, GPS,
gyroscopes, solid state compasses, radio-frequency identification (RFID).
These technologies offer varying levels of accuracy and precision.
These technologies are implemented in the ARKit API by Apple and ARCore API by
Google to allow tracking for their respective mobile device platforms.
Networking
Mobile augmented reality applications are gaining popularity because of the wide
adoption of mobile and especially wearable devices.
This requires computationally intensive algorithms with extreme latency requirements.
To compensate for the lack of computing power, offloading data processing to a distant
machine is often desired.
7
Input devices
Speech recognition systems that translate a user's spoken words into computer
instructions
Gesture recognition systems that interpret a user's body movements by visual
detection or from sensors embedded in a peripheral device such as a wand, stylus,
pointer, glove Products which are trying to serve as a controller of AR headsets include
Wave by Seebright Inc. and Nimble by Intugine Technologies.
Computer
The computer analyzes the sensed visual and other data to synthesize and position
augmentations.
Computers are responsible for the graphics that go with AR.
Augmented Reality uses a computer-generated image which has a striking effect on
the way the real world is shown.
Projector
The projector can throw a virtual object on a projection screen and the viewer can
interact with this virtual object.
Projection surfaces can be many objects such as walls or glass panes.
The software must derive real world coordinates, independent of camera, and camera
images.
That process is called image registration, and uses different methods of computer
vision, mostly related to video tracking.
Many computer vision methods of AR are inherited from visual odometry.
An augogram is a computer generated image that is used to create AR.
Augography is the science and software practice of making augograms for AR.
It is a data standard developed within the Open Geospatial Consortium (OGC), which
consists of Extensible Markup Language (XML) grammar to describe the location and
appearance of virtual objects in the scene, as well as ECMAScript bindings to allow
dynamic access to properties of virtual objects.
Development
Environmental/context design
Context Design focuses on the end-user's physical surrounding, spatial space, and
accessibility that may play a role when using the AR system.
8
Designers should be aware of the possible physical scenarios the end-user may be in
such as:
Interaction design
Visual design
Visual design is the appearance of the developing application that engages the user.
To improve the graphic interface elements and user interaction, developers may use
visual cues to inform the user what elements of UI are designed to interact with and
how to interact with them.
Since navigating in an AR application may appear difficult and seem frustrating, visual
cue design can make interactions seem more natural.
1.8 3D Vision
“3D Vision,” depth perception is dependent on the ability to use both eyes together at
the highest level.
3D vision relies on both eyes working together to accurately focus on the same point in
space.
The brain is then able to interpret the image the each eye sees to create your
perception of depth.
Deficiencies in depth perception can result in a lack of 3D vision or headaches and
eyestrain during 3D movies.
Discomfort: 3D technology requires the viewer to focus their eyes either in front of
the screen (by converging) or behind the screen (by diverging). If you have
difficulties with convergence, such as convergence insufficiency, it can result in
headaches, eyestrain, or fatigue.
Dizziness: Often times people will complain of dizziness or nausea after viewing 3D
material. One reason for these complaints is that some of the technology used to
create 3D can worsen Visual Motion Hypersensitivity (VMH).
Depth: Some individuals describe 3D as “popping off the screen” or “coming right at
them”, while others only see a faintly raised image or a flat image that resembles a
traditional screen. This lack or absence of depth is one of the signs that the
binocular vision system is not functioning properly. People who are unable to use
both eyes together to achieve binocularity will not see the depth of 3D content.
9
1.9 Approaches to Augmented Reality
Marker-based AR:
Markerless AR, also known as location-based AR, uses the device's sensors (e.g.,
GPS, accelerometers, and compass) to determine the user's location and orientation.
This approach can be used to display location-specific information, such as points of
interest, directions, or geolocated content, on a mobile device or AR headset.
Projection-based AR:
Head-mounted AR:
AR apps for smartphones and tablets use the device's camera and sensors to
provide AR experiences. They can recognize images, surfaces, or objects and overlay
digital content on the device's screen.
Popular platforms like Apple's ARKit and Google's ARCore provide tools and frameworks
for developing AR applications on mobile devices.
10
Web-based AR:
Wearable AR:
Each approach to augmented reality has its own advantages and limitations, and
the choice of which to use depends on the specific use case, hardware, and user
requirements. AR technology continues to evolve, opening up new possibilities and
applications across various industries.
Augmented Reality (AR) offers various interface paradigms that go beyond traditional
graphical user interfaces (GUIs) to provide more immersive and interactive experiences.
Gesture-Based Interfaces:
Gesture-based AR interfaces allow users to interact with digital objects in the real world
using hand gestures or body movements. Devices like Microsoft's HoloLens and Leap
Motion have popularized this approach.
Users can control, manipulate, or select virtual objects by performing specific hand or
body movements, making AR more intuitive and engaging.
Voice Commands:
Voice commands in AR enable users to interact with digital content and applications using
natural language. Virtual assistants like Siri, Google Assistant, and AR-specific voice
command systems can recognize and respond to spoken instructions.
This interface paradigm is particularly useful for hands-free and eyes-free interaction in
situations where touch or gesture-based input may be impractical.
AR devices with touch-sensitive surfaces or handheld controllers provide users with the
ability to touch, tap, and interact with virtual objects as if they were physical. For example,
AR glasses may have touch-sensitive frames or handheld devices for interaction.
Users can interact with objects, select options, and navigate through menus by physically
touching or tapping the AR interface.
Brain-Computer Interfaces (BCI):
11
Emerging BCIs can be used to control AR systems using brain signals, bypassing the
need for physical input devices. These interfaces are still in experimental stages but have
the potential to offer a high degree of control and personalization.
BCIs can interpret brainwave patterns to trigger actions, navigate menus, or select objects
in the AR environment.
Eye-Tracking Interfaces:
Haptic Feedback:
Haptic feedback provides tactile sensations to the user, enhancing the sense of touch in
AR experiences. It can be delivered through vibrations, force feedback, or other tactile
feedback mechanisms.
In AR, haptic feedback can simulate the feeling of interacting with virtual objects, providing
a more immersive experience.
ARML is a language designed for creating augmented reality experiences that include
interactive 3D models, animations, and information overlays. It allows developers to define
the structure and behavior of AR content. ARML can be used to create interactive AR
applications that respond to user actions, such as selecting, moving, or resizing virtual
objects.
1.11 Spatial AR
13
1.13 3D Position Trackers
3D Tracking: Much like vehicle GPS navigation, coordinate data are used to show
where an object is in 3D space, and where it needs to go next.
With vehicle GPS navigation, the car is moving and the destination is fixed.
A map supplies the travel route.
However, only a 2D (planar) view of the vehicle’s position is available, and only
movement along the longitude and latitude lines (X and Y axes, respectively) is
shown.
Altitude info (Z-axis) is missing, as is rotational data – only two degrees-of-freedom
are reported.
A view that limited cannot support complex OEM surgical navigation applications
14
The Polaris optical measurement solution, and Aurora and 3D Guidance
electromagnetic (EM) tracking solutions capture the position (X-Y-Z coordinate
data) and orientation (roll, pitch, yaw) of an optical navigation marker or EM sensor
in relation to a fixed object or reference point in 3D space.
Position and orientation measurements also refer to the “degrees of freedom”
(DOF) in which an object moves in 3D space.
There are six degrees of freedom in total; NDI’s solutions capture all six degrees of
freedom in real-time.
This type of technology, known as 3D measurement, spatial measurement, or 3D
motion tracking, can be used for real-time tool tracking and navigation purposes.
1 Single-Point Tracking
This beginner-friendly motion tracking technique is perfect for getting your feet wet in
the world of motion tracking.
A single-point tracker refers to tracking an object using a single point of reference
within a composition.
How it works: In single-point tracking, the motion tracking software is given a single
point in a clip to focus on and tracks the movement of the camera around that single
point.
15
2. Two-Point Tracking
Two-point tracking allows you to apply two different points of motion tracking to an
image and track more than one type of movement.
How it works: In two-point tracking, you apply two separate tracking points to an image
and use each one to track a different type of motion.
3. Four-Point Tracking
So, you’ve mastered single-point and two-point tracking, it’s time to move up to
something a bit more challenging, but very useful: four-point tracking.
How it works: Also known as “corner pin tracking” or “perspective tracking,” four-point
tracking allows you to track each corner of a four-point surface throughout a shot (such
as a smartphone screen).
4. Planar tracking
Planar tracking is one of the most effective forms of motion tracking, but does require
you to be very comfortable with motion tracking tools to be able to use. However, the
results speak for themselves as to why this form of tracking is well-worth the effort to
learn.
How it works: Planar tracking utilizes the Mocha plug-in (included with After Effects) to
track a plane or a flat surface through a motion.
5. Spline tracking
This complex motion tracking technique is one of the most accurate of all techniques,
but comes with a significant learning curve.
How it works: Spline tracking allows you to trace around an object that you want to
track instead of focusing on a single or set of points. This creates a custom 2D object
that After Effects will try to track.
6. 3D Camera Tracking
Last, but certainly not least, is 3D camera tracking, a feature of After Effects that has
gained mounting popularity by hobbyists and pro-visual effects artists alike.
How it works: The 3D tracking tool automatically generates dozens of possible motion
tracking points, then allows the user to select which points they would like to track. This
takes a lot of the manual labor out of setting tracking points, but does require quite a bit
of time and processing power to use.
16
1.16 Gesture Interfaces
Gesture recognition is the process by which gestures formed by a user are made known to
the system. In completely immersive VR environments, the keyboard is generally not
included, and some other means of control over the environment is needed
A rough overview of the main parts that make up the framework is shown in Fig1. The
framework supports an easy swap of the Hand Tracker component, allowing many hand
tracking solutions to be used with the framework. The Gesture Recorder allows recording
and saving gestures that are then stored in a set of known gestures. These gestures are
then recognized by the Gesture Recognizer by comparing stored data with live data from
the hand tracker. A Gesture Interpreter is used to communicate with the desired
application using events that inform when gestures are performed.
The hand detection in the proposed framework is realized using a hand tracker built into
an HMD. The tracking part is crucial and serves as an entry point to the gesture
recognition framework as it is responsible for accurate pose matching. The hand tracker
primarily used in the development of the framework provides 23 points for each hand
(see Fig 2. Other configurations are supported as well. The more points a tracking device
provides, the more fine-grained the gesture recognition is
requires two primary features for recognizing one-handed gestures:
Joint Positions for static hand shape detection and matching;
Finger Tip Positions for recording spatial information required to perform dynamic
gestures.
Hand poses and shapes can be stored for later recognition of static hand gestures,
e.g., while a user is performing the hand movement, it can be matched against a
predefined set of hand shapes. In order to increase the recognition performance for users
with hands below or above the average human hand size, the hand positions are adjusted
by a scaling factor. This normalization is necessary to increase the recognition accuracy
for gestures that were recorded by a different user than the user performing the gestures.
To achieve this, each joint position is divided by the hand scaling factor.
17
The raw hand shape does not take hand rotation or orientation into account and
therefore has rotation invariance. Performing a hand shape that resembles a “thumbs up”
will therefore be indistinguishable from a “thumbs down”, but it should likely have a
different meaning. Furthermore, for some gestures, it is necessary to know whether the
hand is facing the user.
Gesture input and output devices are technology tools that enable users to interact with
computers, smartphones, and other digital devices through hand and body movements.
These devices can be categorized into various types based on their functions and
capabilities. Here are some common types of gesture input and output devices:
Touchscreens: Touchscreens are one of the most common gesture input devices,
allowing users to interact with a device by tapping, swiping, pinching, and zooming using
their fingers or a stylus.
Motion Controllers: Motion controllers, such as the ones used with gaming consoles like
the PlayStation Move or the Xbox Kinect, capture the user's hand and body movements to
control on-screen actions in games and applications.
Gesture Recognition Cameras: Devices like the Kinect for Xbox and various webcams
equipped with gesture recognition software can track hand and body movements to
control applications and games.
Gesture Gloves: Specialized gloves with embedded sensors can detect hand and finger
movements, making them suitable for virtual reality and augmented reality applications.
Inertial Measurement Units (IMUs): IMUs are sensors that can be attached to various
body parts to track their movements, commonly used in motion capture and gesture
recognition systems.
Depth-Sensing Cameras: Cameras like the Intel RealSense and the Orbbec Astra use
infrared technology to create 3D depth maps, allowing for precise gesture recognition and
tracking.
Haptic Feedback Devices: These devices provide tactile feedback to the user, such as
vibration or force feedback, in response to specific gestures. Examples include haptic
feedback in smartphones and game controllers.
Augmented Reality (AR) and Virtual Reality (VR) Headsets: AR and VR headsets, like
the Oculus Rift and HoloLens, offer immersive experiences where gestures can control
virtual objects and environments.
Projectors: Projectors can display interactive content on various surfaces, enabling users
to interact with projected images and interfaces through gestures and touch.
18
Interactive Whiteboards: These large touchscreen displays are used in educational and
business settings to allow users to control and interact with digital content using gestures
and digital pens.
Smart TVs: Some modern smart TVs come with gesture control features, allowing users
to change channels, adjust volume, and navigate menus with hand movements.
Ambient Displays: Ambient displays use lights, colors, and motion to provide information
or convey messages, and users can interact with them through gestures or proximity.
Wearable Devices: Smartwatches and other wearable devices can provide gesture-based
feedback through vibrations and displays on the user's wrist.
These are just some of the many types of gesture input and output devices available, and
technology in this field continues to evolve, offering new and innovative ways to interact
with digital systems and environments.
The user can hold the device in one or both hands in order to periodically view a
synthetic scene
Allows user to go in and out of the simulation environment as demanded by the
application
It has push buttons that can be used to interact with the virtual scene
Large Volume Displays Large volume displays are used in VR environment that allow
more than one user located in close proximity
Large-volume displays in augmented reality (AR) refer to systems or setups that enable
the projection of AR content into a physical space on a larger scale, often beyond the
confines of typical handheld devices or headsets. These displays can be used for various
purposes, such as virtual design and prototyping, immersive gaming experiences,
architectural visualization, and more. Here are some methods and technologies commonly
used for creating large-volume AR displays:
The Visual System The human visual system can be regarded as consisting of two
parts. The eyes act as image receptors which capture light and convert it into
signals which are then transmitted to image processing centres in the brain.
These centres process the signals received from the eyes and build an internal
“picture” of the scene being viewed.
Processing by the brain consists of partly of simple image processing and partly of
higher functions which build and manipulate an internal model of the outside world.
Although the division of function between the eyes and the brain is not clear-cut, it
is useful to consider each of the components separately.
The basic structure of the eye is displayed in figure a cross-section of a right eye.
The cornea and aqueous humour act as a primary lens which perform crude
focusing of the incoming light signal.
A muscle called the zonula controls both the shape and positioning (forward and
backwards) of the eye’s lens.
This provides a fine control over how the light entering the eye is focused.
The iris is a muscle which, when contracted, covers all but a small central portion of
the lense.
This allows dynamic control of the amount of light entering the eye, so that the eye
can work well in a wide range of viewing conditions, from dim to very bright light.
The portion of the lens not covered by the iris is called the pupil.
The retina provides a photo-sensitive screen at the back of the eye, which incoming
light is focused onto.
Light hitting the retina is converted into nerve signals.
A small central region of the retina, called the fovea, is particularly sensitive
because it is tightly packed with photo-sensitive cells.
It provides very good resolution and is used for close inspection of objects in the
visual field.
The optic nerve transmits the signals generated by the retina to the vision
processing centres of the brain.
21
The retina is composed of a thin layer of cells lining the interior back and sides of
the eye.
Many of the cells making up the retina are specialised nerve cells which are quite
similar to the tissue of the brain.
Other cells are light-sensitive and convert incoming light into nerve signals which
are transmitted by the other retinal cells to the optic nerve and from there to the
brain.
There are two general classes of light sensitive cells in the brain; rods and cones.
Rod cells are very sensitive and provide visual capability at very low light levels.
Cone cells perform best at normal light levels.
The provide our daytime visual facilities, including the ability to see in colour (which
we discuss in the next chapter).
There are roughly 120 million rod cells and 6 million cone cells in the retina.
There are many more rods than cones because they are used at low light levels
and so more of them are required to gather the light
To ensure full immersion in VR systems, the spatial sounds need to match the spatial
characteristics of the visuals – so if you see a car moving away from you in the VR
environment, you will also expect to hear the car moving away from you.
Stereo sound
If we have two loudspeakers (stereo), we can move the perceived position of a sound
source anywhere along the horizontal plane between the two loudspeakers. We can ‘pan’
the sound to the left side by increasing the amplitude level of the left loudspeaker and
lowering the amplitude of the right loudspeaker.
If the sound is played at the same amplitude level through both loudspeakers it will be
heard as if coming from directly in between the two.
This technique of ‘amplitude panning’ to move sound sources between loudspeakers can
be scaled up and used across an array of multiple loudspeakers to reproduce three
dimensional surround sound.
But of course, not many people have access to a large number of loudspeakers arranged
in a sphere which can produce enveloping surround sound. Most VR/AR systems
designed for personal/home use rely on headphones to deliver surround sound to the
user.
So – how do we go about delivering a 360 degree immersive sound scene over a pair of
headphones?
Spatial sound
Take a moment to close your eyes and listen very carefully to the sounds all around you.
(If you’re somewhere really quiet then you might want to try this exercise somewhere with
more noise.) What can you hear? Where are those sounds coming from? What can you
hear in front of you? What can you hear behind you? What about above you, below you,
or to each side? Try rotating your head – do the sounds change?
22
Even with our eyes closed, our hearing system is very well evolved to allow us to locate
sounds in space around us – we call this sound localisation. But how does it work?
To localise sounds in space we rely on ‘binaural cues’. Our brain takes ‘cues’, information
about the level, timing and overall tone of the sound arriving at our left ear, and compares
it with the sound arriving at our right ear. Differences between the sounds in each ear help
us to work out where the sound is placed relative to our own position.
In How Your Ear Changes Sound we discovered how sounds are filtered by the outer ear,
mainly the pinna. Think about a sound placed to your right, slightly above your head. The
acoustic wave that reaches your ears from this sound has travelled directly to your right
ear, but has had to travel around your head to reach your left ear. The shape of your head
actually filters the sound, meaning that some frequencies are dampened and the overall
tone is altered – these are spectral cues.
There are two more types of binaural cue your brain can use:
The interaural level difference (ILD) – in our example the sound has travelled further to get
to your left ear, so it’s quieter because it’s lost more energy on the way
The interaural time difference (ITD) – in our example the sound reaches your right ear a
fraction of a millisecond before it reaches your left ear.
Sound: Intensity, Frequency, Outer and Middle Ear Mechanisms, Impedance Matching by
Area and Lever Ratios
The auditory system changes a wide range of weak mechanical signals into a complex
series of electrical signals in the central nervous system. Sound is a series of pressure
changes in the air. Sounds often vary in frequency and intensity over time. Humans can
detect sounds that cause movements only slightly greater than those of Brownian
movement. Obviously, if we heard that ceaseless (except at absolute zero) motion of air
molecules we would have no silence.
Figure 12.2 depicts these alternating compression and rarefaction (pressure) waves
impinging on the ear. The pinna and external auditory meatus collect these waves, change
23
them slightly, and direct them to the tympanic membrane. The resulting movements of the
eardrum are transmitted through the three middle-ear ossicles (malleus, incus and stapes)
to the fluid of the inner ear. The footplate of the stapes fits tightly into the oval window of
the bony cochlea. The inner ear is filled with fluid. Since fluid is incompressible, as the
stapes moves in and out there needs to be a compensatory movement in the opposite
direction. Notice that the round window membrane, located beneath the oval window,
moves in the opposite direction.
Because the tympanic membrane has a larger area than the stapes footplate there is a
hydraulic amplification of the sound pressure. Also because the arm of the malleus to
which the tympanic membrane is attached is longer than the arm of the incus to which the
stapes is attached, there is a slight amplification of the sound pressure by a lever action.
These two impedance matching mechanisms effectively transmit air-born sound into the
fluid of the inner ear. If the middle-ear apparatus (ear drum and ossicles) were absent,
then sound reaching the oval and round windows would be largely reflected.
24