Acoustics Presentation
Acoustics Presentation
Acoustics Seminar
Music 21700
Paul Kozel
Edits and addendums, Jonathan Perl
All Fig. numbers are examples from An Introduction to the Creation of Electroacoustic Music, by Samuel Pellman
(out of print).
What is Sound?
Sound starts with the movement of an elastic medium (i.e. a string). The movement of this medium
(sound propagation) disturbs the surrounding air molecules, which creates regular and/or irregular patterns of
compression and rarefaction of these air molecules. Air molecules move toward each other during
compression and away from each other during rarefaction. High air pressure is created during compression while
low air pressure is created during rarefaction. These compressions and rarefactions are called vibrations.
Vibrations travel to the ear where they are processed into perceptions of either noise or pitched sound or
combinations of the two.
Noise is created by irregular patterns of compression and rarefaction while pitched sound or frequency
is created by regular patterns of compression and rarefaction. These vibrations radiate in various directional
patterns at a constant speed of 1,130 feet per second (the speed of sound). The speed of sound is dependent on
the medium and the temperature of the medium. Sound will travel through 72 Fahrenheit dry air at 1,130
feet-per-second.
Irregular vibration = Noise
The Cochlea is the part of the inner ear containing several membranes, including the basilar membrane.
Sounds excite the fluid within the inner ear via the action of the stapes which are in contact with the oval window.
On the surface of the basilar membrane lies the Organ of Corti, which contains between 20,000 and 30,000
reedlike fibers/hairs, along its length, and these change properties: they are short and stiff at the start of the
basilar membrane, and become increasingly longer and less stiff further down the membrane.
Because of their variable stiffness and length, the hairs have different resonant frequencies. Nearer to the
middle ear, the shorter stiffer hairs will be excited by higher-frequency components of the sound. Further on
down the membrane, the longer looser hairs will be excited by lower-frequency components of the sound. The
basilar membrane fires electrical impulses based on these excitations. These impulses travel via the cochlear
nerve, which sends them on to the cerebral cortex of our brains. Our brains, in turn, have the amazingly
complex job of interpreting and making sense of the impulses arriving from the basilar membrane. Among the
most crucial interpretations is that of pitch: we can perceive specific identifiable pitches based on the location of
firing of nerve impulses from the basilar membrane.
Because most pitched sounds are not sine waves but
instead contain many sine waves at different frequencies simultaneously, the concurrent firing from multiple
locations can not only result in a perception of pitch, but in a perception of tone or timbre.
If two fundamental tones are very close in pitch, they will excite areas that are very close to each other on the
basilar membrane. In this case, our brains may be unable to parse the two distinct pitches. Whichever tone is
louder and is causing the greater amplitude of vibration will be perceived as a distinct pitch, and the adjacent pitch
will be masked, only contributing to an increase in perceived volume.
Sounds with greater sound pressure amplitude will cause the hairs of the membrane to vibrate with a greater
range of motion/displacement. The resulting nerve impulses are interpreted by the brain as being of sounds with
a greater perceived loudness.
Frequency of a String
The frequency a string produces is dependent on three factors: string length, the tension of the string,
and the mass of the string. Longer strings produce lower pitches. Increased tension increases frequency.
Increased mass decreases frequency. All strings of a guitar are of the same length. Differences in frequency
between the strings of a guitar are not dependent on length but on tension and mass. The tuning pegs adjust the
tension and the different string thickness combine to create six distinct frequencies. Note that some strings are
made of a solid core (top strings) while others are solid core covered with over-wound strings (bottom strings).
The over-wound string increases the mass and therefore lowers the pitch of the string. The piano uses a
combination of differences in string length, tension, and mass to produce different frequencies.
The longer the string The lower the frequency
The more tension on a string The higher the frequency
The more mass of a string The lower the pitch
Constructive Interference
When two identical waves are in phase (starting at the same time) the amplitudes of the these waves are
summed. This is called constructive interference or phase summation (Fig. 1.10).
Deconstructive Interference
If two identical waves are 180 out of phase they will cancel each other out completely since they have
opposite yet equal energy. No sound will be perceived by the listener. This is called destructive interference,
or phase cancellation. (Fig. 1.11) If these two waves are 180 out of phase but have different amplitudes only
partial phase cancellation will occur. Partial phase cancellation will reduce the amplitude of the two waves based
on the differences in amplitude. The closer the two waves are in amplitude the greater the decrease in amplitude.
Polarity
Polarity simply refers to energy relationships between two identical waves. It does not refer to time
relationships. In the example below there is no time difference between the two identical waves. They both start
at the exact same time. However, there is a polarity difference since these waves are of equal but opposite
electrical energy. If these were acoustical waves they would be of equal but opposite air pressure. In the
example below the waves are said to have opposite polarities or polarity reversed. Two identical sine waves that
have opposite polarity or are phase shifted 180create the same result: zero output. However, they create this
effect by different means. (180 phase-shifting of waveforms that are more complex then sine waves - or of actual
audio tracks - does not result in total cancellation unless the waveforms are symmetrical - e.g., triangle waves and
square waves: see below for the complete discussion of phase-shifting effects, and also the section on geometric
waveforms.)
Polarity Reversal
Phase Shift and Frequency
The amount of phase shift that results from a specific, fixed, time delay varies with frequency. In the
recording studio a digital delay processor has the ability to delay any frequencies that are sent to it by a specific
time value usually given in milliseconds. The digital delay processor then combines the original non-delayed
signal with the delayed copy to create various effects such as echo, phasing, flanging, and chorusing.
In the example below, if the four waveforms were delayed 5 milliseconds each signal would be phase
shifted by different amounts due to the fact that their periods are of varying length. So when these delayed
signals are combined with the non-delayed signals each frequency will have a different phase shift. In the
example below you can see that a 5 millisecond delay applied to a 400 Hertz waveform will create a 720 phase
shift (2 cycles or 360 x 2) while a 200 Hertz wave will be phase shifted 360 (1 cycle).
Dry/Unprocessed
Signal
Delayed/Processed
Input
Signal
A/D
Signal
Convertor
Mix/Output
D/A
Clock
Convertor
Input
between two identical waves are less than 30 40 milliseconds listeners are actually detecting changes in the
comb-filter patterns!
40 - 60 ms.
60 - 100 ms.
Flutter.
rapid reiterations.
differences can affect the frequency response, spatial placement, and depth perception of the sounds emanating
from a pair of speakers. Sitting in the sweet spot between two speakers is simply being equidistant from each of
the speakers so that they will produce the most accurate picture of the audio.
1 Wavelength
10'
1 Wavelength
10'
90
Microphone Microphone
B
A
110 Hz Wave
Sound Source
180
270
Wavelength
10'
Microphone
A
90
Microphone
B
55 Hz Wave
Sound Source
180
Modes Of Vibration
Inversely Proportional Law- The frequency (number of cycles per second) of a vibrating string is inversely
proportional to its length. In short, as frequency increases, the length of string decreases. Conversely, as the
frequency decreases, the length of the string increases. If you play a single guitar string, you can produce
progressively higher frequencies by shortening the vibrating length of the string (see diagram below).
vibrating length
Neck
Bridge
vibrating length
Neck
Bridge
Neck
Bridge
The ratio that exists between any two frequencies is called an interval. The interval created when the
string vibrates as a whole (open string) and when it vibrates at half its length has a ratio of 1:2. Therefore, if the
open string is 440 Hertz then the string vibrating at half its length is 880 Hertz or 440 x 2. This interval is called
the octave. Each doubling in frequency produces an octave interval. It represents a fixed ratio, not a specific
number of Hertz between frequencies (for example all octaves being 100 Hertz apart). Therefore, its logical to
deduce that the intervals express a logarithmic relationship.
The graph below shows the relationship between the open string (vibrating as a whole) and other
vibrating modes 1/2, 1/3, 1/4 etc. expressed in ratio, interval name, and frequency. These ratios and their
resultant intervals are called pure intervals because they exist in natural vibrating systems like string and wind
instruments.
67
fx1
1/3
134
fx2
ratio:
1:2
interval:
octave
1/4
201
fx3
2:3
268
fx4
3:4
fifth
1/5
335
fx5
4:5
fourth
maj. 3rd
The lower number of a ratio expresses the lower frequency while the higher number the higher frequency.
The frequencies of 67 Hertz and 134 Hertz (2x67) create the ratio of 1:2. 134 Hertz (2x67) and 201 Hertz (3x67)
create the ratio of 2:3. 201 Hertz (3x67) and 268 Hertz (4x67) create the ratio of 3:4, etc. The interval names
listed above represent some of the consonant intervals. These intervals have simple ratios and are considered
pleasant or neutral sounding. Dissonant intervals are expressed with more complex ratios and are considered
more harsh sounding than consonant Intervals. The diminished fifth is a dissonant interval due to its sound and
the ratio of 45:32.
The formulas below demonstrate how to determine the second frequency of an interval when one is provided.
Interval
Perfect 5
Perfect 5
Perfect 4
Major 3rd
Ratio
3:2
3:2
4:3
5:4
Up or Down
Up (2:3)
Down (3:2)
Up (3:4)
Down (5:4)
Starting Pitch
A-440
A-440
A-440
A-440
Formula
440x32=660 Hz
440x23=293.3 Hz
440x43=586.6 Hz
440x45=352 Hz
An equal tempered tuning system divides all notes of the chromatic scale into twelve equal parts.
Intervals are not derived by pure or just ratios but by using a constant frequency multiple between notes of a
chromatic scale (1.059463). This makes all intervals and all scales equal. Since all half steps are equal in this
system we further divide the equal half step into Cents. There are 100 cents in every half step so 50 cents
would equal a quarter step.
Overtone Series
All strings have the ability to vibrate in many different modes simultaneously. Strings can vibrate as a
whole (a below), in halves (b below), in thirds (c below), etc. all at the same time. The entire string is moving in
each mode but various parts of the string may be moving in different directions as demonstrated below.
vibrating length
a . Neck
Bridge
whole string vibrating
vibrating length
b . Neck
Bridge
vibrating in halves
vibrating length
c . Neck
Bridge
vibrating in thirds
A string vibrating in two equal halves (a below) will produce the same frequency that half of the string produces (b
below). A string vibrating in thirds will produce the same frequency that 1/3 of the string produces, etc.
vibrating length
a . Neck
vibrating length
Bridge
b . Neck
Bridge
vibrating in halves
1:2
Interval:
3:4
Fifth
Fourth
Octave
Frequency in Hertz: 67
Fund
Partial #:
2:3
Overtone/Harmonic:
Mode of Vibration: Whole
String Lengths: 1 yrd
4:5
5:6
Maj 3rd
134
Min 3rd
7:8
268
335
402
*469
536
fx2
fx3
fx4
fx5
fx6
fx7
fx8
Fourths
Fifths
Sixths
1/4 yrd
1/5 yrd
Halves
1/2 yrd
201
6:7
Thirds
1/3 yrd
1/6 yrd
Sevenths
1/7 yrd
Eights
1/8 yrd
*The interval between the 6th and 7th partial (created by the ratio 6:7) does not yeild a true
B flat in the Equal Tempered System of tuning. A true tempered B flat would be created by a
ratio of 5:6 which would yield a frequency of 482.4 hz. The interval created by the ratio of 6:7
exists as a pure interval of the overtone series but does not exit in the tempered tuning system we
employ.
**Since the pure ratio of 6:7 does not create a tempered B flat than the interval from the 7th
partial to the 8th is not an interval of a major 2nd in the tempered system. The interval created
by the ratio of 7:8 exists only as a pure interval of the overtone series. A tempered major second
has a ratio of 8:9.
Geometric Waveforms
The different combinations of present partials and their associated amplitudes create various waveforms.
The names of some of the standard geometric waveforms refer of the shapes they create when viewed with an
oscilloscope. Geometric waveforms use static or fixed amplitudes for all of the partials.
Sine Wave
The sine wave has only a single partial. No overtones exist in a sine wave.
Sine Wave
Triangle Wave
A = Amplitude
n = partial number
Square Wave
Sawtooth Wave
http://www.audiocheck.net/audiofrequencysignalgenerator_sinetone.php
Harmonic/Inharmonic Spectra
If all of the partials in a given waveform are whole-number multiples of the fundamental, the wave is said
to have harmonic spectra. (Fig. 7.2) Conversely, if some of the partials have a fractional relationship to the
fundamental the wave has inharmonic spectra (Fig. 7.3).
Creating complex waves by adding together different individual sine waves is call Fourier synthesis (Fig. 7.5).
Formant Regions
Sympathetic vibration or resonance occurs when a vibrating wave comes in contact with an elastic
medium and sets this medium in motion. Some vibrating spaces, due to the material of construction, size or
shape, will emphasize (add volume to) one or more frequencies due to sympathetic vibration. When these
frequencies are played the vibrating space will begin to sympathetically vibrate, causing an increase in amplitude
of those frequencies. These emphasized frequencies are called formant regions. (Fig. 7.7)
Formant regions can also occur due to the absorption coefficient (ability to absorb acoustical energy) of
materials in a vibrating space. All materials employed in a vibrating space (whether inside a violin or a concert
hall) will have a unique absorption coefficient rating. All ratings are done by frequency range (amount of
absorption in a particular frequency band). Some materials will absorb high frequencies while others are best
suited for low frequency absorption. The combination of all of the materials in a vibrating space and their unique
absorption capabilities will create an overall frequency response with characteristic formant regions.
Another cause of formant regions is the overall dimensions and layout of a vibrating space that creates favorable
or unfavorable interactions between vibrations in the space. Waves emanating from one part of the vibrating
space (incident wave) can reach a boundary and reflect back (reflected wave) in the direction of the incident
wave. In some instances the interaction of the incident and reflected waves will create varying degrees of
constructive interference that in turn creates a formant region. The interaction of the incident and reflected waves
resulting in constructive interference is called a standing wave. Standing waves are created by parallel surfaces.
Standing waves can be reduced or eliminated by avoiding parallel surfaces.
Measuring Amplitude - Objective, Empirical Measurement - Power and Intensity
The amplitude of a sound is the direct result of the amount of energy used to create the sound. One
empirical/objective measurement of amplitude can be accomplished by simply determining the amount of
energy/work spent over the period of one second to create the sound. This is a measurement of power.
W = E/T
W = Power E = Energy (measured in joules) T = Time (in seconds)
Another way of creating an empirical measurement of amplitude is to determine the amount of
energy/work that is dissipated in a vibrating space over the period of one second. This is a measurement of
intensity.
I = W/S
I = Intensity W = Power (measured in watts) S = surface area (in square meters)
An analogy to these types of measurements can be seen by looking at the inherent power of a light bulb
(determined by its construction and the electrical energy applied to it) and its resultant intensity in a room. This
intensity relies on the power of the light bulb but also the color of the walls, the size of the space, etc. Therefore,
in order for two rooms to have the same intensity of light they must both be of the exact same construction and
use the exact same power source.
A piano will have inherent power characteristics (determined by its construction and the mechanical
energy applied by the performer) but the perceived intensity will be determined by the distance from the piano and
the construction of the room that the piano is in.
Intensity determined by room characteristics
Power Source
Room Boundaries
The graph below demonstrates how incremental doubling of intensity/power is perceived as a progressive
increase in loudness in equal increments.
Logarithmic
in IntensityLinear
Loudness Perception
Perception
Logarithmic
ChangeChange
in Intensity
|
Linear Loudness
Incremental
Incremental
doubling of
doubling
i n t e nof
sity
intensity
1:2
2:4
4:8
8:16
16:32
32:64
Equal
Equal
increments
incrementsof
lou
dn ess
of
loudness
6
The
listener
perceives
a progressive
The
listener
perceives
a progressive
increase
increase in loudness in equal
in loudness
in
equal
increments
increments
Amplitude measurements for all audio signals (electrical or acoustic) reflect the subjective way in which
we perceive changes in amplitude. This system employs logarithmic power/intensity ratios to provide linear
changes in loudness. The increments of amplitude change are called bels named after Alexander Graham Bell.
The bel represents a 10:1 power/intensity ratio. A 10:1 power ratio results in a doubling of perceived loudness
(this figure is disputed in the literature: loudness perception is affected by many factors and is quite variable and
subjective - see below section on the equal loudness curves). Each bel can be subdivided into 10 decibels or
dB. The decibel represents the unit of measurement that corresponds roughly to the amount a sound must be
raised in level to be heard as louder. Each incremental increase in dB is perceived as an equal change in
loudness that is accomplished with a logarithmic change in intensity.
SUBJECTIVE PERCEPTION/LOUDNESS
Linear (equal) loudness changes are
perceived between one adjacent bel or
decibel and another
5 bels
4 bels
10:1 power ratio
dB (decibels)
3 bels
Each bel is 10x the
intensity/power or twice
the loudness of the
previous bel
2 bels
10:1 power ratio
1 bel
10:1 power ratio
Point of Reference
(Threshold of hearing)
The diagram above shows the logarithmic relationship from one bel to the next. Note that all
measurements are using the threshold of hearing as the point of reference. The point of reference is an
absolute, empirical measurement (in power or intensity values) that represents the energy required to create the
softest sound an average person can hear. This is the value that all other amplitudes are compared to.
The diagram below demonstrates the logarithmic ratios and their linear equivalents in the decibel. Note
that a doubling of intensity/power (2:1 ratio) will always result in an increase of 3 dB.
Logarithmic
Ratios Decibels
1
(db)
Bels
2
1 : 2)
(Doubling Intensity = 3 db increase
(3.01)
3
4
1:3
(4.77)
1:4
(6.02)
1:5
(6.98)
1:6
(7.78)
1:7
(8.45)
1:8
1:9
1:10
10
(9.03)
(9.54)
(10.00)
The ear can perceive an incredibly wide range of amplitude change from the very threshold of hearing up
to approximately the intensity ratio of 10,000,000,000,000:1 that represents the threshold of pain. The graph
below demonstrates some typical sounds sources, their power/intensity ratio compared to the threshold of
hearing, log, and the equivalent bel/decibel.
Sound
Threshold of pain
Jet taking off from 500 ft. away
Loud band in a club
Power saw
Subway
Heavy highway traffic
Busy street traffic
Close normal conversation
Office background noise
Quiet Conversation
Waiting room in doctors office
Recording studio ambient level
Breathing
Threshold of hearing
Power/Intensity Ratio
10,000,000,000,000:1
1,000,000,000,000:1
100,000,000,000:1
10,000,000,000:1
1,000,000,000:1
100,000,000:1
10,000,000:1
1,000,000:1
100,000:1
10,000:1
1,000:1
100:1
10:1
0:0
Log
10/13
10/12
10/11
10/10
10/9
10/8
10/7
10/6
10/5
10/4
10/3
10/2
10/1
10/0
Bel/dB
13/130
12/120
11/110
10/100
9/90
8/80
7/70
6/60
5/50
4/40
3/30
2/20
1/10
0/0
R = Rate
L = Level
L1
Amplitude
R2
R1
L2
R3
Time
Finger Down
Finger Up
As the diagram above indicates, these shapes when plotted on a graph consist of time values (rate) and
amplitude values (level). R1 represents how long it takes the sound to go from no sound to the first amplitude
peak (L1). R2 represents the time it takes to go from the first amplitude peak to (L1) to the second amplitude
peak (L2). Finally R3 represents the time it takes to go from L2 to no sound at all.
In order to create envelope shapes on a synthesizer the rate and level values are used to instruct an
amplifier how and when to open and close. If you want the sound above to continue to sustain as long as the
note is held down you need to tell the synthesizer that L2 is the sustain level. Now when you depress a key the
synthesizer runs through R1, L1, R2, and holds on L2 as long as your finger is down. When you finger is
released from the keyboard the synthesizer then runs through R3 and then the sound stops (see Fig. 1.8 on next
page for various amplitude/intensity envelopes).
Noise
Irregular or random vibrations create noise. Noise can be created on a synthesizer with a noise generator. The
noise generator produces all frequencies in the audible spectrum and assigns random amplitudes to these
frequencies. Noise generators are defined by color, the color representing the part of the frequency spectrum (as
with light) that is emphasized. White noise is defined as having equal energy per unit frequency. This means
that there is the same amount of energy between 500 Hz and 501 Hz as there is between 1500 Hz and 1501 Hz.
Equal energy per unit frequency means that the noise is spectrally flat. Its' response is the same for all
frequencies. Pink noise contains equal energy per octave. For example, there is the same amount of energy
between 50 and 100 Hz as is between 7,500 and 15,000 Hz. Therefore, there is more energy in the lower part of
the frequency spectrum than the upper part. Pink noise derives its name from the color pink that is at the low end
of the light spectrum.