Acoustic Phonetics 2017-18
Acoustic Phonetics 2017-18
Anna Sfakianaki
Phonetician/Linguist (PhD)
Laboratory Teaching Staff,
Staff CSD UoC
CS-578
CS 578 Digital Speech Signal Processing
Spring Term 2017-18
University of Crete
Anna Sfakianaki, CS-578, 2017-18, University of Crete
A COURSE IN PHONETICS
QUESTIONS
https://home.cc.umanitoba.ca/
Anna Sfakianaki, CS-578, 2017-18, University of Crete
1 Formants
1.
Sounds differ from each other in three ways
pitch
loudness/intensity
quality
A vowel sound contains a number of different pitches
simultaneously
pitch at which it was spoken
various overtone pitches that give it its distinctive quality
Vowel Qualityy Overtone Structure
Overtones = Formants
The lowest 3 formants distinguish
g vowels from each other
F1 F2 F3
Anna Sfakianaki, CS-578, 2017-18, University of Crete
Articulate [i, e, a, o, u]
without producing sound.
What do you observe?
Pitch
Pit h off F1 ggoing
i g up ffor [i,
[i e]]
and down for [a, o, u]
Anna Sfakianaki, CS-578, 2017-18, University of Crete
Formants that
characterize different
vowels are the result of hood
the different shapes of
the vocal tract.
tract
Any body of air will vibrate hard
in a way that depends on
its size and shape.
y
Blow across the top of
tu ((FR))
an empty bottle
partially filled bottle
What do you observe? i
heed
æ
had
Adapted from Fant (1960)
Anna Sfakianaki, CS-578, 2017-18, University of Crete
1) V
Vocall ffolds
ld open and
d close
l
sending out pulses of acoustic energy at
different pitches and amplitudes.
One pulse in the vocal In vowels, we actually hear
tract produces three the sum of these waveforms
diff
different waveforms.
f added together
together.
Th air
The i iin ffrontt off the
th tongue,
t 2
a smaller cavity, produces a
higher-frequency waveform.
3
Anna Sfakianaki, CS-578, 2017-18, University of Crete
Sum of waveforms
Anna Sfakianaki, CS-578, 2017-18, University of Crete
SOME HISTORY…
HISTORY
The general theory of formants was stated by the great German
scientist Hermann von Helmholtz (1821-1894) about 150 years ago.
His scientific work covers the disciplines
off Physiology,
Ph i l g Psychology
P h l g
(theories of vision), Physics
((energy,
gy, electrodynamics,
y ,
thermodynamics), philosophy and
aesthetics.
He was Hertz’s supervisor and during his
studies in 1879 he suggested that
Hertz'ss doctoral dissertation
Hertz
be on testing Maxwell's theory of
electromagnetism, resulting in Hertz’s
di
discovery off electromagnetic
l t g ti waves.
SOME HISTORY…
HISTORY
Helmholtz
H l h lt invented
i t d the
th Helmholtz
H l h lt
resonator to identify the various
frequencies or pitches of the pure
sine wave components of complex
sounds containing multiple tones. Helmholtz Resonator
The Helmholtz resonator inspired Alexander Graham Bell to invent the
telephone based on the harmonic telegraph principle.
http://www.phonetics.ucla.edu/course/cha
pter8/speechbird/speechbird.html
Anna Sfakianaki, CS-578, 2017-18, University of Crete
2 Acoustic Analysis
2.
It is possible to
analyze sounds so
that we can measure
the actual
frequencies of the
formants and
represent them
graphically.
Average of F1, F2
and F3 frequencies
i eight
in i h A
American
i
English vowels.
heed, hid, head, had,
hod, hawed, hood, who’d
Anna Sfakianaki, CS-578, 2017-18, University of Crete
2 1 Spectrogram
2.1
Computer programs can analyze sounds and show their components.
The display
displa produced
prod ced is called a spectrogram.
spectrogram
In spectrograms
horizontal axis: time
Spectrograms
S t
Dark bands for concentrations
of energy at particular frequencies
showing the source and filter
characteristics of speech
Anna Sfakianaki, CS-578, 2017-18, University of Crete
2 2 Computer
2.2 C t P Programs
g ffor acoustic
ti analysis
l i
(free access)
P
Praat
http://www.fon.hum.uva.nl/praat/
U i
University
it off A
Amsterdam
t d
Wavesurfer
http://www.speech.kth.se/wavesurfer/
KTH (Royal Institute of
Technology, Stockholm)
Anna Sfakianaki, CS-578, 2017-18, University of Crete
tongue
e height
Chart based on X-ray data
F1
increases from
[i] to [æ] –as
vowel height
decreases.
decreases
from [] to [u]
–as vowel
height
increases
increases.
Hence F1 is
inversely related
to vowel height.
Anna Sfakianaki, CS-578, 2017-18, University of Crete
F2
higher for
front vowels
lower for back
vowels
affected by lip
rounding
decrease of
F2 & F3
Anna Sfakianaki, CS-578, 2017-18, University of Crete
2 5 F1 by F2 plot
2.5
F2
F1
Anna Sfakianaki, CS-578, 2017-18, University of Crete
COMPARISON
F2
F1
“Traditional vowel diagrams express acoustic facts in terms of physiological fantasies.” Oscar
Russell (1930s)
Vowel height F1,
F1 not actually tongue height
backness
Front – back dimension +
lip rounding
Degree of backness F1-F2 difference
The closer together F1 and F2, the more “back” a vowel sounds.
Anna Sfakianaki, CS-578, 2017-18, University of Crete
F1
Anna Sfakianaki, CS-578, 2017-18, University of Crete
3 Acoustics of Consonants
3.
The acoustic structure of consonants is usually more complicated
than that of vowels.
o els
In many cases, there is no distinguishable feature during the
consonant articulation itself, e.g.
g silence part of [p, t, k].
We have to look for the identity of the consonant at the beginning or
the ending of the vowel beside it.
bab dad gag
Anna Sfakianaki, CS-578, 2017-18, University of Crete
3 1 Stops
3.1
Each of the stop sounds conveys its
quality by its effect on the adjacent
vowel.
The formants of [æ] correspond to the
particular shape of the vocal tract.
tract
During the production of [bæ] the
formants correspond to the particular
shape
h th
thatt occurs the
th momentt theth lips
li
come apart.
Closure of the lips causes a lowering of
all formants.
f
The syllable [bæb] will begin with
formants in a lower position, then they
will rapidly rise to the positions of [æ],
and finally descend again as the lip
closure is formed.
Anna Sfakianaki, CS-578, 2017-18, University of Crete
Anticipatory Coarticulation
For the production of e.g. [bib] or [bab],
the tongue will be in position for the
vowel even when the lips are closed at
the beginning
g g of the word.
This happens because the part of the
tongue not involved in the formation of
the consonant closure is already in
position for the following vowel.
The formants at the moment of
consonantal release will vary according
to vowel.
The apparent point of origin of the
formant for each place of articulation is
called the locus of that place of
articulation.
articulation
The locus depends on adjacent vowels.
Anna Sfakianaki, CS-578, 2017-18, University of Crete
3 2 Formant transitions
3.2
Faint voicing striations near the baseline for each of the stops
[b d,
[b, d g]] (voice
( i b bar).
)
In all three words, F1 rises from a low position due to
consonant closure,
closure hence it does not distinguish one place of
articulation from another.
What distinguishes the three stops are the onsets and offsets
of F2 and F3.
[bd] [dd] []
Anna Sfakianaki, CS-578, 2017-18, University of Crete
3 2 Formant transitions
3.2
[bd]
F2 & F3 start at a lower frequency than in [dd].
[dd]
F2 & F3 are noticeably rising from a low locus.
[dd]
F2 is fairly steady at the beginning.
F3 drops a little.
[ ]
[]
Characteristic coming together of F2 & F3 velar pinch
[bd] [dd] []
Anna Sfakianaki, CS-578, 2017-18, University of Crete
3 3 Voiceless stops
3.3
The release of aspirated stops is marked by a sudden sharp spike lean
vertical line.
line
Period of aspiration noise absence of energy in F1 & no vertical striations
Frequency & intensity
Whisper [t, t, t, k, k, k, p, p, p]. What do you observe?
[t] > [k] > [p]
Intensity of [p] burst is sometimes so low that there is no evidence of it on a
spectrogram.
[pm] [tn] [k]
Anna Sfakianaki, CS-578, 2017-18, University of Crete
3 3 Voiceless stops
3.3
Formant transitions also present in aspiration noise.
[pm] : F2 & F3 rising into the vowel.
[tn] : F2 steady, F3 dropping and then rising.
[k] : characteristic velar pinch
3 4 Nasals
3.4
A clear mark of a nasal (and a lateral) is an abrupt change in the spectrogram at the
time of the formation of the articulatory closure.
A nasall h
has a fformantt structure
t t similar
i il tto th
thatt off a vowel.l Differences:
Diff
Bands are fainter.
Bands located in particular frequency locations depending on characteristic resonances of
the nasal cavities.
cavities
F1: around 250 Hz
Large region above F1 with no energy.
F2 etc: varying according to speaker (here around 2000 Hz).
Hz)
Place cues sometimes not very clear.
[pm] [tn] [k]
Anna Sfakianaki, CS-578, 2017-18, University of Crete
3 5 Voiceless fricatives
3.5
Highest frequencies in speech occur over fricatives.
Frequency scale increased to 8000 Hz.Hz
Diphthong [a] : F1 & F2 start close together for low central [a] and
move apart for high front [].
Fricatives: Random energy distributed over a wide range of
frequencies.
fie thigh sigh shy
Anna Sfakianaki, CS-578, 2017-18, University of Crete
[f ]
Voiceless fricatives [f,
Same pattern in [f] and [].
Difference: Movement of F2 into following vowel.
Very little movement in [f].
In [], F2 starts around 1200 Hz and moves down.
Often confused in noisyy settings.
g
Fallen together in some accents of English, such as London Cockney
fin and thin both pronounced with a [f].
fie thigh i h
sigh shy
h
Anna Sfakianaki, CS-578, 2017-18, University of Crete
[s ]
Voiceless fricatives [s,
The noise in [s] is centered at a high frequency, 5000 – 6000 Hz.
In [] itt iss lower,
o e , extending
e te d g down
do to about 2500500 Hz.
Both [s, ] have larger acoustic energy and produce darker patterns than [f, ]
Both [s, ] are marked with distinctive formant transitions.
Th llocus off F2 transition
The t iti iincreases th
throughout
h t th
the words
d
[f] < [] < [s] < [] (see arrows in fig.)
Before [] F2 of [a] is in a position comparable to its location in [i].
fie thigh sigh shy
Anna Sfakianaki, CS-578, 2017-18, University of Crete
v ]
3 6 Voiced fricatives [v,
3.6
Voiced fricatives [v, , z, ] have patterns similar to their voiceless
counterparts [f, , s, ].
Voiced fricatives also have vertical striations indicative of voicing.
Vertical striations due to voicing are apparent throughout [v] and [].
The fricative component of [v] is very faint.
faint
F2 higher around [] than [v].
[z ]
Voiced fricatives [z,
Fricative energy in higher frequencies very apparent in [z, ].
Voice bar
faint in [z]
–vertical striations due to voicing in 6-8 kHz.
hard to see in []
F2 transition into [] is
level from[z]
d
descending
di from
f []
[ ]
ever whether fizzer pleasure
Anna Sfakianaki, CS-578, 2017-18, University of Crete
TYPES OF SPECTROGRAMS
wide-band
spectrograms
t
narrow-band
spectrograms
t
TYPES OF SPECTROGRAMS
Wide-band spectrograms
Very accurate in the time dimension
They show each vibration of the vocal folds as a separate vertical line.
Th iindicate
They di t ththe precise
i momentt off a stop
t b burstt with
ith a vertical
ti l spike.
ik
Less accurate in the frequency dimension
There are usually several component frequencies present in a single
formant, all of them lumped together in one wide band on the
spectrogram.
Narrow-band spectrograms
More accurate in the frequency dimension (at the expense of
accuracy in the time dimension).
The spikes of stop releases are smeared in the time dimension in the
narrow-band spectrogram.
The frequencies that compose each formant are visible.
Anna Sfakianaki, CS-578, 2017-18, University of Crete
FEMALE VOICE
Women’s voices usually have a higher pitch.
The higher the F0 the more difficult it is to locate formants,
formants because the
harmonics interfere with the display of formants.
Greek phrase uttered by a male and a female Greek adult.
Λέ «παππού»
Λέγε ύ πάλι.
άλ (Say “ df th ” again)
(S “grandfather” i )
male
female
Anna Sfakianaki, CS-578, 2017-18, University of Crete
7 INDIVIDUAL DIFFERENCES
7.
It is important to know what sort of
differences exist between different
speakers.
1. When trying to measure features that are
linguistically significant, one must know how
to disco
discount
nt ppurely
rel individual
indi id al feat
features.
res
2. When trying to find out whether a speaker has
speech problems.
3. For valid speaker identification in forensic
situations.
7 INDIVIDUAL DIFFERENCES
7.
Same phonetic
S h ti
quality
Similar relative
positions
Different
absolute
values
Vowels pronounced by
2 speakers of Californian
English.
Anna Sfakianaki, CS-578, 2017-18, University of Crete
7 INDIVIDUAL DIFFERENCES
7.
No simple technique to average out individual characteristics
so that a formant plot shows only the phonetic qualities of
vowels.
F4 indicator of individual’s head size
Express values of other formants as percentages of the mean F4.
F4 values are not usually reported.
Phoneticians do not really know how to compare acoustic
d t on the
data th soundsd off one individual
i di id l with
ith th
those off another.
th
We cannot write a computer program that will accept any
individual’s
individual s vowels as input and then output a narrow
phonetic transcription.
Anna Sfakianaki, CS-578, 2017-18, University of Crete
Read
Read…
…&
& visit
https://corpus.linguistics.berkeley.edu/acip/
htt // li i ti b k l d / i /
Material for chapter 8 from UC Berkley Linguistics, “A course in
phonetics” including online exercises
h
https://soundphysics.ius.edu/?page_id=812
// d h i i d /? id 812
An Interactive eBook on the physics of sound (Indiana University
Southeast)
http://zonalandeducation.com/mstm/physics/waves/waveAdder/Wave
Adder1.html
Wave Adder
http://www.linguistics.ucla.edu/people/hayes/103/SpectrogramReadi
ng/Index.htm
Spectrogram reading practice (by Bruce Hayes, UCLA)
http://home.cc.umanitoba.ca/~robh/howto.html
Monthly Mystery Spectrogram Webzone -Rob Hagiwara's professional
web-space
http://www.youtube.com/watch?v=Gg4IHbiITd0
Introduction to spectrogram analysis (FloridaLinguistics.com)
Anna Sfakianaki, CS-578, 2017-18, University of Crete
PHONETICS ASSIGNMENT