0% found this document useful (0 votes)
57 views25 pages

Perception: 1. Ear Physiology 2. Auditory Psychophysics 3. Pitch Perception 4. Music Perception

This document summarizes key points from a lecture on music signal processing and perception. It discusses the physiology of the ear and cochlea's role in transducing sound into nerve impulses. It also examines auditory psychophysics concepts like loudness perception, equal loudness curves, and masking. Pitch perception is explored including place and time models. Finally, aspects of music perception like scene analysis, consonance, rhythm, and sequence perception are summarized.

Uploaded by

Guto Valentin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views25 pages

Perception: 1. Ear Physiology 2. Auditory Psychophysics 3. Pitch Perception 4. Music Perception

This document summarizes key points from a lecture on music signal processing and perception. It discusses the physiology of the ear and cochlea's role in transducing sound into nerve impulses. It also examines auditory psychophysics concepts like loudness perception, equal loudness curves, and masking. Pitch perception is explored including place and time models. Finally, aspects of music perception like scene analysis, consonance, rhythm, and sequence perception are summarized.

Uploaded by

Guto Valentin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

ELEN E4896 MUSIC SIGNAL PROCESSING

Lecture 3:
Perception
1.
2.
3.
4.

Ear Physiology
Auditory Psychophysics
Pitch Perception
Music Perception
Dan Ellis

Dept. Electrical Engineering, Columbia University


dpwe@ee.columbia.edu
E4896 Music Signal Processing (Dan Ellis)

http://www.ee.columbia.edu/~dpwe/e4896/
2013-02-04 - 1 /24

1. Ear Physiology

The ear is a very sensitive transducer

of air pressure variations into nerve firings


just above Brownian motion !?
Middle
ear
Auditory
nerve
Cortex

Outer
ear

Midbrain
Inner ear
(cochlea)

The cochlea is largely understood


the brain is more difficult

E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 2 /24

The Ear

Impedance matching & transduction


Ear canal
Middle ear
bones

Pinna

Cochlea
(inner ear)

Eardrum
(tympanum)

pinna acts as horn


eardrum + bones match impedance
cochlea transduces to nerve firings
E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 3 /24

The Cochlea

Complex resonant structure


Oval window
(from ME bones)
Travelling
wave

Basilar Membrane
(BM)

Cochlea
16 kHz

Resonant
frequency
50 Hz
0

Position

35mm

http://www.wadalab.mech.tohoku.ac.jp/FEM_BM-e.html

active feedback to maintain near-ringing state


efferent fibers?
E4896 Music Signal Processing (Dan Ellis)

2013-02-04 -

/24

Hair Cells

Transduce mechanical motion


to nerve firings

Cochlea
Tectorial
membrane

Inner Hair Cell


(IHC)

Basilar
membrane
Auditory nerve
Outer Hair Cell
(OHC)

3,000 IHCs driving 20,000 nerves


easily damaged

E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 5 /24

Auditory Nerve

IHC fires near maximum displacement


Local BM
displacement

50

time / ms

Typical nerve
signal (mV)

cannot fire every cycle


some noise
Firing
count
Cycle
angle

E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 6 /24

Nerve Responses

Onset enhancement
Frequency selectivity
Dynamic range

Spike
count
100

Time
100 ms

Tone burst
One fiber:
~ 25 dB dynamic range

dB SPL
300

Spikes/sec

80
60
40

200

100

20

Intensity / dB SPL
0

100 Hz

1 kHz
(log) frequency

E4896 Music Signal Processing (Dan Ellis)

20

40

60

80

100

10 kHz
Hearing dynamic range > 100 dB

2013-02-04 - 7 /24

Auditory Nerve Ensemble

Secker-Walker & Searle 90

Ensemble of nerves provide full information

similar to constant-Q log-intensity spectrogram


freq / 8ve re 100 Hz

PatSla rectsmoo on bbctmp2 (2001-02-18)


5
4
3
2
1
0
0

10

E4896 Music Signal Processing (Dan Ellis)

20

30

40

50

60

time / ms

2013-02-04 - 8 /24

Auditory Models

Filterbank + nonlinearity

varying (but broad) bandwidth


IHC

Sound

Outer/middle
ear
filtering

Cochlea
filterbank

IHC

SlaneyPatterson 12 chans/oct from 180 Hz, BBC1tmp (20010218)

60

channel

50
40
30
20
10
0

0.1

E4896 Music Signal Processing (Dan Ellis)

0.2

0.3

0.4

0.5

time / s

2013-02-04 - 9 /24

2. Auditory Psychophysics

Extensive study of relationship between

physical () and psychological () values


perception is not direct!

Common across all perceptual modalities


proprioceptive force, body positioning
vision
hearing

- distinction is important!

E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 10/24

Loudness perception

Perception of physical parameter


just noticeable difference - jnd
magnitude scaling

Webers law:
Loudness

L
log(L) =
log10 (L) =
dB(I) =

log(L)

I 0.3
0.3 log(I)
0.03 dB(I)
33.3 log10 (L)

log(I)

Hartmann(1993) Classroom loudness scaling data


2.6

Log(loudness rating)

2.4

Textbook figure:
L I 0.3

2.2
2.0

Power law fit:


L I 0.22

1.8
1.6
1.4
-20

-10

10 Sound
Hartmann
90level
tracks 19+20

E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 11/24

Equal Loudness

Fletcher-Munson curves (1937)


Intensity / dB SPL

120

80

40

0
100

1000

10,000

freq / Hz

match intensity to specific 1 kHz tone


loudness growth
E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 12/24

Masking

Limited dynamic range in cochlea


Intensity / dB

effect within frequencycritical bands


absolute
threshold

masking
tone

masked
threshold
log freq

basis of MPEG Audio

temporal effects

level / dB

Forward/backward

Masking tone

20

Elevated masking
threshold skirt

18
16
14
12
10

20

8
15

6
4

10

2
5

0
0

50

100

150

time / ms

E4896 Music Signal Processing (Dan Ellis)

200

250

freq / Bark

tracks 23-25

2013-02-04 - 13/24

Limits of Hearing

Test what listeners can discriminate


A

two-interval
forced-choice:

X = A or B?
time

Roughly...

timing: 2 ms difference, 20 ms ordering


tuning: ~1%
spectral profile: single components ~ 2 dB
phase?
tones vs. noise...

E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 14/24

3. Pitch Perception

Complex (non-sinusoidal) tones


give a single, fused percept

despite harmonics resolved by cochlea


freq. chan.

70
60
50
40
30
20
10

0.05

0.1

time/s

percept is of a single pitch


.. but pitch does NOT rely on the fundamental
track 37

E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 15/24

Place models of pitch

Hypothesis:

Pitch results from activation pattern

resolved
harmonics

frequency channel

Correlate with harmonic sieve:

Pitch strength

Duifhuis et al. 82

AN excitation

broader HF channels
cannot resolve
harmonics

frequency channel

support: low harmonics are important


but: pitch of noisy signals

E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 16/24

Time models of pitch

Autocorrelation
freq

neatly unifies pitch phenomena


per-channel
autocorrelation

time

Meddis & Hewitt 91

autocorrelation

Summary
autocorrelation

10

20

30

common period
(pitch)

lag / ms

but: high-frequency modulation evokes weak pitch


E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 17/24

Competing Cues

Perhaps brains use both place & time cues


common perceptual strategy:
opportunistic combination of information

e.g. Probabilistic combination


arg max P r( |x)

P r( |x1 )P r( |x2 )
arg max
P r( )

if x1, x2 are independent given

E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 18/24

4. Music Perception

Hearing music involves

Can study with


subjective
experiments

E4896 Music Signal Processing (Dan Ellis)

Grey 75

instruments
notes
rhythm

2013-02-04 - 19/24

Scene Analysis

Detect separate events


freq / Hz

common onset
common harmonicity

8000

6000

Pierce 83

4000

2000

9 time / s

instruments & timbre

E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 20/24

Consonance

Musical intervals

Pitch Helix
E4896 Music Signal Processing (Dan Ellis)

Warren et al. 2003

relate to
harmonic
proximity

2013-02-04 - 21/24

Rhythm

Sensitive to periodicity
speech? breathing? brain?
Onsets + autocorrelation?
variations in tapping
4/4 vs 3/4

40

40

30

30

20

20

10

10
0

10

-2

10

-2

80

80

60

60

40

40

20

20

10

10

10

100

200

300

400

500

600

700

800

900

1000

0
0

10

1000

100

500

0
0

100

200

300

400

500

600

700

800

E4896 Music Signal Processing (Dan Ellis)

900

1000

-100

2013-02-04 - 22/24

Sequences

Perceptual effects of sequences


e.g. streaming

frequency

TRT: 60-150 ms
1 kHz

f:

2 octaves

track 36
time

Music is built of sequences


different sensitivities

E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 23/24

Summary

Ear converts air pressure to nerve firings


spectral analysis

Brain does a lot with scarce information


dealing with the real world

Music is a complex signal


multiple sources
harmonic structures
temporal patterns

E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 24/24

References

H. Duifhuis, L. F. Willems, & R. J. Sluyter, Measurement of pitch in speech: An


implementation of Goldsteins theory of pitch perception, J. Acoust. Soc. Am.,
71(6):1568-1580, 1982.
H. Fletcher & W. A. Munson, Relation between loudness and masking, J.
Acoust. Soc. Am., 9:1-10, 1937.
B. Gold, N. Morgan, & D. Ellis, Speech and Audio Signal Processing (2nd ed.),
Wiley, 2011.
J. M. Grey, An Exploration of Musical Timbre, Ph.D. Dissertation, Stanford
CCRMA, No. STAN-M-2, 1975.
W. M. Hartmann, Auditory demonstrations on compact disk for large N, J.
Acoust. Soc. Am., 93(1):1-16, 1993.
R. Meddis & M. J. Hewitt, Virtual pitch and phase sensitivity of a computer
model of the auditory periphery. I: Pitch identification, J. Acoust. Soc. Am.,
89(6):2866-2882, 1991.
B. C. J. Moore, An Introduction to the Psychology of Hearing (4th ed.), Academic
Press, 1997.
J. R. Pierce, The Science of Musical Sound, Scientific American Press, 1983.
H. E. Secker-Walker & C. L. Searle, Time-domain analysis of auditory-nervefiber firing rates, J. Acoust. Soc. Am., 88(3):1427- 1436, 1990.
J. D. Warren, S. Uppenkamp, R. D. Patterson, & T. D. Griffiths, Separating pitch
chroma and pitch height in the human brain, PNAS 100(17):10038-42, 2003.

E4896 Music Signal Processing (Dan Ellis)

2013-02-04 - 25/24

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy