HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing
HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
6.555/HST 582
Apr-07
April 07
HST 582
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
6.555/HST 582
Apr-07
3. Model Estimation
April 07
HST 582
Problem Setup
April 07
HST 582
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
6.555/HST 582
Apr-07
Concepts
In many experiments there is some element of
randomness the we are unable to explain.
Probability and statistics are mathematical tools for
reasoning in the face of such uncertainty.
They allow us to answer questions quantitatively such as
Is the signal present or not?
Binary : YES or NO
How certain am I?
Continuous : Degree of confidence
April 07
HST 582
April 07
HST 582
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
6.555/HST 582
Apr-07
April 07
signal
signal(?)+noise
signal(?)+noise
signal(?)+noise
HST 582
Coin Flipping
Fairly simple probability modeling problem
Binary hypothesis testing
Many decision systems come down to making a decision on
the basis of a biased coin flip (or N-sided die)
April 07
HST 582
10
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
6.555/HST 582
Apr-07
Bayes Rule
Bayes rule plays an important role in classification,
inference, and estimation.
HST 582
11
HH
HT
TH
TT
April 07
HST 582
12
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
6.555/HST 582
Apr-07
HH
HT
TH
TT
2nd flip
April 07
HST 582
1st flip
HH
HT
TH
TT
13
Define:
N : the number of trials
NA, NB : the number of times events A and B are observed.
Events A and B are mutually exclusive (i.e. observing one precludes observing
the other).
Empirical definition:
Probability is defined as a
limit over observations
April 07
HST 582
Axiomatic definition:
Probability is derived from
its properties
14
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
6.555/HST 582
Apr-07
April 07
HST 582
15
4 out of 5 Dentists
What does this statement mean?
How can we attach meaning/significance to the claim?
An example of a frequentist vs. Bayesian viewpoint
The difference (in this case) lies in:
The assumption regarding how the data is generated
The way in which we can express certainty about our answer
April 07
HST 582
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
6.555/HST 582
Apr-07
HST 582
:
:
17
:
:
April 07
HST 582
18
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
6.555/HST 582
Apr-07
April 07
HST 582
19
April 07
HST 582
20
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
10
6.555/HST 582
Apr-07
Expectation
Given a function of a random
variable (i.e. g(X)) we define its
expected value as:
April 07
HST 582
21
v
y
p XY ( u, v )
u
x
April 07
HST 582
22
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
11
6.555/HST 582
Apr-07
Conditional Density
y
p XY ( x, y )
yo
p XY ( x, yo ) is the slice of p XY ( x, y )
along the line y = yo
x
p XY ( x, yo ) / pY ( yo )
April 07
HST 582
23
Bayes Rule
For continuous random variables, Bayes rule is
essentially the same (again just an algebraic
manipulation of the definition of a conditional density).
p X |Y ( x | y ) =
pY | X ( y | x ) p X ( x )
pY ( y )
April 07
HST 582
24
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
12
6.555/HST 582
Apr-07
April 07
HST 582
25
Decision Rules
p0 ( x )
p1 ( x )
x
Decision rules are functions which map measurements to
choices.
In the binary case we can write it as
April 07
HST 582
26
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
13
6.555/HST 582
Apr-07
Error Types
p0 ( x )
p1 ( x )
x
R0
R1
R1
A false alarm
April 07
HST 582
27
Marginal density of X
HST 582
28
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
14
6.555/HST 582
Apr-07
p1 ( x )
pX ( x )
x
So given observations of x, how should select our best
guess of Hi?
Specifically, what is a good criterion for making that
assignment?
Which Hi should we select before we observe x.
April 07
HST 582
29
Bayes Classifier
p0 ( x )
p1 ( x )
pX ( x )
x
A reasonable criterion for guessing values of H given
observations of X is to minimize the probability of
error.
The classifier which achieves this minimization is the
Bayes classifier.
April 07
HST 582
30
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
15
6.555/HST 582
Apr-07
Probability of Misclassification
p0 ( x )
p1 ( x )
x
R0
R1
R1
April 07
HST 582
31
Probability of Misclassification
p0 ( x )
p1 ( x )
x
R0
R1
R1
April 07
HST 582
32
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
16
6.555/HST 582
Apr-07
In the second step we just change the region over which integrate
for one of the terms (these are complementary events).
In the third step we collect terms and note that all underbraced
terms in the integrand are non-negative.
If we want to choose regions (remember choosing region 1
effectively chooses region 2) to minimize PE then we should set
region 1 to be such that the integrand is negative.
April 07
HST 582
33
April 07
HST 582
34
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
17
6.555/HST 582
Apr-07
Derivation
Well simplify by assuming that
C11=C22=0 (there is zero cost to
being correct) and that all other
costs are positive.
Think of cost as a piecewise
constant function of X.
If we divide X into decision regions
we can compute the expected cost
as the cost of being wrong times
the probability of a sample falling
into that region.
April 07
HST 582
35
Alternatively
HST 582
36
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
18
6.555/HST 582
Apr-07
p1 ( x )
x
R0
R1
R1
April 07
HST 582
37
p1 ( x )
x
R1
April 07
R2
HST 582
R1
38
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
19
6.555/HST 582
Apr-07
Okay, so what.
All of this is great. We now know what to do in a few classic cases if
some nice person hands us all of the probability models.
In general we arent given the models What do we do?
Density estimation to the rescue.
While we may not have the models, often we do have a collection of
labeled measurements, that is a set of {x,Hj}.
From these we can estimate the class-conditional densities.
Important issues will be:
April 07
HST 582
39
Cite as: John Fisher. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing,
Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded
on [DD Month YYYY].
20