H92pqSb3uH Merged
H92pqSb3uH Merged
1 INTRODUCTION
The term image transforms usually refers to a class of unitary matrices used for representing images. Just as a one-dimensional signal can be
represented by an orthogonal series of basis functions, an image can also be expanded in terms of a discrete set of basis arrays called basis
images. These basis images can be generated by unitary matrices. Alternatively, a given NN image can be viewed as an N ^ 2 * 1 vector. An
image transform provides a set of fordinates or basis vectors for the vector space.
For continuous functions, orthogonal series expansions provide sexiesco efficients which can be used for any further processing or analysis of
the functions. For a one-dimensional sequence \ u(n) 0 <= n <= N - 1 \ represented as a vector u of size N, a unitary transformation is written as
v = An Rightarrow v(k) = sum n = 0 to N - 1 a * (k, n) * u(n) . where A ^ - 1 =A^ *T unitarygives u =A^ *T y Rightarrow u(n)= sum k=0 ^ N-1 v(k)a^ *
(k, n) 0 <= k <= N - 1 0 <= n <= N - 1 (5.1) (5.2)
Equation (5.2can be viewed as a series representation of the sequence u(n) The columns of A ^ (- T) that is, the vectors a i ^ * triangleq(a^ * (k,
n) , 0 <= n <= N - 1 \ ^ T are the basis vectors of A. Figure 5.1 shows examples of basis of several orthogonal transforms encountered in image
processing. The series coefficients nu(k) give a representation of the original sequence mu(n) and are useful in filtering, data compression,
feature extraction, and other analyses.
5.2 TWO-DIMENSIONAL ORTHOGONAL AND UNITARY TRANSFORMS
In the context of image processing a general orthogonal series expansion for an NN image omega(m, n) is a pair of transformations of the form
nu(k, l) = sum m n=0 ^ N-1 u(m,n) sigma n,l (m,n), u(m,n)= sum k, l = 6 to N - 1 v * (k, l) sigma k,l ^ * (m,n), 0 <= k 0 <= m l <= N - 1 r <= N - 1 (5.3)
(5.4) where \ a A, l (m,n)\ called an image transform, is a set of complete orthonormal discrete satisfying the properties
Orthonormality: sum K = 1 to N - 1 sum n = 0 to N - 1 sigma k,l (m, pi) sigma k,r ^ * (m,n)=5(k^ prime -k^ prime ,l-l^ prime ) (5.5)
Completeness: sum i,l+ bullet sum i= bullet ^ n sigma k, l (m,n)* r k,l ^ * (m^ prime ,n^ i )= mathfrak S (m-m^ i ,n-n^ prime ) (5.6). The elements
v(k, l) v triangleq \{v(k, l)\} is
of the form
omega F, \mathcal{Q} (m,n) Delta n sum k=0 ^ rho-1 sum l=0 ^ Q-1 v(k,l) sigma k,l ^ n (m,n). PSN, QSN (5.7)
will minimize the sum of squares error sigma r ^ 2 = sum m,n=0 ^ N-1 sum n=0 ^ N-1 \ u(m, n) -u r,Q (m,n)\ ^ 2 (5.8)
when the coefficients v(k.1) are given by (5.3). The completeness property assures that this error will be zero for P sim Q = N Problem 5.1).
Separable Unitary Transforms
The number of multiplications and additions required to compute the transform coefficients v(4, 1) using (5.3) is O(N), which is quite excessive
for practical-size images. The dimensionality of the problem is reduced to O(N) when the transform is restricted to be separable, that is,
where ( a_{i}(m), k = 0 . ,-1), (b(n),1-0,..., N-1) are one-dimensional complete sets of basis vectors. Imposition of (5.5) and (5.6) shows that A
(a(k, m)) and B (6(1,n)) should be unitary matrices themselves, for example. A * A ^ T =A^ T A^ + =1 Olien one chooses B to be the same as A so
that (5.3) and (5.4) reduce to (5:10)
(5.11)
u(m,n)= sum k,l=0 ^ N-1 sum k,l=0 ^ N-1 a^ * (k, m) * v(k, l) a^ * (l,n) mapsto cup=A^ nT V Lambda^ *
(5.12)
(5.13)
(5.14)
where Au and A are MM and NN unitary matrices, respectively. These are called two-dimensional separable transformations. Unless otherwise
stated, we will always imply the preceding separability when we mention two-dimensional unitary transformations. Note that (5.11) can be
written as V ^ T =A\ AU]^ T
(5.15)
which means (5.11) can be performed by first transforming each column of Land then transforming each of the result to obtain the rows of V.
Basis Images
(5.16)
Let a denote the kth column of A. Define the matrices A i, j ^ * =a i ^ * a i ^ *T and the matrix inner product of two NN matrices Fand Gas langle
F,C rangle= sum n + 0 sum n + 0 f * (m, n) /c^ * (m,n) Then (5.4) and (5.3) give a series representation for the image as U= sum k, i = 0 to N - 1 nu
* (k, i) A k,i ^ * int nu(k, I) =(U,A k,I ^ * ) (5.18)
(5.17)
(5.19)
Equation (5.18) Expesses any linear combination of the Ne matrices Lambda 4, 3 ^ prime ,k,l=0, N - 1 which are called the basis imagez Figure
5.2 shows 8 * 8 for same set of transforms in Fig. 5.1. The transform coefficient (k, 1) is simply the inner product of the (k, 1)th basis image
with the given image. It is also called the projection of the image on the (k, 1)th basis image. Therefore, any NXN image can be expanded d in a
series using a complete set of N basis images. If U and V are mapped into vectors by row ordering, then (5.11), (5.12), and (5.16) yield (see
Section 2.8, on Kronecker products) int e =(A odot A) e ^ i oplus e C e AA where 30
(5.20)
(5.21)
(5.22) is a unitary matrix. Thus, given any unitary transform A, a two-dimensional sepa- rable unitary transformation can be defined via (5.20) or
(5.13).
Example 5.1
For the given orthogonal matrix A and image U Lambda = 1/(sqrt(2)) * [[1, 1], [1, - 1]] U = [[1, 2], [3, 4]] the transformed image, obtained according
to (5.11), is V - 1/2 * [[1, 1], [1, - 1]] * [[1, 2], [3, 4]] * [[1, 1], [1, - 1]] - 1/1 * [[4, 1, 6], [- 2, - 2, 0]] * [[1, 1], [1, - 1]] - [[5, - 1], [- 2, 0]] To obtain the basis
images, we find the outer product of the columns of A ^ +T, , which gives A hat sigma 0, 0 = 1 2 ( matrix 1\\ 1 matrix )( matrix 1&1 matrix )= 1 2
( matrix 1&1\\ 1&1 matrix ) [[1, - 1], [- 1, 1]] and similarly The inverse transformation gives A^ -T VA^ - = 1/2 * [[1, 1], [1, - 1]] * [[5, - 1], [- 2, 0]] * [[1,
1], [1, - 1]] - 4 * [[3, - 1], [7, - 1]] * [[1, 1], [1, - 1]] = [[1, 2], [3, 4]] which is U, the original image.
Dimensionality of image transforms can also be studied in terms of their Kronecker product separability. An arbitrary one-dimensional
transformation
(5.23)
is called separable if
C =A 1 otimes A 2
(5.24)
This is because (5.23) can be reduced to the separable two-dimensional trans formation
X = A_{1}*X * A_{2} ^ T
(5.25)
where X and Y are matrices that map into vectors andy, respectively, by row ordering. If is N ^ 2 * N ^ 2 and A., A_{2} are N*N_{1} then the
number of operations required for implementing (5.25) reduces from N ^ 4 to about 2N ^ 3 The number of operations can be reduced further if
A_{1} and A, are also separable. Image transforms such as discrete Fourier, sine, cosine, Hadamard, Haar, and Slant can be factored as
Kronecker products of several smaller-sized matrices, which leads to fast algorithms for their implementation (see Problem 5.2). In the context
of image processing such matrices are also called fast image transforms. Dimensionality of Image Transforms.
The 20' computations for V can also be reduced by restricting the choice of A to the fast transfoins, whose matrix structure allows a
factorization of the Type
(5.26)
where Aa,i=1.....p(p) are matrices with just a few nonzero entries (say r. with rN). Thus, a multiplication of the type y Ax is accomplished in spN
operations. For Fourier, sinc, cosine, Hadamard, Slant, and several other trans- forms, plog. N, and the operations reduce to the order of N log, N
(or N² log N for NXN images). Depending on the actual transform, one operation can be defined as onc multiplication and one addition or
subtraction, as in the Fourier transform, or one addition or subtraction, as in the Hadamard transform.
NA In Lillis
one-dimensional signal f(x), frequency is defined by the Fourier domain variable. It is related to the number of zero crossines of the real or
imaginary part of the basis function exp(j2mtx). This concept can be generalized to arbitrary unitary transforms. Let the rows of a unitary matrix
A be arranged so that the number of zero crossings increases with the row number. Then in the trans- formation
y Ax
the elenients y(k) are ordered according to increasing wave number of transform frequency. In the sequel any reference to frequency will imply
the transforni frequency, that is, discrete Fourier frequency, cosine frequency, and so on. The term spatial frequency generally refers to the
continuous Fourier transform fre- quency and is not the same as the discrete Fourier frequency. In the case of Hadamard transform, a term
called sequency is also used. It should be noted that this concept of frequency is useful only on a relative basis for a particular transform. A
low-frequency term of one transform could contain the high-frequenev harmonics of another transform
Another important consideration in selecting a transform is its performance in fiftering and data compression of images based on the mean
square criterion. The Karhonen-Loeve transform (KLT) is known to be optimum with respect to this criterion and is discussed in Section 5.11.
5.3 PROPERTIES OF UNITARY TRANSFORMS
v = Au
||v|| ^ 2 =|u||^ 2
(5.27) This is easily proven by noting that Nv k ^ r2 triangleq sum k = 0 to N - 1 |v(k)| ^ 2 = v^ *T ^ T v=u^ +T A^ *T Au=u^ *T u= sum s=0 ^ N-1 |
u(n)|^ 2 underline Delta overline h v|^ 2
Thus a unitary transformation preserves the signal energy or, equivalently, the length of the vector u in the N-dimensional vector space. This
means every unbary transformation is simply a rotation of the vector a in the N-dimensional-vector Space. Alternatively, a unitary
transformation is a rotation of the basis coordinates and the components of are the projections of won the new basis (see Problem 5.4).
Similarly, for the two-dimensional unitary transformations such as (5.3), (5.4), and (5.11) to (5.14), it can be proven that sum m=0 ^ N-1 sum v=1
^ N-1 | u(m, n) * P = sum k, l = 0 to N - 1 |nu(k, l)| ^ 2
(5.28)
(5.29)
(5.30)
The transform coefficient variances are given by the diagonal elements of R., that is sigma^ 2 (k)=[R v ] t,k = [A*R_{u} * A ^ 7] t,k
(5.31)
4 For USADO EM sum nu=0 ^ -1 | mu nu (k)|^ 2 = mu tau ^ +T mu nu = mu_{nu} ^ T * A ^ T * A*mu_{nu} = sum nu = 0 to N - 1 [mu_{nu}(n)] ^ 2
(622)?!)-1 sum k = 0 to N - 1 E * [|v(k)| ^ 2] = sum n=0 ^ N-1 E\ |u(n)|^ 2 ] (5.32) (5.33)
(5.34)
The average energy E [|/\ k)|^ 2 ] of the transform coefficients nu(k) tends to be un- evealy distributed, may be evenly distributed for the input
sequence u(n) For a twodimensional -random field u(m, n) whose mean is mu * (m, pi) and covariance is r(m, n; m', n') its transform coefficients
v(k, l) satisfy the properties mu_{i}(k, l) = sum m sum s a * (k, m) * a(l, n) * mu_{s}(m, n) (5.35). sigma bullet ^ 2 (k,l)=E[| v(k, l) - mu bullet (k,l)|^ 2 |
= sum n sum n sum n' sum n' a * (k, m) * a(l, n) * r(m, n; m', n') a^ * (k,m^ prime )a^ * (l,n^ prime ) If the covariance of u(m, n) is separable, that is
r(m, n; m', n') = r_{y}(m, m') * r_{2}(n, n') (5.37) then the variances of the transform coefficients can be written as a separable product sigma ^ 2 *
(k, l) = sigma_{nu} ^ 2 * (k) * sigma_{2} ^ 2 * (l) triangleq[ A*R_{1} A^ *7 ] k,d [AR 2 A^ *7 ] k,d R_{1} = \{r_{1}(m, m')\} * and*R_{2} =\ r 2 (n, ^ prime
R^ prime )\ where
(5.38)
Decorrelation
When the input vector elements are highly correlated the transform coefficients tend to be uncorrelated. This means the off-diagonal terms of
the covariance matrix R, tend to become small compared to the diagonal elements.
With respect to the preceding two properties, the KL transform is optimum, that is, it packs the maximum average energy in a given number of
transform coefficients while completely decorrelating them. These properties are presented in greater detail in Section 5.11.
Other Properties
Unitary transforms have other interesting properties. For example, the determinant and the cigenvalues of a unitary matrix have unity
magnitude. Also, the entropy of a random vector is preserved under a unitary transformation. Since entropy is a measure of average
information, this means under a unitary transformation. E_{33} = [[18/2, 1], [- 18/2, 18/2]] * [[1, 1], [1, 0]] * [[11/2 - 1], [10/2 - 1], [10/3]]
The two-dimensional DFT of an NN image \{u(m, n)\} is a separable transform defined as nu(k, I) = sum m = 0 to N - 1 sum n = 0 to N - 1 omega
* (m, n) * W_{N} ^ (Lm) * W_{N} ^ (Lm) 0 <= k l = N - 1 (5.61) and the inverse transform is w(m, hat n )= 1/(N ^ 2) * sum k = 9 to N - 1 sum l = 9 to
N - 1 v * (k, l) * W_{N} ^ (- k * m) * W_{k} ^ (- k) 0 <= m a <= N-:1 (5.62)
The two-dimensional unitary DFT pair is defined as nu(k,l)= 1/N * sum m = 0 to N - 1 sum n = 0 to N - 1 omega * (m, n) W N ^ h- W N ^ h ,
V = FUF
(5.66)
If U and V are mapped into row-ordered vectors and, respectively, then
o = F_{m} Fo
(5.67)
F =F otimes F
(5.68)
The N ^ 2 * N ^ 2 matrix F represents the NN two-dimensional unitary DFI. Figure 5.6 shows an original image and the magnitude and phase
components of its unitary DFT. Figure 5.7 shows magnitudes of the unitary DFTs of two other images.
The properties of the two-dimensional unitary DFT are quite similar to the one- dimensional case and are sunumarized next.
Symmetric, unitary.
FF
(5.69)
Periodic extensions. v(k + N, l + N) = v(k, l) omega(m + N, n + N) = u(m, n) Vm, n (5.70) Sampled Fourier spectrum. If 04m, q <= N - 1 and hat
a(m, n) = 0 otherwise, then tilde u(m, n) = u(m, n) G((2pi*k)/N, (2pi*l)/(N')) =DFT\ u(m,n))=v(k,I) (5.71)
where tilde U(omega_{1}, omega_{2}) the Fourier transform of omega(m, n) Fast transform. Since the two-dimensional DFT is separable, the
trans- formation of (5.65) is equivalent to 20 one-dimensional unitary DFTs, each of which can be performed in O(N * log_2(N)) operations vis
the FFT. Hence the total number of operations is O(N ^ 2 * log_2(N))
Conjugate symmetry. The DFT and unitary DFT of real images exhibit conjugate symmetry, that is, nu(N/2 plus/minus k, N/2 plus/minus 1) =
nu^ bullet ( N 2 mp k, N 2 mp1), 0 <= k, l <= N/2 - 1 (5.72)
or
From this, it can be shown that v(k, l) has only N ^ 2 independent real elements. For example, the samples in the shaded region of Fig. 5.8
determine the complete DFT or unitary DFT (see problem 5.10).
Basis images. The basis images are given by definition [see (5.16) and (5.53)]: A k, 1 ^ * = phi k phi l ^ T = 1 N \ W N ^ (km+W) 0 <= m, n <= N - 1 \
0 <= k l <= N - 1 (5.74)
Two-dimensional circular convolution theorem. The DFT of the two- dimensional circular convolution of two arrays is the product of their DFTs.
Two-dimensional circular convolution of two NN arrays (m, n) and u_{1}(m, n) is defined as u 2 (m,n)= sum m' = 6 to N - 1 sum n' = 0 to N - 1 h
(m-m^ prime ,n-n^ prime ) c u 1 (m^ prime ,n^ prime ), where 1 0 <= m n <= N (5.75)
h(m, n), h(m modulo N, modulo N) (5.76) sum m=0 ^ N-1 sum n=0 ^ N-1 h(m-m^ prime ,n-n^ prime ) c W N ^ (m-k+n) =W^ prime (m-1) sum i = -
(∞)' sum i = - (∞)' * N - 1 - m' f l (i,j) c W H ^ (k+n) . N-IN-1 N = W+ =Wk+n)ΣΣh(mn)W+) N (5.77)
= Wk+nD DFT{h(m, n)}N where we have used (5.76). Taking the DFT of both sides of (5.75) and using the preceding result, we obtain' DFT{u (m,
n)} =DFT(h (m, n)), DFT{u, (m, n)}
(5.78)
From this and the fast transform property (page 142), it follows that an NN circular convolution can be performed in O(N ^ 2 * log_2(N))
operations. This property is also useful in calculating two-dimensional such as x 3 (m,n)= sum m^ prime =0 ^ H-1 sum n^ prime =0 ^ H-1 x 2 (m -
m', n - n') * x_{1}(m', n') (5.79) where x_{1}(m, n) and x_{2}(m, n) are assumed to be zero for m, n \in [0, M - 1] The region support for the result
x_{3}(m, n) * is\{0 <= m, n <= 2M - 2\} N >= 2M - 1 and define NN arrays
'We denote DFT(x(m, n), as the two-dimensional DFT of an NN array (m, n),0 n <= N - 1. Evaluating the circular convolution of tilde h(m, n) and
tilde u_{1}(m, n) according to (5.75), it can be seen with the aid of Fig. 5.9 that x_{3}(m, n) = mu_{2}(m, n) 0 <= m n ^ 2 <= 2M - 2 (5.82) This
means the two-dimensional linear convolution of (5.79) can be performed in
O(N ^ 2 * log_2(N)) operations. Block circulant operations. Dividing both sides of (5.77) by N and using the definition of Kronecker product, we
obtain ( F otimes F)\%=Q(F otimes F) (5.83) where is doubly circulant and is diagonal whose elements are given by [ emptyset] iN + 1, kN + 1
triangleq d k,i =DF Gamma\ h(m,n)\ N , Eqn. (5.83) can be written as FR-DF or 0 <= k l <= N - 1 FHF = D (5.84) (5.85) that is, a doubly block
circulant matrix is diagonalized by the two-dimensional unitary DFT. From (5.84) and the fast transform property (page 142), we conclude that a
doubly block circulant matrix can be diagonalized in O(N ^ 2 * log_2(N)) opera- tions. The eigenvalues of IN, given by the twodimensional -DFT
of i(m, n) are the same as operating NF on the first column of A. This is because the elements of the
Block Toeplitz operations. Our discussion on linear convolution implies that any doubly block Toeplitz matrix operation can be imbedded into a
double block circulant operation, which, in turn, can be implemented using the two- dimensional unitary DFT.