0% found this document useful (0 votes)
8 views

image_processing

Uploaded by

nikin.research
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

image_processing

Uploaded by

nikin.research
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Medical Image Processing

Nikin Baidar

Year 2021
1. Fundamentals of Digital Image Pro-
cessing
Monochrome image (or simply image) refers to a 2D light intensity function f(x,y) where
x and y denote spatial co-ordinates. The value of f(x, y) at (x, y) is proportional to the
brightness (or gray level) of the image at that point.
A digital image is an image f(x,y) that has been discretized both in spatial coordi-
nates and in brightness.
A digital image is essentially a matrix MxN where, the value of each element of
the matrix (pixel) (x, y) is given by the light intensity function I(x, y). The analog sig-
nals used to generate a digital image are discretized, both in spatial coordinates and
intensity levels.

Key Stages in Digital Image processing


The following block diagram represents the key stages in digital image processing:

Image Morphological
restoration processing

Image
Segmentation
enhancement

Image Object
acquisition recognition

Problem Representation
Domain & Description

Color image Image


processing Compression

Figure 1.1: Key Stages in Digital Image Processing. Image processing begins out with
the problem domain.

• Problem Domain
• Image Acquisition
• Image Enhancement

• Image restoration

• Morphological processing

• Segmentation

• Object Recognition

• Representation and Description

• Image Compression∗ .

• Color Image Processing∗

Properties of Digital Images


1. (Image) Histogram

2. Entropy

3. Image Quality

4. Noise

Image Histogram
An image histogram is a graphical representation of the distribution of gray levels (pixel
values) in an image. The histogram of a digital image f with intensities [0, L−1] is given
by a discrete function h(rk ) = nk . Where, rk is the kth intensity value and nk is the
total number of pixels with intensity rk .
Essentially, an image histogram represents the frequency of every pixel value in an
image. A histogram provides a global description of the appearance of the image that
can be used in the following:

1. Enhancement - contrast enhancement

2. Statistical analysis of an image

3. Image Segmentation

4. Compression

The histogram plot of a darker image, the graph tends towards left, in a bright im-
age, it tends towards right, in an image with a low contrast the plot is more or less at the
center, whilst in a clear image with high contrast, the plots are distributed throughout
the graph and each bar of the histogram is evenly spaced.
Histogram Normalization
Histogram normalization is a common practice. To normalize an image histogram,
divide each component of the histogram, by the total number of pixels in the image.
For an image of resolution MxN with pixels between [0..L−1], the normalizing function
nk
p(rk ) is given by .
MN

Image Storage Requirements: Entropy


In information theory, the entropy of a discrete random variable X with possible out-
comes x1 , x2 , ...xn which can occur with probability P(x1 ), P(x2 ), ...P(Xn ) is given by

X
n
H(X) = − P(xi ) log(P(xi ))
i=1

Entropy as defined in information theory provides insights on the total number of


binary digits that are required to encode (represent) a piece of information. In image
processing, the information is an image and entropy represents the total number of bits
required to represent each pixel of an image such that the information in the image is
clearly represented without any loss. If each pixel is represented with higher bits, the
image will be more detailed but it will take more resources in terms of storage and
bandwidth in transmission lines. For example, a 16-bit image will have more detail
compared to an 8 bit image but suppose the entropy of a given image is 7.98 then
representing this image with just 8-bits will be enough to extract the information that it
provides.
Entropy measures the width of the histogram of any given image. Higher width
(when the bars are evenly spaced and spans the graph) means more randomness
and high image contrast. Lower width, on the other hand, represents homogeneity in
the image and low image contrast. To put it another way, when the difference between
the highest and lowest intensities of an image is high, the image has high entropy and
good contrast. Likewise, if the differences of maximum and minimum pixel values is
relatively lower, then the image will be said to have a lower entropy and poor contrast.
Algorithm to compute entropy:

Step 0 Acquire the image.


Step 1 Calculate the image histogram, i.e., the frequency of each pixel in the image.
Step 2 Compute the resolution of the image.
Step 2 Calculate the probability of each pixel and the binary logarithm i.e., log2 , for each
of those probabilities. The probability p(i) is given by h(i)/resolution, where
h(i) is the histogram value for the ith pixel.
Step 3 Multiply the log of probabilities log(p(i)) and the probabilities p(i) for each pixel
and add the products together.
Step 4 Finally multiply by -1 to obtain the value for entropy.

Here’s an example of how the size of an image increases as the number of bits
used to represent an image increases. Consider a grayscale image of size 1024x1024.
A grayscale image is composed of discrete pixels with values (gray levels/brightness
intensities) between the interval [0..255]. The number of bits per pixel (bpp) required
to encode such grayscale image will be log(255 − 0 + 1) = log(256) = 8. The size
of the image will be resolution ∗ bpp which yields 8388608 bits. To express this
into bytes we divide by 8 (as 8 bits make a byte). So the size of the image will be
1048576 bytes. Now, suppose an RGB image data with resolution 1024x1024. An
RGB image will have 3 channels: red, green and blue and each of these channels
will have pixel values between the interval [0..255]. So the number of bits required to
encode an RGB image will be 3 times 8 which yields 24-bits. So size of such image
will be 1024 × 1204 × 24 bits; to express in bytes we divide by 8 which gives 3145728
bytes. Clearly, the size of image (with same resolution) has increased by three folds
when it is converted from a grayscale image to an RGB image.

TODO: Image Transmission Requirements

Image Quality
Image quality is rather subjective and influences the viewer’s visual perception about
the features defined in an image. Several attributes are responsible for the ultimate
quality of an image as follows:

1. Pixel resolution: Pixel resolution is the number of pixels in an image.

2. Spatial resolution: Spatial resolution in a digital image can be defined as the


number of independent pixel values per inch. Spatial resolution of an image is
determined by how sampling was carried out. Spatial resolution is the smallest
discernible or perceivable detail in an image.
What spatial resolution is needed or proper, depends upon the information we
want to retrieve from an image. For example, in traffic footage if we just want to
distinguish the type of vehicles in image, whether it’s a car or a truck, an image
with relatively lower spatial resolution will be just enough. Now, if want to read the
number plates of a particular vehicle, the image needs to have higher resolution.
Measures of spatial resolution:

(a) Dots per inch (DPI)


(b) Pixels per inch (PPI)
(c) Lines per inch (LPI): This is used for printing purposes.

CPI is another measure of spatial resolution (not exactly used in image processing)
that defines the sensitivity of a mouse. CPI stands for Counts per inch, and defines
the number of pixels by which the cursor moves on your screen when you move your
mouse by an inch. So a CPI setting of 800 means that moving your mouse by one inch
will move the cursor by an equivalent of 800 pixels.

3. Intensity Level Resolution (Entropy)


Intensity level resolution refers to the number of intensity levels used to represent
the image. The more intensity levels used, the finer will be the details discernible
in an image. Intensity level resolution is usually given in terms of the number of
bits used to store each intensity level. The table below lists the intensity level
resolution for various images.

# of bits # of intensity levels Examples


1 2 0,1
2 4 00, 01, 10, 11
4 16 0000, 0101, 1111
8 256 0110011, 01010101
16 65,536 1010101010101010

4. Contrast & Dynamic range


Contrast is the difference in luminance or colors that makes an object distin-
guishable. Contrast is determined by the difference in color and brightness of an
object and other objects.
Dynamic range is the maximum contrast of an image. Dynamic range is also
referred to as the “contrast ratio”. It is the ratio between the largest and the
smallest pixel values in an image.

5. Brightness

6. Sharpness

7. SNR: Noises and Artifacts


2. Basic Operations on Images

Image Dtypes
Before performing any operations on images, it is important to get a general idea of
image dtypes.

Typecasting Images in Py
Module to use: ‘scikit-image‘ i.e., ‘skimage’. The ‘skimage.util‘ sub-module provides
some functions to typecast images:

Function Description
img_as_float Convert to floating points
img_as_ubyte Convert to 8-bit unsigned integer type.
img_as_uint Convert to 16-bit unsigned integer type.
img_as_int Convert to 16-bit signed integer type.

Note that floating point images must be restricted to the interval [-1..1] even though
the datatype itself can exceed this range. To respect this property, amongst others,
the ‘astype‘ py command should never be used to typecast images because is violates
the assumptions about the dtype range of images.
import numpy as np
from skimage.util import img_as_float as toFloat
image = np.arange(0, 50, 10, dtype=np.uint8)
# These float values are out of dtype range for an image.
print(image.astype(float))
[ 0. 10. 20. 30. 40. ]

# The correct way to do it.


print(toFloat(image))
[ 0. 0.03921569. 07843137. 0.11764706 0.15686275 ]

Basic Arithmetic Operations


Arithmetic Operations: Addition, Subtraction

Bitwise Operations in Image


Bitwise Operations: AND, OR, NOT, XOR
Bitwise operations are used in image manipulation and are used for extracting es-
sential parts of an image. Bitwise operations helps in image masking. Bitwise opera-
tions should be applied on input images of same dimensions.
Here’s a list of bitwise operators provided by OpenCV Python:
• bitwise_and(source1, source2, destination, mask)

• bitwise_or(source1, source2, destination, mask)

• bitwise_not(source, destination, mask)

• bitwise_xor(source1, source2, destination, mask)


3. Image Enhancement
Image enhancement refers to strengthening certain aspects of an image such as its
brightness, contrast, sharpness to improve the quality of information contained in an
image. Here’s a py code to illustrate this:

Point Operations
In point operations, same conversion operation is applied to each and every pixel in
an image. This means that the transformation of any given pixel is independent of
its location or its neighboring pixels. This is in contrast to neighbourhood operations
where the transformation of a pixel depends on where it is located and the pixels that
surround it.
The transformation function of point operations is given by:

s = c T (r)

where, s is the processed pixel, c is the scaling factor, T is the transformation function
and r is the input pixel.

• Logarithmic Transformation
• Power Law Transformation (Gamma Transformation)
• Contrast Stretching
• Histogram Equalization
• Image Negatives

Logarithmic Transformation
Logarithmic transformation is used for contrast enhancement. The input pixels are
replaced with their (natural) logarithmic values. Logarithmic transformation increases
details in the darker regions of an image whilst at the same time decreases the details
in the brighter regions of the image with respect to the contrast.
The logarithmic transformation function is as follows:

s = c log(r + 1)

The input pixel is incremented by 1 in order to handle the case when the logarithm
is undefined log(0). For logarithmic transformation, the scaling factor c is calculated
as:

255
c=
log(max input pixel value + 1)

255 is the maximum possible pixel value. Here, the scaling factor c is chosen such
that we get the maximum output value corresponding to the bit size used.
Contrast stretching
Contrast stretching is used to stretch intensity values to cover a desired range of pixels
in a linear fashion. It enhances images for which the histogram is narrower.
The contrast stretching function in as follows:

(b − a)
s = (r − c) +a
(d − c)

where, s is the processed/output pixel, r is the input pixel, a and b is the range of
possible pixel values, c and d are the minimum and maximum intensity values present
in the image.
Algorithm:

Step 0. Acquire the image.


Step 1. Compute the image resolution.
Step 2. Initialize the least and the maximum grey level values that can occur in the image.

a = 0 # Lower bound
b = 255 # Upper bound (for 8-bit images;)

Step 3. Calculate the image histogram, i.e., the frequency of all grey levels in the image.
Step 4. Initialize the least and maximum grey level values that are present in the image.

c = min(i) for which H(i) > 0


d = max(i) for which H(i) > 0

Step 5. With all a, b, c, d in place calculate the scaling factor:

scaling_factor = (b-a)/(d-c)

Step 6. Initialize a matrix with same dimensions as the input image.


Step 7. Iterate through each pixel, i, of the input image and set the corresponding values
in the output matrix to the transformed pixel value which is given by:

output_pixel = (input_pixel - c) * scaling_factor + a

Step 8. Display the output image and plot histograms.

Power law transformation (Or γ Correction)


Opposite of logarithmic transformation. The transformation function for γ transforma-
tion is given by:

s = c (r)γ

where, c is the scaling factor (usually an identity function), s is the processed or the
output pixel, r is the input pixel and γ is a controllable parameter such that
• when γ > 1, the contrast of the light gray area is enhanced
• when γ < 1, the contrast of the dark gray ares is enhanced
• when γ = 1, the contrast of the original image remains unchanged.
Figure 3.1: Curves for different values of γ.

Image Negatives
Image negatives, in the good old days, were used to produce images in Film Photog-
raphy.
Image negative is produced by subtracting each pixel from the maximum intensity
value. e.g. for an 8-bit image, the max intensity value is 255, thus each pixel is
subtracted from 255 to produce the image negative. So the transformation function
used to obtain image negative is

s = c ∗ (L − 1) − r

Where, the scaling factor c is 1, (L - 1) is the max possible intensity value and s, and r
are output and input pixel values respectively.
Image negatives have their applications in images where the background is black
and the foreground gray levels are not clearly visible. So converting the background to
white will render the image clear.
Neighbourhood Operations: Spatial Filtering
With neighbourhood operations, individual pixel values are modified with respect to its
location. To put in another way, the final transformation of an individual pixel value
will depend also on the pixel values of its surrounding. Some simple neighbourhood
operations include:

1. Min filter : Set the pixel value to the min in the neighbourhood.
2. Max filter: Set the pixel value to the max in the neighbourhood.
3. Median filter: Set the pixel value to the median of all pixel values in the neigh-
bourhood.

These min, max and median filters are termed as order statistics filters.

Pixel Neighbourhood
• N4 (p) 4-neighbours: 2 horizontal and 2 vertical pixels of the central pixel.

• ND (p) diagonal neighbours: Elements of the major and the minor diagonals of
the central pixel p. Two diagonals meet at the central pixel.

• N8 (p) 8-neighbours: The 2 horizontal and 2 vertical plus the elements of the
major and the minor diagonals of the central pixel p. i.e., N8 (p) = N4 (p) ∪ ND (p).

• NB: N4 (p) are more closer than ND (p).

Relationship Between Neighbouring Pixels


Let v be a set of intensity values used to define adjacency and connectivity. In a binary
image, v = {1} if we are referring to adjacency of pixels with value 1 or v = {0}, if we
are referring to the adjacency of pixels with value 0. Fro a grayscale image, the idea
is pretty much the same but the set v typically contains more elements; for example
v = {180, 181, 182, ..., 200}. If the range of possible intensity values is [0..L-1], then v
can be any subset of the values in the range.

• 4-adjacency Two pixels, p ∈ v and q ∈ v are 4-adjacent if q ∈ N4 (p).

• 8-adjacency Two pixels, p ∈ v and q ∈ v are 8-adjacent if q ∈ N8 (p).

• m-adjacency Two pixels, p ∈ v and q ∈ v are m-adjacent if


i) Either q ∈ N4 (p) Or q ∈ N8 (p)
ii) AND v − (N4 (p) ∩ N4 (q)) = (N4 (p) ∩ N4 (q).

Showcasing how neighbourhood operations work: Basic Algorithm


1. Initialize the original image and the filter.
2. Start applying the filter starting at the origin (top left) of the image. The central
pixel should overlap the pixel on which the operation is being performed. At the
edges of an image, we lack certain pixels to form a neighbourhood. In those
cases, we use one of a few different approaches:
(a) Add padding - Add pixels, either all white or all black, all around the image.
Typically, zero padding (black pixels) are added around the image.
(b) Replicate border pixels
(c) Truncate the image
(d) Allow pixels to warp around the image

3. Compute a new set of pixels by performing some sort of operations, between the
original and the filter pixels, such as addition or multiplication.
4. Apply the filter function, it can be things such as minimum, maximum, median,
addition, multiplication. Result of the whichever operation performed is the value
of the new pixel.
5. Replace the original pixel with this new pixel value.
6. Shift the window 1 step/stride to the right (or by the defined number of steps and
in the defined direction)
7. Repeat till EOF.

Filter Properties
A filter can have different parameters such as its shape, size, weights and function.
Filter parameters:

• Filter size (Size of the neighbourhood), usual filter size ranges anywhere between
3x3 to 21x21.

• Filter shape, filters do not necessarily need to be squares, they can also be
rectangular, circular and so on.

• Filter weights: The weights of different pixels in a filter can be different.

• Filter function: Pertains to certain operations. Can be linear or non-linear.

Correlation
For this illustration, a 2D filter will be represented as a 1D array such that each row of
the filter will be appended at the end of the array in a top-down approach.
Consider a 3x3 filter F = [a, b, c, d, e, f, g, h, i] is placed over the first set of orig-
inal pixels [j, k, l, m, n, o, p, q, r] at the origin. Now, we multiply the correspond-
ing pixels in the original set and the filter to get [aj, bk, cl, dm, en, fo, gp, hq, ir].
Now, add up all the items in this new set of pixels and replace the central pixel in
the original set i.e. ‘n’ with its sum. So our processed set of pixels will become
[j, k, l, m, sum, o, p, q, r]. Now we shift the window by the defined number of step-
s/strides, say 1 and repeat this process for every pixel in the original image to generate
a new filtered image.
Figure 3.2: ./images/image-origin.jpg

Convolution
Start with pretty much the same (or similar) filter as correlation, F=[a, b, c, d, e, f, g,
h, i]. The only difference is that the filter is reversed (reversed in 1D, flipped in 2D) as
G=[i, h, g, f, e, d, c, b, a]. Apply this filter to the original set of pixels in the image.
Multiply the corresponding pixel values to obtain a new set of pixels [ji, kh, lg, mf, en,
do, cp, qb, ra]. Add up all pixels in this new set and use the sum to replace the central
pixel.
For symmetric filters, there is no difference between correlation and convolution as
its reverse will be same as it. For example, since F=[a, b, c, d, e, d, c, b, a] is symmetric
its reverse willPbePsame as F.
g(x, y) = s t w(s, t)f(x + s, y + t)
g(x, y) yields the new value of the central pixel.

More Spatial Filtering Operations:


Frequency in images and Image Derivatives
Image derivatives (or spatial differentiation) can be computed by using small convolu-
tion filters of size ranging from 2x2 to 3x3. Larger masks gives a better approximation
of the derivative and examples of such filters are Gaussian derivatives.
From the notions of continuous differential calculus, the first-order derivative of a
continuous 1D signal, y=f(x) is given by
dy f(x + h) − f(x)
= lim
dx h→0 h
In discrete derivative, the h increment (h → 0) does not tend to cancel but takes on
a finite value such as 1 (the smallest possible h for an image data). This essentially
means that discrete derivative is an approximation of the actual derivative. The follow-
ing equation is just an approximations of the derivative:
dy f(x + 1) − f(x)
=
dx 1
= f(x + 1) − f(x)
The second-order derivative can be approximated as:
d2 y
= f(x + 1) + f(x − 1) − 2f(x)
dx2
But the concept of differentiation of images is very different from the differentiation of
continuous functions. Derivative of an image is just an approximation of the continuous
derivative. For multidimensional functions, such such as a scalar field on n directions,
f : Rn → R, the (discrete) derivative is calculated as partial derivatives with respect
to each of the n directions. For an image, I(x,y) the gradient ∇I is obtained through
(discrete) partial derivatives of pixel intensities in the x and the y directions.
 
∂I ∂I
∇I = ,
∂x ∂y
This gradient vector gives the magnitude which corresponds to the strength of the
image and a phase which corresponds to the orientation/direction of the edge (i.e. the
direction in which there is the most rapid change in gray levels).

Ix′
q  
−1
∇I = (Ix′ )2 + (Iy′ )2 θ = tan
Iy′

Figure 3.3: Edges in an image

Step edge: [0. 0. 0. 0. 1. 1. 1. 1.] Ramp edge: [0. 0. .1 .2 .4 .8 1. 1.] Roof edge:
[0. .1 .2 1. 1. .2 .1 0.]

Low Pass Filtering: Smoothing/Blurring


Box-blur filter Remove finer details and render the image blurry.

Gaussian Smoothing (or Gaussian Filtering)


The Gaussian filter of kernel size (2n + 1)x(2n + 1) is given by:

(i − (n + 1))2 + (j − (n + 1))2
 
1
Hij = exp − ; 1 ≤ i, j ≤ (2n + 1)
2πσ2 2σ2
Looking at the range, we can see that filters size should be 3 at minimum. Fur-
thermore, the output of the first iteration must be placed at the center of the filter. To
achieve this, think of the origin as the center of a mesh grid. So if you want a filter
of size 3, the origin (0,0) is the center and its N4 (p) are (-1,0), (1,0), (0,-1), (0,1). To
implement in this code, start iteration from -n to n+1 and place the values of the pixels
in (i+n, j+n). Here’s a py function to create Gaussian filters:

#! /usr/bin/python

def getGaussianFilter(size, sigma=1.0):


n = int(size) // 2
x, y = np.mgrid[-n:n+1, -n:n+1] # mesh grid
normalizer = (1 / 2 * np.pi * sigma**2)
filter = normalizer * np.exp(-(x**2 + y**2)/(2 * sigma**2))
return filter

High Pass Filtering


- Remove blurring from the images - Highlight the edges - Spatial differentiation

Laplacian Operator
1. Isotropic, or rotation invariant. This means that the results of first applying the
Laplacian operator and later rotating the image will be same as the results of first
rotating the image and then applying the Laplacian operator.

2. One of the simplest sharpening filters

3. Digital implementation

∂2 f ∂2 f
∇2 f = +
∂2 x ∂2 y

∂2 f
= f(x + 1, y) + f(x − 1, y) − 2f(x, y)
∂x2
∂2 f
= f(x, y + 1) + f(x, y − 1) − 2f(x, y)
∂y2
Adding these together, we get

∇2 f = f(x + 1, y) + f(x − 1, y) − 2f(x, y) + f(x, y + 1) + f(x, y − 1) − 2f(x, y)


= −4(x, y) + f(x + 1, y) + f(x − 1, y) + f(x, y + 1) + f(x, y − 1)

With this equation, we can compute the standard Laplacian operator kernel:
 
0 1 0
1 −4 1
0 1 0
Some commonly used variants of the standard Laplacian kernels are:
   
1 1 1 −1 2 −1
1 −8 1 and  2 −4 2 
1 1 1 −1 2 −1

Laplacian filters are very sensitive to noise. So, to counter this, the image is Gaus-
sian smoothed before applying the Laplacian filter. This step reduces high frequency
noise components prior to the spatial differentiation.

Prewitt Operator

def applyPrewitt(src, size):


k = size // 2
x,y = np.mgrid[(-1*k):k+1, (-1*k):(k+1)]
print(x,y, sep=’\n’*2)
gradient_x = cv.filter2D(src, -1, x)
gradient_y = cv.filter2D(src, -1, y)
gradient_xy = gradient_x + gradient_y
return {
’edges_x’ : gradient_x,
’edges_y’ : gradient_y,
’edges_xy’ : gradient_xy
}

edges_prewitt = applyPrewitt(src, 3)
edges_x = edges_prewitt.get(’edges_x’)
edges_y = edges_prewitt.get(’edges_y’)

# Compute magnitude
magnitude = numpy.sqrt(edges_x**2 + edges_y**2)

# Orientation
phase = cv2.phase(gradient_x, gradient_y, angleInDegrees=True)

Sobel Operator
• Like the Laplacian and Prewitt operators, Sobel operator is another discrete dif-
ferentiation operator. Discrete differentiation is only an approximation of the con-
tinuous differentiation, therefore the Sobel operator computes only an approxi-
mation of the gradient (derivative) of an image (intensity function).

• Used to detect vertical and horizontal edges.

• Differentiation is sensitive to high frequency noise.

• Combines Gaussian smoothing and differentiation to counter this sensitivity. (With


Laplacian operator, Gaussian smoothing needs to be applied separately before
using the Laplacian operator).
For an image with developments in the x and the y direction, the spatial differentia-
tion is computed as partial derivatives with respect to each of the x and the y direction.
x-derivative corresponds to the horizontal changes and y-derivative corresponds to
vertical changes.
4. Image Segmentation

Image Segmentation Approaches & Algorithms


Segmentation breaks down image into various subgroups called image segments.
Segmentation helps in reducing the complexity of the image which renders further
processing or analysis of the image rather simpler. The degree of segmentation or
subdivision of an image depends on the problem at hand. Segmentation attempts to
partition the pixels of an image into groups that strongly correlate with the objects in
the image. Different approaches to image segmentation, along with their advantages
and disadvantages are depicted in the chart below:

Image Segmenta-
tion Approaches

Edge based Region based Thresholding


techniques techniques techniques

Advantages: Advantages: Advantages:

• Works well when • Better for noisy • One of the best and
the edges are images where easiest techniques.
prominent. edges are hard to
identify. Disadvantages:
Disadvantages:
Disadvantages: • Cannot not be
• Cannot not be applied to images
applied on images • Seed point must be with complex
having many or specified. intensity distributions.
smooth edges. • Different seeds • Cannot process
• Not suitable for may give different images with
noisy images. outputs. unimodal histograms.

Figure 4.1: Various approaches to image segmentation. (i) Edge based approaches
depends on the local changes in the image intensity but these cannot be applied on
images that have smooth or way too many edges. (ii) Region based segmentations
rely on a seed point, based on which regions grow by checking neighbouring pixel
intensities to add or not add, thus separating the regions. (iii) The thresholding based
segmentation (common pixel based method) involves calculating optimum thresholds
which separates different regions. Thresholding exhibits good performance in images
with bimodal intensity distribution, but cannot process images that have unimodal his-
tograms.
Thresholding Techniques
Thresholding is one of the simplest operations for segmentation. Thresholding is used
to remove an intended object or target object from its background by allocating a
threshold, an intensity value, for every pixel such that each pixel is categorized as
either a background pixel or a pixel that belongs to the object of interest.
For example, if we take a grayscale image and threshold it with a value of T we
make all pixel values >= T into 1, and all pixels < T into 0. The results would be as
follows:

Figure 4.2: Example of thresholding

Thresholding can be either done globally in an image, where a single threshold


value is used to segment the entire image or it can also be adaptive. In adaptive
thresholding, the entire image is broken into smaller pieces, and different thresholds
are used for each of these different pieces. Simple thresholding involves manually sup-
plying a global threshold, and is more of a hit-and-trial approach. Automatic thresh-
olding technique’s such as Otsu’s method are a bit more dynamic and compute the
optimal global threshold value based on the input image.

Use Cases of Thresholding Techniques


• Document image analysis, where the goal is to extract printed characters, logos,
or other graphical content.
• Map processing, where lines, legends and characters are to be found.
• Scene processing, where a target is to be detected.
• Quality inspection of materials, where defective parts must be delineated.
• In medical applications, it is used to detect defects in body organs.
Otsu’s method: automatic thresholding
Otsu’s method is used to perform automatic image thresholding. The algorithm re-
turns a single intensity threshold that separate pixels into two classes, foreground and
background. This threshold is determined by minimizing intraclass intensity variance,
and/or maximizing interclass intensity variance.
The algorithm searches for a threshold that minimizes the intraclass variance, de-
fined as a weighted sum of variances of the two classes.
Step 0. Import a 2D image.
Step 1. Compute the intensity distribution i.e. image histogram and probabilities of each
intensity level.
Step 2. Set up a large random value as the initial intraclass variance.
Step 3. For a given threshold value, calculate:
1. the weight (ω) of the background and the foreground pixels.
sum of all pixels pertaining to a class
ω(t) =
image resolution
Note that: Pixels falling to background or foreground are determined based
on the current threshold (t). For example, in an 8 bit grayscale image, pixel
values from 0 through t − 1 are background pixels and values from t through
28 − 1 = 255 are foreground pixels. Here the assumption is that darker pixels
make the background and the lighter ones make up the foreground, but this
isn’t always true. So, for a grayscale image, the weight of the background is
given by:

X
t−1
ωb (t) = p(i)
i=0

Similarly, the weight of the foreground is given by:


X
255
ωf (t) = p(i)
i=t

Where, p(i) is the probability of the intensity value i. and H(i) is the his-
togram value i.e. frequency of the pixel i in the image. The probabilities
must add up i.e. ωb + ωf = 1.
2. the mean (µ) of background and foreground pixels. The mean of the back-
ground pixels is given by:
X
t−1
p(i) ∗ H(i)
µb (t) =
i=0
ωb (t)

and that of the foreground pixels is given by:


X
255
p(i) ∗ H(i)
µf (t) =
i=t
ωf (t)
3. the variance (ν = σ2 ) of background and foreground pixels. The variance of
the background pixels is given by:
X
t−1
p(i) ∗ (H(i) − µb )2
νb (t) =
i=0
ωb (t)
and that of the foreground pixels is given by:
X
255
p(i) ∗ (H(i) − µf )2
νf (t) =
i=t
ωf (t)

Step 4. Compute intraclass (within class) variance for the current threshold. This is sim-
ply the sum of the two variances (background and foreground) multiplied by their
associated weights.
νW (t) = ωb (t) νb (t) + ωf (t) νf (t)

Step 5. Update the intraclass (within class) variance for a given threshold if it lower than
the current value for it. The desired threshold corresponds to the one with the
minimum νW (W for within).
Step 6. Repeat 3 and 5 for all possible values of threshold.
Step 7. Once the optimum threshold value is determined, change all background pixels
to 0 and all foreground pixels to 1 to get the segmented image.
A slight modification that opts for a much faster approach (because you skip through
step 3.3 mentioned above) is determination of the maximum value for the interclass
variance between foregrounds and backgrounds.
νB (t) = ν − νW
= σ2 − σ2W
νB (t) = ωb ωf (µb − µf )2
Here, we look for the maximum value of ν( B) (B for between) to select the optimum
value of threshold. The threshold that yields maximum νB also yields minimum νW .

Edge Based Techniques


Edge detection includes a variety of mathematical methods that aim at identifying
edges, curves in a digital image at which contrast changes sharply, or has disconti-
nuities.
With this approach, the boundaries of different regions of the image need to suf-
ficiently different from each other and from the background, which opts for edge de-
tection based on local discontinuities in the intensity levels. Such approaches involves
edge detection for segmentation purposes. The term “edge detection refers” refers to
the process of determining the boundary of an object in an image. This is an important
step for understanding image features, as edges consist meaningful features and have
significant information.
• Detection of edge with minimum errors which means that the detection should
be able to accurately detect as many edges in the image as possible.
Canny Edge Detection
Canny-edge detection involves a multi-stage algorithm that can detect edges in a wide
range of images.

Steps involved in Canny edge detection:

Step 1. Apply the Gaussian filter


Apply the Gaussian filter to smooth the image in order to remove the noise. It is
crucial to filter out noise to prevent false detection caused by it. A Gaussian filter
kernel is convolved over the image. The size of the kernel will have an effect on
the edge detection. Larger kernels will render the detector more insensitive to
noise. In addition, the localization error to detect the edge will slightly increase
with the increase in the kernel size. 3x3 or 5x5 is a good size for most cases.
Step 2. Find the intensity gradients of the image
An edge may point in a variety of directions, so use filters to detect: horizontal,
vertical and diagonal edges in the blurred image. How? Answer: Edge detection
operators such as Prewitt, or Sobel returns a value for first derivative in the Ix and
Iy directions. From this the edge gradient and direction can be determined using
q
G = G2x + G2y θ = tan− 1(Gy /Gx )

Where, G is the magnitude of the gradient and θ is the direction. Depending on


the range in which the value of θ falls in, it is set to one of four specific values.
For example, if θ ∈ [0o , 22.5o ] or if θ ∈ [157o , 180o ] it maps to 0°.
Depending on the rounded value of gradient θ, vertical (90° ), horizontal (0° ),
and the diagonal (45° and 135° ) edges are determined. The mapping functions
is tabulated below:

Table 4.1: Direction Mapping


Range Angle Edge
θ∈/ [22.5o , 157.5o ) 0° Horizontal
o o
θ ∈ [22.5 , 67.5 ) 45° Diagonal
o o
θ ∈ [67.5 , 112.5 ) 90° Vertical
θ ∈ [112.5o , 157.5o ) 135° Diagonal

Step 3. Apply gradient magnitude thresholding or non-maximum suppression


Gradient magnitude thresholding is applied as a means to find the location with
the sharpest change of intensity values. To apply gradient magnitude threshold-
ing, begin with comparing the edge strength of the current pixel with strength
of the pixel in positive and negative gradient directions. If the edge strength of
the current pixel is largest compared to other pixels in the mask with the same
direction, the value will be preserved or else, the value will be suppressed.

• if the rounded gradient θ = 0o the point will be considered to be on the edge


if its gradient magnitude is greater than the magnitudes of pixels in the east
and west directions.
• if the rounded gradient angle is θ = 90o , the point will be considered to be on
the edge if its gradient magnitude is greater than the magnitudes at pixels
in the north and south directions.
• if the rounded gradient angle is 135° (i.e. the edge is in the northeast–southwest
direction) the point will be considered to be on the edge if its gradient magni-
tude is greater than the magnitudes at pixels in the northwest and southeast
directions.
• if the rounded gradient angle is 45° (i.e. the edge is in the northwest–southeast
direction) the point will be considered to be on the edge if its gradient magni-
tude is greater than the magnitudes at pixels in the northeast and southwest
directions.

The purpose of this step is to check if the pixels on the same direction are more
or less intense than the ones being processed. Helps to get rid of false edge
detections by thinning the edges.
Step 4. Apply double threshold to determine potential edges
Post non-maximum suppression, remaining edge pixels provide a more accurate
representation of the real edges of objects in the image. However, some false
edge pixels may still exist due to noise and color variations. To account for these
spurious edges, perform double thresholding to filter out edge pixels with a weak
gradient value and preserve edges with high gradient values. This is different
from non-maximum suppression as comparisons are made with respect to pre-
set lower and upper bounds. So if the edge pixel’s gradient value is higher than
the upper bound, it is marked as a strong edge pixel and if on the other hand it
is higher than the lower bound but smaller than the upper bound, it is marked as
a weak edge pixel. Pixels whose gradient values are lower than the lower bound
are suppressed. There is no predefined values for the lower and upper bound;
these are determined experimentally depending on the content of the image.
Step 5. Track edge by hysteresis
As a final step, the edges are detected by suppressing those weak pixels that
are not connected to any strong edges. Usually a weak edge pixel caused from
true edges will be connected to strong edge pixels. So this step eliminates any
spurious weak pixels that may have resulted from noise/color variations. To track
the edge connection, blob analysis is applied by looking at a weak edge pixel
and its 8-connected neighborhood pixels. As long as there is one strong edge
pixel that is involved in the blob, that weak edge point can be identified as one
that should be preserved otherwise it should be suppressed.

Hough Transform
“Method and Means for Recognizing Complex Patterns”.
• Extraneous data: Which points to fit to?
• Incomplete Data: Only parts of the model is visible.
• Noise
The simplest case of Hough transform (edge detection) is detecting straight lines.
In general, the straight line y = mx + c can be represented as a point (b, m) in the
parameter space (the parameters are the slope m and the intercept c).
Line detection algorithm:

Step 1. Quantize parameter space (m,c).


Step 2. Create an accumulator array A(m,c).
Step 3. Set A(m,c) = 0 for all (m,c).
Step 4. For each edge point (xi , yi ),

A(m, c) = A(m, c) + 1 if (m, c) lies on the line:c = −mxi + yi .

So the intersection point will get maximum votes.


Step 5. Find local maxima in A(m,c) to get the straight line.

However, this would give rise to unbound values of the slope parameter, meaning
that −∞ ≤ m ≥ ∞. This would mean the accumulator needs to be massive to store
all possible values of m. Thus, for computational efficiency, we resort to the Hesse
normal form:

r = x cos θ + y sin θ

The intuition is that every vector on the line must be perpendicular to the straight
line of length r that comes from the origin. The intersection point of the function line
and the perpendicular line that comes from the origin is at Po = (r cos θ, r sin θ). So for
any point P on the line, the vector P − Po must be orthogonal to vector Po .

Region Based Segmentation


Region based segmentation consists of partitioning an image into regions that are
similar depending on a certain preset criteria. In this approach, the algorithm makes
segments by dividing an image into various components that have similar pixel charac-
teristics. The algorithm searches for chunks in an input image, and for segmentation
purposes, it will either add more pixels to the selected chunk or remove pixels form
it to get smaller segments and, where appropriate, merge them with other smaller
chunks. Therefore there are two more basic techniques that stem from region based
segmentation:
1. Region Growing. This is a more bottom-up approach where the algorithm starts
with a smaller chunk and adds pixels to it to get an image segment.
• Starting from a set of seed pixels, the regions grow by appending to each
seed point those neighboring pixels that have similar properties such as
gray level, texture, color.
• If the rules apply, keep on increasing the region till where it follows the rule.

2. Region Splitting and Merging. This is a top-down approach where the algo-
rithm completes in two step; starts with a bigger chunk and removes pixels as
necessary to get the segment. Similar segments are added depending on the
set criteria.

• Start out with splitting.


• Consider the whole image as a region. Let R represent the entire image
region and select a predicate (base).
• Check if the regions follow a certain condition for homogeneity. If the don’t,
then divide the image into four equal regions (i.e., four quadrants).
• Again, check the rules for each region. If the rules are followed, the regions
are not split, else the image keeps getting split in this way. Stop when all
regions are homogeneous.
• This will result in a data-structure called quad-tree.
• Keeping splitting the regions as long as the rule get violated.
• Merging is performed simultaneously. Merging is opposite of region split-
ting. Start out with small regions that have similar characteristics and merge
them.
• Deal with each and every pixel and merge the region if it follows the rule.
• Iterate until no further splitting/merging is possible.

Region growing based techniques are better than edge-based techniques in noisy
images where edges are difficult to detect.

More on Image Segmentation


Semantic and Instance Segmentation
Semantic segmentation associates every pixel of an image with a class label such as
person, flower, car and so on. It treats multiple objects of the same class as a single
entity. In contrast, instance segmentation treats multiple objects of the same class as
distinct individual instances.

Figure 4.3: Illustration of semantic and instance segmentation

Properties of Image Segments


Image segments, for the most part, should have th following properties:

1. Connectivity and compactness. Compactness is measured by the ratio of the


object area to the area of its smallest bounding box.
2. Regularity of boundaries
3. Homogeneity in terms of colors and texture.
4. Differentiation from neighbour regions.
Morphological Operations
Morphological operations involves processing images based on shapes and struc-
tures. In a morphological operation, the value of each pixel in the output image is
based on comparison of the corresponding pixel in the input image with its neighbors.
Morphological operations are usually implemented to post-process the output of the
segmentation so that imperfections in the segmented image can be eliminated. In
some cases, these are also performed right before segmentation to remove any kind
of noise and prepare the image for segmentation.
Morphological operations is similar to spatial filtering (recall correlation and convo-
lution), where the structuring element is moved across every pixel in the original image
to give a new pixel whose value depends on the morphological operations performed.

Terminologies in Morphological Operations:

Structuring A filter that traverses the image. The structuring element


element is positioned at all possible locations in the image, and it is
compared with the connected pixels.
Fit When all the pixels in the structuring element covers the
pixel of the object we call it Fit.
Hit If the central pixel in the structuring element matches the
central pixel of the object we call it Hit. The position of cen-
tral pixel differs depending on the dimensions of the struc-
turing element.
Miss When no pixel in the structuring element covers the pixels
of the object, we call it Miss.

The following figure shows a visualization of terminologies explained above.

Figure 4.4: Morphological Operations Visualized


Simple Morphological Operations
Erosion
Erosion involves removing isolated pixels, especially on object boundaries. This mor-
phological operation, in a way, shrinks the original image. Erosion enlarges the back-
ground and shrinks the foreground. We traverse the structuring element over the im-
age object to perform an erosion operation the value for the new pixel is

1 if Fit
Pixel (output) =
0 otherwise

The following figure shows an example of erosion:

Figure 4.5: Illustration of erosion on an input image using a structuring element.

Here’s another example to show how erosion works. We start with the input image
I on which a 2x2 structuring element F traverses.

1 1 1 1
1 1 0 1
Input I =
0 1 1 1
0 1 0 0

1 1
Structuring element F =
1 0

Starting from the top left we traverse the structuring element across all the pixels
in a left-to-right and top-to-bottom basis and a new pixel value is computed after each
iteration such that the final output becomes:

0 1 0 0
0 0 0 0
Eroded E = I ⊖ F =
0 1 0 0
0 1 0 0
The following illustrates how the erosion process [I ⊖ F] for the above input and
filter.
1 [0] 1 [0] 1 1 1 [0] 1 [1] 1 [1] 1 1 [0] 1 [1] 1 [0] 1 [0]
1 [0] 1 [0] 0 1 1 [0] 1 [1] 0 [1] 1 1 [0] 1 [1] 0 [0] 1 [0]
0 1 1 1 0 1 1 1 0 1 1 1
0 1 0 0 0 1 0 0 0 1 0 0

1 [0] 1 [1] 1 [0] 1 [0] 1 [0] 1 [1] 1 [0] 1 [0] 1 [0] 1 [1] 1 [0] 1 [0]
1 [0] 1 [0] 0 [0] 1 [0] 1 [0] 1 [0] 0 [0] 1 [0] 1 [0] 1 [0] 0 [0] 1 [0]
0 [0] 1 [0] 1 1 0 [0] 1 [0] 1 [0] 1 0 [0] 1 [0] 1 [0] 1 [0]
0 1 0 0 0 1 0 0 0 1 0 0

1 [0] 1 [1] 1 [0] 1 [0] 1 [0] 1 [1] 1 [0] 1 [0] 1 [0] 1 [1] 1 [0] 1 [0]
1 [0] 1 [0] 0 [0] 1 [0] 1 [0] 1 [0] 0 [0] 1 [0] 1 [0] 1 [0] 0 [0] 1 [0]
0 [0] 1 [0] 1 [0] 1 [0] 0 [0] 1 [1] 1 [1] 1 [0] 0 [0] 1 [1] 1 [0] 1 [0]
0 [0] 1 [0] 0 0 0 [0] 1 [1] 0 [1] 0 0 [0] 1 [1] 0 [0] 0 [0]

The following figure shows how the output of an erosion operation actually looks
like:

Figure 4.6: Example of erosion on the original image (a) with a 3x3 (b) and a 5x5 (c)
structuring elements.

Properties of erosion: Fig. 4.7.


• Can split apart joint objects.
• Repair intrusions.

Figure 4.7: Use cases of erosion.


Dilation
Dilation involves dilating or expanding the objects at the boundary in the image. It
works pretty much same as erosion with the following output conditions:

replace with filter if Hit
Pixel (Output) =
keep same otherwise
The following figure represents dilation:

Figure 4.8: Dilation operation on an input image using a 2x2 structuring element

Here’s another example to show how dilation works. We start with the input image
I on which a 2x2 structuring element F traverses. The dot (.) represents the central
pixel viz. (0,0) for a 2x2 filter.

0 1 0
1. 1
Input I = 1 0 0 Structuring element F =
0 0
0 0 0

Starting from the top left we traverse the structuring element across all the pixels
in a left-to-right and top-to-bottom basis and a new pixel value is computed after each
iteration such that the final output becomes:

0 1 1
Dilated D = I ⊕ F = 1 1 0
0 0 0

The following illustrates how the dilation process [I ⊕ F] using substitution method
works for the above input and filter.

0. [0] 1 [1] 0 0 [0] 1. [1] 0 [1]


1 [1] 0 [0] 0 1 [1] 0 [0] 0 [1]
0 0 0 0 0 0

0 [0] 1 [1] 0 [1] 0 [0] 1 [1] 0 [1]


1. [1] 0 [1] 0 1 [1] 0. [1] 0 [0]
0 [0] 0[0] 0 0 [0] 0 [0] 0[0]
The same dilation process can also be performed using the vector substitution
method. In vector substitution method, first we note the positions of 1’s in the input im-
age as well as the structuring element. In the input image, 1’s are located at positions
PositionI (1) = {(0,1), (1,0)} and in the filter, 1’s are located at PositionF (1) = {(0,0),
(0,1)}.
Now to calculate PositionD (1) we perform vector addition PositionI (1)⊕PositionF (1)
as follows:

(0, 1) + (0, 0) = (0, 1)


(0, 1) + (0, 1) = (0, 2)
(1, 0) + (0, 0) = (1, 0)
(1, 0) + (0, 1) = (1, 1)

So PositionD (1) = {(0,1), (0,2), (1,0), (1,1)} can be used to compute the dilated
output (notice that the output of both vector addition and substitution method is same).

Figure 4.9: Dilation Illustration on the original image (a) with a 3x3 (b) and a 5x5 (c)
structuring element.

Dilation Properties:

• Fill holes and repair breaks.


• Strip away intrusions.

Figure 4.10: Dilution use cases


Compound Morphological Operations
Most morphological operations are not performed using either dilation or erosion; in-
stead both are combined. Two most widely used compound morphological operations
are explained below:

Figure 4.11: Output of compound operations on the input.

Opening
• Erosion first then dilation with the same structuring element. Mathematically,
represented as A ◦ B = D(E(A, B)) = (A ⊖ B) ⊕ B
• Removes any narrow connections and lines between two regions.
• Renders the edges more sharper and smoother.

Closing
• Dilation first then erosion with the same structuring element. Mathematically,
represented as A • B = E(D(A, B)) = (A ⊕ B) ⊖ B
• Removes noise in the form of small holes i.e. fills in any small black areas or
holes in the image while maintaining the shape and size of the object in the
image.
Morphological Gradient
Dilation and erosion have inverse effect of each other. Dilation adds a pixel layer to the
boundaries of regions, while erosion strips them away. The difference between image
dilation and erosion is termed as the morphological gradient.
5. Image Compression
Compression, at its core, is the reduction in the original (file) size of the image. Com-
pression encodes an original image with fewer number of bits. The primary objec-
tive of image compression is to eliminate or minimize the redundancy (occurrence of
similar bits) in the image which will allow for optimum resource utilization. For exam-
ple, smaller sized images require lower bandwidth to be transmitted across a channel
which makes the transmission more efficient, on a similar note, storing a compressed
image minimizes unnecessary memory hogging.

Compression Ratio
Compression ratio is the ratio of the original or uncompressed image file to the reduced
or compressed file. The size of the compressed file depends on the compression ratio
as related by the following equation:
Size of uncompressed file
Compression ratio (c) =
Size of compressed file
Compression ratio is often written as SIZEU : SIZEC .

Example#1 The original image is 256x256 pixels, single-band (grayscale), 8-bits per
pixel. This file is 65,536 bytes (64k). After compression the image file is 6,554 bytes.
The compression ration is SIZEU : SIZEC i.e. 65546 : 6554 = 9.99 ≈ 10 : 1. This is read
as “10 to 1 compression” or “x10” compression.
Another way of stating this compression is to use the terminology of bits per pixel.
The bpp is given by
Number of bits
bpp =
Number of pixels
So in the above case, bpp is given by
(8)(6554)
bpp = which yields 0.8.
(256)(256)
CR is clearly a relative measure, while bpp is an absolute measure and represents
the average number of bits needed to encode each image pixel information. CR is
often times represented as a normalized ratio such as 2 : 1, meaning that the com-
pressed file is twice as small as the original size.
For compressed images, as they are usually transformed into different representa-
tions, the bpp is evaluated indirectly by taking the following average:
SIZEC
bpp =
Npixels
So, as the number of pixels (Npixels ) remains unchanged, CR can be related with
bpp as follows:
bppU
CR =
bppC
For the above example, CR is given by 8/0.8 = 10. Since it’s a ratio, we write this
as 10 : 1.
Redundancy in Image Data
1. Coding redundancy
• Occurs when the data used to represent the image is not utilized in an opti-
mal fashion.
• If the gray levels of an image are coded in a way that uses more code sym-
bols than absolutely necessary to represent each gray level, then the result-
ing image is said to contain coding redundancy.
2. Interpixel redundancy
• Occurs because adjacent pixels tend to be highly correlated, in most im-
ages, the brightness levels do not change rapidly but chage gradually.
3. Interband redundancy
• Occurs in color images due to the correlation between bands within an im-
age— if we extract the red, green and blue bands they look similar.
4. Psychovisual redundancy
• Some information is more important to the human visual system than other
types of information.

Compression System Model


The compression system model consists of two parts: a) the compressor and b) the
decompressor. The compressor consists of a preprocessing stage and an encoding
stage where as the decompressor consists of a decoding stage, optionally followed
by a post-processing stage. Preprocessing before encoding the image is preformed
in order to prepare the image for the encoding process and may consist of a number
of operations depending up on the applications. Once the compressed file has been
decoded, post-processing steps can be employed to eliminate some of the potentially
undesirable artifacts brought about by the compression.

Preprocessing

Input Data
Mapping
Image I(r,c) Reduction

Compressed
Quantization Coding
file

Encoding

Figure 5.1: Compressor System


Decoding

Compressed Inverse
Decoding
file Mapping

Post Decoded
Processing Image (I,c)

Figure 5.2: Decompressor System

Types: Lossy and Lossless


Compression can be either lossy or lossless. In lossless compression, all of the infor-
mation in the original image is retained as it is whilst the original size of the image is
reduced. On the other hand, in lossy compressions, some information may be lost and
the quality of the image might degrade up to a certain degree during the process. This
gives an idea that, images that undergo lossless compression take up more space than
those that undergo lossy compression. Raw images have the lowest degree of com-
pression (which makes them lossless) but they do not have viewing support. Usually
PNG, BMP (Microsoft), TIFF, and WebP (exclusive to the web) image formats have
lossless compressions and JPG, JPEG and GIFs (animated) have lossy compres-
sions. Lossless algorithms usually just group similar pixels together which maintains
the same quality as before it was compressed, lossy compression algorithms are more
likely to pretty much remove similar pixels.
The compression algorithm to use always depends on the end goal/application for
which the image is being used. For example, traffic... With medical images, lossless
compression is always preferred because information must be preserved as much as
possible.

Lossless Compression
• Image size is reduced without any quality loss
• Lossless compression algorithms usually do not remove any pixels, instead they
just group similar pixels together. This renders lossless compression reversible.
• The biggest benefit to lossless compression is that the quality of the image is
retained and we can still achieve a smaller file size.
• The downside to lossless compression is that the file size can, even though
smaller than the original, can still be quite big. These days Google’s WebP for-
mat is combined with lossless compression to achieve a significant reduction in
file size while keep the image quality same.
Here is a list of a few lossless algorithms:
• Run-length encoding
• Huffman-coding

• Lempel Ziv-Welch (LZW)

• Arithmetic Coding

Lossy Compression
• Some of the data from the original image is lost.
• Images that have undergone lossy compression cannot be reversed, once mean-
ing that lossy compression algorithms are irreversible.
• If the compression is repeated a number times, more degradation occurs each
time which isn’t the case with non-lossy compression.

Here is another list of a few lossy algorithms:

• Transform encoding

• Discrete Cosine Transformation (DCT) : JPEG Compression

• Discrete Wavelet Transformation (DWT)

Negative Image Compression


Where the compression algorithm results in a file greater than the size of the original
uncompressed file.
6. Neuroimages Analysis
Before diving deep into neuroimage analysis let’s discuss some key techniques used to
record the brain with. We will focus more on EEG and MRI because of their widespread
use in neuroimaging.

MRI
sMRI
sMRI, short for structural MRI, is the most commonly used MRI modality which is
used to record the anatomical structure of tissues i.e. to display the contrast between
different tissues in the brain.

dMRI
Diffusion MRI creates contrast based on the diffusion of direction of water molecules
in the brain. This technique is used to map brain structures such as the white matter
tracts in the brain. WM tracts are insulated and transmit electrical signals over long
distances. This allows communications between brain regions with high bandwidth
and speed. Brain regions are made up of gray matter. GM is regarded as the brain’s
substrate for computation and memory. The magnitude and direction of diffusion of
water molecules measured with dMRI are used to provide us with an idea of the struc-
ture and directionality of WM tracts. Mapping WM tracts in the brain is called WM
tractography.

fMRI
Functional MRI, records the activity in the brain i.e. the functional connectivity between
the brain regions. Unlike sMRI, fMRI does not use properties of hydrogen to create a
contrast but uses the blood-oxygen-level-dependent (BOLD) signal. The BOLD signal
measures the amount of blood flow and blood oxygenation in different regions of the
brain over time. An increase in energy consumption in a certain brain region increases
the blood flow and the transportation of oxygen to that brain region. fMRI thus indirectly
records the neural activity, either while a person is performing some task (inside the
MRI scanner, like watching a movie) or while in the resting state, performing no task at
all to understand the neural activity. fMRI signals are spatio-temporal (4D).

EEG
EEG records brain signals using electrodes.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy