IA Unit-04
IA Unit-04
Feature Extraction
Boundary Preprocessing:
Boundary preprocessing in feature extraction for image analytics involves applying various
techniques to enhance the quality and suitability of object boundaries for feature extraction.
These techniques aim to improve the accuracy, robustness, and efficiency of subsequent
feature extraction algorithms. Here are some common boundary preprocessing steps:
1. Noise Reduction:
Gaussian Smoothing: Apply Gaussian blur to reduce noise in the image, making the
boundary smoother and more continuous.
Median Filtering: Use median filtering to remove impulse noise while preserving
edge information.
2. Edge Detection:
Canny Edge Detection: Detect edges in the image using the Canny edge detector,
which finds the edges based on gradient magnitude and direction.
Sobel, Prewitt, or Roberts Operators: Apply gradient-based edge detectors to
highlight regions of rapid intensity change.
3. Thresholding:
Global Thresholding: Apply a single threshold to segment the image into foreground
(object) and background regions.
Adaptive Thresholding: Use adaptive methods to set thresholds based on local
image properties, which can handle variations in illumination and contrast.
4. Morphological Operations:
Erosion and Dilation: Perform erosion to shrink the object boundaries and dilation to
expand them. This helps to remove small irregularities and fill gaps in the boundary.
Opening and Closing: Apply opening to remove small objects and closing to fill
small gaps in the object boundary.
5. Contour Detection:
FindContours: Use contour detection algorithms to extract the boundary of objects
directly from the binary image obtained after thresholding.
Active Contour Models (Snakes): Employ active contour models to refine object
boundaries iteratively based on image features and prior knowledge.
6. Boundary Smoothing:
Curve Fitting: Fit parametric curves (e.g., B-splines) to the detected boundary points
to obtain a smoother representation.
Gaussian Smoothing: Apply Gaussian filtering along the boundary to remove small-
scale irregularities.
7. Normalization:
Scale Normalization: Normalize the scale of the boundary points to make them
invariant to size variations.
Rotation Normalization: Align the object boundaries to a common orientation to
make them invariant to rotation.
8. Boundary Segmentation:
Split and Merge Algorithms: Segment the boundary into meaningful segments based
on curvature or other features.
1
Dynamic Programming: Use dynamic programming algorithms to find optimal
segmentations based on certain criteria.
9. Boundary Completion:
Closing Gaps: Fill small gaps in the boundary caused by noise or segmentation
errors.
Bridge Gaps: Connect disjointed parts of the boundary that belong to the same
object.
Boundary preprocessing is essential for obtaining accurate and robust features from object
boundaries, which are crucial for various tasks such as object recognition, classification, and
segmentation in image analytics.
Feature Extraction:
Feature extraction is a crucial step in image analytics. It involves transforming raw image
data into a representation that is more suitable for analysis. One important aspect of feature
extraction is background representation, which involves separating the background from the
foreground or objects of interest in an image. Here are some common methods used for
background representation in image analytics:
1. Thresholding: This method involves setting a threshold value and classifying pixels in the
image as either background or foreground based on whether their intensity values are above
or below the threshold. This is particularly effective for images with a clear contrast between
the background and foreground.
2. Background Subtraction: In this technique, a model of the background is created and then
subtracted from the original image to isolate the foreground. This is often used in video
analytics where the background is relatively static.
3. Morphological Operations: Operations such as erosion and dilation can be used to remove
noise and smooth the boundaries between the background and foreground regions.
Morphological operations are particularly useful in segmenting objects from the background.
4. Feature-based Methods: Features like color, texture, and shape can be used to differentiate
between background and foreground. For example, in texture-based methods, statistical
properties of texture can be used to classify regions as background or foreground.
5. Machine Learning Techniques: Supervised and unsupervised machine learning algorithms
can be trained to automatically learn the characteristics of the background and segment it
from the foreground. Techniques such as clustering, decision trees, and neural networks can
be employed for this purpose.
6. Frequency Domain Methods: Techniques such as Fourier transform or wavelet transform
can be used to analyze images in the frequency domain. This can help in separating
background noise from the actual image content.
7. Sparse Coding and Dictionary Learning: These methods aim to represent images as a
linear combination of basic elements (dictionary atoms). The background can be represented
by a subset of these elements.
8. PCA (Principal Component Analysis): PCA can be used to reduce the dimensionality of the
image data. In the reduced space, the background and foreground can be more easily
separated.
2
The choice of method depends on factors like the complexity of the background, the type of
images being analyzed, computational resources available, and the specific requirements of
the analysis task. Often, a combination of methods may be used to achieve the best results.
1. Chain Codes: Chain codes represent the boundary of an object by encoding the direction of
the boundary at each point. They are simple and efficient descriptors that can capture the
overall shape of an object.
2. Fourier Descriptors: Fourier descriptors represent the boundary of an object by
decomposing it into a series of Fourier coefficients. These coefficients capture the frequency
components of the boundary shape and can be used to characterize its overall shape,
curvature, and orientation.
3. Curvature Scale Space (CSS) Descriptors: CSS descriptors represent the boundary of an
object by encoding its curvature at different scales. They capture the local curvature
information along the boundary and are robust to noise and scale variations.
4. Histogram of Oriented Gradients (HOG): HOG descriptors capture the gradient
information along the boundary of an object. They divide the boundary into small regions and
compute histograms of gradient orientations within each region. HOG descriptors are widely
used in object detection and recognition tasks.
5. Shape Context Descriptors: Shape context descriptors represent the spatial distribution of
boundary points around each point on the boundary. They capture both local and global shape
information and are invariant to translation, rotation, and scale.
6. Zernike Moments: Zernike moments are orthogonal moments that capture the shape
information of an object's boundary. They are invariant to rotation and scale and can be used
to characterize the shape of objects with complex boundaries.
7. Hu Moments: Hu moments are a set of seven invariant moments that capture shape
information based on the image's intensity values. They are invariant to translation, rotation,
and scale and can be used to characterize the shape of objects in binary images.
8. Local Binary Patterns (LBP): LBP descriptors capture the local texture information along
the boundary of an object. They encode the local binary patterns of pixel neighborhoods and
are useful for texture classification and segmentation tasks.
9. Scale-Invariant Feature Transform (SIFT): SIFT descriptors capture local features along
the boundary of an object. They are invariant to rotation, scale, and illumination changes and
are widely used in image matching and object recognition tasks.
These boundary feature descriptors can be combined or used in conjunction with other image
features to improve the performance of various computer vision tasks. The choice of
descriptor depends on factors such as the complexity of the objects, the level of invariance
required, and the computational resources available.
Shape Numbers:
In image analytics, feature extraction is a crucial step where relevant information is extracted
from an image to represent it in a more concise and meaningful form. One important aspect
3
of feature extraction is the use of shape descriptors or shape-based features. Shape descriptors
are numerical representations of the shape characteristics of objects or regions in an image.
One of the key shape descriptors used in feature extraction is shape number.
Shape number is a simple and intuitive shape descriptor that provides a measure of how
complex a shape is. It is based on the concept of the Euler characteristic, which relates the
number of vertices, edges, and faces of a polyhedron. In 2D, for images, it's computed as
follows:
where:
1. Number of Regions: This is the number of distinct enclosed areas or connected components
within the object. In simpler terms, if you imagine an object as a filled shape, each separate
filled area would be a region. For example, in the case of a circular object with a hole inside
it, there would be two regions: one for the outer area and one for the inner hole.
2. Number of Holes: Holes are regions of space within an object's boundary that are not part of
the object itself. They are bounded by the contour of the object but not filled. For instance,
the hole inside a doughnut shape is a classic example.
The shape number, therefore, gives us a measure of the object's complexity. Objects with a
higher shape number are typically more complex in shape, with more holes or separate
regions.
1. Object Recognition: Shape number can be used as a feature for recognizing and classifying
objects based on their shapes. Objects with similar shape numbers might belong to the same
category or class.
2. Image Segmentation: Shape number can help in segmenting images by distinguishing
between different objects based on their shapes. Objects with different shape numbers can be
separated out during segmentation processes.
3. Quality Control and Inspection: In industrial applications, shape number can be used to
identify defects or irregularities in products based on changes in their shape complexity.
4. Biomedical Image Analysis: In medical imaging, shape number can aid in identifying and
analyzing different structures like cells, tissues, or organs based on their shapes.
Fourier descriptors:
4
Fourier descriptors are a powerful tool in feature extraction for analyzing the shape of objects
in images. They are based on Fourier transformations, which are widely used in signal
processing to analyze the frequency components of a signal. In the context of images, Fourier
descriptors are used to represent the shape of an object's boundary.
1. Contour Extraction: The first step is to extract the contour of the object of interest from the
image. This contour is typically represented as a sequence of points in the boundary of the
object.
2. Normalization: Before applying Fourier transformations, it's often necessary to normalize
the contour. Normalization involves translating and scaling the contour so that it becomes
invariant to translation, rotation, and scale. This step ensures that the Fourier descriptors
capture the shape information regardless of the object's position, orientation, or size.
3. Complex Representation: Once the contour is normalized, each point on the contour is
represented as a complex number, where the real part corresponds to the x-coordinate and the
imaginary part corresponds to the y-coordinate.
4. Discrete Fourier Transform (DFT): The complex representation of the contour is then
transformed using the Discrete Fourier Transform (DFT). The DFT converts the sequence of
points in the contour into a series of complex coefficients, which represent the frequency
components of the shape.
5. Filtering and Truncation: Since Fourier descriptors capture both shape and noise
information, it's common to filter out the higher frequency components that represent noise or
fine details. This is often done by truncating the Fourier series after a certain number of
coefficients, keeping only the most significant coefficients.
6. Inverse DFT: After filtering, the inverse Discrete Fourier Transform (IDFT) is applied to
reconstruct the contour from the filtered Fourier coefficients.
7. Feature Extraction: The resulting set of Fourier descriptors forms a feature vector that
represents the shape of the object. These descriptors capture important information about the
object's shape, such as its curvature, symmetry, and overall contour.
Rotation, Translation, and Scale Invariance: Because Fourier descriptors are based
on the frequency components of the shape, they are inherently invariant to translation,
rotation, and scale changes. This makes them robust in situations where the object's
position, orientation, or size varies.
Compact Representation: Fourier descriptors provide a compact representation of
shape information. By retaining only a subset of the most significant Fourier
coefficients, the dimensionality of the feature space can be reduced, which is
beneficial for efficient storage and processing.
Discriminative Power: Fourier descriptors capture important shape characteristics of
objects, making them effective for tasks such as object recognition, classification, and
shape matching.
5
Sensitivity to Noise: Like many other frequency-based methods, Fourier descriptors
can be sensitive to noise in the image. Filtering techniques are often used to mitigate
this issue.
Limited Local Information: Fourier descriptors represent the entire shape of the
object, but they may not capture local details or texture information within the object.
Regional feature descriptors are techniques used in feature extraction to capture information
from specific regions of interest within an image. These regions could be areas with distinct
visual characteristics, such as corners, blobs, or regions surrounding keypoints detected in the
image. Regional feature descriptors are valuable in various computer vision tasks, including
object recognition, image matching, and image retrieval.
6
LSSD captures the self-similarities within local regions of the image.
It computes histograms of gradient orientations for small patches and measures the
similarity between pairs of patches.
LSSD descriptors are effective for texture classification and image retrieval tasks.
7. Gabor Filters:
Gabor filters are used to extract texture features from different frequency bands and
orientations within local image regions.
These filters produce responses that are then used to compute descriptors representing
texture characteristics.
Gabor descriptors are effective for texture classification, segmentation, and
fingerprint recognition.
Topological descriptors and texture descriptors are important feature extraction techniques in
image analysis and computer vision, each focusing on different aspects of the image. Let's
discuss each of them:
Topological Descriptors:
Topological descriptors aim to capture the spatial relationships and connectivity of objects
within an image. These descriptors provide information about the shape and structure of
objects and their arrangement in the image.
1. Euler Number:
The Euler number, also known as the Euler characteristic, is a topological invariant
that describes the connectivity and number of holes in an object.
It's calculated as E=V−E+F, where V is the number of vertices, E is the number of
edges, and F is the number of faces (regions) of an object.
This descriptor is useful in distinguishing objects with different topological
properties.
2. Betti Numbers:
Betti numbers are another set of topological invariants that provide more detailed
information about the topology of objects.
They describe the number of connected components, holes, tunnels, and voids in an
object.
Betti numbers can be computed at different scales, providing multi-scale topological
information.
3. Skeletonization:
Skeletonization is a process to extract the medial axis or skeleton of objects in the
image.
The skeleton represents the centerline or main structure of objects, providing a
simplified representation while preserving topological characteristics.
4. Persistence Homology:
Persistence homology is a more advanced topological descriptor that characterizes the
topological features of shapes across different scales.
It identifies topological features such as connected components, loops, and voids, and
tracks their persistence across various scales.
7
Persistence diagrams or barcodes are often used to represent the persistent topological
features.
Texture Descriptors:
Texture descriptors focus on capturing the spatial arrangement of pixel intensities in an image
to characterize its texture or surface appearance.
1. Dimensionality Reduction:
PCA identifies the principal components (PCs) of the data, which are the directions of
maximum variance.
These PCs form a new orthogonal basis that represents the data in a lower-
dimensional space while preserving as much variance as possible.
8
In image processing, the data can represent the pixel intensities of images. PCA can
reduce the dimensionality of this data, effectively compressing the image information.
2. Eigenfaces:
One popular application of PCA in image processing is in face recognition, where
PCA is used to derive eigenfaces.
Eigenfaces are the principal components of a set of face images.
By projecting face images onto the eigenfaces space, each face image can be
represented by a set of coefficients, which serve as feature descriptors.
These coefficients capture the most important variations in the face images, such as
lighting conditions, facial expressions, and pose variations.
3. Texture Descriptors:
PCA can also be used to extract texture descriptors from images.
Given a set of texture images, PCA can reduce the dimensionality of the texture
features while preserving the most significant variations.
The resulting principal components can serve as texture descriptors, capturing
important textural information present in the images.
4. Feature Fusion:
PCA can be combined with other feature extraction techniques to create more robust
feature descriptors.
For example, in image classification tasks, PCA can be applied to different types of
features, such as color histograms, texture features, and shape descriptors.
The resulting principal components from each feature type are concatenated or fused
together to form a more comprehensive feature vector.
5. Noise Reduction:
PCA can also be used for denoising images.
By representing the image in a lower-dimensional space spanned by the principal
components, PCA effectively filters out noise components that correspond to smaller
eigenvalues.
The reconstructed image using only the most significant principal components tends
to preserve the essential structures of the image while reducing noise.
6. Visualization:
While not strictly a feature descriptor, PCA can help visualize the distribution of data
in a lower-dimensional space.
In applications such as image clustering or visualization, PCA can reduce high-
dimensional image data to 2 or 3 dimensions, allowing easy visualization and
understanding of the data distribution.
Dimensionality Reduction: PCA reduces the dimensionality of the data, which can
lead to more efficient computation and storage of feature vectors.
Noise Reduction: PCA can filter out noise components from the data, leading to
cleaner feature representations.
Interpretability: PCA-derived features can sometimes be interpreted in terms of the
underlying patterns or structures in the data.
Computational Efficiency: PCA can significantly reduce the computational cost of
subsequent processing steps, especially in high-dimensional data.
9
Loss of Information: PCA discards information in the data associated with the lower-
ranked principal components, which might be important for certain tasks.
Linearity Assumption: PCA assumes that the data is linearly related, which might
not always hold true for complex data distributions.
Sensitivity to Scaling: PCA is sensitive to the scale of the input features, so data
normalization might be required.
Lack of Interpretability: While PCA-derived features capture the most significant
variations in the data, they might not be directly interpretable in terms of the original
features.
10
Matching is typically performed using distance metrics such as Euclidean distance or
cosine similarity between descriptors.
Keypoints with the closest matching descriptors are considered as corresponding
points between images.
Scale and Rotation Invariance: SIFT features are invariant to changes in scale and
rotation, making them robust to viewpoint changes.
Distinctiveness: SIFT features are highly distinctive, allowing reliable matching even
in cluttered or noisy images.
Localization Accuracy: SIFT accurately localizes keypoints to sub-pixel precision,
which improves matching performance.
Robustness: SIFT is robust to changes in lighting conditions and partial occlusions.
Widely Used: SIFT has been extensively used in various applications and has become
a standard technique in computer vision.
...................................
11