0% found this document useful (0 votes)
74 views3 pages

Ijetr022734 PDF

This document summarizes and compares different image segmentation techniques: 1) Edge-based techniques use edge detection and the Hough transform to segment images by detecting boundaries between regions. 2) Shape-based techniques segment images based on the shapes and sizes of connected components in the image. Features like size, shape, and context are used to classify components as text or non-text. 3) Clustering techniques like K-means and Expectation Maximization segment images by grouping similar pixels or regions into clusters based on features like pixel color or intensity values.

Uploaded by

erpublication
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views3 pages

Ijetr022734 PDF

This document summarizes and compares different image segmentation techniques: 1) Edge-based techniques use edge detection and the Hough transform to segment images by detecting boundaries between regions. 2) Shape-based techniques segment images based on the shapes and sizes of connected components in the image. Features like size, shape, and context are used to classify components as text or non-text. 3) Clustering techniques like K-means and Expectation Maximization segment images by grouping similar pixels or regions into clusters based on features like pixel color or intensity values.

Uploaded by

erpublication
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

International Journal of Engineering and Technical Research (IJETR)

ISSN: 2321-0869, Volume-2, Issue-11, November 2014

Comparative Study of various Image Segmentation


Techniques
Harshil Shah, Rahul Shah

Abstract Image segmentation is a considered to be an II. EDGE BASED TECHNIQUE


integral part of the OCR process. The positives of the recent
advancements in the OCR are imputed to the better
This method attempts to resolve image segmentation by
manipulation of images in terms of segmentation. The proposed
paper is a review of the progress made. Segmentation methods
detecting edges or boundaries between two distinct or
are listed under three main techniques. The first technique,
contrasting regions. In the paper [3] proposed by Satadal
Hough Transform, takes into consideration various Saha, they have used an edge-based technique to improve
mathematical equations to expedite the segmentation process. the overall efficiency of the system. The approach is used is
The second technique uses the connected components to converting the input image using Hough transform for
segment the images. Lastly it uses clustering algorithm to define directional segmentation of lines and words from any type of
several of the data points of the image. images The above transform is used iteratively from
sentences to words and from words to characters. The
Index TermsCCL, Expected Maximization, Hough Hough image is generated from the binarized edge map of
Transform, Image Segmentation. the image. The Hough transform uses various parameters to
tune the overall transform for better results.
I. INTRODUCTION
A. Algorithm
Image Segmentation [1],[2] is the process of partitioning a 1. Locate all the feature points in the image space.
digital image into multiple segments. The objective of
2. For each feature point in the image space, a set of
segmentation is to simplify and change the representation of
lines are plotted in the Hough space.
an image into something that can be dealt better in terms of
analysis. Image Segmentation is done based on certain 3. The intersections in the Hough space are plotted into
characteristics, such as color, shape and orientation. Image a 2-d accumulator.
segmentation can be done by using a 2-d dimensional bitmap 4. After all the plotting, a local maxima is found in the
with each pixel of the image represented by one bit of the accumulator
bitmap. Formal definition of image segmentation is defined 5. If required, plot back each maxima into the image
as the function such that it divides the image x into sub space.
images xj such that every sub image belong to a particular
equivalence class defined by the relation. The methods for The representation of the Hough plane is done in terms of (p,
image segmentation are described as below. ) instead of (x, y). The given representation is used so as to
avoid the situation when the line is parallel to the y-axis. This
would lead to slope of the line tending to infinity. Thus the
equation of the line changes to -
x cos +y sin = p (1)

Fig. 1 Classification of various segmentation techniques

Manuscript received November 05, 2014.


Harshil Shah, BE Computer Engineering, Dwarkadas J. Sanghvi
College of Engineering, Mumbai, India, +919322065106. Fig.2 Alternative representation of straight line in (p, )
Rahul Shah, BE Computer Engineering, Dwarkadas J. Sanghvi College plane.
of Engineering, Mumbai, India, +919769568468.

77 www.erpublication.org
Comparative Study of Various Techniques of Image Segmentation Techniques

B. Preprocessing and Tuning Generally, the text components are aligned horizontally in the
document as compared to the non-text components. Hence,
The various parameters used for the tuning deltaRho, we use the surrounding components also to build the feature
deltaTheta, startTheta, endTheta, connectDistance, vector. Each connected component with its surrounding
pixelCount. The image before getting transformed goes connected area is rescaled to the 40X40 window size for
through a number of stages. The image are pre-processed, generating the context based feature vector. The surrounding
binarized (using Otsu algorithm [4]). The edge detection of context area is not fixed for all connected components but it
the objects is done using various masks. The masks used are is a function of components length (l) and height (h). The
as follow: function is such that, for each connected component the area
of dimensions is 5xl by 2xh. The size of the context based
[ ] [ ] feature vector is 1600.

Hence the total size of the feature vector is 3204 which


consists of raw rescaled shape (1600), raw rescaled context
[ ] [ ] (1600), and four size based features.

C. Classification

For classification, the paper makes use of Auto-MLP, a


All the marked white lines in the Hough transformed images self-tuning classifier that can automatically adjust learning
are segmented through CCL algorithm. In the algorithm [6], parameters. For these classifiers, learning parameters are
any white pixel searches for its white neighbors.8-connected chosen from parameter space which has been sampled
neighbors are searched and non-recursive function call is according to probability distribution function. All of these
used to reduce usage of system resource and time complexity. MLPs are trained for few epochs and then half of these
Lastly, a bounding box is created which envelopes the words classifiers are selected for next generation based on
recognized. performance. After MLP classifier, it labels each connected
component based on the classification probabilities as text
III. SHAPE BASED TECHNIQUE and non-text.

The technique which is useful in segmentation is shaped IV. CLUSTERING BASED TECHNIQUES
based. These techniques take into consideration the
homogeneity of a particular area (forming a region). The Apart from edge and shaped methods, there are techniques
paper proposed [5] uses discriminative learning connected which are derived from data mining to facilitate the process
component based classification. Here they train a self-tunable of segmentation. The paper [6] proposed the algorithms used
multilayer perceptron (MLP) classifier for distinguishing are K-Means, EM which are useful in terms of segmenting
between text and non-text connected components using shape images.
and context information as a feature vector.
A. K-Means Algorithm
A. Shape of the connected component
K-Means algorithm is an example of unsupervised clustering
In most of the documents, the size of the non-text algorithm. It classifies the input data points into different
components is larger than that of the text components. Thus, clusters based on their Minkowski distance.
size information plays a key role in classification. But it alone
cannot suffice the need of classification and hence we also ( | | ) (2)
use shape of the text and non-text components which can be
learned by the MLP classifier. The algorithm assumes that the bits of the image form a
vector space and tries to clusters them naturally into
Hence for generating the feature vector each connected according to their intensities. The points are clustered around
component is rescaled to a 40X40 pixel window. It is only centroids i i ranging from 1 to k in pursuit of minimizing
downscaling. If the length or height is greater than 40 then it the distance of the data points from the centroids of their
is downscaled to 40 else if it is less than 40 it is fit to the respective clusters. The algorithm uses an iterative approach
center of the window. The advantage of doing so is to to cluster the data points. Here the data points are nothing but
distinguish the shape of the smaller and larger components. the pixel density.

Together with raw rescaled connected component, the shape The algorithm is given below
based feature vector is also composed of four other size based
features: 1. Calculate the histogram graph of the intensities of the pixel
of a particular image.
1. Normalized length - It is the ratio of the length of the
component to the length of the input image. 2. Randomly select k data points that will act as a centroid for
a particular cluster.
2. Normalized height - It is the ratio of the height of the
component to the height of the input image. 3. Follow the given steps again until the cluster a label of the
image does not change anymore.
3. Aspect ratio of a component - It is ratio of length to height
4. Cluster the points based on the metric used for the relative
4. The ratio of the number of foreground pixels to the total change in the intensities from the centroid intensities.
rescaled area. ()
() ()

B. Surrounding context of connected component
5. Compute the new centroid for each of the clusters.

78 www.erpublication.org
International Journal of Engineering and Technical Research (IJETR)
ISSN: 2321-0869, Volume-2, Issue-11, November 2014
()
()
* () + Al competition
Azawi, Faisal test images
* () + Shafait, and circuit
Thomas M. diagrams
The parameter on which the above algorithm is tuned is k. k Breuel
denotes the number of clusters to be formed for given set of Suman
K-means,
data points. The characters in a text are clusters into similar Gray-scaled Expected
Tatiraju, Avi - -
cluster due to the fact that most of the characters are of same images. Maximizati
Mehta
intensities and thus belong to the same cluster. on

B. Expected Maximization
VI. CONCLUSION
When it comes to unsupervised learning, the most
omnipresent algorithm used is Expected Maximization. The
data model is dependent on the hidden variables and the The above survey concludes that remarkable work has been
method depends on computing the maximum a posterior done for image segmentation. But there is more scope for
(MAP) estimate of the parameters. In Expected improvements. Some of the key improvements could be in
Maximization, the steps are performed iteratively till all terms of segmentation of cursive handwriting in images. In
consecutive iterations give the same value. The Expectation conclusion, we hope that this lucid discussion will clarify the
Step (E step) computes the probability of hidden variables approaches and methodologies involved in it and would aid
being observable. The next step i.e. the Maximization Step to the future researchers.
(M step) maximizes the probability of the expected
probability found in the previous step. Now, again the E step ACKNOWLEDGEMENT
and M step are repeated so that the values of the result reach a
constant point. The tuning factor or the parameter is
calculated in the M step are used in the previous step. The Special thanks to Mrs. Khushali Deulkar for guiding us
above explanation can be mathematically expressed as: through the implementation of the algorithm and enriching
the quality of the research.
Given training dataset { ( ) ( ) ( ) } and model ( )
where z is the latent variable, we have: REFERENCES

( ) ( ) [1]. Richard G. Casey and Eric Lecolinet. A Survey of Methods and


( ) Strategies to Image Segmentation, IEEE Transactions On Pattern
It is evident in the above equation that the log probability is Analysis And Machine Intelligence, Vol. 18, No. 7,1999. p. 690-706
described in terms of x, z and . But since z, the hidden [2]. Archana A. Shinde, D.G.Chougule. Text Pre-processing and Text
variable is not known; we approximate its value. The Segmentation for OCR, IJCSET Volume 2, Issue 1; 2012. p. 810-812
approximations used are derived mathematically using the E [3]. Satadal Saha, et al. A Hough Transform based Technique for Text
& M steps and is given below. Segmentation, Journal of Computing, Volume 2, Issue 2, February
2010, ISSN 2151-9617. p.134-141
As observed from the above equation, the log probability is [4]. Nobuyuki Otsu (1979). "A threshold selection method from
described in terms of x, z and . Expectation Step, i:
gray-level histograms". IEEE Trans. Sys., Man., Cyber. 9 (1): 6266.
( ()
) ( ()
| () [5]. Syed Saqib Bukhari, et al. Document Image Segmentation using
Discriminative Learning over Connected Components, Proceeding
Maximization Step, z: DAS '10 Proceedings of the 9th IAPR International Workshop on
( () () )
( ) ( ( )) Document Analysis Systems p. 183-190
( ( )) [6]. Suman Tatiraju, Avi Mehta, Image Segmentation k-means
Where Qi is the posterior distribution of z (i)s given the x clustering, EM and Normalized Cuts, University of California
(i)s.
Irvine.
Theoretically, the Expected Maximization Algorithm is an
alternative to the K Means Algorithm where a point that is a Harshil Shah, B.E. in Computer Engineering, Dwarkadas J. Sanghvi
member of a given cluster is incomplete and is not integral. College of Engineering, Mumbai, India.
Rahul Shah, B.E. in Computer Engineering, Dwarkadas J Sanghvi
V. COMPARISON OF VARIOUS APPROACHES College of Engineering, Mumbai, India.
Table 1. Comparison of Various Approaches
Author Work done Concept Data set Res
on Used ult
45
Satadal Saha, documents
Business
Subhadip containing
Card
Basu, Mita Hough 812 85.7
Reader
Nasipuri and Transform lines 0%
(BCR)
Dipak Kr. comprising
system
Basu of 7300+
words
Subset of Machine
95
Syed Saqib UW- Learning on
documents 97.2
Bukhari, III, ICDAR Connected
and 18 5
Mayce 2009 page Component
images
Ibrahim Ali segmentation s

79 www.erpublication.org

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy