Chapter Final Chapter
Chapter Final Chapter
Chapter 9
By: Kinfe W.
Chapter outlines
• Introduction
Digital Image classification
• Supervised classification
• popular classifier
• Minimum distance
• Maximum likelihood
• Parallel piped
• Unsupervised classification
• Sequential clustering
• Isodata clustering
Introduction
• Use of mage classification
• Direct uses
• Produce a map of land-use/land-cover
• Indirect use
• Classification is an intermediate step, and may form only one of several data
layers in a GIS
• Water map vs water quality GIS
Why classify?
Make sense of a landscape
Place landscape into categories (classes)
Forest, Agriculture, Water, etc
Classification scheme = structure of classes
Depends on interest of users
• Can be few or many categories, depending on the purpose of the map and available resources
Image Classification
Image Classification
Classification
II. To obtain awareness in the data with respect to ground cover and surface characteristics.
black = water
yellow = open/field
dark green = dense forest
light green = sparse forest
bronze = mixed urban
red = dense urban
Image classification
• Image classification: is the process by which pixels which have similar spectral characteristics and
which are consequently assumed to belong to the same class are identified and assigned a unique colour.
• Classification is the process of sorting pixels into a finite number of individual classes, or categories of
data based on their data file values.
•It is uses to smooth out small, insignificant variations and simplify an image into a thematic map of
land cover.
• For the first part of the classification process, the computer system must be trained to recognize
patterns in the data.
• Training is the process of defining the criteria by which these patterns are recognized.
9
Multispectral classification
• Assigning each pixel in a remote sensed image a label describing real world object.
• That is automatically categorizing all pixels in an image in to land cover classes or themes.
• Output is a classified map a form of digital thematic map
• Grouping of similar pixel
• Separation of dissimilar ones
• Assigning class label to pixel
• Resulting in manageable size of classes
Unsupervised Classification
Unsupervised classification does not use training data as the basis for classification, but
examine unknown pixels in an image and aggregate them into a number of clusters based on
natural groupings present in the image values.
The classification assumes that DN values within a given cover type should be close together
in the spectral space, while data in different classes should be comparatively well separated.
The categories that unsupervised classification identify are not land cover or use classes, but
spectral classes or clusters. Some cover types may have encompass multiple clusters, for
example, agriculture land may have sugar cane, rice, wheat etc. But they look different
spectrally.
The analysts must provide labels to each of the clusters after unsupervised classification using
other sources of data.
Cont.…
A. Unsupervised classification: it is a technique that groups the pixels into clusters based upon the
distribution of the digital numbers in the image.
Threshold value
• The identities of land cover types to be specified as classes within a scene are generally not known a priori
because ground reference information is lacking or surface features within the scene are not well defined.
• The computer is required to group pixel with similar spectral characteristics in to unique clusters according to
some statistically determined criteria
• Analysis then combine and re-labels the spectral clusters in to information classes
Unsupervised classification
Cont…
• Advantages
• Requires no prior knowledge of the region
• Human error is minimized
• Unique classes are recognized as distinct units
• Disadvantages
• Classes do not necessarily match informational categories of interest
• Limited control of classes and identities
• Spectral properties of classes can change with time
Isodata clustering
• The iterative self-organizing data analysis technique(ISODATA)
• ISODATA is iterative because it makes a large number of passes through remote sensing dataset
specified results are obtained instead of just two passes
• It dose not allocate it initial mean vector base on the analysis of pixel rather an initial arbitrary
assignment of all Cmax clustering takes place along n-dimensional vector that runs b/n very specific
point in feature space
• maximum no. of clusters to be identify
• maximum % of pixel whose class value are allowed to be unchanged b/n iterations
• max no. of time ISODATA is to classify pixel and recalculate clusters mean vector
• Minimum members in cluster
• Maximum standard deviation for cluster
• Minimum distance b/n cluster means
Cont`d
• Phase 1: ISODATA cluster building using many passes through the dataset
A B
A. ISODATA initial distribution of five hypothetical mean vectors using + standard deviations in both bands as
beginning and ending points.
B. in the first iteration each candidate pixel is compared to each cluster mean and assigned to the cluster whose
mean is closest in Euclidean distance
Cont`d
• During the 2nd iteration a new mean calculated for each cluster based on the actual spectral location of pixels
assigned to each cluster instead of the initial arbitrary
• After the new cluster mean vectors are selected every pixel in the scene is assigned to one of the new
clusters
• This split-merge-assign process continues until there is little change in class assignment b/n iteration (the T
threshold is reached )or the maximum number of iterations is reached(M).
Example k-means
Band 2
Band 2
Band 2
Band 1 Band 1 Band 1
1. First iteration. The cluster centers 2. Second iteration. The 3. N-th iteration. The centers have
are set at random. Pixels will be centers move to the stabilized.
assigned to the nearest center. mean-center of all pixels
in this cluster.
Sequential Clustering
Y
12
3
4 10
11 13
5 14
2 7
9
15
1 6 8
Sequential Clustering X
B. Supervised classification: initially the operator outlines sample or training areas for each
surface class (from ancillary data or Ground truth).
• Computer then generates statistical parameters from the training areas and compares the
digital numbers of every pixel in the image with these statistical parameters.
• Every pixel is evaluated and assigned to the class which it most closely resemble digitally(
in statistics)
“Training”
Classified Image
Landsat ETM+
Digital color infrared
Acquired: April 21, 2003
Spatial resolution: 30 meters
Landsat TM
Digital color infrared
Acquired: February 17, 1989
Spatial resolution: 30 meters
Landsat MSS
Digital color infrared
Acquired: March 14, 1975
Spatial resolution: 57 meters
Corona
Panchromatic (b/w) film
Acquired: March 2, 1969
Spatial Resolution: 3 meters
Supervised classification
Cont`d
• Many of the classification tools can also be accessed through the Signature Editor.
• Signature editor: allows to create, manage, evaluate, edit, and classify signatures.
• Threshold dialog: allows you to evaluate the accuracy of the classification process.
• Accuracy assessment dialog: allows to evaluate the accuracy of the classification process.
28
• Classification Steps
5. Accuracy Assessment
29
• Supervised Classification Algorithms
i. Minimum Distance to Mean Classifier: every pixel is assigned a class based on its
distance from the mean of each class.
ii. Parallelepiped (box) Classifier: the data file values of the candidate pixel are
compared to upper and lower limits. These limits can be either:
the minimum and maximum data file values of each band in the signature,
the mean of each band, plus and minus a number of standard deviations, or
any limits that you specify, based on your knowledge of the data and signatures.
30
Parallelepiped (supervised)
• For each training region determine the range of values observed in each band.
• These ranges form a spectral box (or parallelepiped) which is used to classify this class type.
• Assign new image pixels to the parallelepiped which it fits into best.
• Pixels outside all boxes can be unclassified or assigned to the closest one.
• Problems with classes that exhibit high correlation between bands. This creates long
‘diagonal’ data-sets that don’t fit well into a box.
Parallelepiped example
• Probability contours are created around each training area and a pixel assigned to a class
depending upon the value of the probability contours that encompass it.
• The maximum likelihood classifier is generally considered to be the most powerful but is
also considered the most computer intensive.
• Using this algorithm, pixel 1 belongs to class A, pixel 2 to class B and pixel 4 to class C.
Pixel 3 has a higher probability of belonging to class B than class C.
35
Training samples
• Training samples are sets of pixel that represent what is recognized as a potential class.
• Size: the general rule is if training data is being collected from n bands then 10n pixels of training
data is to be collected from each class but the total sample should be less than 100.
• Using a class from a thematic raster layer from an image file of the same area.
Select appropriate classification algorithm
• Various supervised classification algorithms may be used to assign an unknown pixel to one of the
classes
• The choice of particular classifier depends on nature of input data and output required
Parametric
parametric classification algorithms assume that the observed measurement vector Xc obtained
for each class in each spectral band during the training phase are Gaussian in nature.
Maximum likelihood classifier
Non parametric
Parallelepiped classifier
Equiprobability
contours
2
Maximum likelihood classifier
• This rule is based on the probability that a pixel belongs to a particular class.
• The probability is equal for all class and the input band have normal distributions.
• If you have a priori knowledge that the probabilities are not equal for all classes, you can
specify weight factors for particular classes
Cont`d
• Advantage
• Disadvantage
Tends to over classify signatures with relatively large value in the covariance matrix
Basic Steps in supervised classification
Post classification
• Can check non-training regions with more ground truth if available.
Accuracy Assessment
By: Kinfe W.
• Chapter Outline
• Overall accuracy:
• KAPPA(K^)
• Errors matrix
• Commission errors:
• Omission errors
Accuracy assessment
• Accuracy assessment is a general term for comparing the classification to
geographical data that are assumed to be true.
• Three types of random pixel selecting can be distinguished:
Random : no rules will be used
Stratified random: the number of points will be stratified to the distribution of
thematic layer classes.
Equalized random: each class will have an equal number of random points.
• From the accuracy assessment cell array, two kinds of reports can be
derived:
The error matrix simply compares the reference points to the classified points in a c*c
matrix, where c is the number of classes.
The accuracy report calculates statistics of the percentages of accuracy, based upon the
results of the error matrix.
45
Confusion matrix
• A confusion matrix shows the number of correct and incorrect prediction made by the
classification model compare to the actual outcome (target value ) in the data.
Target
Confusion matrix Class 1 Class 2
Class 1 a b class1 predictive value a/(a+b)
Model
Class 2 c d Class 2 predictive value d/(c+d)
Accuracy= (a+d)/(a+b+c+d)
Errors matrix
Omission error
Error of exclusion
pixel are not assigned to its appropriate class or pixel are omitted from the actual class they
belongs.
Commission error
Error of inclusion
7 out of 32 were omitted from grass category Grass land category most confused with other land cover
Cont`d
• Representation of map accuracy:
over all accuracy (of the classification or map) number of correctly classified pixel (sum of the
major diagonal divided by the number of reference pixel.
Grass Agricultura Forest bush Total possible
land l land land land
Grass land 25 5 10 3 43
Agricultural 2 50 6 5 63
land
Forest land 3 4 60 5 72
Bush land 2 2 2 100 106
Total 32 61 78 113 284
• User`s accuracy (1-Commission error ): divided the number of correctly classified pixels in each
category by the number of pixels that were classified in that category (row total )
• UA = 25/43*100= 58.1%
By: Kinfe W.
Applications of remote sensing
Forestry
1) reconnaissance mapping:
• Forest cover updating, depletion monitoring, and measuring biophysical properties
of forest stands.
• Forest cover type discrimination
• Agroforestry mapping
2) Commercial forestry
• Clear cut mapping / regeneration assessment
• Burn delineation
• Infrastructure mapping / operations support
• Forest inventory
• Biomass estimation
• Species inventory
Cont`d
3) Environmental monitoring
• Species inventory
• Watershed protection
• Coastal protection
• Forest health
Land Use/Cover
Land cover refers to the surface cover on the ground, whether vegetation, urban infrastructure,
water, bare soil or other.
• Identifying, delineating and mapping land cover is important for global monitoring studies,
resource management, and planning activities.
Land use refers to the purpose of the land serves, for example, recreation, wildlife habitat, or
agriculture.
applications of remote sensing on Land use
• Routing and logistics planning for seismic / exploration / resource extraction activities
• Target detection - identification of landing strips, roads, clearings, bridges, land/water interface
Agriculture
• maps have been the basic ingredient of all military planning, throughout the
history.
• military strategists use maps to locate opposing forces, plan operations and to
coordinate logistics
• since topography is vital for operations, defense forces prefer topographic maps
over other types of maps
I am will come, If you have question or comment
65