L9 Spatial Interpolation
L9 Spatial Interpolation
Figure 1: Global (L) and Local (R) interpolation. The circe shows the spatial divisisons of calculation.
The spatial interpolation can also be classified based on the exactness of its fit with the
sample points. They are exact and approximate.
1
Exact interpolation generates a surface passing through the control or sample points. The
approximate interpolation method predicts a value at the point locations that differ from their
known values.
Figure 2: Exact (L) and approximate (R) interpolation. Yellow dots are sample points
Exact methods are most appropriate when a high degree of certainty is attached to the mea-
surements made at the observed data points. Approximate methods are more appropriate where
there is a degree of uncertainty surrounding the measurements made at the sample points. The
approximate method provides a smoother interpolated surface compared to the exact method.
Another broader classification is deterministic interpolation method and stochastic inter-
polation method.
Deterministic interpolation methods can be used when there is sufficient knowledge about
the geographical surface being modelled. This adequate knowledge about the surface to be
modelled will enable us to choose a simple mathematical function to do the interpolation easily.
As the method becomes more straightforward, it will not be able to handle the randomness of
the data, or it cannot provide any assessment of errors in the predicted values.
Stochastic methods incorporate the concept of randomness and offer an assessment of pre-
diction errors with estimated variances. Kriging is one of the best examples of the stochastic
interpolation method.
Figure 3: Classification of spatial interpolation methods: Among these methods, the highlighted ones
are only discussed here.
3 Trend Surface
Trend Surface Analysis is a global polynomial interpolation method that fits sample points with
a smooth surface defined by a mathematical function. It assumes that the variable being studied
can be approximated by a polynomial equation. Following are the linear, quadratic and cubic
equations for the trend surface:
Linear:
z(x0 , y0 ) = b0 + b1 x + b2 y (1)
2
Quadratic:
z(x0 , y0 ) = b0 + b1 x + b2 y + b3 x2 + b4 xy + b5 y 2 (2)
Cubic:
z(x0 , y0 ) = b0 + b1 x + b2 y + b3 x2 + b4 xy + b5 y 2 + b6 x3 + b7 x2 y + b8 xy 2 + b9 y 3 (3)
where z(x0 , y0 ) is the value at the unknown point; x,y represent the known sampling locations;
b represent the coefficient estimated from the known locations.
A sample problem is given in the Lecture 10 slides
Problem 1: There are five known sampling locations and one unknown location, as shown in
the figure:
Figure 4: Black-coloured numbers are the ID numbers given to the samples. Red-coloured numbers are
the elevation values of the sample points. The point with zero ID is the unknown point.
The points’ IDs, their x,y coordinates and the corresponding elevation values (z) are shown
in the following figure:
3
Figure 5: Six points and their details are listed. The point with ID 0 is the unknown point.
4
By re arranging and substituting the above values in the Equation (7), we get the coefficient
of the linear polynomial function:
−1 P
b0 Pn Px 2 Py P z
b1 = x
P x P xy2 P zx (8)
P
b2 y xy y xy
−1
5 17 17 31
= 17
81 54 88 (9)
17 54 79 98
By calculating the inverse, the matrix form can be written as:
0.093 −0.011 −0.012 31
= −0.011 −0.024 −0.013 88 (10)
−0.012 −0.013 0.024 98
b0 0.739
b1 = 0.497 (11)
b2 0.836
After obtaining the values of b0 , b1 and b2 , we can substitute them in the equation z(x0 , y0 ) =
b0 + b1 x + b2 y to get the interpolated value at the unknown location.
• While using higher-order polynomial functions, the calculation becomes more complex.
• Trend surface analysis can be sensitive to outliers in the data, which can lead to overfit-
ting.
Advantages:
• This method is useful when the surface varies gradually from region to region over the
area of interest( e.g. Pollution in an area).
• One of the major advantages of this method is that it is simple and easy to compute.
5
4 Interpolation using Thiessen polygon
hiessen polygons (otherwise known as Voronoi polygons or Voronoi diagrams) are used to
determine a sampling point’s proximity and neighbourhood. Every spatially distributed point in
a 2D space influences the area around them. Finding the influential area around every sampling
point can divide the area into discrete regions. These regions around each point will be assigned
with the value held by the point representing the region. The polygon around each sampling
point is also known as the Voronoi cell. Thiessen polygons assume that any point within a
Voronoi cell is closer to a known point than any other point in the area.
The second step is to draw lines connecting the points to their nearest neighbors.
Third step is to find the bisectors of each line drawn in the previous step.
6
Last step is to connect the bisectors of the lines radiating from each sampling points each
other resulting in a polygon around each point.
A sample problem is given in the Lecture 10 slides There are 5 known sampling locations
and one unknown locations as shown in the figure. The unknown point (0) is geographically
included in the Voronoi cell of point(1).
From the above figure, it is understood that the unknown point is within the Voronoi cell of
the sample point (1). Hence, it is assigned the value 10.
7
4.2 Strength and limitations of Thiessen polygon
It is found that the sample points in sparsely sampled regions have a larger area of influence
than those in the more densely sampled areas. This method will give more reliable results when
the sampling density is higher.
Results are poor when areas of interest are around the edges of the Voronoi cell because the
method does not consider the behaviour of the data; it only takes the boundary of the Voronoi
cell. This boundary can change when the sampling density is higher.
This method is unsuitable for continuous data like pollution, temperature, etc. The calcula-
tion of the Thiessen polygon method is simple and fast. The typical application of this method
is to map the rainfall data of a region by assuming the rain gauge stations as the sampling
points.
The values at the unknown points are calculated with a weighted average of the values
available at the known points. The inverse of the distance to each known point is taken as the
8
weights in the calculation. The value at an unknown point x0 , y0 is calculated by:
Pn 1
i=1 zi dki
z(x0 , y0 ) = Pn 1 (12)
i=1 dki
where zi is a known value; di is the distance between point i and point x0 , y0 ; s is the number
of known samples; k is the power.
The power k always take value ≥ 1. If k = 1, it smooths out the interpolated surface.
By defining a higher power value than 1, the nearby data will have the most influence, and
the surface will have more detail (be less smooth). An optimal value for the power can be
considered where the minimum mean absolute error is at its lowest. In general cases, the value
of k is taken as 2.
A sample problem is given in the Lecture 10 slides There are 5 known sampling locations
and one unknown locations as shown in the figure
Figure 13: Sample points are given in red colour and the its distance from the unknown points are
shown in blue
6 Kriging
Kriging is a geostatistical interpolation technique. Geostatistics is synonymous with Kriging,
which is a statistical version of interpolation. The prediction is made with spatial autocorrela-
tion, which can be defined as the degree to which values or measurements of a variable at one
9
location in a spatial dataset are correlated or related to values in nearby areas. Kriging differs
from other local interpolation methods because Kriging can assess prediction quality with esti-
mated prediction errors. Interpolation by Kriging considers the spatial relation or dependencies
between the unknown points and the sample points. In addition to it, it considers the spatial
relationship between the sample points also.
The general formula for kriging, a geostatistical interpolation method used to estimate val-
ues at unobserved locations based on spatial autocorrelation and a set of observed data points,
can be expressed as follows:
N
X
z(x0 , y0 ) = Wi × zi (13)
i=1
where z(x0, y0) is the value z at an unobserved location : zi the measured value at the it h
location; Wi is an unknown weight for the measured value at the ith location; N is the the
number of measured values.
The kriging weights Wi are determined through a variogram model, which characterizes the
spatial autocorrelation of the data. Here, Wi is calculated based on the distance between the
measured points and the prediction location, considering their overall spatial dependence.
The spatial dependence is measured in Kriging by plotting a semi-variogram. The semi-
variogram is a graph with the x-axis as the distance between the sampling points and the Y-axis
as the semi-variance between the points.
For example for a pair of sampling locations i and j separated by a distance ’h’, the semi
variance , γ(h) is calculated as:
1
γ(h) = (valuei − valuej )2 (14)
2N
N is taken as 1 when only a pair of samples are considered. If there are more pairs of samples
with distance ’h’ between them, then the value of N will be equal to the number of pairs
used in the semi variance calculation. Dividing by N actually makes an average of the values
calculated.
10
There is a lag distance for each pair of data points, which represents how far apart those two
points are in the spatial domain. For example, Considering 11 sample points as in the figure 14,
you can calculate the semivariance for each pair of points at various lag distances. The number
of distinct lag distances is given by the following formula:
N (N − 1)
Nlag = (15)
2
where N is the number of sample points (in this case, 11). For 11 sample points, you will
have 11(11−1)
2
= 55 distinct pairs of points. That means you can calculate 55 semivariances for
different lag distances, resulting in 55 points in the variogram cloud if you plot semivariance
against lag distance.
The principle of spatial autocorrelation (variogram) is rooted in the observation that ge-
ographic features or phenomena are often influenced by their surroundings. Things that are
physically closer to each other tend to share common characteristics or attributes. It is found
that the far left on the x-axis of the semivariogram cloud have more similar values and are less
apart from each other. Moving to the right on the x-axis of the points in the semivariogram
cloud shows more spread. This pattern of a closer cloud of points towards the left of the x-axis
and a cloud of points farther apart with larger semi-variance shows that spatial dependence
exists.
When the sampling points are larger, a semivariogram cloud is difficult to manage. In such
cases, a data preprocessing step called ’binning’ or spatial aggregation is devised. The first part
of the binning process is to group pairs of sample points into lag classes. For example, if the lag
distance (i.e., the distance between points ) is 100 meters, then all the pairs of points separated
by less than 100 meters are grouped into the lag distance class of 0–100. similarly, those pairs
of points separated between 100 and 200 meters are grouped into the lag class of 100–200 m,
and so on. The binning process will result in numerous bins that contain pairs of sample points
with similar distances between them. The next step is to compute the average semivariance by:
11
Figure 16: Binning: Each circle represent a specific distance from a sampling point
where γ(h) is the average semi-variance between sample points separated by lag h; N is the
number of pairs of sample points in the bin; and z is the attribute value.
• Circular
• Spherical
• Exponential
• Gaussian
• Linear
12
Figure 17: Models for fitting semivariograms
A spherical model shows a progressive decrease of spatial dependence until some distance,
beyond which spatial dependence disappears. An exponential model exhibits a less gradual
pattern than a spherical model: spatial dependence decreases exponentially with increasing
distance and disappears completely at an infinite distance. In the Gaussian model, spatial
dependence gradually decreases as distance increases, following a bell-shaped curve. How-
ever, unlike the spherical model, it does not reach a complete plateau but continues to decline
smoothly. In linear, the semi-variance increases linearly as the distance increases. Similar to the
spherical model, the spatial dependence in a circular pattern decreases with increasing distance
from the centre. However, it doesn’t necessarily level off at a specific range as the spherical
model does. After fitting the model, we get a smooth curve according to the selected model.
This curve can be characterized by three elements: nugget, range and sill.
13
Nugget
The semi-variogram model intercepts the y-axis. The value of intercept is called nugget.
Range(a)
Range(a)
Sill (C+C0)
Sill (C)
Nugget
C0 C0
(a) (b)
Figure 18: a) Semi-variogram when nugget is zero b) Semi-variogram when nugget or intercept is non-
zero
6.4 Variogram Models and its range, sill and nugget characteristics
Variogram models are used to fit the points on the variogram plot mathematically. It is used
to quantify the relationship between the variance of the variable and the distance between data
points. The variogram can be mathematically represented using various models, such as spher-
ical, exponential, Gaussian, linear, etc. The choice of the variogram model and the estimation
of its parameters (range, sill, and nugget) are critical in geostatistical analysis because they im-
pact the quality of spatial predictions and interpolation. These different variogram model types
have different properties; some are better at modelling certain data sets than others. In practice,
it’s good to try fitting different variogram models to the data and determine which one fits best.
By considering a as the range, C + C0 as the sill, h is the distance between different points and
C0 as the intercept on the y-axis, various models can be characterized as follows:
14
Figure 19: Variogram model characterization
6.5 Interpolation
Once you fit a variogram to a suitable model, the semi-variance (γ(h) ) between any two points
can be calculated by the equation given along with the Figure (19). Values of a, c, h and a0
for the equation can be derived from the fitted variogram and it can be used to find the γ(h)
between the unknown point and the known points (i.e., γ(h01 ), γ(h02 ), ...., γ(h0n )) also. We
can interpolate using the fitted model after making the dependence or autocorrelation using
a variogram. The fundamental equation for interpolation is z(x0 , y0 ) = N
P
i=1 W i × zi The
.
weights in this equation can be derived from solving a set of simultaneous equations.
For example, if we assume that there are three known points and we need to interpolate for
an unknown point z(x0 , y0 ), then the following four simultaneous equations are to be solved
for W1 , W2 and W3 .
15
W1 γ(h21 ) + W2 γ(h22 ) + W3 γ(h23 ) + λ = γ(h20 )
W1 γ(h31 ) + W2 γ(h32 ) + W3 γ(h33 ) + λ = γ(h30 )
W1 + W2 + W3 + 0 = 1
where γ(hij ) is the semi-variance between known points i and j; γ(hi0 )is the semi-variance
between known point i and the point to be estimated; λ is a Lagrange multiplier, which is
added to handle unexpected errors due to binning. Once the weights are obtained by solving
the simultaneous equations, the z(x0 , y0 ) can be estimated by:
z(x0 , y0 ) = z1 W1 + z2 W2 + z3 W3 (17)
A sample problem is given in the Lecture 10 slides There are five known sampling loca-
tions and one unknown location as shown in the figure:
The figure shows the layout of the sample points and the table shows the distance between
the points. It is also given in the question that the semi-variance between the sample points are
16
calculated and and plotted against the distance between the points. The semi-variance-distance
plot has been fitted with a linear variogram model:
c0 + c ha , 0 < h ⩽ a
The plot between the semivariance and the distance is given below:
Solution: The semi-variance is calculated and presented along with the question for all
sample points with known values. It is plotted in Figure (21). The calculation of semi-variance
demands two known values while using the formula γ(h) = 0.5 × average(valuei − valuej ).
Therefore, the γ(h) between the unknown point and the known sample is impossible to cal-
culate straightforwardly. Consequently, it is required to make a mathematical relationship be-
tween γ(h) and h to extend or apply for any pair of points. Therefore, the semi-variance is
calculated for all the pairs of points using the Figure (21) and the linear variogram equation 18.
The γ(h) for all possible pairs of points are shown below:
17
Figure 22: Calculated semivariance value
The general equation for ordinary kriging using 5 points to estimate the value at an unknown
point x0 , y0 is:
5
X
z(x0 , y0 ) = Wi × zi (19)
i=1
= W1 z1 + W2 z2 + W3 z3 + W4 z4 + W5 z5 (20)
Here, there are five unknown weights which can be estimated using six simultaneous equations.
(There is an extra term called Lagrange term (λ) to be included while solving for weights. So
there are six unknowns).
γ(h11 )W1 + γ(h12 )W2 + γ(h13 )W3 + γ(h14 )W4 + γ(h15 )W5 + λ = γ(h01 )
γ(h21 )W1 + γ(h22 )W2 + γ(h23 )W3 + γ(h24 )W4 + γ(h25 )W5 + λ = γ(h02 )
γ(h31 )W1 + γ(h32 )W2 + γ(h33 )W3 + γ(h34 )W4 + γ(h35 )W5 + λ = γ(h03 )
γ(h41 )W1 + γ(h42 )W2 + γ(h43 )W3 + γ(h44 )W4 + γ(h45 )W5 + λ = γ(h04 )
18
γ(h51 )W1 + γ(h52 )W2 + γ(h53 )W3 + γ(h54 )W4 + γ(h55 )W5 + λ = γ(h05 )
W1 + W2 + W3 + W4 + W5 + 0 = 1
These equations can be re-written in matrix multiplication form:
−1
W1 γ(h11 ) γ(h12 ) γ(h13 ) γ(h14 ) γ(h15 ) 1 γ(h01 )
W2 γ(h21 ) γ(h22 ) γ(h23 ) γ(h24 ) γ(h25 ) 1 γ(h02 )
W3 γ(h31 ) γ(h32 ) γ(h33 ) γ(h34 ) γ(h35 ) 1 γ(h03 )
=
W4 γ(h41 ) γ(h42 ) γ(h43 ) γ(h44 ) γ(h45 ) 1 γ(h04 )
W5 γ(h51 ) γ(h52 ) γ(h53 ) γ(h54 ) γ(h55 ) 1 γ(h05 )
λ W1 W2 W3 W4 W5 0 1
−1
0 7.672 12.150 8.479 14.653 1 4.420
7.672 0 9.823 13.978 15.234 1 5.128
12.150 9.823 0 11.643 6.404 1 7.935
=
8.479 13.978 11.643
0 9.872 1 8.855
14.653 15.234 6.404 9.872 0 1 11.592
1 1 1 1 1 0 1
W1 0.397
W2 0.318
W3 0.182
=
W4 0.094
W5 0.009
λ −1.161
Finally, the value at the unknown point calculated as:
z(x0 , y0 ) = W1 z1 + W2 z2 + W3 z3 + W4 z4 + W5 z5 (21)
s2 = 4.420(0.397)+5.128(0.318)+7.935(0.182)+8.855(0.094)+11.592(0.009)−1.161 = 4.605
7 Universal Kriging
It is an extension of ordinary kriging, which is a method for estimating the value of a variable
at an unmeasured location based on the values observed at nearby locations.
19
Ordinary kriging assumes that the spatial variability of the variable being studied is station-
ary (i.e., the statistical properties of the variable do not change across the study area). Universal
kriging addresses this limitation by incorporating a trend component into the kriging model.
The trend component can be realized using a trend surface. This is mathematically achieved
by combining a mathematical function, such as a polynomial or spline, in the ordinary kriging
equation.
If we are using a linear trend component the equations will be as follows:
20
−1
W1 0 7.672 12.150 8.479 14.653 1 69 76 4.420
W2 7.672 0 9.823 13.978 15.234 1 59 64 5.128
W3 12.150 9.823 0 11.643 6.404 1 75 52 7.935
W4 8.479 13.978 11.643 0 9.872 1 86 73 8.855
=
W5 14.653 15.234 6.404 9.872 0 1 88 53 11.592
λ 1 1 1 1 1 0 0 0 1
b1 69 59 75 86 88 0 0 0 69
b2 76 64 52 73 53 0 0 0 67
W1 0.387
W2 0.311
W3 0.188
W4 0.093
=
W5 0.021
λ −1.154
b1 0.009
b2 −0.010
The interpolated value will be:
s2 = 4.420(0.377)+5.128(0.311)+7.935(0.188)+8.855(0.093)+11.592(0.021)−1.154+0.009(69)−0.010(67
s2 = 4.661
The standard error is:
s = 2.159
21
Estimation variance or error
Another significant difference is regarding the quantification of the reliability of the estimation.
Kriging can provide a variance measure for each estimated point.
The choice between Kriging and IDW depends on the specific characteristics of the data,
the underlying spatial processes, and the level of detail required in the interpolation. Kriging
is more sophisticated and considers spatial correlation, making it suitable for situations with
complex spatial structures. At the same time, IDW is more straightforward and ideal for cases
where spatial correlation is not a crucial consideration.
22