Deutsch Course 2016-10-10
Deutsch Course 2016-10-10
Uncertainty Assessment
A Short Course
October 2016
Clayton V. Deutsch
This schedule will be modified to meet the objectives and pace of the instructor/participants
All rights reserved. The entire set of course notes can be reproduced and distributed in their entirety
(with this notice intact). Small excerpts can be distributed with acknowledgement given to the source.
1
Geostatistics for Mineral Resources Uncertainty Assessment
Logistical Details
• Requests:
– Ask questions right away
– Be on time
– No unauthorized recording
• Software
– Software is essential
– There are commercial alternatives
– This course is not on software
Lec 01 – page 1
Background
• Geostatistics is an approach and a toolkit that applies statistical and numerical
analysis principles to geological variables
• The focus of this course is the fundamental principles of:
– Geological heterogeneity modeling
– Uncertainty assessment
– Decision-making
• Participants should leave with:
– Appreciation for the place of geostatistics in support of geological modeling
– Knowledge of the limitations of geostatistics
– Background necessary to dig into more details
• This course conveys selected concepts from geostatistics. Time is limited:
– Citation Program in Applied Geostatistics (four weeks plus project)
– Masters (M.Eng. and M.Sc.) Degrees (two years plus thesis)
– Doctorate (Ph.D.) Degree (two more years plus thesis)
• Background knowledge in statistics and mathematics would make this short
course easier, but background material will be given as we proceed
• Review the breadth of modern geostatistics drilling into some important
principles along the way
3
Numerical Modeling
• Historically, science involved (1) extensive data collection and physical
experimentation, then (2) deduction of laws consistent with the data
• Now, science is much more concerned with (1) understanding and
quantifying physical laws, and (2) numerical modeling for inference
• We now accept that uncertainty cannot be removed (account of E. Teller’s
statement of how science has changed)
“then we believed that everything could be predicted, now we
know that we can only predict in a probabilistic sense”
• In general:
– Numerical modeling has become more important than physical experimentation,
– Inductive reasoning has become more popular than deductive reasoning,
– Uncertainty is quantified and managed rather than ignored.
• Numerical modeling is ubiquitous in modern
science and engineering (virtually all design
is on the computer…)
Lec 01 – page 2
Historical Perspective
• A gambler's dispute in 1654 led to the creation of a
mathematical theory of probability by two French
mathematicians, Blaise Pascal and Pierre de Fermat
History of Geostatistics
• D. Krige and H. Sichel studied reserve estimation problems in South
Africa from the 1950's establishing the problem
• Professor Georges Matheron (1930-2000) built the major concepts of the
theory for estimating resources he named Geostatistics
• The monumental Traité de géostatistique appliquée (Editions Technip,
France, 1962-63) defines the fundamental tools of linear geostatistics:
variography, variances of estimation and dispersion, and kriging.
• Two of Matheron's first students (Journel and David) would leave for the
USA and Canada and start new centers of geostatistical research
6
Lec 01 – page 3
Geostatistics
• Business Need: make the best possible decisions in presence of
uncertainty. One of the biggest uncertainties is the numerical
description of the subsurface.
Lec 01 – page 4
Why Geostatistics?
• The best approach to model the spatial distribution of geological
properties at the present time
• Better modeling of variability:
– Controllable degree of spatial variability
– Estimates are more reliable
• Framework to integrate data:
– Geological interpretation
– Direct measurements (hard data) and secondary variables (soft data)
– Data representing different measurement supports
• Assess uncertainty in process performance due to uncertainty in
geological model
• Practicality / consistency with data
• Repeatability / audit-trail
• Easy to merge incremental data
Rock Type 1
Rock Type 2
Rock Type 3
Rock Type 4
Rock Type 5
Rock Type 6
10
Lec 01 – page 5
Stochastic Modeling
Reality Model
Distribution of Rock/Fluid Properties Distribution of the Rock/Fluid Properties
Transfer
Function
Distribution of possible
Multiple stochastic models responses
12
Lec 01 – page 6
Some Comments on Uncertainty
• Uncertainty exists because of incomplete data:
• Cannot be avoided,
• Can be reduced by consideration of all relevant data, and
• Can be managed
13
14
Lec 01 – page 7
Geostatistics for Mineral Resources Uncertainty Assessment
Stratigraphic Coordinates
– Erosion
– Filling preexisting topography
• Handled automatically in most
software
Onlap
Erosion
Combination
Lec 02 – page 1
Vein Coordinates
• Define the hangingwall and footwall by surfaces
• Choose a gridding system within
• Assign grades
• Convert back to regular 3-D system
More Prerequisites
• A geological understanding of the site being modeled is essential
Lec 02 – page 2
Cumulative Distribution Function (CDF)
F ( z ) Prob{Z z}
• CDF is used most of the time, but it may be convenient to use PDF
6
Lec 02 – page 3
Distribution Models
• Parametric distribution models have an analytical expression for the
probability (cdf or pdf), which is completely determined by a few
parameters
– The normal or Gaussian distribution
is the most important to us (see later):
Quantiles
• The p-quantile of the distribution F(zp) is the value zp for which:
F ( z p ) Prob{Z z p } p [0,1]
Thus, the quantile can be expressed in an inverse form of the CDF
q(p) = F-1(p) Inverse, not 1/--
• The probability value lends meaning to a particular number
first quartile
Lec 02 – page 4
The Gaussian Distribution
• The Gaussian distribution is a constant of the universe.
– AKA normal distribution, error function or the curve
– Unparalleled mathematical tractability – particularly in high dimensions
• Central Limit Theorem: the sum of a great number of independent equally
distributed (not necessarily Gaussian) standardized random variables tends
to be normally distributed
z mz 1 y 2 /2
• The probability density: y f ( y) e
z 2
Lec 02 – page 5
Expected Values
• The expected value is a statistical operator that is the probability weighted
average of the random variable:
E{Z } z f ( z ) dz
E{Z 2 } z f ( z ) dz
2
• Expected value of a constant is the constant
E{a} a f {z} dz a f ( z ) dz a
=1
• The expected value is a linear operator:
E{aZ } a E{Z }
E{ X Y } E{ X } E{Y }
11
m E Z
• The mean is the correct effective value for most variables
• The variance is a second order moment defined as:
Var{Z } 2 E Z m E Z m
2 2 2
• The variance is a measure of the spread of the data from the mean.
• The standard deviation is the square root of the variance
12
Lec 02 – page 6
Representative Statistics
• Account of Dr. George Gallup
13
14
Lec 02 – page 7
Representative Statistics
• Historical mapping algorithms correct for preferential sampling: no
need for declustering in inverse distance, OK, …
• There is a need for representative proportions and histograms in
modern geostatistics:
– Checking models
– Global resource assessment
– As an input to simulation
• Simulation does not correct for preferential sampling even though
kriging is used inside simulation
• Cokriging with a secondary data does not correct the distribution:
correlation with the rank order of the data is used – the conditional
distributions are not used directly
15
Principle of Declustering
• Assign differential weights to data based on proximity to other data
• Note the example below for five data equal weighted (left) and
weighted (right)
w z
n
w z z
2
i i
i i
z i 1
n s i 1
w
n
i 1
i
w i
i 1 16
Lec 02 – page 8
Example of Declustering
• Location map of 122 wells. The gray scale shows the underlying “true”
distribution of porosity (inaccessible in practice)
• Histogram of 122 well data with the true reference histogram shown as the
black line (inaccessible in practice)
• Note the greater proportion of data between 25 to 30 % and the sparsity of data
in the 0 to 20% porosity range
Frequency
25
20
true distribution
15 0.04
10
0.00
0
0.0 10.0 20.0 30.0 40.0
Porosity, %
17
polygonal weighting
0.04
0.02
0.00
0.0 10.0 20.0 30.0 40.0
Porosity, %
18
Lec 02 – page 9
Cell Declustering
• Polygonal declustering works well when the borders / limits of the area
of interest are well defined
• Another technique, called Cell Declustering, is more robust in 3-D and
when the limits are poorly defined:
– divide the volume of interest into a grid of cells l=1,…,L
– count the occupied cells Lo and the number in each cell nl, l=1,…, Lo
– weight inversely by number in cell (standardize by Lo)
19
Cell Declustering
• Fixing the cell size and changing the origin often leads to different declustering
weights. To avoid this artifact
– A number of different origin locations are considered for the same cell size
– Weights are averaged for each origin offset
• Cell size should be the spacing of the data in the sparsely sampled areas
• The declustered mean is often plotted versus the cell size for a range of cell
sizes – a diagnostic plot that may help get the cell size
median 22.02
lower quartile 16.22
minimum 6.06
Frequency
0.04
20.0
0.02
Lec 02 – page 10
Cell Declustering: Some Notes
• Perform an areal 2-D declustering when drilling is vertical
• Consider 3-D declustering when there are horizontal or highly deviated
wells that preferentially sample certain stratigraphic intervals
• The shape of the cells depends on the geometric configuration of the
data - adjust the shape of the cells to conform to major directions of
preferential sampling
• Choosing the cell size
– Choose the cell size so that there is approximately one datum per cell in the
sparsely sampled areas
– When there are many data and it is known that the high- or low-valued
areas have been oversampled then the cell size can be selected such that
the weights give the minimum (or maximum) declustered mean of the data
– Consider 10-20 origin offsets to stabilise the results relative to the origin
21
Declustering
Mean Standard Deviation
Reference 18.72 7.37
Equal Weighted 22.26 +18.9% 5.58 -24.3%
Polygonal Declustering 19.44 +3.8% 6.90 -6.4%
Cell Declustering 20.02 +6.9% 6.63 -10.0%
22
Lec 02 – page 11
Review of Main Points
• Stationarity
• Probability distributions
• Summary statistics
• Build weighted distribution / summary statistics:
– Polygons of influence
– Cell declustering
– Global estimation (accumulate weight to each data)
• Calibrate to secondary data when weighting is deemed inadequate:
• Other considerations / comments:
– No recipe for correct application
– Want to go beyond a limited sample to the population
– Must decluster facies proportions
– Future geostatistical analysis will depend heavily on simple statistics
inferred early in the modeling efforts
23
Lec 02 – page 12
Geostatistics for Mineral Resources Uncertainty Assessment
Variogram Calculation
• Spatial Dependence
• Variogram and Covariance
• Calculation of the Variogram
• Basic Interpretation Principles
Spatial Correlation
• The three maps are remarkably
similar: all three have the same 140
data, same histograms and same
range of correlation, and yet their
spatial variability/continuity is quite
different
Lec 03 – page 1
The Variogram
2 ( h) E{[ Z ( u) Z ( u h)]2 }
• Variogram as a function of
distance in different directions
Lec 03 – page 2
Variogram Lags
Tail
Tail
Head Head
( h) 2 C ( h )
2 [1 ( h)]
Lec 03 – page 3
Covariance / Variogram
< 0
C(0) = 2
(h)
= 0
> 0
Distance “lags”
0 h
C(h)
Covariance is more intuitive,
but less commonly used due to
historical reasons.
Calculating Variograms
• Application
– Secondary data
– Checking models
Lec 03 – page 4
Calculating Variograms
• Angle tolerance
• Bandwidth
• Bandwidth
Azimuth
tolerance
• Lag tolerance
10
Lec 03 – page 5
Choosing Directions
• Consider the vertical (or down hole) variogram first
• Consider the omnidirectional (all directions taken together) next
• Look at maps and sections to identify directions – consider the
orientation of geological site and/or the data
• Create a “neutral” model using IDS or kriging
• Consider multiple directions before choosing a set of 3
perpendicular directions (major horizontal direction & two
perpendicular to major direction)
11
Azimuth
Bandwidth
Azimuth
tolerance
X axis (East) 12
Lec 03 – page 6
Variogram Interpretation
• Variogram interpretation
consists of explaining the
variability over different
distance scales.
13
Sill
Nugget
Effect
Range
h
14
Lec 03 – page 7
Comments on the Nugget Effect
• The nugget effect is the variance between pairs that are touching, but not
overlapping
15
Trends (1/4)
• Indicates a trend (fining upward, …)
• Could be interpreted as a fractal, fit a power law
function
• Model to the theoretical sill; the data will ensure that
the trend appears in the final model
• May have to explicitly account for the trend in later
simulation/modeling
3.0
Vertical
0.0
-3.0
Horizontal
16
Lec 03 – page 8
Cyclicity (2/4)
17
3.0
Vertical
0.0
-3.0
Horizontal
18
Lec 03 – page 9
Zonal Anisotropy (4/4)
• Compare the apparent vertical sill with the apparent horizontal sill
• When the vertical variogram reaches a higher sill:
– likely due to additional variance from stratification/layering
• When the vertical variogram reaches a lower sill:
– likely due to an areal trend – areal zoning
Vertical
0.0
Variability within strata
-3.0
Horizontal 19
Some Variograms
20
Lec 03 – page 10
Review of Main Points
21
Lec 03 – page 11
Geostatistics for Mineral Resources Uncertainty Assessment
Variogram Modeling
Lec 04 – page 1
Common Variogram Models
Commonly
No spatial correlation
encountered
Should be a small
variogram shape
component of the
overall variance
Similar to spherical
Implies short scale
but rises more
continuity; parabolic
steeply and reaches
behavior at the origin,
the sill
instead of linear
asymptotically
E Yi u 0
E Yi u 1 i, j i, u, u '
2
E Yi u Y j u ' 0
• Each Y factor has its own variogram with different shape, anisotropy,…
i h , i 0,..., nst
Lec 04 – page 2
LMR – Nested Structures
• The variogram of the combined variable depends on the contributions of
the factors:
nst
Z u m aiYi u
i 0
nst nst
Z h ai ai i h ci i h
i 0 i 0
• To model a variogram:
1. Choose nst based on complexity (in practice, no more than four)
2. Choose shape of each structure
3. Choose angles and ranges of each structure
4. Choose contribution of each structure
5. Iteratively adjust as necessary
Nested
Structures
Lec 04 – page 3
A Complicated Case
Vertical Horz: Major Horz: Minor
0.8 0.8
0.4
0.3
0.1 h
0 8 20 0 400 1000 0 200 1000
Type 2 ax ay az
2.- Sph 0.2 8 0 0 h 0.1 0.2 sphav8 0.1 sphav8 0.4 sphav8 0.2 sphav8
ah10 ah10 ah1400 ah1
3.- Sph 0.1 8 0 200 ah 20 ah 2 200 ah 2200 ah 2
• Published variograms
• Characteristic signature for different settings
make it possible to apply with limited data
Lec 04 – page 4
Inference in Presence of
Sparse Data
• Most often there are inadequate data to infer a reliable horizontal variogram.
• Horizontal wells have not significantly helped with horizontal variograms:
– hard core data and high quality well logs are rarely collected from horizontal wells
– horizontal wells rarely track the stratigraphic coordinate
• At present, we depend on analogue data deemed relevant to the site being
considered such as other, more extensively sampled, reservoirs, geological
process simulation, or outcrop measurements, e.g.,
Lec 04 – page 5
Semi-Automatic Fitting
• There is software that mimics the iterative procedure performed by the
modeler, varying the different parameters that define a variogram model.
• The nugget effect, sill contributions, structure types and ranges, are fit to
experimental variogram points in up to three directions simultaneously.
• The user can fix any subset of parameters.
11
12
Lec 04 – page 6
Review of Main Points
• Variogram modeling is one of the most important steps in a
geostatistical study - all spatial inference depends on model of spatial
variability / continuity
13
Lec 04 – page 7
Geostatistics for Mineral Resources Uncertainty Assessment
Theory of Kriging
• Linear Estimation
• Derivation of Kriging
• Comments on Kriging
• Cross Validation
Context
• The goal is to compute a best estimate at an unsampled location
• Consider the data as differences from their mean values
y ui z ui m ui yi , i 1,..., n
• Statistical inference and a decision of stationarity provides the required
information:
E Y u 0
E Y u
2
2
u A
E Y ui Y u j C ui u j Cij
• Covariances are calculated from the variogram model
2
Lec 05 – page 1
Estimation
• Consider an unsampled location
y * u 0 z * u 0 m u 0 y* ?
• The only basis to estimate different than the mean is when there are data
indicating that the local area is higher or lower than average
Error Variance
• We require a measure of the goodness of an estimate to get the best
• Minimizing the squared error is reasonable
E2 E y * y
2
• The true value is not precisely known, but we can still calculate the
error variance in expected value:
• This error variance can be calculated using the variogram / covariance
• The mean or expected value (the local conditional one) is always the
estimate that minimizes the squared error criterion
• The estimate that minimizes this error variance by construction will be
called kriging in geostatistics
Lec 05 – page 2
Recall Variogram to Covariance
C(0) = sill
• The Variogram is defined by: (h)
C(h)
• The Covariance is defined by:
• So:
Kriging
• Take the partial derivatives and set them equal to zero to find the
minimum:
Lec 05 – page 3
Kriging System of Equations
or
Summary of Kriging
Lec 05 – page 4
Cross Validation with Kriging
• Cross validation deletes one input data at a time and re-estimates
that point using the remaining data.
• This process is as difficult as the actual estimation of the unsampled
locations – remove the entire well/drillhole
• Sometimes the result is a pessimistic
• The main use of these tools is to detect blunders and problems in the
input data before bad estimates are made
10
Lec 05 – page 5
Continuous Variable Display
• Continuous variables are relatively straightforward – with enough data
• Display points and summary statistics
11
12
Lec 05 – page 6
Review of Main Points
13
Lec 05 – page 7
Geostatistics for Mineral Resources Uncertainty Assessment
Practice of Kriging
• Recall
• Ordinary Kriging
• Universal Kriging
• Smoothing
Simple Kriging
• Local estimation n
y * u 0 z * u 0 m u 0 y* i yi
i 1
E2 E y * y
2
n n n
2 2 i Ci 0 i j Cij
i 1 i 1 j 1
n
• Linear estimate
• Minimum error variance C
j 1
j ij Ci 0 , i 1,..., n
• Nice system of equations
n
2
SK i Ci 0
2
i 1
2
Lec 06 – page 1
Types of Kriging
• Kriging is a procedure for constructing a minimum error variance
linear estimate at a location where the true value is unknown
n
z * u 0 m i z ui m
i 1
Lec 06 – page 2
Ordinary Kriging
• Assume mean is constant but unknown:
• Estimation variance:
Ordinary Kriging
• Lagrange Formalism
Lec 06 – page 3
Types of Kriging
• Kriging with a trend model (KT) considers that m is unknown and that
it has a more complex trend of known shape but unknown parameters.
L
m(u) al f l (u)
l 0
where m(u) is the local mean, a1, l = 0...,L are unknown coefficients of
the trend model, and fl(u) are low order monomials of the coordinates
including( x , y , z , xx , yy , zz, xy, xz , and, yz )
• Kriging with an external drift is an extension of KT. Considers a single
trend function f1(u) defined at each location from some external
(secondary) variable.
Kriging with
External
Drift:
any
Secondary
Variable can
be Used
Lec 06 – page 4
Kriging
• All versions of kriging are elaborations on the basic linear regression algorithm
and corresponding estimator: n
*
[ Z SK (u) m(u)] (u)[ Z (u ) m(u )]
1
where Z(u) is the RV model at location u, the u 's are the n data locations,
m(u) = E{Z(u) is the location-dependent expected value of RV Z(u), and
ZSK*(u) is the linear regression estimator, also called the “simple kriging” (SK)
estimator.
• The SK weights (u) are given by the normal equations:
n
(u)C (u , u ) C (u, u ) ,
1
1,...., n
• Some Remarks:
– there are many types of kriging where specific constraints or methods of application
are considered
– the weights (u) account for (1) the proximity of the data to the location being
estimated and (2) the clustering of the data
– traditionally used for mapping
– modern use is in the construction of local distributions of uncertainty for stochastic
simulation algorithms
9
Variance of Estimates
• Let’s calculate how smooth the estimates are:
Var Y * u E Y * u m
2
E Y *
2
n n
E i jYiY j
i 1 j 1
i j E YiY j i j Ci , j
n n n n
i 1 j 1 i 1 j 1
n
n
SK
2
2 iCi ,0 2 SK
2
2 2 iCi ,0 2
i 1 i 1
SK
2 2
10
Lec 06 – page 5
Variance and Smoothing
• However, these variances are too low and the degree of smoothing is
equal to the missing variance.
• The difference between the global, C(0)=2, and kriged variance is the
missing variance.
11
12
Lec 06 – page 6
Geostatistics for Mineral Resources Uncertainty Assessment
Principles of Simulation
• Preliminary remarks
• Motivation for simulation
• Setup
Why Simulation?
• Geological formations are heterogeneous at all scales
• Access to formations is costly
• Direct measurements are relatively widely spaced
Lec 07 – page 1
Some Initial Remarks
• Experimental mathematics is well established
• Drop a needle on floor of hardwood strips to calculate …
• (Re)invented many times in last centuries. On the computer:
• The name Monte Carlo Simulation (MCS) was coined by von Neumann
and Ulam as part of the secret research on the Manhattan Project
• Monte Carlo Simulation (MCS) aims to transfer uncertainty:
Initial Example
• What is the sum of the numbers showing on the face of three fair cubic
dice if they were thrown randomly?
• The input uncertainty and transfer function are known
• Simulation proceeds by:
– Draw three random numbers and associated outcomes from input CDF
– Process through the transfer function
– Repeat some number of times (a hundred?)
r3
r1
r2
Lec 07 – page 2
Space of Uncertainty
• The space of uncertainty for the initial example of three dice is small
6 x 6 x 6 = 63 = 216
could be solved exactly
Lec 07 – page 3
Uniform Random Numbers
• An long sequence of random numbers is required for simulation.
• 1927: Tippett published a table of 40,000 digits from census reports.
• 1939: Kendall and Babington-Smith create a mechanical device
(machine) to generate random numbers.
• 1946: Von Neumann proposed the middle square method.
• 1948: Lehmer introduced the linear congruential method.
• 1955: RAND Corporation published digits from electronic noise.
• 1965: MacLaren and Marsaglia combined two congruential generators.
• 1989: Wikramaratna proposed the additive congruential method.
Problem Setup
• Geostatistical Simulation: Zkl u , k 1,...,K;l 1,..., L,u A
Lec 07 – page 4
Scenarios
• Realizations are generated one-for-one and not branched
http://www.waldeneffect.org
http://www.rhynelandscape.com
1. Model Setup
– What are we going to model, steps, algorithms
2. Prior Parameter Uncertainty
– What are the base case parameters and uncertainty
3. Data Uncertainty
– Fill missing values and sample data error
4. Simulate Realizations
– Apply simulation engine with specified workflow, parameters and data
5. Process in Transfer Function
– Calculate what we are interested in (resources, reserves…)
6. Report Uncertainty and Sensitivity
– Show base case plus uncertainty, evaluate where uncertainty coming from
Lec 07 – page 5
1. Model Setup
• Specify hierarchical workflow for how a realization of all variables will be
assembled
• Deterministic to stochastic
• Facies proportions
• Variograms
Lec 07 – page 6
3. Data Uncertainty
• Impute missing values (probably consider decorrelation anyway)
4. Simulate Realizations
• Apply geostatistical workflow in scripted fashion with specified
parameters and data
Lec 07 – page 7
5. Process in Transfer Function
• Compute anything we want on all realizations
Lec 07 – page 8
Merging of RT and Grade
• Normally take first
RT with the first grade
realization and so on
• A simple template (cookie
cutter) is often used
• The results could be merged
with soft boundaries
– Merge results that share data
across the boundary
variable (inaccessible)
Sampling distribution of
quantile
Lec 07 – page 9
Uncertainty in CDF Value
• The mean and variance of these distributions can be determined
theoretically.
– where the indicator function is 1 if random drawing is less than or equal to
the value F and 0 otherwise. The mean or expected value of F* is the true
underlying cdf value, that is, F.
• The mean of F* is calculated as: L
1
*
F
L i(u )
l 1
l
F2 *
1 2 1
L
i E i 2 E
L
1
i 2 F F 2
L
F (1 F )
L
• The distribution shape will approach Gaussian as L increases
• Probability to be inside interval can be easily calculated
Number of Realizations
• The sampling distribution of the probability can be derived
• The following table summarizes the results for common probability
and tolerance values
Lec 07 – page 10
Final Remarks
• Promote best practice of geological modeling and geostatistics
– Simulation is required
– Principles are well established; details of implementation are not
• Steps in simulation:
• Realizations:
Lec 07 – page 11
Geostatistics for Mineral Resources Uncertainty Assessment
Overview of Techniques
1. Digitization of limits on 2-D sections (reconciled to 3-D) or in 3-D directly
– Always the best approach if possible
– Requires close data relative to scale of variability
2. Semi-deterministic (implicit) modeling with signed distance functions
– Easier than digitization, but less control
– Systematically erode and dilate boundaries
3. Process mimicking models
– Applicable in a few situations
– Nice looking and predictive models
4. Object based models
– Applicable in some situations
– Requires clear object sizes and shapes
5. Cell based models
– Sequential indicator, truncated Gaussian, multiple point statistics
– Statistically control variability between cells
2
Lec 08 – page 1
General Comments
• Geological sites are modeled hierarchically
– Large scale limits / container (surfaces, digitized limits,…)
– Major heterogeneities (often categorical variables)
– Small scale features (often continuous variables such as mass/volume
fractions and rate constants)
• Categorical variables:
– Based on some geological characteristic of the rock that is large scale
– Normally do not consider more than 5 to 7 at one time
– Categories have different statistics (univariate or spatial)
– Reasonable consistency within and variation between
– Mutually exclusive at the scale of modeling (and exhaustive)
– Spatially consistent/coherent with some degree of predictability
– Must be enough data within each category for inference
– Contact analysis is performed – hard boundaries are preferred
Deterministic Interpretation
• Digitize contacts and reconcile in 3-D
• Not a geostatistical subject, but important
http://s3.amazonaws.com/s3.agoracom.com/public/photos/images/3623/large/Represa%203D.jpg
Lec 08 – page 2
Boundary Modeling with Distance
Functions
• The limits of each domain must be determined
• Deterministic model with a digitized solid model or surfaces
• Partially stochastic with distance functions
Lec 08 – page 3
Process Mimicking
• Process mimicking or event based modeling is becoming more common
• Very deposit specific
• Some vein type and laterites
• Motivation:
– Visually attractive
– Clean geologic shapes
– Realistic in right circumstances
• Steps:
1. Model large scale features
2. Model small scale features
3. Populate with petrophysical properties
Lec 08 – page 4
Cell Based Techniques
• Assign a category to a gridded model
• Statistical control with surrounding cells
through variograms or higher order
statistics
• Could work through one or more
continuous variables – truncated (pluri)
Gaussian
• Trend modeling is important for large
scale features
Lec 08 – page 5
Geostatistics for Mineral Resources Uncertainty Assessment
• Multivariate Gaussian
• Congenial properties
• Local conditional distributions
Normal Distribution
Lec 09 – page 1
Central Limit Theorem (CLT)
• Theorem: the sum of a great number of independent equally distributed (not
necessarily Gaussian) standardized random variables tends to be normally
distributed
• The standard (m=0,=1) normal probability density function: 1 [ z 2 / 2]
f ( z) e
2
n
Y Yi
i 1
X ln Y
n n
X ln Yi X i CLT applies to X's
i 1 i 1
Lec 09 – page 2
Bivariate Gaussian Distribution
2 1 2
• The relationship between the two variables is defined by a single
parameter: the correlation coefficient
• The probability contours are elliptical. Some special parameters:
– Conditional expectation a linear function of conditioning event
Y
E Y X x mY X ,Y ( x mx )
X
– Conditional variance is independent of conditioning event
Var Y X x Y2 (1 X2 ,Y )
Lec 09 – page 3
Multivariate Gaussian Distribution
Lec 09 – page 4
Application of Multivariate Gaussian
Distribution
• Assuming Gaussianity within reasonably defined rock types is a reasonable
approximation
• Let’s make the variable Gaussian and assume multivariate Gaussianity…
• We would only need correlations/covariances/variograms
• We would only need to perform simple (co)kriging to get mean and variance
of all conditional distributions
• All distributions of uncertainty would be Gaussian with mean and variance
specified by:
– Linear combination of data in simple (co)kriging
– Homoscedastic kriging variance
• But….
– Real data distributions are not Gaussian
– Multivariate relationships are not homoscedastic and Gaussian
– Constraints, Heteroscedasticity, and Non-linearity are encountered in practice
• Everything we do not explicitly control in mathematics tends toward a
Gaussian distribution
Lec 09 – page 5
Gaussian Transform
• Univariate transformation guarantees a univariate Gaussian distribution
• There is no guarantee of a multivariate Gaussian distribution :
– Non-linearity is not removed
– Constraints are not removed
• The proportional effect / heteroscedasticity is largely removed by transfromation
and reintroduced by back transformation
Original Original
Units Units
0.04
Normal Normal
score score
Lec 09 – page 6
More Examples
0.98 0.97
Original Original
Units Units
0.25 0.04
Normal Normal
score score
Quantile Backtransformation
• Quantiles can be back transformed at any time
• Non-standard Gaussian distributions are valid – just back transform non-
standard Gaussian to a possible shape in real data units (Z):
z l FZ1 G SK G 1 ( p l ) y SK , l 1,..., L
Lec 09 – page 7
Another Picture of Transformation
Lec 09 – page 8
Geostatistics for Mineral Resources Uncertainty Assessment
Gaussian Simulation
• Estimation is locally
accurate and smooth,
appropriate for visualizing
trends, inappropriate for
engineering calculations
where extreme values are
important, and does not
assess global uncertainty
• Simulation reproduces
histogram, honors spatial
variability (variogram),
appropriate for flow
simulation, allows an
assessment of uncertainty
with alternative realizations
possible
Lec 10 – page 1
Preliminary Remarks
• Kriging is smooth – but is entirely based on the data
• Kriging does not reproduce the histogram and variogram
• Simulation is designed to draw realizations that reproduce the data,
histogram and variogram
• The average of multiple simulated realizations is very close to kriging
Prerequisites
• Work within statistically homogeneous stationary populations
– Model facies/RTs first
– Chronostratigraphic or unfolded grid framework
• Must have clean data that are positioned correctly with manageable
outliers and error content
• Need to understand special data:
– trends
– production data
– seismic data
• Do not leave data out for checking when building the final model
• Simulation considers grid nodes and not grid blocks:
– Assign a property at cell center at the scale of the data
– Normally paint entire cell based on central value
– Sometimes refine the grid dynamically after simulation
Lec 10 – page 2
Simulation from High Dimensional
Multivariate Distributions
• MCS or simulation from a univariate distribution is easy
Lec 10 – page 3
More Sequential Simulation
Sequential Simulation
• Transform variables one-at-a-time to a Gaussian distribution
– Use representative distribution
• Establish the covariance between all pairs of variables / locations
– Variogram of normal scores
• Predict local uncertainty
• Simulation
– Loop over all locations
– Estimate local uncertainty
– Sample uncertainty
• Repeat with different random numbers for multiple realizations
• Back transform to original units
Lec 10 – page 4
Some Implementation Details
• The sequence of grid nodes to populate:
– Random to avoid potential for artifacts
– Multiple grid to constrain large scale structure
Check Results
• Check in normal units and in original units
• Reproduce data at their locations?
• Reproduce histogram, N(0,1), over many realizations?
• Reproduce variogram?
• Average of many realizations equal to the kriged value?
10
Lec 10 – page 5
Different Simulation Algorithms
• There are alternative Gaussian simulation techniques:
– Matrix Approach LU Decomposition of an N x N matrix
– Turning Bands simulate the variable on 1-D lines and combine in 3-D
– Spectral and other Fourier techniques
– Fractal techniques
– Moving Average techniques
11
12
Lec 10 – page 6
Geostatistics for Mineral Resources Uncertainty Assessment
Practice of Simulation
• Implementation
• Checking
• Programs
Lec 11 – page 1
Normal Scores Transformation
5.0
1.0
F , ( z ) 1 , 1
z
F(zk)
zk Z variable
Lec 11 – page 2
Two-Part Search or Assign Data to
Nodes
Two-Part?
• search for previously simulated nodes and then original data in two steps
then treat the same when constructing the conditional distribution
• honor the data at their locations even if they can not be seen in the final model
• necessary for cross-sectional or small-area models
Assign Data to Grid Nodes:
• explicitly honor data - data values will appear in final 3-D model
• improves the CPU speed of the algorithm: searching for previously simulated
nodes and original data is accomplished in one step
Lec 11 – page 3
Number of Data to Consider
• Reasons for more:
– theoretically better
– more accurate estimate of the conditional mean and variance
– better reproduction of the variogram
• Reasons for less:
– CPU time is proportional to N3
– memory requirements proportional to N2
– negative weights are commonly encountered when data are screened
– using fewer data places less emphasis on the assumption of stationarity
• So, choose between 12 to 48 depending on:
– 2-D versus 3-D
– range of variogram relative to grid node spacing
– CPU time restrictions
Type of Kriging
• Simple Kriging (SK) is equivalent to normal equations and is theoretically
correct:
i z ( ui ) 1 i m global
n n
*
m SK
i 1
i 1
Lec 11 – page 4
Ergodic Fluctuations
• A statistical domain is ergodic when it is large with respect to the scale of
variability
• Expect some statistical fluctuations in the input statistics when the
domain is not ergodic
Lec 11 – page 5
Review of Main Points
• Kriged estimates are too smooth and inappropriate for most engineering
applications
• Simulation corrects for this smoothness and ensures that the variogram /
covariance is honored
• There are many different simulation algorithms → sequential Gaussian is simple
and most widely used
• Use Gaussian / normal distribution for a consistent multivariate distribution
Lec 11 – page 6
Geostatistics for Mineral Resources Uncertainty Assessment
Quantifying Importance
• Impact of a variable on the local uncertainty.
• Remove the variable from the system, and reassess kriging variance
• Importance – the deviation in kriging variance observed when that
variable is removed from the system.
K2 ,i K2
Importancei
K2
i variable(s) removed
K2 ,i kriging variance with variable i removed
K2 kriging variance when all variables are considered.
Lec 12 – page 1
Importance Measures
Response
Variable
Transfer f(y)
function
We need a tool to visualize, summarize, analyse and present
uncertainty and sensitivity
Lec 12 – page 2
First Goal
Ideal Characteristics
Easy to understand (essential information)
• Simple
Easy to use (“friendly”)
• Informative
Tornado Chart
Lec 12 – page 3
Contribution to Uncertainty in Response?
R2=0.92 R2=0.23
Lec 12 – page 4
Geostatistics for Mineral Resources Uncertainty Assessment
• Taxonomy of techniques
• Data exploration and preparation
• Model building
Lec 13 – page 1
I.a Transform Compositional Data
• Modeled variables commonly have
constraints which must be obeyed
– Compositional components must not sum to
greater than 100%
– Net measurements must not exceed associated
gross (acid soluble versus total)
• Consider a set of variables that are a
composition (perhaps with a filler variable)
with last variable as reference:
K
zk , k 1,..., K with zk 0 and z
k 1
k 1
Program: logratio
3
Lec 13 – page 2
I.c Summarize Multivariate Relationships
• Cross plots are a pale shadow of full multivariate space, but…
• Correlation matrix is interesting – may consider optimal ordering
• Multidimensional scaling may help group or screen the variables
• MDS could be followed by clustering
• Avoids overfitting
• Influence of each can be determined by sampling
Program: supersec
6
Lec 13 – page 3
I.e Cluster Data Observations
• Group observations that are close
in multivariate space
• Main algorithms:
– Hierarchical
– K-means
– Gaussian Mixture Model
Program: cluster
• Several clustering
algorithms
• Dendrogram plot (for hier.)
• Summary plots
7
– Quadratic Yˆ b0 bi X i ckl X k X l
i 1 k 1 l k
– ACE
Lec 13 – page 4
II.a Estimate with Unequally Sampled Data
• Many situations where data are never (or rarely) equally sampled
– Diamond drill holes vs. reverse circulation.
– Blast hole samples vs. exploration drill hole samples.
– Legacy data
Lec 13 – page 5
II.c Impute Missing Data
• Multivariate transforms may only be executed on equally sampled
(homotopic) observations
• Multiple imputation is the best approach in most circumstances
Program: decorrelate 12
Lec 13 – page 6
II.e Simulate with Nonlinear Decorrelation
• Transform the multiple variables with a multivariate nonlinear transform
to decorrelate and make multivariate Gaussian
• Projection Pursuit Multivariate Transform (PPMT) now widely used
Lec 13 – page 7
Comments on Workflows
• These workflows are stable and
well established
• CCG has tools for all of these
workflows
15
Lec 13 – page 8
Geostatistics for Mineral Resources Uncertainty Assessment
Background
• PCA is a classic dimension reduction and
decorrelation technique that was
developed by Pearson (1901) and
Hotelling (1933)
– Adapted to geostatistics by Joreskog et al
(1976) , as well as Davis and Greenes
(1986)
Lec 14 – page 1
Linear Decorrelation Workflow
• An elegant workflow for multiGaussian data
– Covariance matrix fully explains the multivariate distribution
• Decorrelation leads to independence
PCA Theory
• The primary objective is often to transform correlated variables into
orthogonal principal components.
Z : z i , 1,..., n, i 1,..., K
• The covariance matrix of the data is given as ΣZ 1 n ZT Z
Lec 14 – page 2
Data Exploration with PCA
• PCA is often used for applications beyond decorrelation, such as
sensitivity analysis and dimension reduction
Σ Z VDV T
Orthogonal eigenvectors Diagonal Eigenvalues
v11 v1 K d11 0
V D
vK 1 vK K 0 dK K
Lec 14 – page 3
Nickel Laterite Dimension Reduction
• The variability of the first four prinicipal
components of the Nickel laterite
system according to d i / tr( D)
Scree Plots
• Help decide # of PCs to use
– Choose up to and including point on an “elbow”
– Sensitive to sample size
Lec 14 – page 4
Data Sphereing
• Sphereing is closely related to PCA
– PCA: , where
– Sphereing (or simply, Sphere) refers here to the
standardization of the rotated variables according
to:
– Sphere-R (reverse projection) refers here to the
standardization of the rotated variables before
projecting the variables back onto the original basis:
10
Lec 14 – page 5
MAF Background
12
Lec 14 – page 6
Transformed Loadings
• Loadings of Sphere and Sphere-R
transformation explain the nature of those
techniques
– Relate to variance of the h = 0 multivariate
system
• Loadings of MAF appear more unorganized
– Relate to variance of the h > 0 multivariate
system
• MAF rotation
is more easily
understood
after viewing
the
transformed
variograms
13
14
Lec 14 – page 7
Geostatistics for Mineral Resources Uncertainty Assessment
• Averaging schema
• Linearizing transforms
• Multiscale simulation
• Measurement error
• Case study
Geometallurgy
• One perspective
Lec 15 – page 1
Geometallurgical Data
Geometallurgical
Geostatistics
• Proposed approach for
challenging data is data
transformation followed by
either:
1. A correlated multivariate
simulation workflow
Lec 15 – page 2
Data Transformation Methods
Lec 15 – page 3
Bitumen Recovery
• Another variable exhibiting non-linear behaviour is bitumen recovery
• Increasing the concentration of fines for oilsands flotation sharply
decreases bitumen recovery in the froth
• Response is percolation-like: little effect of fines on bitumen recovery up
to 12% fines, then recovery drops substantially
Sanford, E. C. (1983). Processibility of athabasca oil sand: Interrelationship between oil sand fine solids, process aids, mechanical 7
energy and oil sand age after mining. The Canadian Journal of Chemical Engineering, 61(4), 554-567.
Lec 15 – page 4
Methods to Infer Nonlinearity
• Ideally use direct
measurement of blends of
known ore types to infer
nonlinearity
• With multiscale
measurements, nonlinear
regression may be used,
however error will bias the
regression
Lec 15 – page 5
Impact of Measurement Error
• Measurement error, particularly in the case of small scale
correspondence measurements, is expected
• Results in regression attenuation, a reduction in signal
strength due to measurement error (Spearman, 1904)
• Error in measurements
may lead to bias in
models, power of 3.5 fit
instead of 5.5 for the
case shown, although it
appears unbiased
Lec 15 – page 6
Attenuation of Power Law Models
• If measurement error is present on the small scale measurements, then
the fit will be biased for substantial amounts of error, even in a linear
model
• This is expected behavior, we could consider disattenuation
Lec 15 – page 7
Disattenuation of Power Law Models
• Attenuation can be observed in a
Monte Carlo experiment, like the
situation shown here
Lec 15 – page 8
Nonlinear Inference with Multivariate
Multiscale Data
• Require realizations of Z at the small scale v with:
– the correct spatial regionalization (variogram)
– the correct bivariate distribution with the known correlation
linear variable
• Two evident approaches:
– Drawing realizations of correlated probabilities using
probability field (p-field) simulation
– Semiparametric Bayesian updating (BU) using a familiar
framework from Barnett and Deutsch’s multiple imputation
method
Lec 15 – page 9
Effect of Bivariate Information
• Increasing bivariate information decreased the mean squared error of
the inferred nonlinear re-expression (mean squared error of fit Z(V)
compared to true Z(V)) when all other parameters held constant
Lec 15 – page 10
Bivariate Distribution Scaling Algorithm
• Scale a bivariate distribution (or up to ~4 dimensional multivariate)
iteratively
• Susceptible to curse of
dimensionality
Lec 15 – page 11
Geometallurgical Case Study
• Several of the described techniques
are demonstrated using a case study
from a copper-molybdenum porphyry
deposit
• Quantifying mill throughput is critical
for this large tonnage operation
– A model of grinding indices and rock
properties is required in addition to
copper and molybdenum grades
• Data consists of 18.5 km of drilling
– Geology includes 13 lithologies and 8
alterations
– Modeled domain of highly altered, high
grade, and immediately surrounding
host rock is displayed
• Not the focus here
23
Data Inventory
• Data continued:
– Assays: copper, molybdenum, silver, iron, sulfur, acid soluble copper and
cyanide (2 m intervals)
• Interval data also included rock quality designation (RQD), core recovery,
point load test strength and fractures frequencies
– Metallurgical data: communition and whole rock analysis (30m intervals)
• Includes BMWi, semi-autogeneous grinding (SAG) and critical semi-
autogeneous grinding index (Abx)
• Data available on 2m intervals are composited to 15m (bench scale) for
modeling
• Due to the absence of experimental data comparing assay and
geometallurgical variables, the two are modeled separately
– Comparison will be made between using models that are constructed using
30m intervals and downscaled 15m intervals for geometallurgical variables
24
Lec 15 – page 12
Model Workflow
• Capturing parameter and data
uncertainty at several steps,
the workflow may be
summarized as:
1. Normal score the variables
2. Impute missing values
3. Simulate using ICCK (with
SGSIM) using supersecondary
variables
4. Correct realizations to have
the correct histogram
5. Check reproduction
25
Model Checking
• Model checking typically includes:
– Visual inspection of realization and e-type
– Histogram reproduction
– Multivariate reproduction
– Variogram reproduction
Realization E-type
26
Lec 15 – page 13
Model Checking (2)
Data
Realization
27
28
Lec 15 – page 14
Variability of Throughput Rate
• Regions of high Axb corresponding to less energy intensive ore and low
Axb (more energy intensive ore) are observed in the pit limits
• The temporal variation of Axb is examined in an intermediate pit
expansion over a period of 300 operating days
– An extraction sequence is applied to 20 high-resolution realizations
– The average daily Axb was calculated for each day
29
Downscaling
• The realizations of Axb that are used in the previous slide are
constructed using the 30m measurements of Axb, not
measurements on a 15m scale
• Consider downscaling to 15m
– Use an assumption of linearity since data is not available to support
non-linear averaging schema
• Downscaling of the variogram displayed below
– Lead to a variance increase of 30%
30
Lec 15 – page 15
Increased Variability of Throughput
• The downscaled histogram is
stable and reproduces the
increase in variance of 30%
• The resulting realizations of Axb
show the effect of the
substantially increased variance
31
Lec 15 – page 16
Geostatistics for Mineral Resources Uncertainty Assessment
• Post processing
• Principles of data spacing
Reporting
• Practitioners may wonder how to report in presence of multiple
realizations – people still want just one model!?!
• Reporting within one zone (time period, bench, stope, …):
• These tables are for one zone. The zones could be changed to achieve
certain targets for tonnage, strip ratio, metal content…
• Assigning probability intervals to all tonnage/grade numbers is an important
step
Lec 16 – page 1
By Time Period
• Present uncertainty in different
time periods
• Uncertainty in NPV/ROI-type
statistics would require
realizations, which we have
• Schematic to the right shows
error bounds and multiple
realizations
• Should exceed P90 10% of the
time and fall below P10 10% of the
time
• It is fairly standard to assemble
uncertainty in this manner
– What about the source of
uncertainty? RT or grade?
– Sensitivity versus Uncertainty
Uncertainty
• P10/P50/P90 results for the
10 zones
• Realizations are shown
• Results are consistent:
– 10% above P90
– 10% below P10
• Could process through
cash flow analysis…
• Summarize by expected
value and show
uncertainty
Lec 16 – page 2
Review of the Approach
The Efficient Frontier
• Originally proposed by H. Markowitz (1952) as a way to select between
investment portfolios
* Figure is a simplified example of the efficient frontier originally presented by H. Markowitz in 1952 in “Portfolio Selection"
Lec 16 – page 3
The Efficient Frontier
A suggested approach for using the efficient frontier methodology
1. Find the pits along the efficient frontier
2. Review and understand the changes between the pits along the frontier
3. Use the efficient frontier and the review of changes to better inform
future decision making processes.
Lec 16 – page 4
Surfaces Along the Frontier
Efficient Frontier Plot
2. Review the changes in the pits
along the efficient frontier
Depth of Pit
Depth of Pit
What is Changing? (Cross-Sections)
How much do the pits change?
Surface Plots
Difference in Depth
• Pit Outlines
• Probability to be Ore
Decimal Probability
Decimal Probability
South North
Lec 16 – page 5
Small 2D Examples
Small 2D examples show some similarities and hint at other shapes to the
frontier
Similar Features:
• Smooth slope where sides of the pit
are being shaved away
Other Features:
• Strong Jumps in the Frontier
where larger portions
of the pit are dropped
Lec 16 – page 6
Quantifying Uncertainty
• Principle: uncertainty can be quantified with geostatistical simulation and
is scale dependent.
Sources of Uncertainty
• Principle: uncertainty
can be explained by
data spacing and other
local factors.
Lec 16 – page 7
Predicting How Many Data are Needed
• Principle: local factors can be used for
estimating the data spacing that yields a
target uncertainty.
• Studied with
resampling and
resimulation
Lec 16 – page 8
Optimal Spacing
• Principle: the cost of uncertainty should be estimated to determine a
target data spacing and associated uncertainty.
Reasonable Spacing
• Principle: if the cost of uncertainty cannot be estimated, characteristics of
data spacing and uncertainty can be used to determine a reasonable
target.
Lec 16 – page 9
Geostatistics for Mineral Resources Uncertainty Assessment
Resource Classification
• Purpose
• Geometric criteria
• Probabilistic support
• Recommendations
Background
• Canadian Securities Commissions (ASC, BCSC, CVMQ, OSC) and the
Toronto Stock Exchange (TSE) have adopted National Instrument 43-
101: Standards of Disclosure for Mineral Projects.
• NI 43-101 applies to all oral statements and written disclosure of
scientific or technical information, including disclosure of a mineral
resource or mineral reserve.
• NI 43-101 defers to the Canadian Institute of Mining, Metallurgy and
Petroleum for definitions and guidelines. In particular, the CIM
Standards on Mineral Resources and Reserves Definitions and
Guidelines adopted by CIM Council on August 20, 2000.
• Council of Mining and Metallurgical Institutes (CMMI) of which CIM is a
member, have developed a Resource/Reserve classification, definition
and reporting system that is similar for Australia, Canada, Great Britain,
South Africa and the United States.
• The Joint Ore Reserves Committee (JORC) of the Australasian Institute
of Mining and Metallurgy (AusIMM) and the Australian Institute of
Geoscientists and the Minerals Council of Australia has received broad
international acceptance.
Lec 17 – page 1
Reserve Definitions
• A Mineral Reserve is the economically mineable part of a Measured or Indicated
Mineral Resource demonstrated by at least a Preliminary Feasibility Study. This
Study must include adequate information on mining, processing, metallurgical,
economic and other relevant factors that demonstrate, at the time of reporting,
that economic extraction can be justified. A Mineral Reserve includes diluting
materials and allowances for losses that may occur when the material is mined.
– A ‘Proven Mineral Reserve’ is the economically mineable part of a Measured Mineral
Resource demonstrated by at least a Preliminary Feasibility Study. This Study must
include adequate information on mining, processing, metallurgical, economic, and
other relevant factors that demonstrate, at the time of reporting, that economic
extraction is justified.
– A ‘Probable Mineral Reserve’ is the economically mineable part of an Indicated, and in
some circumstances a Measured Mineral Resource demonstrated by at least a
Preliminary Feasibility Study. This Study must include adequate information on mining,
processing, metallurgical, economic, and other relevant factors that demonstrate, at
the time of reporting, that economic extraction can be justified.
Historical Approaches
Lec 17 – page 2
Geometric Measures
and
10000 n 10000
d L
rs2 d
Lec 17 – page 3
Uncertainty
• Geometric methods for classification are understandable, but do not
give an actual measure of uncertainty or risk.
• Professionals in ore reserve estimation increasingly want to quantify
the uncertainty/risk in their estimates.
• The JORC code:
Application to Mining
• Volume:
– Large variability at small scale
– Variability reduces at large scale
– Volume translates to “time”
• Geological continuity:
– Greater continuity – less variability
– Continuity depends on facies/rock type
• Data configuration
– Drilling is not uniform
– Areas of greater drillhole density will
have less uncertainty
• High/low grades
– Variability/uncertainty often depends
on grade level
– High grades can be more variable
Lec 17 – page 4
Classification of Resources
and Reserves
• Three aspects of probabilistic classification:
1. Volume
2. Measure of “+/-” uncertainty
3. Probability to be within the “+/-” measure of uncertainty
• Format for uncertainty reporting is clear and understandable, e.g.:
– monthly production volumes where the true grade will be within 15% of the
predicted grade 90% of the time are defined as measured
– quarterly production volumes where the true grade will be within 15% of
the predicted grade 90% of the time are defined as indicated
• Drillhole density and other geometric measures are understandable,
but do not really tell us the uncertainty or risk associated with a
prediction
• There are no established rules or guidelines to decide on these three
parameters; that remains in the hands of the Qualified Person
1. Volume
• “Volumes relevant to technical and economic evaluation”
• Consider a regular grid of volumes; classify central point by specified
volume around it rather than complex volume within a mine plan that
could change
Lec 17 – page 5
2. Measure of “+/-” uncertainty
• Analytical procedures can achieve a very tight tolerance:
– 103 grams at most
– the entire sample is accessible for measurement
– independent repeat measurements reduce uncertainty 2=2s/n
• Mineral deposits are different:
– 1012 grams or more
– less than 1 billionth of the sample available for analysis
– data are correlated because of the geological setting
– must predict relatively small local volumes between available data
• Practical guidelines for +/- uncertainty:
– 5 % lower limit – very continuous orebody
– 25% upper limit – prediction quite poor indeed with more than 25% error
– 15% is a suitable compromise and is becoming accepted
Lec 17 – page 6
Schematic Illustration
2. Predicted grade: z*
and chosen measure of
-15% z* +15%
uncertainty: +/- 15%
3. Probability to be within
measure of uncertainty
Uncertainty
• Predictions (z* values) can be from geostatistical or more traditional methods
• Geostatistical procedures are used to construct probability distributions of
uncertainty:
– parameters vary locally and within facies
– there are a number of different techniques including simulation
• Uncertainty is predicted at different scales (small blocks, approx. monthly
volumes, approx. quarterly volumes,…)
• Distributions of uncertainty can be checked:
– Predict uncertainty at locations where we have drillholes
– Construct different probability intervals (e.g., 10, 50, and 90%)
– Count the number of times the true values is in these intervals
– Should be in the interval the correct percentage (10, 50, and 90%)
• Geostatistical procedures have been shown to reliably predict the uncertainty
due to incomplete sampling
Lec 17 – page 7
Concerns with Probabilistic Approach
• There are four main concerns:
1. Uncertainty is model-dependent and stationarity-dependent.
Uncertainty can be changed dramatically be minor changes to these
decisions.
2. Many parameters affect the distributions of uncertainty in a non-
intuitive and non-transparent manner.
3. Uncertainty in the histogram and
spatial continuity parameters is not
commonly considered, but can have
a large affect on large mining-scale
uncertainty.
4. Choosing the parameters of uncertainty
for classification is cannot be universal
and is very deposit specific
• These concerns are serious enough that a geometric-based
classification scheme that is backed up by uncertainty may be the
most robust
Lec 17 – page 8
More Drilling and a Change in Modeling
Lec 17 – page 9
More on Example of the Importance of
Parameters
• Realizations and
probability to be
within 50% of
expected
• Significant change
in probabilities (see
below)
Discussion
• Geometric criteria are easy to understand and are recommended in most
cases
• Three aspects of probabilistic classification:
1. Volume (monthly and quarterly)
2. Measure of “+/-” uncertainty (15%)
3. Probability to be within the “+/-” measure of uncertainty (80%)
• Quantitative uncertainty should be used to support choice of geometric
criteria
• There will always be a need for Qualified Persons to render expert
opinions in different areas related to geological interpretation, sampling,
and estimation procedures.
• The rather loose definitions in current usage will be tightened up by
lending probabilistic meaning to classificaiton that is easily understood
by technical staff, management, and the investment community.
Lec 17 – page 10
Geostatistics for Mineral Resources Uncertainty Assessment
• Remarks
• Olympic Dam
Lec 18 – page 1
Prediction Model Steps
• Following Boisvert et al. (2013), the first section of the Olympic Dam
project requires predicting plant performance from measured variables:
1. Remove unimportant and redundant variables.
2. Quantile to quantile univariate transformation to a Gaussian distribution.
3. Merge the variables (level 1). This step reduces the 112 input variables to
23 merged variables.
4. Merge the variables (level 2). This step reduces the 23 merged variables to
4.
5. Regression on the 4 variables and prediction of the plant performance
variables: DWi, BMWi, Cu recovery, U3O8 recovery, acid consumption, and
net recovery
6. Back transform the estimated variables.
7. Determine uncertainty in the model.
• Geostatistical simulation is performed to spatially model inputs to the
plant prediction model
Boisvert, J.B., Rossi, M.E., Ehrig, K., Deutsch, C.V. (2013) Geometallurgical
Modeling at Olympic Dam Mine, South Australia, Math Geosci, 25: 901-925 3
Amalgamation
• Given the 804 observations, using all 204 variables in a regression
model would lead to overfitting
– Redundant and unimportant variables are identified, reducing the number to
112
– Amalgamation steps are used to further condense the variables to 4
subcategories
• A linear model based on 4 amalgamated variables provides a robust
predictive model that is used for predicting recovery and plant
performance
Lec 18 – page 2
Amalgamation (2)
Merged 1-3: Head Grade Variables
LEVEL 1: Merged 4-6: Mineralogy
Reduce 112 secondary variables to 23 variables
(16 merged variables + 7 variables retained) Merged 7-16: Association Variables
Retained Cu, Ag, Au, S, U3O8, SG, ratios
Retained Merged_1 Merged_2 Merged_3 Merged_4 Merged_5 Merged_6 Merged_7 Merged_8 Merged_9
Cu(wt%) Co(ppm) Ba(wt%) La(wt%) Uran_Wt% Chal_Wt% Sul_Wt% Bran_Pyr_assoc Cof_Bran_assoc Uran_Cof_assoc
U3O8(ppm) Mo(ppm) Fe(wt%) Mg(wt%) Cof_Wt% Born_Wt% A_Sol_Wt% Bran_Chalcopy_assoc Cof_Uran_assoc Uran_Chalcopy_assoc
SG Pb(ppm) Al(wt%) Mn(wt%) Bran_Wt% Chal_Wt% A_Insol_Wt% Bran_Bornite_assoc Cof_Pyr_assoc Uran_Bornite_assoc
K:Al Zn(ppm) Si(wt%) Na(wt%) Pyr_Wt% Bran_Chalcocite_assoc Cof_Chalcopy_assoc Uran_A_Sol_assoc
Ag(ppm) K(wt%) P(wt%) Bran_A_Sol_assoc Cof_Chalcocite_assoc Uran_A_Insol_assoc
Au(ppm) Ca(wt%) Ti(wt%) Bran_A_Insol_assoc Cof_Sulphides_assoc
Badj%S S(wt%) Ce(wt%) Bran_Free_Surf_assoc Cof_A_Sol_assoc
CO2(wt%) Cof_A_Insol_assoc
F(wt%) Cof_Free_Surf_assoc
Merged_10 Merged_11 Merged_12 Merged_13 Merged_14 Merged_15 Merged_16
Pyr_Cof_assoc Chalcopy_Bran_assoc Bornite_Cof_assoc Chalcocite_Chalcopy_assoc Sulphides_Uran_assoc A_Sol_Bran_assoc A_Insol_Bran_assoc
Pyr_Uran_assoc Chalcopy_Cof_assoc Bornite_Pyr_assoc Chalcocite_Bornite_assoc Sulphides_Pyr_assoc A_Sol_Cof_assoc A_Insol_Cof_assoc
Pyr_Chalcopy_assoc Chalcopy_Uran_assoc Bornite_Chalcopy_assoc Chalcocite_Sulphides_assoc Sulphides_Chalcopy_assoc A_Sol_Uran_assoc A_Insol_Uran_assoc
Pyr_Sulphides_assoc Chalcopy_Pyr_assoc Bornite_Chalcocite_assoc Chalcocite_A_Sol_assoc Sulphides_Bornite_assoc A_Sol_Pyr_assoc A_Insol_Chalcopy_assoc
Pyr_A_Sol_assoc Chalcopy_Bornite_assoc Bornite_Sulphides_assoc Chalcocite_A_Insol_assoc Sulphides_A_Sol_assoc A_Sol_Chalcopy_assoc A_Insol_Bornite_assoc
Pyr_Free_Surf_assoc Chalcopy_Chalcocite_assoc Bornite_A_Sol_assoc Chalcocite_Free_Surf_assoc Sulphides_A_Insol_assoc A_Sol_Bornite_assoc A_Insol_Sulphides_assoc
Chalcopy_Sulphides_assoc Bornite_A_Insol_assoc A_Sol_Chalcocite_assoc A_Insol_A_Sol_assoc
Chalcopy_A_Sol_assoc Bornite_Free_Surf_assoc A_Sol_Sulphides_assoc A_Insol_Free_Surf_assoc
Chalcopy_A_Insol_assoc A_Sol_A_Insol_assoc
Chalcopy_Free_Surf_assoc A_Sol_Free_Surf_assoc
A B C D
Cu(wt%) Merged_1 Merged_4 Merged_7
U3O8(ppm) Merged_2 Merged_5 Merged_8 How to merge variables:
LEVEL 2: SG Merged_3 Merged_6 Merged_9
4 final variables K:Al Merged_10
Ag(ppm) Merged_11
Au(ppm) Merged_12
Badj%S Merged_13
Merged_14
Merged_15
Merged_16
Response Surface
• Fit a response surface to six performance (metallurgical) variables
Variable A contains individual variables retained.
Variable B contains the remainder of the head assays.
Variable C contains all mineralogy variables.
Variable D contains all association variables.
Lec 18 – page 3
Sensitivity Analysis
Rule Induction
SG + Chalcopy_Wt% + Na = DWi 0.93 96
Na + Fe = DWi 0.91 110
Na + Fe + Bran_Pyr = DWi 0.91 110
Na + Fe + Sulph_Born = DWi 0.91 110
Na + Fe + Chalci_Free = DWi 0.91 110
Na + Fe + Sulph_Uran = DWi 0.91 110
Ti + A_Sol_Chalcopy + A_Sol_A_Insol = BMWi 0.90 73
Si + Chalcopy_Free + Uran_Wt% = BMWi 0.91 66
Si + Chalcopy_Free + A_Sol_Uran = BMWi 0.92 66
Chalcopy_Free + A_Sol_Uran + SG = BMWi 0.92 66
Ti + Born_A_Sol + Mn = BMWi 0.91 65
• A rule is an association of 2 or
P + Cof_Uran + Ti = BMWi 0.91 75
Cof_Free + Sulph_A_Insol + Ti = BMWi 0.90 71
P + Chalcopy_Pyr + Ti = BMWi 0.91 63
Lec 18 – page 4
Spatially Modeling of Variables
• 112 variables must be spatially mapped to utilize the developed models
• There are two main difficulties:
– the compositional nature of the variables must be accounted for
– many of the variables are correlated and require methodologies that can be
applied to a large number of variables but are effective in reproducing the
multivariate relationships.
• 27M cell model creates computational challenges for 112 variables
10
Lec 18 – page 5
Head Grade Simulation
• Head grades are simulated using logratios and PCA
– Updated workflow would likely use PPMT and/or MAF
Data Realization
11
Data
12
Lec 18 – page 6
Spatial Prediction of Plant Performance
• Maps of Copper (left) and Uranium (right) recovery
13
Lec 18 – page 7