0% found this document useful (0 votes)
36 views19 pages

Unit 2A Error Analysis and Statistics

WAYS TO TEST FOR SAMPLES

Uploaded by

hbkgosana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views19 pages

Unit 2A Error Analysis and Statistics

WAYS TO TEST FOR SAMPLES

Uploaded by

hbkgosana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Error Analysis and Statistics

Unit 2A
Outline
1.1 Random and systematic Errors

1.2 Gaussian distribution

1.3 Mean value and standard deviation

1.4 Variance and other measures of precision

1.5 Confidence intervals

1.6 Significance testing

1.7 Comparison of standard deviation (F-test)

1.8 Testing for outliers


Error Analysis and Statistics
• No measurement can be conducted in a chemical analysis without some kind of
associated uncertainty or error.
• Because this is an inevitable component of conducting analyses, we must try and
minimize errors (by employing appropriate sampling techniques, making sure we
use our apparatus correctly and calibrating our instruments) and also estimate the
size of errors present in our analyses in order to assess their significance.
• The following section will therefore summarize the principal statistical techniques
used to evaluate errors occurring in analytical chemistry that you should know and
be able to apply.
• The associated reference materials used to compile this section are Chapters 3 & 4
(Harris & Lucy, 2010) and a paper by Miller & Miller (1988) titled Basic Statistical
Methods for Analytical Chemistry.
1.1 Random and systematic errors
• There are two types of errors that affect chemical analyses namely systematic and
random errors.
• Random errors (or indeterminate errors) arise from uncontrolled variables
present during the measurement and cannot be eliminated but only reduced by
means of a better experimental design.
• An example of how a random error could be introduced in an analysis would be
how a change in airflow in the lab brought about by someone opening and closing
a door affects the air pressure reading you are taking.
• They affect each observation independently and unpredictably.
• Random errors are uncovered when replicate measurements are performed and
they affect what we call the precision or repeatability of the results.
1.1 Random and systematic errors
• Systematic errors (or determinate errors) are a result of flaws in the equipment or
design of the experiment and cause replicate measurements to deviate from the
true value (accepted or correct value for the measured quantity).
• In this case, the error will be reproduced in the same way every time you repeat
the experiment or measurement.
• An example of a systematic error introduced in an experiment would be when a
pH meter that wasn't properly calibrated is used to record pH changes of a
solution.
• Systematic errors affect the accuracy of results.
• They may be additive (constant) or multiplicative (proportional).
• Poor calibration of instruments
is a common source of
systematic errors.
1.1 Random and systematic errors
• Consider the picture below of a hypothetical dart board representing the spread
of measurements collected during different analyses (the centre circle represents
the true value). Which picture represents collected data that is accurate and
precise; accurate but not precise; precise but not accurate and neither accurate or
precise?

[Answer: A represents an analysis with measurements that were neither accurate or precise; B represents
a set of data that is fairly accurate but not precise; C illustrates a set of data that is accurate and precise
whilst D describes a situation where the measurements collected were precise but not accurate.]
Important terms

• Replicate measurements: Samples of the same size that are analysed in exactly the
same way. In any testing protocol, a critical decision is how many measurements to
make/items to test.
• Precision: Describes the internal agreement between results that were obtained in
the same way (how close or similar are the values collected). It is estimated by
evaluating standard deviations or confidence limits.
• Accuracy: A measure to determine how close the results are to the true or
accepted value for that measured quantity. It is estimated by comparing results to
those obtained using other methods and other laboratories, or through the use of
standard reference materials (SRMs). Estimates are available for standard methods
(ISO, ASTM, DIN, BS, SANS).
• True value: A theoretical value referring to the measured quantity without any
error.
1.2 Gaussian
distribution

• As mentioned previously, experimental


measurements always contain some amount
of error and conclusions from results can
never be drawn with absolute certainty.
• However, statistics can help us accept or
reject certain conclusions based on the
probability of them being correct.
• The next few slides will take a look at which
parameters are commonly used in analytical
chemistry to assess the results we obtain
from replicate measurements.
1.2
Gaussian
distribution
• It is important that you note that
all of these parameters and tests
rely on an assumption made
regarding the distribution of
measurements collected during
an analysis.
• When an experiment is repeated
many times and if there are only
random errors present in the
measurements, then results
tend to cluster symmetrically
about an average value (this has
been proven experimentally and
is supported by the Central Limit
Theorem).
• The more times an experiment is
repeated, the closer the shape
representing the measurements
will be to a bell-shaped curve we
call a Gaussian distribution.
1.2 Gaussian distribution
• Chemical analyses very rarely make use of large number of measurements
though and rely more often on smaller data sets.
• However, these smaller sets of results are still useful and can help us estimate
properties of a hypothetical larger set.
• In a sense you can come to think of it as the smaller data set being a sample of
the larger data set (also called the population data).
• This distinction will be important to keep in mind for the tests and parameters
we are about to look at.

Important terms
• Population data set: A large set of data containing all possible data values.
• Sample data set: A smaller set of data which is part of or a subset of the
population data set.
1.3 Mean value and standard deviation
• Chemists typically perform between two and five replicates of a chemical
measurement using the same analytical procedure.
• As already mentioned, when only a small number of measurements are collected,
we refer to this data set as a sample of the larger population data set.
• Rarely are the values obtained from small sample data sets identical but if a
normal Gaussian distribution of results is assumed then a mean value, 𝑥ҧ (an
average) can be calculated that represents the true value of the measured quantity
(the central value in our curve distribution of results around which other values are
spread symmetrically if systematic errors are absent).
1.3 Mean value and standard deviation
• A mean can be calculated for the sample of data, 𝑥ҧ , and the population data (μ).
• Going forward we will focus on mostly evaluating small sample sizes of data as this
is what we will typically work with in chemical analyses.
• To calculate the sample mean we begin by taking the sum of all the measurements
and dividing that total by N, the number of measurements as given by the
equation below (note that in some texts, the N value is not given as the uppercase
letter but as the lowercase letter, n).

σ𝑁
𝑖=1 𝑥𝑖
𝑥ҧ =
𝑁
1.3 Mean value and standard deviation
• The median is the middle result or value from a set of measurements that have
been arranged in ascending order (smallest to largest) e.g. for values 2, 3, 6, 8, 13,
the median value would be 6.
• The standard deviation, s, measures the spread of repeated measurements
(replicates) in a sample data set i.e. how clustered they are around the mean.
• The smaller the standard deviation, the more closely the data points are clustered
around the mean.
• This value can be calculated by using the following equation (standard deviation
can also be computed on a calculator using automated functions).

σ 𝑥𝑖 2
2
σ 𝑥𝑖 −
σ𝑁
𝑖=1 𝑥𝑖 − 𝑥ҧ 2
𝑁 σ 𝑥𝑖 2− 𝑁 𝑥ҧ 2
𝑠= = =
𝑁−1 𝑁−1 𝑁−1

• Take note that the (𝑥𝑖 − 𝑥)ҧ term represents the deviation of each measured value
(𝑥𝑖 ) from the mean.
• The N-1 term represents the number of degrees of freedom and is used to
estimate a standard deviation that closer resembles the population standard
deviation (σ).
1.3 Mean value and standard deviation
Example: The ratio of the number of atoms between isotopes 35Cl and 37Cl were measured in
eight different samples to help improve the reported atomic mass of chlorine on the periodic
table. Below you will find a summary of the measured ratios. Find the mean, median and
standard deviation for the set of data.

Sample 35Cl/37Cl ratio Sample 35Cl/37Cl ratio


1 3.167 5 3.169
2 3.165 6 3.166
3 3.168 7 3.163
4 3.164 8 3.162

See also the example on p.69 of Harris & Lucy (2020).


1.3 Mean value and standard deviation
Practice Problem
Find the mean, median and standard deviation for the following set of measurement
values: 822, 856, 785, 796, 786.

[Answer: mean (𝑥)ҧ = 809; median = 796; s = 30]


Important terms

• Mean: The mean is the 'average' of all the measurements and is found by adding all
the values together and dividing by the number of measurements. It indicates what is
considered as the most representative value for a data set of measurements.
• Median: The 'middle' number, the measurement that is centrally listed in the data
when arranged from lowest to highest value. If two values are in the middle (i.e., N is
even), then the median is calculated by adding the values together and diving by two
(calculating the average of the two values).
• Standard deviation: This acts as a measure of how close replicate measurements are
to one another in either a sample data set or the population data set i.e. how clustered
measurements are around the mean given a normal distribution of results.
• Degrees of freedom: Defined as the number of members in a statistical sample that
provides an independent measure of the precision for a data set.
1.4 Variance and other measures of precision
• The sample standard deviation is the most common way to report on the
precision of an experiment but there also exists three other terms in analytical
chemistry that you should be familiar with.
• Variance is simply the square of the standard deviation as shown by the equation
below.
σ 𝑥 2
σ 𝑥𝑖 −2 𝑖
σ𝑁𝑖=1 𝑥𝑖 − 𝑥ҧ
2
𝑁 = σ 𝑥𝑖
2 − 𝑁 𝑥ҧ 2
𝑠= =
𝑁−1 𝑁−1 𝑁−1

• It is a measure of the dispersion among a set of data values.


• The higher the variance value calculated, the more scattered the measured values
in the analysis.
• Relative standard deviation (RSD) is calculated by simply dividing the standard
deviation by the mean value.
• This value is sometimes expressed as parts per thousand (multiplying by a 1000).

𝑠
𝑅𝑆𝐷 = 𝑠𝑟 =
𝑥ҧ
1.4 Variance and other measures of precision
• The relative standard deviation when multiplied by 100% is called the coefficient
of variation (CV).

𝑠
𝐶𝑉 = × 100 %
𝑥ҧ

• The standard deviation of the mean is given by

𝑠
𝑠𝑥ҧ =
𝑁

• The spread or range (w) of the sample data is the calculated difference between
the largest value measured in the data set and the smallest, e.g., for values 2.3,
2.5, 2.8, 2.6, the spread will be 2.8 - 2.3 = 0.5.
1.4 Variance and other measures of precision
• The ratio of the number of atoms between isotopes 35Cl and 37Cl were measured
in eight different samples to help improve the reported atomic mass of chlorine on
the periodic table. Below you will find a summary of the measured ratios. Using
the calculated mean and standard deviation, find the variance, relative standard
deviation, coefficient of variation and spread for the data set.

Sample 35Cl/37Cl ratio Sample 35Cl/37Cl ratio


1 3.167 5 3.169
2 3.165 6 3.166
3 3.168 7 3.163
4 3.164 8 3.162

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy