Annotated
Annotated
• Rounding a number
• Error propagation
1
• Example: in base 10
10
1 −4 −5 −8 −9 −12 −13
• sum of powers of 2: = 2 + 2 + 2 + 2 + 2 + 2 + ⋯
10
Rounding
• Numbers with infinite binary digits cannot be stored exactly in computers.
1
• Example: in base 10
10
1 −4 −5 −8 −9 −12 −13
• sum of powers of 2: = 2 + 2 + 2 + 2 + 2 + 2 + ⋯
10
Right-hand-side
= (2−4
+ 2 ) + 2 (2 + 2 ) + 2 (2 + 2 ) + 2
−5 −4 −4 −5 −8 −4 −5 −12
(2
−4
+2 )+⋯
−5
= (2 + 2 ) (1 + 2 + 2 + 2
−4 −5 −4 −8 −12
+ ⋯)
= (2 + 2 ) ×
−4 −5 1
1 − 2−4 =
Rounding
• Numbers with infinite binary digits cannot be stored exactly in computers.
1
• Example: in base 10
10
1 −4 −5 −8 −9 −12 −13
• sum of powers of 2: = 2 + 2 + 2 + 2 + 2 + 2 + ⋯
10
• Base 2: 0.000110011001100… (1100 repeats)
• x —> round(x)
Rounding modes in IEEE standard
Baddow
Y
↑
↑o round
can
only store I
digit
• Default: round to nearest down
found
: . 1
0
round
up 10X
:
round to nearest
11805000/100
• 01
Example: 1/10 in base 2: 1.10011001100…x 2-4 (1100 repeats)
-
I
-- ↑ Y
+ 1 53
• IEEE double precision: 52 bits for significand (“normalized”)
-
-
0
20 Ol
.
--
.
X
=
2
−52 e
• |round(x) − x| is less than 2-
× 2 for any rounding mode
Error
• Absolute error: |round(x) − x|
± e
• In IEEE double precision, if x = (1.a1a2…a52a53…) × 2 where
−1022 ≤ e ≤ 1023, then
± (1.a1a2…a52 + 0.00…1) × 2 e
−52 e
• |round(x) − x| is less than 2 × 2 for any rounding mode
−52
2 is the machine epsilon for IEEE double precision
(the smallest ϵ > 0 such that 1 + ϵ is representable)
Error
• Absolute error: |round(x) − x|
± e
• In IEEE double precision, if x = (1.a1a2…a52a53…) × 2 where
−1022 ≤ e ≤ 1023, then
± (1.a1a2…a52 + 0.00…1) × 2 e
−52 e
• |round(x) − x| is less than 2 × 2 for any rounding mode
−53 e
• For round to nearest, |round(x) − x| is less than 2 ×2
Error
• Relative error: |round(x) − x|/|x|
−52 e
• We already know |round(x) − x| <2 ×2
± e
• Note x = a × 2 , where 1 ≤ a < 2
−52 e
2 ×2 −52
• Thus, |round(x) − x|/|x| ≤ =2
2 e
−53
• For round to nearest, |round(x) − x|/|x| is less than 2
Error propagation
• Input errors -> calculations -> output errors
• Calculations:
• ω: operations+, − , × , ÷
-
+ 2
Error propagation in multiplication
• Relative errors don’t propagate slowly. Even if Rel(xA), Rel(yA) are close
to 0, Rel(xA + yA), Rel(xA − yA) can be far from 0. Undesirable!
• Error of xA - yA = -0.0004814…
• Relative errors don’t propagate slowly. Even if Rel(xA), Rel(yA) are close
to 0, Rel(xA + yA), Rel(xA − yA) can be far from 0. Undesirable!
• Hard case: f(x) = log(x), f(x) = 10x7 + 8x6 + 4x5 + 19x3 + 10x2 + 1
Function evaluation
Mi
-
• Assume f is continuous and has a derivative f′ around xA.
2
O
Function evaluation
• Assume f is continuous and has a derivative f′ around xA.
• Take b = xT, a = xA
• In general, xA ≈ xT => c ≈ xA ≈ xT
1
• Assume P = V = n = 1. Then, T =
R
−1 −1
• In the MKS measurement system (unit: J ⋅ K ⋅ mol ),
R = 8.3143 + ϵ, | ϵ | ≤ 0.0012
• Goal: evaluate T
Example 2.3.2 in [Atkinson-Han]
Example 2.3.2 in [Atkinson-Han]
Example 2.3.3 in [Atkinson-Han]
x
• f(x) = b , where b is a positive constant.
Example 2.3.3 in [Atkinson-Han]
x
• f(x) = b , where b is a positive constant.