0% found this document useful (0 votes)

47 views2 pages

300 Float

Uploaded by

subhrojit.nandy.27105

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views2 pages

300 Float

Uploaded by

subhrojit.nandy.27105

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Numerical Analysis

Grinshpan

Fixed-point and floating-point representations of numbers

A fixed-point representation of a number may be thought to consist of 3 parts: the sign
field, integer field, and fractional field. One way to store a number using a 32-bit format is
to reserve 1 bit for the sign, 15 bits for the integer part and 16 bits for the fractional part.
A number whose representation exceeds 32 bits would have to be stored inexactly.

On a computer, 0 is used to represent + and 1 is used to represent −.

Example. The 32-bit string

1 | 000000000101011 | 1010000000000000
represents (−101011.101)2 = −43.625.

The fixed point notation, although not without virtues, is usually inadequate for numerical
analysis as it does not allow enough numbers and accuracy.

Example. In the format just discussed, the largest number is

0 | 111111111111111 | 1111111111111111
or (215 − 1) + (1 − 2−16 ) = 215 (1 − 2−31 ) ≈ 32768, and the smallest positive number is
0 | 000000000000000 | 0000000000000001
or 2−16 ≈ 0.000015 . Note that 2−16 is precisely the gap between two adjacent fixed-point
numbers.

The floating-point notation is by far more flexible. Any x ̸= 0 may be written in the form
±(1.b1 b2 b3 ...)2 × 2n ,
called the normalized representation of x. The normalized representation is achieved by
choosing the exponent n so that the binary point “floats” to the position after the first
nonzero digit. This is the binary version of scientific notation.

To store a normalized number in 32-bit format one reserves 1 bit for the sign, 8 bits for the
signed exponent, and 23 bits for the portion b1 b2 b3 ...b23 of the fractional part of the
number. The leading bit 1 is not stored (as it is always 1 for a normalized number) and is
referred to as a “hidden bit”.

The 8-bit exponent field is used to store integer exponents −126 ≤ n ≤ 127.
We will discuss later how exactly this is done.

Example. The 32-bit string

1 | 8 bits storing n=5 | 10101100000000000000000
represents (−1.101011)2 × 25 = (−110101.1)2 = −53.5.
Example. The 32-bit word
0 | 8 bits storing n=0 | 00000000000000000000000
represents (1.0)2 = 1.
Example. The largest normalized number that fits into 32 bits is
0 | 8 bits storing n=127 | 11111111111111111111111
or (1.11111111111111111111111)2 × 2127 = (224 − 1)2104 ≈ 3.40 × 1038 .
The smallest normalized positive number that fits into 32 bits is
0 | 8 bits storing n=-126 | 00000000000000000000000
or (1.00000000000000000000000)2 × 2−126 = 2−126 ≈ 1.18 × 10−38 .
The
1 precision of a floating-point format is the number of positions reserved for binary digits
plus one (for the hidden bit). In the examples considered here the precision is 23+1=24.
One says that x is a floating-point number if it can be represented exactly using a given
0.8
floating-point format. For instance, 1/3 = (0.010101 . . . )2 cannot be a floating-point
number as its binary representation is nonterminating.
0.6
The gap between 1 and the next normalized floating-point number is known as machine
epsilon. In our setting, this gap is (1 + 2−23 ) − 1 = 2−23 . Note that this is not the same as
0.4
the smallest positive floating-point number. Unlike in the fixed-point scenario, the spacing
between the floating-point numbers is not uniform, but varies from one dyadic interval
[2n , 2n+1 ) to another. As we move away from the origin, the spacing becomes less dense:
0.2

−0.2

−0.4

−0.6

−0.8

−1
0 2 4 6 8 10 12 14 16

Unit 2
No ratings yet
Unit 2
85 pages
Unit 2
No ratings yet
Unit 2
16 pages
Madhusanka Liyanage: Lecture 3: Data Representation in Computer Systems
No ratings yet
Madhusanka Liyanage: Lecture 3: Data Representation in Computer Systems
62 pages
L4
No ratings yet
L4
29 pages
L2-Variables and Floating Point Number System
No ratings yet
L2-Variables and Floating Point Number System
38 pages
COA - Unit 2 Data Representation 1
No ratings yet
COA - Unit 2 Data Representation 1
59 pages
CSC 206 Lecture 3
No ratings yet
CSC 206 Lecture 3
13 pages
Fixed - and - Floating - Point - Representation
No ratings yet
Fixed - and - Floating - Point - Representation
40 pages
Coa Unit 2
No ratings yet
Coa Unit 2
35 pages
Lecture 4
No ratings yet
Lecture 4
21 pages
Fixed and Floating Point Representation
No ratings yet
Fixed and Floating Point Representation
5 pages
Architetture Dei Calcolatori 2425 079 092
No ratings yet
Architetture Dei Calcolatori 2425 079 092
14 pages
4 Floating Point Inclass
No ratings yet
4 Floating Point Inclass
33 pages
Lec 06
No ratings yet
Lec 06
49 pages
Fixed Point and Floating Point Number Representations
No ratings yet
Fixed Point and Floating Point Number Representations
7 pages
Unit 5 - Share
No ratings yet
Unit 5 - Share
38 pages
3 Fixed and Floating Point DSP
No ratings yet
3 Fixed and Floating Point DSP
23 pages
ML System Optimization Lecture 11 Quantization
No ratings yet
ML System Optimization Lecture 11 Quantization
150 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
20 pages
Module 1 Data Rep
No ratings yet
Module 1 Data Rep
14 pages
ARCh Presentation1
No ratings yet
ARCh Presentation1
12 pages
Floating Point Representation: Reading: B&O 2.4
No ratings yet
Floating Point Representation: Reading: B&O 2.4
44 pages
03-Data Representation
No ratings yet
03-Data Representation
6 pages
CH03 Data II
No ratings yet
CH03 Data II
31 pages
Floating - Point - Number
No ratings yet
Floating - Point - Number
36 pages
Fixed Point and Floating Point Number Representations
No ratings yet
Fixed Point and Floating Point Number Representations
5 pages
Data Representation
No ratings yet
Data Representation
16 pages
COA UNIT-III PPTs Dr.G.Bhaskar ECE
No ratings yet
COA UNIT-III PPTs Dr.G.Bhaskar ECE
64 pages
FIXED and FLOAT
No ratings yet
FIXED and FLOAT
8 pages
LEC03 Data II
No ratings yet
LEC03 Data II
45 pages
3-EED220 Lecture 3
No ratings yet
3-EED220 Lecture 3
22 pages
Bits, Bytes, Integers, and Floats Notes
No ratings yet
Bits, Bytes, Integers, and Floats Notes
18 pages
L-5 Floating Point Representation of Numbers
No ratings yet
L-5 Floating Point Representation of Numbers
21 pages
Lecture 4
No ratings yet
Lecture 4
21 pages
Coa Module-Iii
No ratings yet
Coa Module-Iii
13 pages
Floating Point & Fixed Point Representation - BCA II
No ratings yet
Floating Point & Fixed Point Representation - BCA II
24 pages
Number Systems - Data Representation (Numbers)
No ratings yet
Number Systems - Data Representation (Numbers)
27 pages
Cit335 Summary
No ratings yet
Cit335 Summary
10 pages
SW Lab 3 Fixed Point Simulation EE 462
No ratings yet
SW Lab 3 Fixed Point Simulation EE 462
7 pages
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
No ratings yet
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
32 pages
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
No ratings yet
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
42 pages
Computer Architecture & Organization Unit 2
No ratings yet
Computer Architecture & Organization Unit 2
24 pages
Lect4 Floats
No ratings yet
Lect4 Floats
64 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
5 pages
CH3 Arm PPT New
No ratings yet
CH3 Arm PPT New
42 pages
COMPX203 Computer Systems: Number Representation
No ratings yet
COMPX203 Computer Systems: Number Representation
33 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
8 pages
COMP0068 Lecture10 High Level Data Types
No ratings yet
COMP0068 Lecture10 High Level Data Types
25 pages
Computer Organization and Architecture
100% (2)
Computer Organization and Architecture
55 pages
02 - Data Representation - Exercise Sheet 2 (Solutions) - 1271758327
No ratings yet
02 - Data Representation - Exercise Sheet 2 (Solutions) - 1271758327
5 pages
Lecture 4 - Computer Arithmetic
No ratings yet
Lecture 4 - Computer Arithmetic
18 pages
Lecture11 Slides 1
No ratings yet
Lecture11 Slides 1
52 pages
Introduction To Numerical Computing: Statistics 580 Number Systems
No ratings yet
Introduction To Numerical Computing: Statistics 580 Number Systems
35 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
13.3 Floating Point Numbers Notes 2024
No ratings yet
13.3 Floating Point Numbers Notes 2024
8 pages
Fixed & Floating Point
No ratings yet
Fixed & Floating Point
31 pages
Chap 02
No ratings yet
Chap 02
16 pages
Patran 2024.1 Reference Manual Part 7 XY Plotting
No ratings yet
Patran 2024.1 Reference Manual Part 7 XY Plotting
66 pages
D7sys Funktion B
No ratings yet
D7sys Funktion B
530 pages
Unit-1 (Computer Architecture)
No ratings yet
Unit-1 (Computer Architecture)
27 pages
Lec08 - Instruction Sets - Characteristics and Functions
0% (1)
Lec08 - Instruction Sets - Characteristics and Functions
44 pages
Number Representation
No ratings yet
Number Representation
7 pages
2.2 Fixed Point Iteration
100% (1)
2.2 Fixed Point Iteration
11 pages
Systems Reference Library: IBM System/360 System Summary, Form A22-6810, Which
No ratings yet
Systems Reference Library: IBM System/360 System Summary, Form A22-6810, Which
199 pages
ZeeboDeveloperGuide0 97
No ratings yet
ZeeboDeveloperGuide0 97
119 pages
Fixed Versus Floating Point
No ratings yet
Fixed Versus Floating Point
5 pages
Lecture 1 FloatingPointNumberSystems
No ratings yet
Lecture 1 FloatingPointNumberSystems
46 pages
Netezza Data Loading Guide PDF
No ratings yet
Netezza Data Loading Guide PDF
90 pages
SpyGlass AreaRules Reference
No ratings yet
SpyGlass AreaRules Reference
38 pages
Finite Word Length Effects
No ratings yet
Finite Word Length Effects
31 pages
"The Course That Gives CMU Its Zip!": Topics
No ratings yet
"The Course That Gives CMU Its Zip!": Topics
31 pages
BCS054
No ratings yet
BCS054
6 pages
Exposure To Computer Descipline Part 1
No ratings yet
Exposure To Computer Descipline Part 1
168 pages
Mailam Engineering College Mailam (Po), Villupuram (DT) - Pin: 604 304
No ratings yet
Mailam Engineering College Mailam (Po), Villupuram (DT) - Pin: 604 304
43 pages
CORDIC For Dummies
No ratings yet
CORDIC For Dummies
14 pages
Ch4-Machine Level Representation of Data-2019
No ratings yet
Ch4-Machine Level Representation of Data-2019
44 pages
Computer Arithmatic1
No ratings yet
Computer Arithmatic1
38 pages
DSP 3
No ratings yet
DSP 3
25 pages
Lecture 1 - Introduction To Matlab
No ratings yet
Lecture 1 - Introduction To Matlab
50 pages
Quantization and Training of Neural Networks For Efficient Integer-Arithmetic-Only Inference
No ratings yet
Quantization and Training of Neural Networks For Efficient Integer-Arithmetic-Only Inference
14 pages
ALTERA - CORDIC IP Core User Guide: Subscribe Send Feedback
No ratings yet
ALTERA - CORDIC IP Core User Guide: Subscribe Send Feedback
11 pages
Ece 306L - Experiment 4: Signal Quantization
No ratings yet
Ece 306L - Experiment 4: Signal Quantization
10 pages
Cs8491 - Computer Architecture Lession Notes Unit Ii Arithmetic Operations
No ratings yet
Cs8491 - Computer Architecture Lession Notes Unit Ii Arithmetic Operations
18 pages
Module 1 DSPA Chapter 2
No ratings yet
Module 1 DSPA Chapter 2
8 pages
Chapter 1 - Data Representation 1.1 - Data Types
No ratings yet
Chapter 1 - Data Representation 1.1 - Data Types
12 pages
Ada Reference Card
100% (1)
Ada Reference Card
2 pages
Basic Math Notes
From Everand
Basic Math Notes
Ernest Bywater
5/5 (2)
Principles of Digital Electronics
From Everand
Principles of Digital Electronics
Sapana Rane
No ratings yet
Master Fracions Addition, Subtraction And Multiplication: Math Childern Book
From Everand
Master Fracions Addition, Subtraction And Multiplication: Math Childern Book
Mourad Boufadene
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

300 Float

Uploaded by

300 Float

Uploaded by

Numerical Analysis

Fixed-point and floating-point representations of numbers

On a computer, 0 is used to represent + and 1 is used to represent −.

Example. The 32-bit string

Example. In the format just discussed, the largest number is

Example. The 32-bit string

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.