0% found this document useful (0 votes)
74 views57 pages

IF2130!02!2020-Representasi Informasi - Integer Dan String

This document discusses computer data representation and organization. It covers binary and hexadecimal numbering systems, byte ordering conventions, integer and floating point representation, character encoding, boolean logic operations, and other fundamental concepts. Programming languages like C provide operators like &, |, ^, ~ to perform bit-level logic and shift operations on integral data types. Memory is organized into words and addressed at the byte level.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views57 pages

IF2130!02!2020-Representasi Informasi - Integer Dan String

This document discusses computer data representation and organization. It covers binary and hexadecimal numbering systems, byte ordering conventions, integer and floating point representation, character encoding, boolean logic operations, and other fundamental concepts. Programming languages like C provide operators like &, |, ^, ~ to perform bit-level logic and shift operations on integral data types. Memory is organized into words and addressed at the byte level.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

IF2130 – Organisasi dan Arsitektur

Komputer
sumber: Greg Kesden, CMU 15-213, 2012

Representasi Informasi
Achmad Imam Kistijantoro (imam@informatika.org)
Anggrahita Bayu Sasmita
Rahmat Mulyawan
Infall Syafalni
Representasi Biner

source: Slide CMU 15-213


Encoding Byte Values
 Byte = 8 bit
 000000002 hingga 111111112
 Desimal 0 – 255
 Hexadesimal 00 – FF
 0xdeadbeef
 0xc0ffeeee
Organisasi Memori Berorientasi Byte

 Program mengakses lokasi berbasis virtual memori


 terdiri atas array byte yang sangat besar
 diimplementasikan sebagai hierarki dari beberapa jenis
memori
 sistem menyediakan private address space ke proses
 program dijalankan dan tidak saling mengganggu program lain
 Compiler + Runtime system mengontrol alokasi
 dimana berbagai objek program harus disimpan
 semua alokasi berada pada virtual address space yang tunggal
Machine Words
 Mesin memiliki “Word Size”
 Ukuran nominal data bernilai integer
 Termasuk addresses
 Umumnya, mesin sekarang menggunakan 32 bits (4 bytes)
words
 Batas alamat 4GB
 Terlalu kecil untuk aplikasi yang memerlukan memori intensif
 High-end systems menggunakan 64 bits (8 bytes) words
 Potential address space ≈ 1.8 X 1019 bytes
 x86-64 machines support 48-bit addresses: 256 Terabytes
 Machines support multiple data formats
 Fractions or multiples of word size
 Always integral number of bytes
Word Oriented Memory Organization

 Addresses Specify Byte


Locations
 Address of first byte in
word
 Addresses of successive
words differ by 4 (32-bit)
or 8 (64-bit)
Representasi Data
Byte Ordering
 How should bytes within a multi-byte word be
ordered in memory?
 Conventions
 Big Endian: Sun, PPC Mac, Internet
 Least significant byte has highest address
 Little Endian: x86
 Least significant byte has lowest address
Melihat representasi data
 Code untuk mencetak representasi data
Decimal: 15213
Representing Integers Binary: 0011 1011 0110 1101
Hex: 3 B 6 D

int A = 15213; long int C = 15213;


IA32, x86-64 Sun
IA32 x86-64 Sun
6D 00
6D 6D 00
3B 00
3B 3B 00
00 3B
00 00 3B
00 6D
00 00 6D
00
int B = -15213; 00
00
IA32, x86-64 Sun
00
93 FF
C4 FF
FF C4
FF 93 Two’s complement representation
Representing Pointers
int B = -15213;
int *P = &B;

Sun IA32 x86-64


EF D4 0C
FF F8 89
FB FF EC
2C BF FF
FF
7F
00
00

Different compilers & machines assign different locations to objects


Representing Strings
char S[6] = "18243";
 Strings in C
 Represented by array of characters
 Each character encoded in ASCII format Linux/Alpha Sun

 Standard 7-bit encoding of character set 31 31


 Character “0” has code 0x30 38 38
 Digit i has code 0x30+i 32 32
 String should be null-terminated 34 34
 Final character = 0 33 33

 Compatibility 00 00

 Byte ordering not an issue


Boolean Algebra
 Developed by George Boole in 19th Century
 Algebraic representation of logic
 Encode “True” as 1 and “False” as 0
And Or
◼ A&B = 1 when both A=1 and B=1 ◼ A|B = 1 when either A=1 or B=1

Not Exclusive-Or (Xor)


◼ ~A = 1 when A=0 ◼ A^B = 1 when either A=1 or B=1, but not both
General Boolean Algebras
 Operate on Bit Vectors
 Operations applied bitwise

01101001 01101001 01101001


& 01010101 | 01010101 ^ 01010101 ~ 01010101
01000001
01000001 01111101
01111101 00111100
00111100 10101010
10101010

 All of the Properties of Boolean Algebra Apply


Example: Representing & Manipulating Sets
 Representation
 Width w bit vector represents subsets of {0, …, w–1}
 aj = 1 if j ∈ A

 01101001 { 0, 3, 5, 6 }
 76543210

 01010101 { 0, 2, 4, 6 }
 76543210
 Operations
 & Intersection 01000001 { 0, 6 }
 | Union 01111101 { 0, 2, 3, 4, 5, 6 }
 ^ Symmetric difference 00111100 { 2, 3, 4, 5 }
 ~ Complement 10101010 { 1, 3, 5, 7 }
Bit-Level Operations in C
 Operations &, |, ~, ^ Available in C
 Apply to any “integral” data type
 long, int, short, char, unsigned
 View arguments as bit vectors
 Arguments applied bit-wise
 Examples (Char data type)
 ~0x41 = 0xBE
 ~010000012 = 101111102
 ~0x00 = 0xFF
 ~000000002 = 111111112
 0x69 & 0x55 = 0x41
 011010012 & 010101012 = 010000012
 0x69 | 0x55 = 0x7D
 011010012 | 010101012 = 011111012
Contrast: Logic Operations in C
 Contrast to Logical Operators
 &&, ||, !
 View 0 as “False”
 Anything nonzero as “True”
 Always return 0 or 1
 Early termination
 Examples (char data type)
 !0x41 = 0x00
 !0x00 = 0x01
 !!0x41 = 0x01

 0x69 && 0x55 = 0x01


 0x69 || 0x55 = 0x01
 p && *p (avoids null pointer access)
Contrast: Logic Operations in C
 Contrast to Logical Operators
 &&, ||, !
 View 0 as “False”
 Anything nonzero as “True”
 Always return 0 or 1
Early termination

Watch out for && vs. & (and || vs.
 Examples (char data type)
|)…
 !0x41 = 0x00

one of the more common oopsies in


 !0x00 = 0x01
 !!0x41 = 0x01
C programming
 0x69 && 0x55 0x01
 0x69 || 0x55 = 0x01
 p && *p (avoids null pointer access)
Shift Operations
 Left Shift: x << y Argument x 01100010
 Shift bit-vector x left y positions
<< 3 00010000
 Throw away extra bits on left
 Fill with 0’s on right Log. >> 2 00011000
 Right Shift: x >> y Arith. >> 2 00011000
 Shift bit-vector x right y positions
 Throw away extra bits on right
Argument x 10100010
 Logical shift
 Fill with 0’s on left << 3 00010000

 Arithmetic shift Log. >> 2 00101000


 Replicate most significant bit on left Arith. >> 2 11101000
 Undefined Behavior
 Shift amount < 0 or ≥ word size
Encoding Integers
Unsigned Two’s Complement
w−1 w−2
B2U(X ) =  xi 2 i
B2T (X ) = − xw−1 2 w−1
+  xi 2 i
i=0 i=0

short int x = 15213;


short int y = -15213; Sign
Bit
 C short 2 bytes long
Decimal Hex Binary
x 15213 3B 6D 00111011 01101101
y -15213 C4 93 11000100 10010011

 Sign Bit
 For 2’s complement, most significant bit indicates sign
 0 for nonnegative
 1 for negative
Two-complement Encoding Example (Cont.)
x = 15213: 00111011 01101101
y = -15213: 11000100 10010011
Weight 15213 -15213
1 1 1 1 1
2 0 0 1 2
4 1 4 0 0
8 1 8 0 0
16 0 0 1 16
32 1 32 0 0
64 1 64 0 0
128 0 0 1 128
256 1 256 0 0
512 1 512 0 0
1024 0 0 1 1024
2048 1 2048 0 0
4096 1 4096 0 0
8192 1 8192 0 0
16384 0 0 1 16384
-32768 0 0 1 -32768
Sum 15213 -15213
Numeric Ranges
 Unsigned Values
 Two’s Complement Values
 UMin = 0
 TMin = –2w–1
000…0
100…0
 UMax = 2w –1
 TMax = 2w–1 – 1
111…1
011…1
 Other Values
 Minus 1
111…1
Values for W = 16
Decimal Hex Binary
UMax 65535 FF FF 11111111 11111111
TMax 32767 7F FF 01111111 11111111
TMin -32768 80 00 10000000 00000000
-1 -1 FF FF 11111111 11111111
0 0 00 00 00000000 00000000
Values for Different Word Sizes
W
8 16 32 64
UMax 255 65,535 4,294,967,295 18,446,744,073,709,551,615
TMax 127 32,767 2,147,483,647 9,223,372,036,854,775,807
TMin -128 -32,768 -2,147,483,648 -9,223,372,036,854,775,808

 Observations  C Programming
 |TMin | = TMax + 1 ▪ #include <limits.h>
 Asymmetric range ▪ Declares constants, e.g.,
 UMax = 2 * TMax + 1 ▪ ULONG_MAX
▪ LONG_MAX
▪ LONG_MIN
▪ Values platform specific
Unsigned & Signed Numeric Values
X B2U(X) B2T(X)  Equivalence
0000 0 0  Same encodings for
0001 1 1 nonnegative values
0010 2 2
0011 3 3
 Uniqueness
0100 4 4  Every bit pattern represents
0101 5 5 unique integer value
0110 6 6  Each representable integer has
0111 7 7 unique bit encoding
1000
1001
8
9
–8
–7
  Can Invert Mappings
1010 10 –6  U2B(x) = B2U-1(x)
1011 11 –5  Bit pattern for unsigned integer
1100 12 –4  T2B(x) = B2T-1(x)
1101 13 –3  Bit pattern for two’s comp
1110 14 –2 integer
1111 15 –1
Today: Bits, Bytes, and Integers
 Representing information as bits
 Bit-level manipulations
 Integers
 Representation: unsigned and signed
 Conversion, casting
 Expanding, truncating
 Addition, negation, multiplication, shifting
 Summary
 Representations in memory, pointers, strings
Mapping Between Signed & Unsigned

Two’s Complement Unsigned


T2U
x T2B B2U ux
X

Maintain Same Bit Pattern

Unsigned Two’s Complement


U2T
ux U2B X B2T x

Maintain Same Bit Pattern

 Mappings between unsigned and two’s complement numbers:


keep bit representations and reinterpret
Mapping Signed  Unsigned
Bits Signed Unsigned
0000 0 0
0001 1 1
0010 2 2
0011 3 3
0100 4 4
0101 5 T2U 5
0110 6 6
U2T
0111 7 7
1000 -8 8
1001 -7 9
1010 -6 10
1011 -5 11
1100 -4 12
1101 -3 13
1110 -2 14
1111 -1 15
Mapping Signed  Unsigned
Bits Signed Unsigned
0000 0 0
0001 1 1
0010 2 2
0011 3 3
0100 4
= 4
0101 5 5
0110 6 6
0111 7 7
1000 -8 8
1001 -7 9
1010 -6 10
+/- 16
1011 -5 11
1100 -4 12
1101 -3 13
1110 -2 14
1111 -1 15
Relation between Signed & Unsigned

Two’s Complement Unsigned


T2U
x T2B B2U ux
X

Maintain Same Bit Pattern

w–1 0
ux + + + ••• + + +
x - + + ••• + + +

Large negative weight


becomes
Large positive weight
Conversion Visualized
 2’s Comp. → Unsigned
UMax
 Ordering Inversion UMax – 1
 Negative → Big Positive

TMax + 1 Unsigned
TMax TMax Range

2’s Complement 0 0
Range –1
–2

TMin
Signed vs. Unsigned in C
 Constants
 By default are considered to be signed integers
 Unsigned if have “U” as suffix
0U, 4294967259U
 Casting
 Explicit casting between signed & unsigned same as U2T and T2U
int tx, ty;
unsigned ux, uy;
tx = (int) ux;
uy = (unsigned) ty;

 Implicit casting also occurs via assignments and procedure calls


tx = ux;
uy = ty;
Casting Surprises
 Expression Evaluation
 If there is a mix of unsigned and signed in single expression,
signed values implicitly cast to unsigned
 Including comparison operations <, >, ==, <=, >=
 Examples for W = 32: TMIN = -2,147,483,648 , TMAX =
2,147,483,647
 Constant1 Constant2 Relation Evaluation
0 0 0U
0U == unsigned
-1 -1 00 < signed
-1 -1 0U > unsigned
2147483647
2147483647 -2147483647-1
-2147483648 > signed
2147483647U
2147483647U -2147483647-1
-2147483648 < unsigned
-1 -1 -2
-2 > signed
(unsigned)-1
(unsigned) -1 -2
-2 > unsigned
2147483647
2147483647 2147483648U
2147483648U < unsigned
2147483647 (int) 2147483648U
2147483647 (int) 2147483648U > signed
Summary
Casting Signed ↔ Unsigned: Basic Rules

 Bit pattern is maintained


 But reinterpreted
 Can have unexpected effects: adding or subtracting 2w

 Expression containing signed and unsigned int


 int is cast to unsigned!!
Today: Bits, Bytes, and Integers
 Representing information as bits
 Bit-level manipulations
 Integers
 Representation: unsigned and signed
 Conversion, casting
 Expanding, truncating
 Addition, negation, multiplication, shifting
 Summary
 Representations in memory, pointers, strings
Sign Extension
 Task:
 Given w-bit signed integer x
 Convert it to w+k-bit integer with same value
 Rule:
 Make k copies of sign bit:
 X  = xw–1 ,…, xw–1 , xw–1 , xw–2 ,…, x0
k copies of MSB w
X •••

•••

X ••• •••
k w
Sign Extension Example

short int x = 15213;


int ix = (int) x;
short int y = -15213;
int iy = (int) y;

Decimal Hex Binary


x 15213 3B 6D 00111011 01101101
ix 15213 00 00 3B 6D 00000000 00000000 00111011 01101101
y -15213 C4 93 11000100 10010011
iy -15213 FF FF C4 93 11111111 11111111 11000100 10010011

 Converting from smaller to larger integer data type


 C automatically performs sign extension
Summary:
Expanding, Truncating: Basic Rules

 Expanding (e.g., short int to int)


 Unsigned: zeros added
 Signed: sign extension
 Both yield expected result

 Truncating (e.g., unsigned to unsigned short)


 Unsigned/signed: bits are truncated
 Result reinterpreted
 Unsigned: mod operation
 Signed: similar to mod
 For small numbers yields expected behaviour
Today: Bits, Bytes, and Integers
 Representing information as bits
 Bit-level manipulations
 Integers
 Representation: unsigned and signed
 Conversion, casting
 Expanding, truncating
 Addition, negation, multiplication, shifting
 Representations in memory, pointers, strings
 Summary
Unsigned Addition
u •••
Operands: w bits
+ v •••
True Sum: w+1 bits u+v •••

Discard Carry: w bits UAddw(u , v) •••

 Standard Addition Function


 Ignores carry output
 Implements Modular Arithmetic
s = UAddw(u , v) = u + v mod 2w
Visualizing (Mathematical) Integer Addition

 Integer Addition Add4(u , v)

 4-bit integers u, v Integer Addition

 Compute true sum


Add4(u , v)
 Values increase 32
28

linearly with u and v 24


20
 Forms planar surface 16
14
12 12
8 10
8
4
0
6
v
4
0
2 2
4
6
u 8
10
12
0
14
Visualizing Unsigned Addition

 Wraps Around Overflow

 If true sum ≥ 2w
UAdd4(u , v)
 At most once

True Sum 16
14
2w+1
Overflow 12
10
8
2w 6 12
14

4 10
8
2
6
v
0 0 4
Modular Sum 0
2
4 2
6
u 8
10
12
0
14
Two’s Complement Addition
Operands: w bits u •••
+ v •••
True Sum: w+1 bits u+v •••
Discard Carry: w bits TAddw(u , v) •••

 TAdd and UAdd have Identical Bit-Level Behavior


 Signed vs. unsigned addition in C:
int s, t, u, v;
s = (int) ((unsigned) u + (unsigned) v);
t = u + v
 Will give s == t
TAdd Overflow
True Sum
 Functionality
0 111…1
 True sum requires 2w–1
PosOver TAdd Result
w+1 bits
0 100…0 2w –1–1 011…1
 Drop off MSB
 Treat remaining bits 0 000…0 0 000…0
as 2’s comp. integer
1 011…1 –2w –1 100…0

1 000…0 NegOver
–2w
Visualizing 2’s Complement Addition
NegOver

 Values
TAdd4(u , v)
 4-bit two’s comp.
 Range from -8 to +7
 Wraps Around 8

If sum  2w–1
6
 4

 Becomes negative 2
0
6
 At most once -2 4
-4 2

 If sum < –2w–1 -6


-2
0

 Becomes positive
-8
-8
-4 v
-6 -6
-4
-2
 At most once u
0
2
4
6
-8
PosOver
Multiplication
 Goal: Computing Product of w-bit numbers x, y
 Either signed or unsigned
 But, exact results can be bigger than w bits
 Unsigned: up to 2w bits
 Result range: 0 ≤ x * y ≤ (2w – 1) 2 = 22w – 2w+1 + 1
 Two’s complement min (negative): Up to 2w-1 bits
 Result range: x * y ≥ (–2w–1)*(2w–1–1) = –22w–2 + 2w–1
 Two’s complement max (positive): Up to 2w bits, but only for (TMinw)2
 Result range: x * y ≤ (–2w–1) 2 = 22w–2
 So, maintaining exact results…
 would need to keep expanding word size with each product
computed
 is done in software, if needed
 e.g., by “arbitrary precision” arithmetic packages
Unsigned Multiplication in C
u •••
Operands: w bits
* v •••

True Product: 2*w bits u · v ••• •••


UMultw(u , v) •••
Discard w bits: w bits

 Standard Multiplication Function


 Ignores high order w bits
 Implements Modular Arithmetic
UMultw(u , v) = u · v mod 2w
Signed Multiplication in C
u •••
Operands: w bits
* v •••

True Product: 2*w bits u · v ••• •••


TMultw(u , v) •••
Discard w bits: w bits

 Standard Multiplication Function


 Ignores high order w bits
 Some of which are different for signed
vs. unsigned multiplication
 Lower bits are the same
Power-of-2 Multiply with Shift
 Operation
 u << k gives u * 2k
k
 Both signed and unsigned
u •••
Operands: w bits
* 2k 0 ••• 010 ••• 00
True Product: w+k bits u · 2k ••• 0 ••• 00
Discard k bits: w bits UMultw(u , 2k) ••• 0 ••• 00
TMultw(u , 2k)
 Examples
 u << 3 == u * 8
 u << 5 - u << 3 == u * 24
 Most machines shift and add faster than multiply
 Compiler generates this code automatically
Unsigned Power-of-2 Divide with Shift
 Quotient of Unsigned by Power of 2
 u >> k gives  u / 2k 
 Uses logical shift
k
u ••• ••• Binary Point
Operands:
/ 2k 0 ••• 010 ••• 00
Division: u / 2k 0 ••• 00 ••• . •••

Result:  u / 2k  0 ••• 00 •••

Division Computed Hex Binary


x 15213 15213 3B 6D 00111011 01101101
x >> 1 7606.5 7606 1D B6 00011101 10110110
x >> 4 950.8125 950 03 B6 00000011 10110110
x >> 8 59.4257813 59 00 3B 00000000 00111011
Signed Power-of-2 Divide with Shift
 Quotient of Signed by Power of 2
 x >> k gives  x / 2k 
 Uses arithmetic shift
 Rounds wrong direction when u < 0

k
x ••• ••• Binary Point
Operands:
/ 2k 0 ••• 010 ••• 00
Division: x / 2k 0 ••• ••• . •••

Result: RoundDown(x / 2k) 0 ••• •••

Division Computed Hex Binary


y -15213 -15213 C4 93 11000100 10010011
y >> 1 -7606.5 -7607 E2 49 11100010 01001001
y >> 4 -950.8125 -951 FC 49 11111100 01001001
y >> 8 -59.4257813 -60 FF C4 11111111 11000100
Correct Power-of-2 Divide
 Quotient of Negative Number by Power of 2
 Want  x / 2k  (Round Toward 0)
 Compute as  (x+2k-1)/ 2k 
 In C: (x + (1<<k)-1) >> k
 Biases dividend toward 0

Case 1: No rounding k
Dividend: u 1 ••• 0 ••• 00
+2k –1 0 ••• 0 0 1 ••• 11
1 ••• 1 ••• 11 Binary Point
Divisor: / 2k 0 ••• 0 1 0 ••• 00

 u / 2k  10 ••• 111 ••• . 1 ••• 11

Biasing has no effect


Correct Power-of-2 Divide (Cont.)

Case 2: Rounding
k
Dividend: x 1 ••• •••
+2k –1 0 ••• 0 0 1 ••• 11
1 ••• •••

Incremented by 1 Binary Point

Divisor: / 2k 0 ••• 010 ••• 00


 x / 2k  10 ••• 111 ••• . •••

Incremented by 1

Biasing adds 1 to final result


Today: Bits, Bytes, and Integers
 Representing information as bits
 Bit-level manipulations
 Integers
 Representation: unsigned and signed
 Conversion, casting
 Expanding, truncating
 Addition, negation, multiplication, shifting
 Summary
 Representations in memory, pointers, strings
Arithmetic: Basic Rules
 Addition:
 Unsigned/signed: Normal addition followed by truncate,
same operation on bit level
 Unsigned: addition mod 2w
 Mathematical addition + possible subtraction of 2w
 Signed: modified addition mod 2w (result in proper range)
 Mathematical addition + possible addition or subtraction of 2w

 Multiplication:
 Unsigned/signed: Normal multiplication followed by truncate,
same operation on bit level
 Unsigned: multiplication mod 2w
 Signed: modified multiplication mod 2w (result in proper range)
Why Should I Use Unsigned?
 Don’t Use Just Because Number Nonnegative
 Easy to make mistakes
unsigned i;
for (i = cnt-2; i >= 0; i--)
a[i] += a[i+1];
 Can be very subtle
#define DELTA sizeof(int)
int i;
for (i = CNT; i-DELTA >= 0; i-= DELTA)
. . .
 Do Use When Performing Modular Arithmetic
 Multiprecision arithmetic
 Do Use When Using Bits to Represent Sets
 Logical right shift, no sign extension

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy