0% found this document useful (0 votes)
19 views

CS 312 Lecture - 7d - Machine Level Programming-Data

The document discusses basic data types used in machine-level programming like integers, floating point numbers, arrays and structs. It describes how arrays are allocated and accessed in memory, including multi-dimensional nested arrays. Examples of array declaration and initialization are provided.

Uploaded by

Hypazia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

CS 312 Lecture - 7d - Machine Level Programming-Data

The document discusses basic data types used in machine-level programming like integers, floating point numbers, arrays and structs. It describes how arrays are allocated and accessed in memory, including multi-dimensional nested arrays. Examples of array declaration and initialization are provided.

Uploaded by

Hypazia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Carnegie Mellon

Machine-Level Programming IV:


Data

Topics:
•Arrays

•Structs

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 1


Carnegie Mellon

Basic Data Types


Integral
 Stored & operated on in general registers

 Signed vs. unsigned depends on instructions used

byte b 1 byte [unsigned] char


word w 2 [unsigned] short
double word l 4 [unsigned] int
quad word q 8 [unsigned] int
Floating Point: stored and operated in FP registers
Single s 4 bytes (float)
Double l 8 (double)
Extended t 10/12 long double

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 2


Carnegie Mellon

Array Allocation
 Basic Principle
T A[L];
 Array named A, data type T and length L
 Contiguously allocated region of L * sizeof(T) bytes in memory

char string[12];

x x + 12

int val[5];

x x+4 x+8 x + 12 x + 16 x + 20

double a[3];

x x+8 x + 16 x + 24

char *p[3];

x x+8 x + 16 x + 24

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 3


Carnegie Mellon

Array Access
 Basic Principle

int val[5]; 1 5 2 1 3
x x+4 x+8 x + 12 x + 16 x + 20

 Reference Type Value


val[4] int 3
val int * x
val+1 int * x + 4
&val[2] int * x + 8
val[5] int ??
*(val+1) int 5
val + i int * x + 4 i

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 4


Carnegie Mellon

Array Example
#define ZLEN 5
typedef int zip_dig[ZLEN];

zip_dig cmu = { 1, 5, 2, 1, 3 };
zip_dig mit = { 0, 2, 1, 3, 9 };
zip_dig ucb = { 9, 4, 7, 2, 0 };

zip_dig cmu; 1 5 2 1 3
16 20 24 28 32 36
zip_dig mit; 0 2 1 3 9
36 40 44 48 52 56
zip_dig ucb; 9 4 7 2 0
56 60 64 68 72 76

 Declaration “zip_dig cmu” equivalent to “int cmu[5]”


 Example arrays were allocated in successive 20 byte blocks
 Not guaranteed to happen in general
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 5
Carnegie Mellon

Array Accessing Example


int get_digit  Register %rdi contains
(zip_dig z, int digit)
{
starting address of array
return z[digit];  Register %rsi contains
} array index
 Desired digit at
%rdi + 4*%rsi
 Use memory reference
(%rdi,%rsi,4)

Memory Reference Code


# %rdi = z
# %rsi = digit
movl (%rdi,%rsi,4), %eax # z[digit]

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 6


Carnegie Mellon

Array Loop Example


void zincr(zip_dig z) {
size_t i;
for (i = 0; i < ZLEN; i++)
z[i]++;
}

# %rdi = z
movl $0, %eax # i = 0
jmp .L3 # goto middle
.L4: # loop:
addl $1, (%rdi,%rax,4) # z[i]++
addq $1, %rax # i++
.L3: # middle
cmpq $4, %rax # i:4
jbe .L4 # if <=, goto loop
ret

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 7


Carnegie Mellon

Multidimensional (Nested) Arrays


 Declaration A[0][0] • • • A[0][C-1]
T A[R][C];
 2D array of data type T • •
• •
 R rows, C columns • •
 Type T element requires K bytes
A[R-1][0] • • • A[R-1][C-1]
 Array Size
 R * C * K bytes
 Arrangement
 Row-Major Ordering

int A[R][C];
A A A A A A
[0] • • • [0] [1] • • • [1] •  •  • [R-1] • • • [R-1]
[0] [C-1] [0] [C-1] [0] [C-1]

4*R*C Bytes
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 8
Carnegie Mellon

Nested Array Example


#define PCOUNT 4
zip_dig pgh[PCOUNT] =
{{1, 5, 2, 0, 6},
{1, 5, 2, 1, 3 },
{1, 5, 2, 1, 7 },
{1, 5, 2, 2, 1 }};

zip_dig
1 5 2 0 6 1 5 2 1 3 1 5 2 1 7 1 5 2 2 1
pgh[4];

76 96 116 136 156

 “zip_dig pgh[4]” equivalent to “int pgh[4][5]”


 Variable pgh: array of 4 elements, allocated contiguously
 Each element is an array of 5 int’s, allocated contiguously
 “Row-Major” ordering of all elements in memory
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 9
Carnegie Mellon

Nested Array Row Access


 Row Vectors
Given an nested array declaration A[R][C], you can think of this as an
array of arrays.
 A[i] is array of C elements
 Each element of type T requires K bytes
 Starting address of A[i] is A + i * (C * K)

A[0] A[i] A[R-1]

A A A A A A
[0] • • • [0] •  •  • [i] • • • [i] •  •  • [R-1] • • • [R-1]
[0] [C-1] [0] [C-1] [0] [C-1]

A A+(i*C*4) A+((R-1)*C*4)

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 10


Carnegie Mellon

Nested Array Element Access


 Array Elements
 A[i][j] is element of type T, which requires K bytes
 Address is A + i * (C * K) + j * K = A + (i * C + j)* K

int A[R][C];

A[0] A[i] A[R-1]

A A A A A
[0] • • • [0] •  •  • • • • [i] • • • •  •  • [R-1] • • • [R-1]
[0] [C-1] [j] [0] [C-1]

A A+(i*C*4) A+((R-1)*C*4)

A+(i*C*4)+(j*4)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 11
Carnegie Mellon

Multi-Level Array Example


zip_dig cmu = { 1, 5, 2, 1, 3 };
 Variable univ denotes
zip_dig mit = { 0, 2, 1, 3, 9 }; array of 3 elements
zip_dig ucb = { 9, 4, 7, 2, 0 };  Each element is a pointer
(8 bytes)
#define UCOUNT 3
int *univ[UCOUNT] = {mit, cmu, ucb};  Each pointer points to array
of int’s

cmu
1 5 2 1 3
univ
16 20 24 28 32 36
160 36 mit
0 2 1 3 9
168 16
176 56 ucb 36 40 44 48 52 56
9 4 7 2 0
56 60 64 68 72 76

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 12


Carnegie Mellon

Element Access in Multi-Level Array


int get_univ_digit
(size_t index, size_t digit)
{
return univ[index][digit];
}

salq $2, %rsi # 4*digit


addq univ(,%rdi,8), %rsi # p = univ[index] + 4*digit
movl (%rsi), %eax # return *p
ret

 Computation
 Element access Mem[Mem[univ+8*index]+4*digit]
 Must do two memory reads
 First get pointer to row array
 Then access element within array
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 13
Carnegie Mellon

Array Element Accesses


Nested array Multi-level array
int get_pgh_digit int get_univ_digit
(size_t index, size_t digit) (size_t index, size_t digit)
{ {
return pgh[index][digit]; return univ[index][digit];
} }

Accesses looks similar in C, but address computations very different:

Mem[pgh+20*index+4*digit] Mem[Mem[univ+8*index]+4*digit]

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 14


Carnegie Mellon

N X N Matrix #define N 16
typedef int fix_matrix[N][N];
Code /* Get element a[i][j] */
int fix_ele(fix_matrix a,
 Fixed dimensions size_t i, size_t j)
 Know value of N at {
compile time return a[i][j];
}
#define IDX(n, i, j) ((i)*(n)+(j))
 Variable dimensions, /* Get element a[i][j] */
explicit indexing int vec_ele(size_t n, int *a,
 Traditional way to size_t i, size_t j)
implement dynamic {
arrays return a[IDX(n,i,j)];
}

/* Get element a[i][j] */


 Variable dimensions, int var_ele(size_t n, int a[n][n],
implicit indexing size_t i, size_t j) {
 Now supported by gcc return a[i][j];
}
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 15
Carnegie Mellon

16 X 16 Matrix Access
 Array Elements
 Address A + i * (C * K) + j * K
 C = 16, K = 4

/* Get element a[i][j] */


int fix_ele(fix_matrix a, size_t i, size_t j) {
return a[i][j];
}

# a in %rdi, i in %rsi, j in %rdx


salq $6, %rsi # 64*i
addq %rsi, %rdi # a + 64*i
movl (%rdi,%rdx,4), %eax # M[a + 64*i + 4*j]
ret

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 16


Carnegie Mellon

n X n Matrix Access
 Array Elements
 Address A + i * (C * K) + j * K
 C = n, K = 4
 Must perform integer multiplication
/* Get element a[i][j] */
int var_ele(size_t n, int a[n][n], size_t i, size_t j)
{
return a[i][j];
}

# n in %rdi, a in %rsi, i in %rdx, j in %rcx


imulq %rdx, %rdi # n*i
leaq (%rsi,%rdi,4), %rax # a + 4*n*i
movl (%rax,%rcx,4), %eax # a + 4*n*i + 4*j
ret

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 17


Carnegie Mellon

Structure Representation
r
struct rec {
int a[4];
size_t i; a i next
struct rec *next;
0 16 24 32
};

 Structure represented as block of memory


 Big enough to hold all of the fields
 Fields ordered according to declaration
 Even if another ordering could yield a more compact
representation
 Compiler determines overall size + positions of fields
 Machine-level program has no understanding of the structures
in the source code

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 18


Carnegie Mellon

Generating Pointer to Structure Member


r r+4*idx
struct rec {
int a[4];
size_t i; a i next
struct rec *next;
0 16 24 32
};

 Generating Pointer to int *get_ap


(struct rec *r, size_t idx)
Array Element {
 Offset of each structure return &r->a[idx];
member determined at }
compile time
 Compute as r + 4*idx # r in %rdi, idx in %rsi
leaq (%rdi,%rsi,4), %rax
ret

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 19


Carnegie Mellon

struct rec {
int a[4];
Following Linked List int i;
struct rec *next;
 C Code };
r
void set_val
(struct rec *r, int val) a i next
{
0 16 24 32
while (r) {
int i = r->i; Element i
r->a[i] = val;
r = r->next; Register Value
} %rdi r
}
%rsi val

.L11: # loop:
movslq 16(%rdi), %rax # i = M[r+16]
movl %esi, (%rdi,%rax,4) # M[r+4*i] = val
movq 24(%rdi), %rdi # r = M[r+24]
testq %rdi, %rdi # Test r
jne .L11 # if !=0 goto loop
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 20
Carnegie Mellon

Structures & Alignment


 Unaligned Data struct
struct S1
S1 {{
char
char c;
c;
c i[0] i[1] v int
int i[2];
i[2];
p p+1 p+5 p+9 p+17 double
double v;v;
}} *p;
*p;
 Aligned Data
 Primitive data type requires K bytes
 Address must be multiple of K

c 3 bytes i[0] i[1] 4 bytes v


p+0 p+4 p+8 p+16 p+24

Multiple of 4 Multiple of 8

Multiple of 8 Multiple of 8
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 21
Carnegie Mellon

Alignment Principles
 Aligned Data
 Primitive data type requires K bytes
 Address must be multiple of K
 Required on some machines; advised on x86-64
 Motivation for Aligning Data
 Memory accessed by (aligned) chunks of 4 or 8 bytes (system
dependent)
 Inefficient to load or store datum that spans quad word
boundaries
 Virtual memory trickier when datum spans 2 pages
 Compiler
 Inserts gaps in structure to ensure correct alignment of fields

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 22


Carnegie Mellon

Specific Cases of Alignment (x86-64)


 1 byte: char, …
 no restrictions on address
 2 bytes: short, …
 lowest 1 bit of address must be 02
 4 bytes: int, float, …
 lowest 2 bits of address must be 002
 8 bytes: double, long, char *, …
 lowest 3 bits of address must be 0002
 16 bytes: long double (GCC on Linux)
 lowest 4 bits of address must be 00002

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 23


Carnegie Mellon

Satisfying Alignment with Structures


 Within structure: struct
struct S1
S1 {{
 Must satisfy each element’s alignment requirement char
char c;
c;
 Overall structure placement int
int i[2];
i[2];
double
double v;v;
 Each structure has alignment requirement K }} *p;
*p;
K = Largest alignment of any element

 Initial address & structure length must be multiples of K
 Example:
 K = 8, due to double element

c 3 bytes i[0] i[1] 4 bytes v


p+0 p+4 p+8 p+16 p+24

Multiple of 4 Multiple of 8

Multiple of 8 Multiple of 8
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 24
Carnegie Mellon

Meeting Overall Alignment Requirement

 For largest alignment requirement K struct


struct S2
S2 {{
double
double v;v;
 Overall structure must be multiple of K
int
int i[2];
i[2];
char
char c;
c;
}} *p;
*p;

v i[0] i[1] c 7 bytes


p+0 p+8 p+16 p+24

Multiple of K=8

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 25


Carnegie Mellon

Arrays of Structures
struct
struct S2
S2 {{
 Overall structure length double
double v;v;
int
int i[2];
i[2];
multiple of K
char
char c;
c;
 Satisfy alignment requirement }} a[10];
a[10];
for every element

a[0] a[1] a[2] •••


a+0 a+24 a+48 a+7
2

v i[0] i[1] c 7 bytes


a+24 a+32 a+40 a+48
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 26
Carnegie Mellon

Accessing Array Elements struct


struct S3
short
S3 {{
short i;
i;
float
float v;
v;
 Compute array offset 12*idx short
short j;
j;
}} a[10];
a[10];
 sizeof(S3), including alignment spacers
 Element j is at offset 8 within structure
 Assembler gives offset a+8
 Resolved during linking
a[0] ••• a[idx] •••
a+0 a+12 a+12*idx

i 2 bytes v j 2 bytes
a+12*idx a+12*idx+8

short
short get_j(int
get_j(int idx)
idx) ## %rdi
%rdi == idx
idx
{{
leaq
leaq (%rdi,%rdi,2),%rax
(%rdi,%rdi,2),%rax ## 3*idx
3*idx
return
return a[idx].j;
a[idx].j; movzwl
movzwl a+8(,%rax,4),%eax
a+8(,%rax,4),%eax
}}
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 27
Carnegie Mellon

Saving Space
 Put large data types first
struct
struct S4
S4 {{ struct
struct S5
S5 {{
char
char c;
c; int
int i;
i;
int
int i;
i; char
char c;
c;
char
char d;
d; char
char d;
d;
}} *p;
*p; }} *p;
*p;
 Effect (K=4)

c 3 bytes i d 3 bytes

i c d 2 bytes

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 28


Carnegie Mellon

Summary
 Arrays
 Elements packed into contiguous region of memory
 Use index arithmetic to locate individual elements
 Structures
 Elements packed into single region of memory
 Access using offsets determined by compiler
 Possible require internal and external padding to ensure alignment

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 29

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy