03 - Dynamic Arrays and Linked Lists
03 - Dynamic Arrays and Linked Lists
Building Blocks
● An important concept in this course is that of the abstract data type (or ADT).
An ADT is simply a data type as viewed by the user of that data type (e.g. a
programmer using it in their code).
● We will explore several common ADTs in this course, including stacks, queues,
trees, priority queues, graphs, etc.
● However, it will also be our goal in this course to implement many ADTs. At this
point, we will need to start worrying about implementation details, including
specifically how data is stored in memory, what algorithms are used to implement
operations, etc. This view of a data type, from the implementor’s perspective, is
known as a data structure.
○ In other words, data structures serve as the building blocks of ADTs.
● Two of the most essential of these building blocks are the dynamic array and
the linked list. We’ll be able to use these to build a number of different ADTs.
Thus, these are the first data structures we’ll explore here.
Dynamic arrays
● Arrays are a very useful data type. One of the main reasons arrays are so useful
is the fact that they are stored in a contiguous block of memory (i.e. a collection
of memory where the memory blocks have sequential/consecutive addresses).
● Because of this, arrays allow direct access (also called random access) to the
data they contain. That is, they allow each individual element of data to be
accessed directly in the same amount of time it takes to access any other
element of the data, regardless of how many individual elements are stored.
● For example, just think about a large array allocated in a C program, e.g.:
● In this array, we can access any element simply by using its array subscript.
Moreover, it will take the same amount of time to access any individual element,
regardless of where in the array that element lives. For example, these two
instructions will take the same amount of time:
array[0] = 0;
array[999999] = 0;
● One of the main drawbacks of a standard array (technically called a static array)
is the fact that they have a fixed size, and this size must be specified when the
array is created (e.g. specified in the call to malloc() in a C program).
● This might be a problem if we don’t know exactly how much data we’ll want to
store in the array. For example, if we end up needing to store more data than we
initially allocated memory for, then we will need to allocate more memory.
● A dynamic array is a data structure that doesn’t have a fixed capacity, like a
static array. Instead a dynamic array has a variable size and can grow as
needed as more elements are inserted into it.
● In other words, a dynamic array is an array that hides the details of managing the
underlying memory storage (e.g. allocating more memory when needed) behind
a simple interface.
● A more challenging situation is when a user wants to insert into a dynamic array
whose size is equal to its capacity. In this situation, there isn’t any available
space in which we can just insert the new element, so we need to increase the
capacity of the underlying data storage array (i.e. data) in order to make space
for the new element.
● Indata
this situation, where size is equal to its capacity, insertion into the dynamic
array requires several steps. We’ll walk through an example to illustrate all of
sizeImagine
them. = 4 that before the insert5operation was called, our dynamic array
looked like this (again,
capacity = 4 assuming for simplicity
13 that we’re only storing integers):
8
31
● Let’s now imagine that the user calls the insert operation to insert the value 16
at the end of the array. Our first step here will be to allocate a new array that has
twice the capacity of the current underlying data storage array. For the moment,
this new array will not be part of the dynamic array structure itself. In other
words, the underlying data storage array (i.e. data) will not change quite yet.
● Once the new, bigger array (let’s call it new_data) is allocated, we’ll copy all of
the values from the current data storage array into the corresponding locations in
the new array (i.e. copying from data to new_data):
data new_data
size = 4 5 5
capacity = 4 13 13
8 8
31 31
● Once the stored values are copied over to the new array, we actually no longer
data new_data
need the old data storage array, so we can free it and update the dynamic array
tosize
use the=new
4 array (i.e. new_data)5as its underlying data storage
5 array
(updating
capacity the dynamic
= 8 array’s capacity accordingly):
13free( 13
8
) 8
31 31
● Finally, with the additional capacity we’ve gained by doubling the size of the
underlying data storage array, we have space to insert the new value. We can
do that now, increasing the dynamic array’s size by 1 to reflect the new stored
value:
data
size = 5 5
capacity = 8 13
8
31
16
● We’ll formally analyze the performance of the dynamic array a little later in the
course. For now, though, try to think about what some advantages of the
dynamic array are and what some of its disadvantages are.
Linked lists
● Linked lists (like dynamic arrays) are a linear data structure. In other words,
data in a linked list (like data in a dynamic array) forms a linear sequence, with
the individual data elements placed one after the other within the data structure.
● A linked list differs from a dynamic array, however, in that each individual value in
a linked list is stored in its own small structure called a link, and these individual
links are chained together into a sequence by having each one “point” to the next
(and sometimes the previous) link in the list.
● In other words, each link in a linked list stores exactly one value and (at a
minimum) points to the next link in the list. Thus, a simple link structure needs at
least two fields:
○ value – This is where the value associated with the link is stored.
○ next – This points to the next link in the list (or to NULL, if there is no next
link).
● A linked list in which each link points only to the next link in the list is known as a
singly-linked list.
● Importantly, a linked list always contains exactly as many links as it has stored
values, and links are allocated and freed as values are added and removed,
respectively, from the list.
● The
head
simplest form
valu
of linked
next
list keeps
valu
track only of the valu
next
first element
next
in the list,
which is known as e the head of the list.e For example, here’s
e NULL
what a singly-linked
list would look like if we were keeping track of just the head (e.g. in a pointer
called head):
● There are several variations on this simple singly-linked list. One of these
involves
hea keepingvaltrack nex
of both the head
val of nex
the list and the
val tail, or last element.
nex
This allows us to have easy access to both ends of the list. It would look
d ue t ue t ue t NULL
something like this:
tai
l
NUL NUL
● In a doubly-linked list, it’s easy to move both forward and backward in the list,
whereas in a singly-linked list, it’s really only possible to move forward from one
link to the next one.
● We can always insert a new element at any point in the list, either between two
existing links, at the head of the list, or at the tail of the list. We’ll step through an
example here where we insert a new link between two existing links, pointing out
the ways insertion would differ if we were inserting at the head or tail.
value next
8
● Note here that if we were inserting the value 8 at the end of the list (i.e. at the
tail), the link containing 4 would have a next field that pointed to NULL, so our
new link containing 8 would also have its next field set to point to NULL. In
other words, regardless of where a new link is being inserted, it’s next field will
always be set to point to the same place as the next field of the link after which
the new one is being inserted.
● Again, note that if we were inserting a new link at the head of the list, we’d
update the entire list’s head pointer to point to the new link, since there would be
no other link before the new one.
● Now the new value is fully inserted into the list, as we can both reach it from a
previous link and reach the next link from it.
● Imagine we’re working with a singly-linked list that contains (at least) the
following links:
valu next valu next valu next
e e e
4 8 16
● If there was no link before the one being removed here (e.g. the link containing 8
was the head of the list), we’d simply update the list’s head pointer to point
around the link being removed. Similarly, if we were removing the link at the end
of the list (i.e. the tail), we would update the previous link’s next field to point to
NULL.
● At this point, the link that we’re removing is effectively taken out of the list, since
we can’t reach it from any other link. All we need to do now is free that link:
free(
valu next
) next
valu valu next
● Note that we didn’t even need to make any modifications to the link being
removed (including updating its next field), since freeing it wipes out the values
of its fields.
● Again, if we were working within a doubly-linked list, we’d have to update the
prev field of the link after the one being removed.
● Just like with the dynamic array, we’ll more formally analyze the performance of
the linked list a little later in the course. Again, though, spend a minute now to
think about what some of the advantages and disadvantages of the linked list
might be.