Data Structures C1
Data Structures C1
in
Contents
1 Introduction 1
1.1 What this book is, and what it isn’t . . . . . . . . . . . . . . . . 1
1.2 Assumed knowledge . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.1 Big Oh notation . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.2 Imperative programming language . . . . . . . . . . . . . 3
1.2.3 Object oriented concepts . . . . . . . . . . . . . . . . . . 4
1.3 Pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Tips for working through the examples . . . . . . . . . . . . . . . 6
1.5 Book outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.7 Where can I get the code? . . . . . . . . . . . . . . . . . . . . . . 7
1.8 Final messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
I Data Structures 8
2 Linked Lists 9
2.1 Singly Linked List . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.2 Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.3 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.4 Traversing the list . . . . . . . . . . . . . . . . . . . . . . 12
2.1.5 Traversing the list in reverse order . . . . . . . . . . . . . 13
2.2 Doubly Linked List . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.1 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.2 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 Reverse Traversal . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Smartzworld.com 1 jntuworldupdates.org
Smartworld.asia Specworld.in
3.7.2 Postorder . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.7.3 Inorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.7.4 Breadth First . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4 Heap 32
4.1 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.3 Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.4 Traversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5 Sets 44
5.1 Unordered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.1.1 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.2 Ordered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6 Queues 48
6.1 A standard queue . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.2 Priority Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.3 Double Ended Queue . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7 AVL Tree 54
7.1 Tree Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
7.2 Tree Rebalancing . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
7.3 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
7.4 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
II Algorithms 62
8 Sorting 63
8.1 Bubble Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
8.2 Merge Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
8.3 Quick Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
8.4 Insertion Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.5 Shell Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
8.6 Radix Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
8.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
9 Numeric 72
9.1 Primality Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
9.2 Base conversions . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
9.3 Attaining the greatest common denominator of two numbers . . 73
9.4 Computing the maximum value for a number of a specific base
consisting of N digits . . . . . . . . . . . . . . . . . . . . . . . . . 74
9.5 Factorial of a number . . . . . . . . . . . . . . . . . . . . . . . . 74
9.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
II
Smartzworld.com 2 jntuworldupdates.org
Smartworld.asia Specworld.in
10 Searching 76
10.1 Sequential Search . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
10.2 Probability Search . . . . . . . . . . . . . . . . . . . . . . . . . . 76
10.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
11 Strings 79
11.1 Reversing the order of words in a sentence . . . . . . . . . . . . . 79
11.2 Detecting a palindrome . . . . . . . . . . . . . . . . . . . . . . . 80
11.3 Counting the number of words in a string . . . . . . . . . . . . . 81
11.4 Determining the number of repeated words within a string . . . . 83
11.5 Determining the first matching character between two strings . . 84
11.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
A Algorithm Walkthrough 86
A.1 Iterative algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 86
A.2 Recursive Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 88
A.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
B Translation Walkthrough 91
B.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
D Testing 97
D.1 What constitutes a unit test? . . . . . . . . . . . . . . . . . . . . 97
D.2 When should I write my tests? . . . . . . . . . . . . . . . . . . . 98
D.3 How seriously should I view my test suite? . . . . . . . . . . . . . 99
D.4 The three A’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
D.5 The structuring of tests . . . . . . . . . . . . . . . . . . . . . . . 99
D.6 Code Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
D.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
III
Smartzworld.com 3 jntuworldupdates.org
Smartworld.asia Specworld.in
Chapter 1
Introduction
1. Big Oh notation
2. An imperative programming language
3. Object oriented concepts
Smartzworld.com 4 jntuworldupdates.org
Smartworld.asia Specworld.in
CHAPTER 1. INTRODUCTION 2
Figure 1.1 shows some of the run times to demonstrate how important it is to
choose an efficient algorithm. For the sanity of our graph we have omitted cubic
O(n3 ), and exponential O(2n ) run times. Cubic and exponential algorithms
should only ever be used for very small problems (if ever!); avoid them if feasibly
possible.
The following list explains some of the most common big Oh notations:
O(1) constant: the operation doesn’t depend on the size of its input, e.g. adding
a node to the tail of a linked list where we always maintain a pointer to
the tail node.
O(n) linear: the run time complexity is proportionate to the size of n.
O(log n) logarithmic: normally associated with algorithms that break the problem
into smaller chunks per each invocation, e.g. searching a binary search
tree.
O(n log n) just n log n: usually associated with an algorithm that breaks the problem
into smaller chunks per each invocation, and then takes the results of these
smaller chunks and stitches them back together, e.g. quick sort.
O(n2 ) quadratic: e.g. bubble sort.
O(n3 ) cubic: very rare.
O(2n ) exponential: incredibly rare.
If you encounter either of the latter two items (cubic and exponential) this is
really a signal for you to review the design of your algorithm. While prototyp-
ing algorithm designs you may just have the intention of solving the problem
irrespective of how fast it works. We would strongly advise that you always
review your algorithm design and optimise where possible—particularly loops
Smartzworld.com 5 jntuworldupdates.org
Smartworld.asia Specworld.in
CHAPTER 1. INTRODUCTION 3
and recursive calls—so that you can get the most efficient run times for your
algorithms.
The biggest asset that big Oh notation gives us is that it allows us to es-
sentially discard things like hardware. If you have two sorting algorithms, one
with a quadratic run time, and the other with a logarithmic run time then the
logarithmic algorithm will always be faster than the quadratic one when the
data set becomes suitably large. This applies even if the former is ran on a ma-
chine that is far faster than the latter. Why? Because big Oh notation isolates
a key factor in algorithm analysis: growth. An algorithm with a quadratic run
time grows faster than one with a logarithmic run time. It is generally said at
some point as n → ∞ the logarithmic algorithm will become faster than the
quadratic algorithm.
Big Oh notation also acts as a communication tool. Picture the scene: you
are having a meeting with some fellow developers within your product group.
You are discussing prototype algorithms for node discovery in massive networks.
Several minutes elapse after you and two others have discussed your respective
algorithms and how they work. Does this give you a good idea of how fast each
respective algorithm is? No. The result of such a discussion will tell you more
about the high level algorithm design rather than its efficiency. Replay the scene
back in your head, but this time as well as talking about algorithm design each
respective developer states the asymptotic run time of their algorithm. Using
the latter approach you not only get a good general idea about the algorithm
design, but also key efficiency data which allows you to make better choices
when it comes to selecting an algorithm fit for purpose.
Some readers may actually work in a product group where they are given
budgets per feature. Each feature holds with it a budget that represents its up-
permost time bound. If you save some time in one feature it doesn’t necessarily
give you a buffer for the remaining features. Imagine you are working on an
application, and you are in the team that is developing the routines that will
essentially spin up everything that is required when the application is started.
Everything is great until your boss comes in and tells you that the start up
time should not exceed n ms. The efficiency of every algorithm that is invoked
during start up in this example is absolutely key to a successful product. Even
if you don’t have these budgets you should still strive for optimal solutions.
Taking a quantitative approach for many software development properties
will make you a far superior programmer - measuring one’s work is critical to
success.
1. C++
2. C#
3. Java
Smartzworld.com 6 jntuworldupdates.org
Smartworld.asia Specworld.in
CHAPTER 1. INTRODUCTION 4
The reason that we are explicit in this requirement is simple—all our imple-
mentations are based on an imperative thinking style. If you are a functional
programmer you will need to apply various aspects from the functional paradigm
to produce efficient solutions with respect to your functional language whether
it be Haskell, F#, OCaml, etc.
Two of the languages that we have listed (C# and Java) target virtual
machines which provide various things like security sand boxing, and memory
management via garbage collection algorithms. It is trivial to port our imple-
mentations to these languages. When porting to C++ you must remember to
use pointers for certain things. For example, when we describe a linked list
node as having a reference to the next node, this description is in the context
of a managed environment. In C++ you should interpret the reference as a
pointer to the next node and so on. For programmers who have a fair amount
of experience with their respective language these subtleties will present no is-
sue, which is why we really do emphasise that the reader must be comfortable
with at least one imperative language in order to successfully port the pseudo-
implementations in this book.
It is essential that the user is familiar with primitive imperative language
constructs before reading this book otherwise you will just get lost. Some algo-
rithms presented in this book can be confusing to follow even for experienced
programmers!
1.3 Pseudocode
Throughout this book we use pseudocode to describe our solutions. For the
most part interpreting the pseudocode is trivial as it looks very much like a
more abstract C++, or C#, but there are a few things to point out:
1. Pre-conditions should always be enforced
2. Post-conditions represent the result of applying algorithm a to data struc-
ture d
Smartzworld.com 7 jntuworldupdates.org
Smartworld.asia Specworld.in
CHAPTER 1. INTRODUCTION 5
Immediately after the algorithm signature we list any Pre or Post condi-
tions.
1) algorithm AlgorithmName(n)
2) Pre: n is the value to compute the factorial of
3) n≥0
4) Post: the factorial of n has been computed
5) // ...
n) end AlgorithmName
Smartzworld.com 8 jntuworldupdates.org
Smartworld.asia Specworld.in
CHAPTER 1. INTRODUCTION 6
The reader doesn’t have to read the book sequentially from beginning to
end: chapters can be read independently from one another. We suggest that
in part 1 you read each chapter in its entirety, but in part 2 you can get away
with just reading the section of a chapter that describes the algorithm you are
interested in.
Each of the chapters on data structures present initially the algorithms con-
cerned with:
1. Insertion
2. Deletion
3. Searching
The previous list represents what we believe in the vast majority of cases to
be the most important for each respective data structure.
For all readers we recommend that before looking at any algorithm you
quickly look at Appendix E which contains a table listing the various symbols
used within our algorithms and their meaning. One keyword that we would like
to point out here is yield. You can think of yield in the same light as return.
The return keyword causes the method to exit and returns control to the caller,
whereas yield returns each value to the caller. With yield control only returns
to the caller when all values to return to the caller have been exhausted.
Smartzworld.com 9 jntuworldupdates.org
Smartworld.asia Specworld.in
CHAPTER 1. INTRODUCTION 7
1.6 Testing
All the data structures and algorithms have been tested using a minimised test
driven development style on paper to flesh out the pseudocode algorithm. We
then transcribe these tests into unit tests satisfying them one by one. When
all the test cases have been progressively satisfied we consider that algorithm
suitably tested.
For the most part algorithms have fairly obvious cases which need to be
satisfied. Some however have many areas which can prove to be more complex
to satisfy. With such algorithms we will point out the test cases which are tricky
and the corresponding portions of pseudocode within the algorithm that satisfy
that respective case.
As you become more familiar with the actual problem you will be able to
intuitively identify areas which may cause problems for your algorithms imple-
mentation. This in some cases will yield an overwhelming list of concerns which
will hinder your ability to design an algorithm greatly. When you are bom-
barded with such a vast amount of concerns look at the overall problem again
and sub-divide the problem into smaller problems. Solving the smaller problems
and then composing them is a far easier task than clouding your mind with too
many little details.
The only type of testing that we use in the implementation of all that is
provided in this book are unit tests. Because unit tests contribute such a core
piece of creating somewhat more stable software we invite the reader to view
Appendix D which describes testing in more depth.
If you always follow these key points, you will get the most out of this book.
1 All readers are encouraged to provide suggestions, feature requests, and bugs so we can
Smartzworld.com 10 jntuworldupdates.org
Smartworld.asia Specworld.in
Part I
Data Structures
Smartzworld.com 11 jntuworldupdates.org