DS Through C
DS Through C
aon
ein
62
even farther than that. A field of computer education,
family with computer literate and anyone who has
people who have achieved graduated in Computer or
this acumen and expertise by Years of | Electronics has graduated
the perpetual quest for Excellence with BPB books. Our
excellence by BPB. With such chairman Shri G.C. Jain has
adornments as 6000 been honoured with the
publications, over 90 million PADMASHREE award by the
books sold over the years Hon’ble President of India for
and about the same number his great contribution in
of reader base, it is a matter spreading the IT education in
of no surprise that BPB titles India. There is no doubt that
are prescribed as standard BPB has reached out and
courseware at most of the| Million Books | changed the lives of millions
leading schools, institutes, Sold Worldwide | of people. All over India and
and universities in India. | Asia, it is its endeavour to do
more.
www.bpbonline.com
e Computer Books ®E Books ®\/ideo Courses
Data Structures Through
C in Depth ;
Second Revised & Updated Edition _
by
S.K. Srivastava
Deepali Srivastava
BPB PUBLICATIONS |
AFF 20 Ansari Road, Darya Ganj, New Delhi-110002
REPRINTED 2021 |
SECOND REVISED & UPDATED EDITION 2011
Copyright © 2011 BPB Publications, INDIA
ISBN 10: 81-7656-741-8
ISBN 13: 978-81-7656-741-1 |
All Rights Reserved. No part of this publication can be stored in a retrieval system or
reproduced in any form or by any means without the prior written permission of the
publishers.
The Author and Publisher of this book have tried their best to ensure that the programs,
procedures and functions described in the book are correct. However, the author and the
publishers make no warranty of any kind, expressed or implied, with regard to these
programs or the documentation contained in the book. The Author & Publisher shall not be
liable in any event of any damages, incidental or consequential, in connection with, or arising
out of the furnishing, performance or use of these programs, procedures and functions,
Product name mentioned are used for dentification purposes only and may be trademarks of
their respective companies.
All trademarks referred to in the book are acknowledged as properties of their respective
owners.
Distributors:
Published by Manish Jain for BPB Publications, 20 Ansari Road, Darya Ganj, New Delhi-
110002, and Printed by Adinath Printer, New Delhi.
To our Parents
and
our lovely daughter
DEVANSHI
Cloud
Computing RESEARCH
COOKBOOK FOR
MOBILE ROBOTIC
PLATFORM CONTROL
PAPERS
YASHISNT
RANETKAR ae
Fythani Future Enbace Fast
: earnPrthaa nicky
Atopsonnetiimith Gee
IE SA Sa
ai
‘ os
The study of data structures is an integral part of any course in Computer Science. This book focuses on various
data structures and their implementation in C language. The prerequisite for this book is a basic knowledge of
programming in C language. We assume that the reader is familiar with the structured programming concepts
and syntax of C language. This book can be used by students for self study as the concepts are explained in
step-by-step manner followed by clear and easy to comprehend complete programs. We have followed a figure-
oriented approach in this book, so there are numerous figures and tables throughout the book to illustrate the
working.of algorithms. This is the second edition of the book, most of the topics have been thoroughly rewritten
and many new topics have been added but we have tried to retain the presentation style and simplicity of the
previous edition. A brief synopSis of the book follows- :
¢ Chapter | is an introduction to data structures, abstract data types, algorithms and efficiency.
¢ Chapter 2 covers fundamental topics of C, like arrays, pointers and structures. It helps in refreshing your
knowledge of C, which is essential for understanding the implementations of various algorithms given in the
book.
¢ Chapter 3 introduces the data structure linked list and discusses is different variations.
¢ Chapter 4 covers the data structures stacks and queues and their different implementations. Applications of
stacks are discussed in detail.
‘Chapter 5 discusses the concept of recursion. Many students have problem in understanding recursive
solutions, so we decided to add a chapter on recursion in the new edition. This chapter explains recursion by
tracing different recursive examples.
¢ Chapter 6 deals with an important data structure Trees. Traversal, insertion and deletion in binary search trees,
threaded trees, AVL trees, red black trees, B-trees are explained in detail with various examples.
¢ Chapter 7 introduces the data structure Graphs. It covers different implementations of graphs and different
algorithms related to graph data structure.
¢ Chapter 8 covers different sorting methods with their analysis.
_ # Chapter 9 describes various searching and hashing techniques.
¢ Chapter 10 discusses storage management concepts.
The exercises given at the end of chapters are thoughtful and promote deep understanding of the subject. They
help in testing your understanding of the concepts and in applying those concepts, we recommend that readers
go through all the exercises and try to solve them before looking at the solutions. Solutions for all the exercises
are provided in the book and the companion disc. The disc contains all the programs given in the book. We have
included some ‘demo’ programs in the disc which demonstrate the stepwise working of a program. You can run
these programs, input your data and see for yourselves how these algorithms work.
We would like to express our gratitude to our teachers who introduced us to the world of computers,
programming and data structures. We are thankful to our publishers for considering our work and letting this
book see the light of day. We would like to thank our family and friends for their continuous love and support.
Our thanks also go to all our readers of first edition who sent us mails of appreciation and constructive criticism.
We look forward to receiving many more mails from our readers and hope that this
book proves helpful to those
whom it is meant for.
\
bpb@vsnl.com
Chapter. 1. Introduction 1
1.1 DataType 1
1.2 Abstract datatypes 1
1.3 Data structures 2
1.3.1 Linear and Non linear data structures 3
1.3.2 Static and dynamic data structures 3
1.4 Algorithms 4
1.4.1 Greedy algorithm 4
1.4.2 Divide and conquer algorithm 4
1.4.3 Backtracking 4
1.4.4 Randomized algorithms 4
|.5 Analysis of algorithms 4
1.5.1 Big O notation 6 ;
1.5.1.1 Rules for O notation 6.
.1 Arrays 10
2.1.1 One Dimensional Array 10
2.1.1.1 Declaration of 1-D Array 10
2.1.1.2 Accessing 1-D Array Elements 11
2.1.1.3 Processing 1-D Arrays 11
2.1.1.4 Initialization of 1-D Array 12
2.1.1.5 1-D Arrays and Functions 14
2.1.1.5.1 Passing Individual Array Elements toa Function 14
2.1.1.5.2 Passing whole 1-D Array toa Function 14
2.1.2 Two Dimensional Arrays 15
2.1.2.1 Declaration and Accessing Individual Elements of a2-D array 15
2.1.2.2 Processing 2-D Arrays 15
2.1.2.3 Initialization of 2-D Arrays 16
.2 Pointers 18
2.2.1 Declaration of a Pointer Variable 18 *
2.2.2 Assigning Address to a Pointer Variable 19
2.2.3 Dereferencing Pointer Variables 19
2.2.4 Pointer to Pointer 20
2.2.5 Pointers and One Dimensional Arrays 21
2.2.6 Pointers and Functions 23
2.2.7 Returning More Than One Value from a Function 24
2.2.8 Function Returning Pointer 25
2.2.9 Passing a 1-D Array toa Function 26
2.2.10 Array of Pointers 27
.3 Dynamic Memory Allocation 27
2.3.1 malloc( ). 28
2.3.2 calloc() 29
2.3.3 realloc() 29
ii. ee a ES ee
2.3.4 free(-) 30
2.4 Structure 31 : ee i
2.4.1 Defining a Structure 31
2.4.2 Declaring Structure Variables 32
2.4.2.1 With Structure Definition 32
2.4.2.2 Using Structure Tag 32
2.4.3-Initialization of Structure Variables 33
2.4.4 Accessing Members of a Structure 33
2.4.5 Assignment of Structure Variables 34
2.4.6 Array of Structures 34
2.4.7 Arrays within Structures 35
2.4.8 Nested Structures (Structure within Structure) 36
2.4.9 Pointers to Structures 38 —
2.4.10 Pointers within Structures 39
2.4.11 Structures and Functions 39
2.4.11.1 Passing Structure Members as Arguments 39
2.4.11.2 Passing Structure Variable as Argument 40
2.4.11.3 Passing Pointers to Structures as Arguments 40
2.4.11.4 Returning a Structure Variable from Function 41 =
2.4.11.5 Returning a Pointer to Structure from a Function 42
2.4.11.6 Passing Array of Structures as Argument 42
2.4.11.7 Self Referential Structures 43
Exercise 43.
Solutions 509
Index 522
Introduction
Generally any problem that has to be solved by the computer involves the use of data. If data is arranged in
some systematic way then it gets a structure and becomes meaningful. This meaningful or processed data is
called information. It is essential to manage data in such a way so thatitcan produce information. There can be
many ways in which data may be organized or structured. To provide an appropriate structure to your data you
need to know about data structures. Data structures can be viewed as a systematic way to organize data so that it
can be used efficiently. The choice of proper data structure can greatly affect the efficiency of our program.
List ADT
A list contains elements of same type agen in sequential order and following operations can be performed
on the list.
Initialize( ) - Initialize the list to be empty.
get( ) - Return an element from the list at any given position.
Data Structures through C in Depth
Stack ADT
A stack contains elements of same type arranged in sequential order and following operations can be performed
on the stack.
Queue ADT
A queue contains elements of same type arranged in sequential order and following operations can be
performed on the queue.
Initailize( ) - Initialize the queue to be empty.
Enqueue( ) - Insert an element at the end of queue.
Dequceue( ) - Remove and return the first element of queue, if Wee is not empty.
Peek( ) - Return first element of the queue without removing it, if queue is not empty.
size( ) - Return the number of elements in the queue.
isEmpty( ) - Return true if qucue is empty, otherwise return false.
isFull( ) - Return true if no more elements can be inserted, otherwise return false.
From these definitions we can clearly see that the definitions do not specify how these ADTs will be
represented and how the operations will be carried out. There can be different ways to implement an ADT, for
example the list ADT can be implemented using arrays, or single linked list or double linked list. Similarly
stack ADT and queue ADT can also be implemented using arrays or linked lists. The representation and
implementation details of these ADTs are in chapters 3 and 4.
The different implementations of ADT are compared for time and space efficiency and the implementation best
suited for the requirement of the user is used. For example if someone wants to use a list in a program which
cares lots of insertions and deletions from the list, then it is better to use the linked list implementation of
Ist.
We can say that ADT iis a specification language for data structures. ADT is a logical description while Data
Structure is concrete i.e. an ADT tells us what is to be done and data structures tells us how to do it. The actual
storage or representation of data and implementation of algorithms is done in data structures.
A program that uses a data structure is generally called a client, and the program that implements it is known
as the implementation. The specification of ADT is called interface and it is the only thing visible to the client
programs that use data structure. The client programs view data structure as ADT, i.e. they have access only to
the interface; the way data is represented and operations are implemented is not visible to the client. For
example if someone wants to use a stack in the program, he can simply use push and pop operations without
any knowledge of how they are implemented. Some examples of clients of stack ADT are programs of balanced
parentheses, infix to postfix, postfix evaluation.
We may change the representation or algorithm but the client code will not be affected as long as the ADT
interface remains the same. For example if the implementation of stack is changed from array to linked list, the
client program should work in the same way. This helps the user of a Gata structure focus on his program rather
than going into the details of the data structure.
Data structures can be nested i.e. a data structure may be made up of other data structures which may be of
primitive types or user defined types. Some of the advantages of data structures are-
(i) Efficiency - Proper choice of data structures makes our program efficient. For example suppose we have
some data and we need to organize it properly and perform search operation. If we organize cur data in an atray,
then we will have to search sequentially element by element. If item to be searched is present at last then we
will have to visit all the elements before reaching that element. So use of an array is not very efficient here.
There are better data structures which can make the search process efficient like ordered array, binary search
tree or hash tables. Different data structures give different efficiency.
(ii) Reusability - Data structures are reusable, i.e. once we have implemented a particular data structure we can
use it in any other place or requirement. Implementations of data structures can be compiled into libraries which
can be used by different clients.
(iii) Abstraction - We have seen that a data structure is specified by an ADT which provides a level of
abstraction. The client program uses the data structure through the interface only without getting into the
implementation details.
Some common operations that are performed on Data structures are-
(i) Insertion ‘
(ii) Deletion
(iii) Traversal
(iv) Search
There can be other operations also and the details of these operations depend on different data structures.
1.4 Algorithms. a
The data stored in the
An algorithm is a procedure having well defined steps for solving a particular problem.
the study of .
data structures is manipulated by using different algorithms, so the study of data structures includes
algorithms. Some of the common approaches of algorithm design are-
1.4.3 Backtracking
In some problems we have several options, where any one might lead to the solution. We will take an option
and try, and if we do not reach the solution, we will undo our action and select another one. The
Steps are
The running time generally depends on the size of input, for example any sorting algorithm will take less
time to sort 10 elements and more time for 100000 elements. So the time efficiency is generally expressed in
terms of size of input. If the size of input is n, then f(n) which is a function of n denotes the time complexity.
Thus to compare any two algorithms we will find out this function for both algorithms and then compare the
rate of growth of these two functions. It is important to compare the rates of growth because an algorithm may
seem better for small input but as the input becomes large it may take more time than others.
The function f(n) may be found out by identifying some key operations in the algorithm which account for
most of the running time. Other operations are not counted as they take very little time as compared to these key
operations and not executed more often than the key operations. For example, in searching we may count the
number of comparisons and in sorting we may count the swaps in addition to comparisons. We are interested
cnly in the growth rate of functions so the exact computation of f(n) is not necessary.
Let us take an example where time complexity is given by the following function-
f(n) = 5n* + 6n + 12
If n=10
% of running time due to the term 5n?: (500/ (500 + 60 +12) ) *100 = 87.41 %
% of running time due to the term 6n : (60 / (500 + 60 +12) ) * 100 = 10.49 %
% of running time due to the term 12: (12 / (500 + 60 +12) ) * 100 =2.09 %
The followin g table shows the growth rate of all the terms of the function
f(n) = 5n? + 6n + 12
Ee 7 Pe
pe 21.74% |. 26.090) 0 2spn766
ae EEL a) 2
ee eetO®: | 98 79ae Feage 0.02 %
ee
We can see that as n grows, the dominant term n’ accounts for most of the running time and we can ignore
the smaller terms. Calculating exact function f(n) for the time complexity may be difficult. So the terms which
do not significantly change th. magnitude of function can be dropped from the function. In this way we can get
an approximation of the time efficiency and we are satisfied with this approximation because this is very close
to the exact value when n becomes large. This approximate measure of complexity is known as asymptotic
complexity.
Thete are some standard functions whose growth rates are known, we find out the complexity of our
algorithm and compare it with these known functions whose growth rates are known. The growth rates of some
known functions are shown in the table.
This implies that f(n) does not grow faster than g(n), or g(n) is an upper bound on the function f(n).
(i) 4n + 3 is O(n) as there exists constants 5 and 3 such that 4n+3<5n foralln>3.
Here f(n) = 4n + 3, g(n) =n, c =5 and no = 3.
(ii) 5n? + 2n +6 is O(n’) as there exists constants 6 and 4 such that 5n°>+2n+6 < 6n’foralln>4.
Here f(n)= 5n* + 2n +6, g(n) = n’, c= 6 and no =4.
f(n) is O(C*h(n))
there exists constants a and b such that
f(n) < a*C * h(n) for alln>b
Taking c = a*C and np = b
f(n) < c * h(n) forall n> no
So f(n) is O(h(n)).
(7) Any polynomial P(n) of degree m is O(n")
: Proof:
P(n) =a + ayn + aon’ +......... +a,n™
Let Co = lagl, cy = lay!, Co'= lag, <....... on=Aa
(8) n® is O(n®) only if a<=b , i.e. n° is O(n’) or O(n*) or O(n’) but n* is not O(n’)
(9) All logarithms grow at the same rate i.e. while computing the O notation, base of the logarithm is not
important. To justify this we’ll prove that O(log,n) is O(log,n) and O(log,n) is O(log,n) for all a, b >1.
log,n is O(log,n)
Let xa=log,n and yb = log,n
a“ =n and b”’ =n
a“ a b”?
L’Hospital’s rule can be used for computing this form of limit and it states that-
If lim f(n) = o and lim g(n)=0
n—oo n—oo
So log n is O(n).
If f(n) = log n and g(n) = n°
i a Data Structures through C in Depth
This is the case in algorithms where a set of data is repeatedly divided into half and the middle element is
processed, e.g. binary search, binary tree traversal. -
(iv) O(n log n) linear logarithmic
This is the case when a set of data is repeatedly divided into half and each half is processed independently, e.g.
best case of quick sort. Ss
(v) O(n?) quadratic
This is the case when full set of data has to be traversed for each element of the set or we can say that all pairs
of data items are processed e.g. worst case of bubble sort. Quadratic problems are useful only for small sets of
data because n? increases rapidly as n increases.
(vi) O(n*) cubic
This is the case when all triplets of data items are processed.
~ (vi) O(2") exponential
This is the case when all possible subsets of a set of data are generated.
(vi) O(n!) |
This is the case when all possible permutations of a set of data are generated.
Arrays, Pointers and
Structures
2.1 Arrays
An array is a collection of similar type of data items and each data item is called an element of the array. The
data type of the elements may be any valid data type like char, int or float. The elements of array share the same
variable name but each element has a different index number known as subscript.
Consider a situation when we want to store and display the age of 100 employees. We can take an array
variable age of type int. The size of this array variable is 100 so it is capable of storing 100 integer values. The
individual elements of this array are-
age (Odm sagehel,.agel2ty AGSLSd. Agee|i cic. cccsccscscsnccosesseesonses age[98], age[99]
In C, the subscripts start from zero, so age[0]is the first element, age[1] is the second element of array and so
on.
Arrays can be single dimensional or multidimensional. The number of subscripts determines the dimension
of array. A one-dimensional array has one subscript; two-dimensional array has two subscripts and so on. The -
two-dimensional arrays are known as matrices.
Here array_name denotes the name of the array and it can be any valid C identifier, data_type is the
data
type of the elements of array. The size of the array specifies the number of elements that can
be stored in the
array. It may be a positive integer constant or constant integer expression. Here are some exam
: les of arra
declarations- P y
int age[100];
float salary[15];
char grade[20];
Here age is an integer type array, which can store 100 elements of inte ger type.
The array salary is a float
type array of size 15, can hold float values and third one is a character ype
ty; array of size 20, can hold characters.
The individual elements of the above arrays are-
AGSHON Age My BACSiT2| Gaeserecttemmeaere
teres
ee age [99]
Salary[0]|, salary [1], salary [2]). Salary [14]
GradeyOny, Grada[ ii Gradeloily.cckeceeee grade[19]
#define SIZE 10
main ()
{
Pitas iets wl Sie
float sal[SIZE]; /*Valid*/
int marks[size]; /*Not valid*/
The use of symbolic constant to specify the size of array makes it convenient to modify the program if the
size of array is to be changed later, because the size has to be changed only at one place, in the #define
directive.
Here 0 is the lower bound and 4 is the upper bound of the array.
The subscript can be any expression that yields an integer value. It can be any integer constant, integer
variable, integer expression or return value(int) from a function call. For example, if i and j are integer
variables then these are some valid subscripted array elements-
Peeisl, arriil, anrolitjh, arr[(2%j3), arr [i++]
A subscripted array element is treated as any other variable in the program. We can store values in them,
print their values or perform any operation that is valid for any simple variable of the same date type. For
example if arr and sal are two arrays of sizes 5 and 10 respecively, then these are valid statements-
int arr[5];
float sal[10];
1906 oeie
scanf (“%d”, &arr[1]); /*input value into arr[1]*/
Drint£(*sf" ,sal.[31); /*print value of sal[3]*/
Beals) = 25; /*assign a value to arr[4]*/
arr[4]++; /*Increment the value of arr[4] by 1*/
sal[5]+=200; /*Aad 200 to sal[5]*/
sum = arr(0)+arr[1]+arr[(2]+arr[3]+arr[4];/*Add all the values of array arr[5]*/
R=2 7
scanf(“%f",&sal[i]); /*Input value into sal[2]*/
DLIntl( “tk” sal [.4]);3 /*Print value of sal[2]*/
printf("“%f”",sal[i++] ); /*Print value of sal[2] and increment the value of i*/
There is no check on bounds of the array. For example, if we have an array arr of size 5, the valid subscripts
are only 0, |, 2, 3, 4 and if someone tries to access elements beyond these subscripts, like arr[5] , arr[10],
the compiler will not show any error message but this may lead to run time errors. So it is the responsibility of
programmer to provide array bounds checking wherever needed.
Here array_name is the name of the array variable, size is the size of the array and valuel,
RENATO 2 asceclecsetvernse valueN are the constant values known as initializers, which are assigned to the array elements
one after another. These values are separated by commas and there is a semicolon after the ending braces. For
example-
ciale ineb@licen(
S| ct eyo), tisy, OR (sy, bs
. While initializing a 1-D array, it is optional to specify the size of the array. If the size is omitted during
initialization then the compiler assumes the size of array equal to the number of initializers. For example-
Inte macks|] =91997. 7850, 45) 067, 89};
floatysdd|| =e{25 25%. somogee 4a
Here the size of array marks is assumed to be 6 and that of sal is assumed to be 3,
If during initialization the number of initializers is less than the size of array, then all the remainin
g elements
of array are assigned value zero. For example-
int marks[5] = {99, 78};
We can’t copy all the elements of an array to another array by simply assigning it to the other array. For
example if we have two arrays a and b, then-
enter alS) =1( 02,3457;
ant b(S) 3
y= a; _/*Not. valid*/
We will have to copy all the elements of array one by one, using a for loop.
for (i=0; (i<55. ++)
binie—yalid;
In the following program we will find out the largest and smallest number in an integer array.
/*P2.2 Program to find the largest and smallest number in an array*/
#include<stdio.h> f
main () s
{
reals a),fees laueihcs Gels tye tel all tehais) lela ate a w/aar
int small, large;
small = large = arr[0J];
for(i=1; i<10; i++)
{
if(arr{i] < small)
small = arr[i];
if(arr[i] > large)
large = arr[i];
} 2
printf£("Smallest = %d, Largest = %d\n",small, large) ;
We have taken the value of first element as the initial value of smal1 and large. Inside the for loop, we will
start-comparing from second element onwards so this time we have started the loop from 1 instead of 0.
The following program will reverse the elements of an array.
/*P2.3 Program to reverse the elements of an array*/
#include<stdio.h>
main()
{
. ine A eeMpRaGr
LO le ety 2S 40,0 ie mouse Ol)i
fom (j=07 =o ae<dee tt a)
a
temp = arr[i];
arr(i] = arr(j];
arr(j] = temp;
}
printf ("After reversing, the array SUT Soon ah
for (1=0; 2<10; ++) ‘
Painee (sd ie iRarnlig))s,
DwRLner (u\me jy,
} -
In the for loop we have used comma operator and taken two variables i and j. The variable i is initialized
with the lower bound and 3 is initialized with upper bound. After each pass of the loop, i is incremented while
j is decremented. Inside the loop, a[i] is exchanged with a[j]. So a[0] will be exchanged with a[9], a[1]
with a{8], a[2] with af7] and soon.
7
Data Structures Through C in Depth
14
sale) lava
|slo)
func(int val[10])
It
: is optional to specify the size of the array i the formal
in nt p : E
definition as- y al argument, for example we may write the function
func(int val{])
Here rowsize specifies the number of rows and columnsize represents the number of columns in the array.
The total number of elements in the array are rowsize * columnsize. For example-
intvarEi{4) [5];
Here arr is a 2-D array with 4 rows and 5 columns. The individual elements of this array can be accessed by
applying two subscripts, where the first subscript denotes the row number and the second subscript denotes the -
column number. The starting element of this array is arr[0][0] and the last element is arr{3] [4]. The total
number of.elements in this array is 4*5 = 20.
The remaining elements in each row will be assigned values 0, so the values of elements will be-
mat[O][O] : 11 mat{O][1]: 0 mat[0][2]: 0
mat[1][O] : 12 : mat[1][1]: 13 » mat[1][2]: 0
mat[2][O] : 3 mat[2][1]: 15 mat[2][2] : if
mat[3}[O]: mat(3][1]: 0 mat[3][2] :
In 2-D arrays, it is optional to specify the first dimension while initializing but the second dimension should
always be present. For example-
intemat |jts c= if
{LO}
{2,20,200},
£35,
{4,40,400}
};
Here first dimension is taken 4 since there are 4 rows in initialization list.
A 2-D array is also known as a matrix. The next program adds two matrices; the order of both the matrices
should be same.
/*P2.7 Program for addition of two matrices.*/
#define ROW 3
#define COL 4
#include<stdio.h>
main ()
{
int i,j,mat1[ROW] [COL] ,mat2 [ROW] [COL] ,mat3 [ROW] [COL];
printf£("Enter matrix mat1(%dx%d) row-wise :\n",ROW,COL) ;
for(i=0; i<ROW; i++)
for (j=0; j<COL; j++)
scanf ("%d", &mat1[i]
[j]);
printf("Enter matrix mat2(%dx%d) row-wise :\n",ROW,COL) ;
for(i=0; i<ROW; i++)
for (j=05" 3 <COL 7 ++)
\ scanf("%d", &mat2[i][j] );
/*Addition*/
for(i=0; i<ROW; i++)
for(j=0; j<COL; j++)
mat3[i][j] = matl[i})[j]) + mat2[il(j];
printf("The resultant matrix mat3 is :\n");
for (i=0; i<ROW; i++)
{
for (j=07) 53<COL; j++)
Printe("s5a" macs its);
joperdghnie
((e Nea” \yep
Now we will write a program to multiply two matrices. Multiplication of matrices requires that the number
of columns in first matrix should be equal to the number of rows in second matrix. Each row of first matrix is
multiplied with the column of second matrix then added to get the element of resultant matrix. If we multiply
two matrices of order m x n and n x p then the multiplied matrix will be of order m x p. For example-
Arg = 4 5 Bg = 2ponet
3 2
#define COL1 4
#define ROW2 COL1
#define COL2 2
main ()
{ . ie;
int mat1[ROW1] [COL1],mat2[ROW2] [COL2] ,mat3[Row1] [COL23
sake, AapappecSy
row-wise :\n",ROW1,COL1) ;
seidealcencas matrix matl(%dax%d)
for (i=0; i<ROW1; i++)
for (j=0; j<COL1; j++)
scanf ("$c", &mat1(iJ](j]);
:\n",ROW2,COL2) ;
printf("Enter matrix mat2 (%ax%d) row-wise
for(i=0; i<ROW2; i++)
for (j=0; j<COL2; j++)
scanf ("$d",&mat2[i]{j] );
/*Multiplication*/ |
for (i=0; i<ROW1; i++)
for(j=0; j3<COL2; j++) |
{ :
mat3[i}{j] = 0;
for (k=0; k<COL1; k++)
mat3[il[j] += mat1l[i]{k] * mat2(k](j];
}
printf ("The Resultant matrix MaAtseLSe oN oa
for(i=0; i<ROW1; i++)
{
Roa (SI=OG Si<Coiivyiy a)57))
print£ ("%t5da",mat3(i}[j));
jopenuatena (MRAP ))A :
2.2 Pointers
A pointer is a variable that stores memory address. Like all other variables it also has a name, has to be declared
and occupies some space in memory. It is called pointer because it points to a particular location in memory by
storing the address of that location. The use of pointers makes the code more efficient and compact. Some of
the uses of pointers are-
(i) Accessing array elements.
(ii) Returning more than one value from a function.
(iii) Accessing dynamically allocated memory
(iv) Implementing data structures like linked lists, trees, and graphs.
Here pname is the name of the pointer variable, which should be a valid C identifier.
The asterisk ‘*’
preceding this name informs the compiler that the variable is declared as a pointer.
Here data_type is known
as the base type of pointer. Let us take some pointer declarations- ps
uggs t*atjoerey:
EvOatwer Eptrt;
ChagunCpiiis, Clileciia
fae
Here iptr is a p pointer that should pointto
me point vari
to variable i
of type int, imi
similarly fptr anc cptr should point to
variables of float and char type respectively. Here type o f variable iptr is ‘poi yk? or (int
:
cil ; a5 pointer tu int’ *), or we
can say that base type of iptr is int. We can also combine
the declaration of simple variables and pointer
oe Pointers and Structures 19
Gariables as we have done in the third declaration statement where chi and ch2 are declared as variables of
type char.
Pointers are also variables so compiler will reserve space for them and they will also have some address. All:
_ pointers irrespective of their base type will occupy same space in memory since all of them contain addresses
only. The size of a pointer depends on the architecture and may vary on different machines; in our discussion
we will take the size of pointer to be 4 bytes.
: |
e2:2.2 eeslaning Address to a Pointer Variable |
When we declare a pointer variable it contains garbage value i.e. it may be pointing anywhere in the memory.
So we should always assign an address before using it in the program. The use of an unassigned pointer may
give unpredictable results and even cause the program to crash. Pointers may be assigned the address of a
variable using assignment statement. For example-
int *iptr,agel = 30;
MmLOatefpter, sal = 150050:
DCT &age;
fptr &sal;
Now iptr contains the address of variable age i.e. it points to variable age, similarly fptr points to
variable sal. Since iptr is declared as a pointer of type int, we should assign address of only integer
variables to it. If we assign address of some other data type then compiler won’t show any error but the output
will be incorrect.
We can also initialize the pointer at the time of declaration, but in this case the variable should be declared
before the pointer: For example- :
int age=30,*iptr=&age;
float sal=1/500.50, *fptr=é&sal;
It is also possible to assign the value of one pointer variable to the other, provided their base type is same.
For example if we have an integer pointer p1 then we can assign the value of iptr to it as- »
Diaper;
Now both pointer variables iptr and p1 contain the address of variable age and point to the same variable
age.
We can assign constant zero to a pointer of any type. A symbolic constant NULL is defined in the header file
stdio.h, which denotes the value zero. The assignment a NULL to a pointer guarantees that it does not point to
any valid memory location. This can be done as-
ptr = NULL;
In our program, if we place ‘*’ before p1 then we can access the variable whose address is stored in p1.
Since p1 contains the address of variable a, we can access the variable a by writing *p1. Similarly we can access
variable b by writing *p2..So we can use *p1 and *p2 in place of variable names a and b anywhere in our
program. Let us see some examples-
*pl= 9; : is equivalent to a = Op
(*pl) ++; is equivalent to att;
a = ew) ae GO p is equivalent to 1 So) os\ OE
‘Sata Structures Through C in Depth
20
3000
4000
2000
3000 2000
Here pa is a pointer variable, which contains the address of the variable a and ppa
is a pointer to point
variable, which contains the address of the pointer variable pa. aa
We know that *pa-gives value of a, similarly *ppa will give the value of pa. Now let us see
given by **ppa. what value will be
**ppa
eats (ease)
> *pa (Since *ppa gives pa)
re (Since *pa gives a)
Hence we can see that **ppa will give the value of a. So to access the
pointer to pointer, we can use double indirection operator. The table given next Ene eon
will make this conce pt clear. oe
Arrays, Pointers and Structures
21
Address of a
Value of pa
Address of pa
‘| Value of ppa
Address of ppa
pa = &a;
ppa = &pa;
printf("Address of a = %p\n",&a);
printf("Value of pa = Address of a = %p\n",pa);
printf£("Value of *pa = Value of a = $d\n",*pa);
printf("Address of pa = %p\n",é&pa);
printf("Value of ppa = Address of pa = %p\n",ppa);
printf("Value of *ppa = Value,
of pa = %p\n",*ppa);
printf("Value of **ppa Value of a = %d\n",**ppa);
printf("Address of ppa p\n", &ppa) ;
Here 5000 is the address of first element, and since each element (type int) takes 4 bytes, address of next
element is 5004, and so on. The address of first element of the array is also known as the base address of the
array. Thus it is clear that the elements of an array are stored sequentially in memory one after another.
In C language, pointers and arrays are closely related. We can access the array elements using pointer
>xpressions. Actually the compiler also accesses the array elements by converting subscript notation to pointer
notation. Following are the main points for understanding the relationship of pointers with arrays.
|. Elements of an array are stored in consecutive memory locations.
2. The name of an array is a constant pointer that points to the first element of the array, i.e. it stores the address
of the first element, also known as the base address of array.
3. According to pointer arithmetic, when a pointer variable is incremented, it points to the next location of its
vase type.
‘or example-
Wee arr (5] = {5,10,15,20,25}
dere arr is an array that has 5 elements each of type int.
arr[0] arr[1] arr[2] arr[3] arr[4]
Tne a
for (i=0;-i<5; i++)
{ ;
printf("Value of arr[%d] = %d\t",i,arrf{il);
printf ("Address of arr[%d] = %p\n",i,&arr[i]);
a }
; tu
2000. The address of 0
The name of the array ‘arr’ denotes the address of 0" element of array which is
The name of an array
element can also be given by sarr[0], so arr and &arr [0] represent the same address.
and according to pointer arithmetic when an integer is added to a pointer then we get the
is a constant pointer,
next elemen
address of next element of same base type. Hence (arr+1) will denote the address of the
the
arr[{1]. Similarly (arr+2) denotes the address of arr[2] and so on. In other words we can say that
so on.
pointer expression (arr+1) points to 1“ element of array, (arr+2) points to 2"! element of array and
arr > PointstoO" element — &arr[0] —> 2000
arr+1 = Pointsto I“element — &arr[l] — 2004
arr+2 > Points to oS element — &arr[2] > 2008
ar+3. Points to 3“ element — &arr[3] > 2012
arr+4 > Points to 4" element > &arr[4] — 2016
#include<stdio.h>
main
C ()
Here a and b are variables declared in the function main() while x and y are declared in the function
value ().These variables reside at different addresses in memory. Whenever the f&nction value () is called,
two variables are created named x and y and are initialized with the values of variables a and b. This type of
parameter passing is called call by value since we are only supplying the values of actual arguments to the
calling function. Any operation performed on variables x and y in the function value(), will not affect
variables a and b. ;
Before calling the function value (), the value of a is 5 and vaiue of b is 8. The values of a and b are copied
y are
into x and y. Since the memory locations of x, y and a, b are different, when the values of x and
after calling the function, a and b are
incremented, there will be no affect on the values of a and b. Therefore
same as before calling the function and have the values 5 and 8.
Before execution of function After incrementing x and y
main () value () main () value()
arguments
_ Although C does not use call by reference, we can simulate it by passing addresses of variables as
to the function. To accept the addresses inside the function, we will need pointer variables. Here is a program
that simulates call by reference by passing addresses of variables a and b.
/*P2.14 Call by reference*/
#include<stdio.h>
woud Kel (int: *p, int .*q)-
main()
: Data Structures Through C in Depth
24
Before calling the function ref (), the value of a is 5 and value of b is 8. The value of actual arguments are
copied into pointer variables p and q, and here the actual arguments are addresses of variables a and b. Since p
contains address of variable a, we can access variable a inside ref () by writing *p, similarly variable b can be
accessed by writing *q.
Now (*p) ++ means value at address 2000 (which is 5 ) is incremented. Similarly .(*q) ++ means value at
address 2004 (which is 8) is incremented. Now the value of *p = 6 and *q = 9. When we come back to
main(), we see that the values of variable a and b have changed. This is because the function ref () knew the
addresses of a and b, so it was able to access them indirectly.
So in this way we could simulate call by reference by passing addresses of arguments. This method is mainly
useful when the called function has to return more than one values to the function.
>
Arrays, Pointers and Structures
25:
} beat
funcl(int xpint y,int *ps,int *pd;int *pp)
{
tps = xty;
*pd = x-y;
*pp = x*y;
’ In func(), variables a and b are passed by value while variables sum, diff, prod are passsed by reference.
The function func () gets the addresses of variables sum, diff and prod, so it accesses these variables
indirectly using pointers and changes their values.
For example-
float *funci(int,char); /*This function returns a pointer to float*/
lit: ee UME Aca ts Heay /*This function returns a pointer to int*/
While returning a pointer, make sure that the memory address returned by the pointer will exist even After
the termination of function. For example a function of this form should not be written.
main ()
{ «
TAC Re Deir
ptr = func();
return p;
Here we are returning a pointer which points to a local variable. We know that a local variable exists only
inside the function. Suppose the variable x is stored at address 2500, the value of p will be 2500 and this value
will be returned by the function func (). As soon as func () terminates, the local variable x will cease to exist.
The address returned by func () is assigned to pointer variable ptr inside main(), so now ptr will contain
address 2500. When we dereference ptr, we are trying to access the value of a variable that no longer exists. So
never return a pointer which points to a local variable. Now let us take another program that uses a function
returning pointer.
/*P2.16 Program to show a function that returns pointer*/
#include<stdio.h>
stile. 221m bhalg (aielic 8d 0) Pealaghey tai)4
main ()
{
Mewar er he |e = lg Ae moron, 6,9, L0.),,21, “Dery
Fale e-t MasyPp :
ptr = fun(arr,n);
Deimet (aga Esp Wire ho pti astd\n" pare pts per) 7
}
mt fui (tart *p, vmt -nm)
{
i) = jsAaele
return p7
d Data Structures Through C in Depth
26
}
fune(ant al Si oes,
f
}
sehuglel (aye “witsy)or
In all the three cases the compiler reserves space only for a pointer variable inside the function. In the
function call, the array name is passed without any subscript or address operator. Since array name represents
the address of first element of array, this address is assigned to the pointer variable in the function. So inside the
function we have a pointer that contains the base address of the array. In the above program, the argument a is
declared as a pointer variable whose base type is int, and it is initialized with the base address
of array arr
We have studied that if we have a pointer variable containing the base address of an array,
then we can sec
any array element either by pointer notation or subscript notation. So inside the function
we can acc ess any i”
element of arr by writing *(a+i) or a[i]. Since we are directly accessing the original
array, all the changes
|
made to the array in the called function are reflected in the calli
alling function.
i The f i g program will
illustrate the point that we have discussed. F ae
/*P2.18 When an array is passed _to a function, the receiving argument is declared as a
pointer*/
#include<stdio.h>
fune(float £[Jjjant *i char ci5]);
main()
Arrays, Pointers and Structures
Pil
{
POn Errata (ole veel Amen 5) 3.007),.4 0151.9}
Liieeacr |S) = t tee 3240.5 ht
Chammceanr Sm kta, 'b,."e! 2'di vet 2
printf("Inside main(): "); .
DEINtaR(SSizesorwt arre= Su\t") sizeot(f£ arr) )-
Deine nh(sr 2enoLeiuarr = u\t) szeot (i iarr) )-s
printf£("Size of c_arr = %u\n",sizeof(c_arr));
funce.(htarm,warn, icliare)
}
pune (eloat Lb) ,int.*) char cl.) }
{
printf("Inside func(): ");
printe ("Size of £ $u\t",sizeof(f));
pELnet ("Size oOfmi: tu\t",sizeof(i));
jongnliel
cee (Sele qey Cone Ko" Su\n",sizeof(c));
}
Inside the function func(), variables £, i and c are declared as pointers.
Here pa is declared as an array of pointers. Every element of this array is a pointer to an integer. pa[i]
gives the value of the i element of pa which is an address of any int variable and *pa[{i)} gives the value of
that int variable. The array of pointers can also contain addresses of elements of another array.
2.3.1 malloc( )
Declaration : vyoid *malloc(size_t size);
to be
This function is used to allocate memory dynamically. The argument size specifies the number of bytes
allocated. The type size_t is defined in stdlib.h as unsigned int. On success, malloc() returns a pointer to the
first byte of allocated memory. The returned pointer is of type void, which can be type cast to appropriate type
of pointer. It is generally used as-
ptr = (datatype *)malloc(specified size);
Here ptr is a pointer of type datatype, and specified size is the size in bytes required to be reserved in
memory. The expression (datatype *) is used to typecast the pointer returned by malloc (). For example-
MME OIE?
joie =a (Giahe ty))imeni
ore (a
This allocates 12 contiguous bytes of memory space and the address of first byte is stored in the pointer
variable ptr. The allocated memory contains garbage value. We can use sizeof operator to make the program
portable and more readable.
pDEaw=" (mes malloc (Sisazeok(imt ))))-
#include<stdlib.h>
main ()
i
sisanes fo)
zaty,alpl
printf("Enter the number of integers to be entered : Hh) per a,
scanf ("3d", &n);
jon = we(Caliahe *)malloc(n*sizeo
(int)f);
“-_
if (p==NULI;)
{
printf ("Memory not available\n") ;
Shei (Gl)
Arrays, Pointers and Structures | 29
The function malloc () returns a void pointer and we have studied that a void pointer can be assigned to any
type of pointer without typecasting. But we have used typecasting because it is a good practice to do so and
moreover it also ensures compatibility with C++.
2.3.2 calloc( )
Declaration > void *calloc(size_t n, size_t size);
The calloc() function is used to allocate multiple blocks of memory. It is similar to. malloc() function
except for two differences. The first one is that it takes two arguments. The first argument specifies the number
of blocks and the second one specifies the size of each block. For example-
ptr={int *)calloc(5,sizeof
(int) ); ‘
This allocates 5 blocks of memory, each block contains 4 bytes and the starting address is stored in the
pointer variable ptr, which is of type int *. An equivalent malloc () call would be-
myer —\intea)mabloe
(5* sa zeot (ant))!;
Here we have to do the calculation ourselves by multiplying, but calloc() function does the calculation
for us.
The other difference between calloc() and malloc() is that the memory allocated by malloc () contains
garbage value while the memory allocated by calloc() is initialized to zero. But this initialization by
calloc() is not very reliable, so it is better to explicitly initialize the elements whenever there is a need to do
sO.
Like malloc(), calloc() also returns NULL if there is not sufficient memory available in the heap
2.3.3 realloc( )
Declaration > void *realloc(void *ptr, size_t newsize)
We may want to increase or decrease the memory allocated by malloc() or calloc(). The function
realloc() is used to change the size of the memory block. It alters the size of the memory block without
losing the old data. This is known as reallocation of memory.
This function takes two arguments, first is a pointer to the block of memory that was previously allocated by
malloc() or calloc() and second one is the new size for that block. For example-
pea (int ~)mal
loc (size);
This statement allocates the memory of the specified size and the starting address of this memory block is
stored in the pointer variable ptr. If we want to change the size of this memory block, then we can use
realloc() as-
ptr = (int *)realloc(ptr,newsize) ; =
This statement allocates the memory space of newsize bytes, and the starting address of this memory block
is stored in the pointer variable ptr. The newsize may be smaller or larger than the old size. If the newsize is
larger, then the old data is not lost and the newly allocated bytes are uninitialized. The starting address
contained in ptr may change if there is not sufficient memory at the old address to store all the bytes
consecutively. This function moves the contents of old block into the new block and the data of the old block is
not lost. On failure, realloc() returns NULL.
/*P2.21 Program to understand the use of realloc() function*/
#include<stdio.h>
: Data Structures Through C in Depth
30
#include<stdlib.h>
main()
{
ante Hjes Deis;
DECE=s ne *)malloc(5*sizeof (int) );
1£ (ptr==NULL)
2.3.4 free( )
Declaration void free(void *p)
The dynamically allocated memory is not automatically released; it will exist till the end of program. If we have
finished working with the memory allocated dynamically, it is our responsibility to release that memory so that
it can be reused. The function free() is used to release the memory space allocated dynamically. The memory
released by free() is made available to the heap again and can be used for some other purpose. For example-
free (per);
Here ptr is a pointer variable that contains the base address of a memory block created by malloc() or-
calloc(). Once a memory location is freed it should not be used. We should not try to free any memory
location that was not allocated by malloc(), calloc() or realloc().
When the program terminates, all the memory is released automatically by the operating system but it is a
good practice to free whatever has been allocated dynamically. We won’t get any errors if we don’t free the
dynamically allocated memory, but this would lead to memory leak i.e. memory is slowly leaking away and can
be reused only after the termination of program. For example consider this function-
void func()
{
sbahel ‘johoner
pes = (int*)malloc(l0*sizeo£
(int) );
Here we have allocated memory for 10 integers through malloc(), so each time this function is called
space for 10 integers would be reserved. We know that the local variables vanish
when the function terminates,
and since ptr is a local pointer variable, it will be deallocated automaticall y at the termina
tion of function. But
the space allocated dynamically is not deallocated automatically, so that space
remains there and can’t be used,
leading to memory leaks. We should free the memory s pace by putting a
call to free() at the end of the
function.
Since the memory space allocated dynamically is not release
d after the term ination of function, it is valid to.
return a pointer to dynamically allocated memory. For example-
Arrays, Pointers and Structures . 31
int’ *fune()
{
LEN DEXY \
ptr = (int*)malloc(10*sizeof(int));
ralsle hae) jshepay
Here we have allocated memory through malloc() in func(), and returned a pointer to this memory. Now
the calling function receives the starting address of this memory, so it can use this memory. Note that, now the
call to function free () should be placed in the calling function when it has finished working with this memory.
Here func() is declared as a function returning pointer. Recall that it is not valid to return address of a local
variable since it vanishes after the termination of function.
2.4 Structure
Array is a collection of same type of elements but in many real life applications we may need to group different
types of logically related data. For example if we want to create a record of a person that contains name, age
and height of that person, then we can't use array because all the three data elements are of different types.
record
To store these related fields of different-data types we can use a structure, which is capable of storing
heterogeneous data. Data of different types can be grouped together under a single name using structures. The
data elements of a structure are referred to as members.
datatype memberN;
eg
Here struct is a keyword, which tells the compiler that a structure is being defined. member1,
MEMDEL2, -reesssseseeee /memberN are members of the structure and are declared inside curly braces. There should be
a semicolon at the end of the curly braces. These members can be of any data type like int, char, float, array,
pointer or another structure type. tagname is the name of the structure and it is used further in the program to
declare variables of this structure type.
Definition of a structure provides one more data type in addition to the built in data types. We can declare
variables of this new data type that will have the format of the defined structure. It is important to note that
definition of a structure template does not reserve any space in memory for the members; space is reserved only
when actual variables of this structure type are declared. Although the syntax of declaration of members inside
the template is identical to the syntax we use in declaring variables, these members are not variables, they don't
Data Structures Through Cin Depth .
20)
Here student is the structure tag and there are three members of this structure viz name, rollno and
marks. Structure template can be defined globally or locally. i.e. it can be placed before all functions in the
program or it can be locally present in a function. If the template is global then it can be used by all functions
while if it is local then only the function containing it can use it.
- Here stul, stu2 and stu3 are variables of type struct student. When we declare a variable while defining
the structure template, the tagname is optionai. So we can also declare them as- €
struct {
char name[20] ;
Ne. sarelLiliaveyy,
float marks;
}stul,stu2,stu3;
If we declare variables in this way, then we’ll not be able to declare other variables of this structure type
anywhere else in the program nor can we send these structure variables to functions. If a need arises
to declare a
) ) of th
_variable this type in the program then we'll have
a to write th e whole template again.
i So al
is optional it is always better to specify a tagname for the structure. : : data ge
Here value of members of stu will be "Mary" for name, 25 for rollno, 98 for marks. The values of
members of stu2 will be "John" for name, 24 for rollno, 67.5 for marks.
Here the members rollno and marks of stul will be initialized to zero: This is equivalent to the initialization-
Beruct student stul = {“Mary”,0,0};
Here on the left side of the dot there should be a variable of structure type and on right hand side there
should be the name of a member of that structure. For example consider the following structure-
Beruct student {
char name[20];
int rollno;
float marks;
};
Struct student stul,stu2;
struct student {
char name[20];
int Yolino;
float marks;
a5
main ()
Unary, relational, arithmetic, bitwise operators are not allowed with structure variables. We can use these
variables with the members provided the member is not a structure itself.
Here stu is an array of 10 elements, each of which is a structure of type struct student, means each
element
ze of stu ; has 3 members, which
vee are name, rollno and marks. These stru ctures can be accesse
subscript notation. To access the individual members of these structures we ‘ll use dthrong
the dot operator as usual.
Arrays, Pointers and Structures
int i;
struct student stuarr[10];
for(i=0; i<10; i++)
{ a
main()
{
lortee alia
struct student stuarr[3];
}
for (i=0; i<3; -i++) ) et
{
printf("Data of student %d\n",i+1); ;
printf ("Name:%s,Roll number:%d\nMarks:",stuarr[i] .name,stuarr[i] .rollno) ;
for(j=0; j<4; j++)
printf ("%d ",stuarr[i].submarks[j]);
jopesuahens (Ya) F
struct tagl{
member1 ;
member2 ;
struct tag2{
member1;
member2 ;
member m;
}varl;
member n;
}var2; (
Here we have defined the template of structure date inside the structure student, we could have defined it
_outside and declared its variables inside the structure student using the tag. But remember if we define the inner
structure outside, then this definition should always be before the definition of outer structure. Here in this case
the date structure should be defined before,the student structure.
struct date{
int day;
int month;
int year;
ion
‘struct student{
char name[20];
int rollno;
float marks;
struct date birthdate;
}stul,stu2;
The advantage of defining date structure outside is that we can declare variables of date type anywhere
else also. Suppose we define a structure teacher, then we can declare variables of date structure inside it as-
struct teacher {
char name[20];
int age;
float salary;
struct date birthdate, joindate;
HELE +
The nested structures may also be initialized at the time of declaration. For example-
struct teacher t1 = {"Sam”,34,9000,{8,12,1970},{1,7,1995}};
Nesting of a structure within itself is not yalid. For example the following structure definition is invalid-
struct person{
char name[20];
int age;
float height;
struct person father; /*Invalid*/
}emp; )
level three
The nesting of structures can be extended to any level. The following example shows nesting at
structure.
ic. first structure is nested inside a second structure and second structure 1s nested inside a third
struct time
{
ioahe ghalp
int min;
38 ta Structures Through C in t
int sec;
ae
struct date
{
int day;
int month;
int year;
struct time t;
Neh
struct student
{
char name[20];
struct date dob; /*Date of birth*/
}stul,stu2;
Here ptr is a pointer variable that can point to a variable of type struct student. We will use the &
operator to access the starting address of a structure variable, so ptr can point to stu by-
ptr = &stu;
There are two ways of accessing the members of structure through the structure pointer.
We know ptr is a pointer to a structure, so by dereferencing it we can get the contents of structure variable.
Hence *ptr will give the contents of stu. So to access members of a structure variable stu we can write-
(*ptr) .name
((Sioneas)y eselonlaliaie)
(*ptr) .marks
Here parentheses are necessary because dot operator has higher precedence than the * operator. This syntax
is confusing so C has provided another facility of accessing structure members through pointers. We can use the
arrow operator (->) which is formed by hyphen symbol and greater than symbol. We can access the members
as-
ptr->name
ptr->rollno
ptr->marks
The arrow operator has same precedence as that of dot operator and it also associates from left to right
/*P2.26 Program to understand pointers to structures*/
#include<stdio.h>
struct student{
char name[20];
int rol lnos
int marks;
oH
main ()
We can also have pointers that point to individual members of a structure variable.
For example-
int *p = &stu.rollno;
float *ptr = &stu.marks;
The expression &stu.rollno is equivalent to &(stu.rollno) because the precedence of dot operator is
more than that of address operator.
Here it is necessary to define the structure template globally because it is used by both functions to declare
variables.
The name of a structure variable is not a pointer unlike arrays, so when we send a structure variable as an
argument to a function, a copy of the whole structure is made inside the called function and all the work is done
on that copy. Any changes made inside the called function are not visible in the calling function since we are
only working on a copy of the structure variable, not on the actual structure variable.
Structure variables can be returned from functions as any other variable. The returned value can be assigned to a
structure of the appropriate type.
/*P2.30 Program to understand how a structure variable is returned from a function*/
#include<stdio.h>
struct student{
char name[20];
int rollno;
int marks;
};
void display(struct student) ;
struct student change(struct student stu);'
main ()
{
struct student stul {"John", 12,87);
struct student stu2 ou {"Mary",18,90};
stul = change(stul) ;
stu2 = change(stu2) ;
display (stul);
display (stu2) ;
} /
struct student change(struct student stu)
{
stu.marks = stu.marks + 5;
stu-rollno = stu.rollno - 10;
return stu;
‘}
void display(struct student stu)
{
printf ("Name - $s\t",stu.name) ;
printf£("Rollno - %d\t",stu.rollno);
_ printf("Marks - %d\n",stu.marks) ;
¥
dec_marks(stuarr) ;
for(i=0; i<3; i++)
display(stuarr[i]);
Arrays, Pointers and Structures 43
4
Here ptri and ptr2 are structure pointers that can point to structure variables of type struct tag, so
struct tag is a self referential structure. These types of structures are helpful in implementing data structures
like linked lists and trees.
Exercise
Find the output of the following programs.
(1) main()
{
int i,size=5,arr[size]);
for(i=0; i<size; i++)
scanf("%d",&arr[i]);
for(i=0; i<size; i++)
prantt ("Sa * ,arr fis);
}
(2) main()
{
intware(4)=(2,4,8,L0},1=4,Ji7
while (i)
{
joHrare (iL) iy
shaeap
}
DEINEe (je= SAN, 31)?
}
(3) main()
{
int i=0,sum=0,arr[6)={4,2,6,0,5,10};
while(tarr[i])
i,
Data Structures Through C in Depth
44
sum = sum+arr[i];
i++;
}
printf£("sum = %d\n",sum) ;
}
(6) main()
{
nee, chit S025 15.0735,40,,45
pepe
je) = Ebigie’y
Low (=O) t< Sia)
printf ("%d\t%d\t",*(p+i),pli]);
} .
(7) main‘)
{
int de,anri5
|=(25,,50),45),40), 55);
for(i=0; i<5; i++)
{
pyuinee(td. arr);
arr++;
Arrays, Pointers and Structures
45
(8) main()
{
int i,arr[5] = {25,30,35,40,45),*p = arr;
for(i=0; 1<5; i++)
{
GD) a
DPrane ses. Ge =p),
ptt;
}
}
(9) main()
{
iiteakr (ot—{25,90,55, 40,55)" Ds
for (p=&arr[0]; p<arr+5; p++)
printf ("%d aD) ie
}
(10) main()
{
InteacnblOl=(25),50),551,40),95,7,00),05,70;,
051, 90)),.°D;
for (p=arr+2; p<arr+8; p=p+2)
PETE (See DG
} . e
(11) main() i
{ :
Pita ar leliOil=fi2 593 0/9514 0), 55, 00), 65,07 0),,057, 90}:
one op —sarr +9 ;
for(i=0; i<10; i++)
printf ("%d Jp O=eI)) Fi
}
(12) main()
{
tiga Seer LO) SS HAS SOE iy HOES) A ROp (cea PM tshoy echOnay ol
for (p=arx+9; p>=arr; p==)
printf ("%d el2)
main ()
{
int a=2,b=6;
func(a, &b) ;
printf("a = $d, b = %d\n",a,b);
}
void func(int x,int *y)
{
int temp;
temp = X;
x = *y;
*y = temp;
}
(16) main ()
{
struct A {
int marks;
char grade;
1A
struct A Al1,Bl;
Al.marks = 80;
Al.grade = 'A';
printf("Marks = %d\t",Al.marks) ;
printf£("Grade = %c\t",Al.grade);
BI=sAl;,
printf("“Marks = %d\t",Bl.marks);
printf("Grade = %c\n",Bl.grade) ;
}
func (var) ;
printf("%d\n",var.i);
}
void func(struct tag var)
{
var.i++;
}
List is a collection of similar type of elements. There are two ways of maintaining a list in memory. The first
way is to store the elements of the list in an array, but arrays have some restrictions and disadvantages The
second way of maintaining a list in memory is through linked list. Now let us study what a linked list is and
after that we will come to know how it overcomes the limitations of array.
Figure 3.2
We can see that the nodes are scattered here and there in memory, but still they are connected to each other
through the link part, which also maintains their linear order. Now let us see the picture of memory if the same
list of integers is implemented through array.
ssesesccrscsareeseccscoeserseeeesesescooes
struct node{
char name[10];
int code;
| name | code] salary [link
float salary;
struct node *link;
Wa
struct node{
struct student stu; stu link
struct node *link;
di
In this chapter we will perform all operations on linked lists that contain only an integer value in the info part of
their nodes. :
In array we could perform all the operations using the array name and index. In the case of linked list, we
will perform all the operations with the help of the pointer start because it is the only source through which
we can access our linked list. The list will be considered empty if the pointer start contains NULL value. So
our first job is to declare the pointer start and initialize it to NULL. This can be done as-
@struct node *start;
start = NULL;
Now we will discuss the following operations on a single linked list.
(i) Traversal of a linked list
(ii) Searching an element
(ii1) Insertion of an element
(iv) Deletion of an element
(v) Creation of a linked list
(vi) Reversal of a linked list
t
50 ane Data Structures Through C in Depth
The main() function and declarations are given here, and the code of other functions are given with the:
Sg ges
explanation.
/*P3.1 Program of single linked list*/
#include<stdio.h>
#include<stdlib.h>
struct node
{
int, info;
struct node *link;
a
struct node *create_list (struct node *start);
void display(struct node *start);
void count(struct node *start);
void search(struct node *start,int data);
struct. node *addatbeg(struct node *start,int data);
struct node *addatend(struct node *start,int data);
struct node *addafter(struct node *start,int data,int item);
struct node *addbefore(struct node *start,int data,int item);
struct node *addatpos(struct node *start,int,data,int pos);
struct node *del(struct node *start,int data);
struct node *reverse(struct node *start) ;
main ()
{ ;
struct node *start=NULL;
int choice, data,item, pos;
while(1)
{
printf("1.Create List\n") ;
_ printf("2.Display\n") ;
DrInts (35 Count yin);
printf("4.Search\n") ;
print£("5.Add to empty list / Add at beginning\n");
printf("6.Add at end\n");
printf("7.Add after node\n");
printf£("8.Add before node\n");
printf("9.Add at position\n");
printf ("10.Delete\n") ;
printf("11.Reverse\n") ;
joenb tesa (aie) Yoyo biteAVa\ooK))F
printf("Enter your choice : ");
scanf ("%d", &choice) ;
Switch (choice)
{
case 1:
start = create_list(start);
break;
case 2:
display (start) ;
break;
case 3:
count (start) ;
break;
case 4:
printf("Enter the element to be searched : rH)is
scanf("%d", &data) ;
search(start,data) ;
break;
case 5:
printf("Enter the element to be inserted : ");
scanft("%d", &data) ; : :
start => daddatbeg(start, data);
break;
i i %
case 6:
printf ("Enter the element to be inserted : ha
scanf ("%d", &data) ;
Start = addatend(start, data) ;
break; :
case 7: :
printf("Enter the element to be inserted : me)
scanf("%d", &data); ~
printf("Enter the element after which to insert : Oe
scanf ("%d", &item) ;
start = addafter(start,data,item) ;
break;
case 8:
printf ("Enter the element to be inserted : ");
scanf ("%d", &data) ;
printf("Enter the element before which to insert: ");
scanf ("%d", &item) ;
start = addbefore(start,data,item);
breah;
case 9; :
printf("Enter the element to be inserted : ");
scanf ("%d", &data) ;
printf("“Enter the position at which to insert : ");
scanf ("%d", &pos) ;
start = addatpos(start,data,pos);
break; e
case 10:
printf ("Enter the element to be deleted : ");
scanf ("%d", &data) ;
start = del(start, data);
break;
case 11:
start = reverse(start);
break;
case 12:
exit (1);
default: ae
printf ("Wrong choice\n"); re
}/*End of switch*/
}/*End of while*/
}/*End of main()*/
In the function main(), we have taken an infinite loop and inside the loop we have written a switch
statement. In the different cases of this switch statement, we have implemented different operations of linked
list. To come out of the infinite loop and exit the program we have used the function exit ().
The structure pointer start is declared in main() and initialized to NULL. We send this pointer to all
functions because start is the only way of accessing linked list. Functions like those of insertion, deletion,
reversal will change our linked list and value of start might change so these functions return the value of
start. The functions like display(), count(), search() do not change the linked list so their return type
IS void.
t
Now p points to the first node of linked list. We can access the info part of first node by writing
p->info. Now we have to shift the pointer p forward so that it points to the next node. This can be done by
assigning the address of the next node to p as-
: e Data Structures Through C in Depth
52
p = p->link;
this assignment
Now p has address of the next node. Similarly we can visit each node of linked list through
as-
until p has NULL value, which is link part value of last element. So the linked list can be traversed
while (p!=NULL)
{
primer (“td 7, p--intoy;
p = p->link;
}
Let us take an example to understand how the assignment p = p->link makes the pointer p move forward.
From now onwards we will not show the addresses, we will show only the info part of the list in the figures.
Start
. Figure 3.4
In figure 3.4, node A is the first node so start points to it, node E is the last node so its link is NULL.
Initially p points to node A, p->info gives 11 and p->link points to node B
After the statement p = p->link;
p points to node B, p->info gives 22 and p->1link points to node C
After the statement p = p->link;
p points to node C, p->info gives 33 and p->link points to node D
After the statement p = p->link;
p points to node D, p->info gives 44 and p->1link points to node E
After the statement p = p->link;
p points to node E, p->info gives 55 and p->link is NULL
After the statement p = p->link;
p becomes NULL, i.e. we have reached the end of the list so we come out of the loop.
The following function display() displays the contents of the linked list.
void display(struct node *start)
{
struct node *p;
'it(starte == NULL)
{
printf£("List is empty\n");
return;
}
p = start;
teat tey( Stae 1S casi),
while(p != NULL)
{
printf("%d ",p->info);
p = p->link;
}
jeyeulvaheoa( \raQool ie
} /*End-~cf display() */
Don’t think of using start for moving forward. If we use start = start->link instead of
p= p->link then we will lose start and that is the only means
of accessing our list. The following function
count () finds out the number of elements of the linked list.
void count (struct node *start)
{
Struet node *p;
inked Lists 53
Mntwent. =. Oly
ra eSta rt:
while (p!=NULL)
oe
; Pp = p->link;
ent++;
}
‘ printf("Number of elements are %d\n",cnt);
f/*End of count() */.
The link part of the mode contains garbage value; we will assign address to it separately in the different
cases. In our explanation we will refer to this new node as node T.
The order of the above two statements is important. First we should make link of T equal to start and after
that only we should update start. Let’s see what happens if the order of these two statements is reversed.
start = tmp;
Link of tmp will point to itself because start has address of tmp. So if we reverse the order, then link of node
T will point to itself and we will be stuck in an infinite loop when the list is processed.
‘The following function addatbeg() adds a node at the beginning of the list.
struct node *addatbeg(struct node *start,int data)
{
struct node *tmp;
tmp = (struct node *)malloc(sizeof (struct node) );
tmp->info = data;
tmp->link = start;
start = tmp;
return start;
}/*End of addatbeg()
*/
st
After insertion
T is the only node Node T
start points toT \
link of Tis NULL
Figure 3.6 Insertion in an empty list
When the list is empty, value of start willbeNULL. The new node
that we are adding will be the only node in
the list. Since it is the first node, start should point to this nod it is
i also the
NULL. 2 € and it its link
last node so its li should be |
inked Lists
tmp->link = NULL;
start = tmp;
Since initially start was NULL, we can write start instead of NULL in the first statement, so now these two
statements can be written as-
tmp->link = start; ’ ;
Start = tmp;
These two statements are same as in the previous case (3.1.3.1), so we can see that this case reduces to the
previous case of insertion in the beginning and the same code can be written for both the cases.
After insertion
Node T is last node start
Node P is second last node r Node P Node T
Link of node T is NULL
Link of P points to node T
Figure 3.7 Insertion at the end of the list
Suppose we have a pointer p pointing to the node P. These are the two statements that should be written for
this insertion-
p->link = tmp;
tmp->link = NULL;
So in this case we should have a pointer p pointing to the last node of the list. The only information about the
linked list that we have is the pointer start. So we will traverse the list till the end to get the pointer p and then
do the insertion. This is how we can obtain the pointer p.
Dp = start;
while (p->link! =NULL)
p = p->link;
In traversal of list(3.1.1) our terminating condition was (p!=NULL), because there we wanted the loop to
terminate when p becomes NULL. Here we want the loop to terminate when p is pointing to the last node so the
terminating condition is (p->link! =NULL).
The following function addatend() inserts a node at the end of the list.
struct node *addatend(struct node *start,int data)
{
struct node *p,*tmp;
tmp = (struct node *)malloc(sizeof (struct node) )j;
tmp->info = data;
pr=estaxnt;
while (p->link! =NULL)
p = p->link;
p->link = tmp;
tmp->link = NULL;
return start;
}/*End of addatend()*/
After insertion
Node T is between nodes P and Q
Link of node T points to node Q
Link of node P points to node T
Suppose we have two pointers p and q pointing to nodes P and Q respectively. The two statements that
should be written for insertion of node T are-
tmp->link = q|
p->link = tmp;
Before insertion address of node Q is in p->link, so instead of pointer q we can write p->link. Now the
two statements for insertion can be written as-
tmp->link = p->link;
p->link = tmp;
Note : The order of these two statements is important, if you write them in the reverse order then you will lose
your links. The address of node Q is in p->link, suppose we write the statement
(p->link = tmp;) first then we will lose the address of node Q, there is no way to reach node Q and our list is
broken, So first we should assign the address of node Q to the link of node T by writing (tmp->link = p-
>link). Now we have stored the address of node Q, so we are free to change p->link.
Now we will see three cases of insertion in between the nodes-
1. Insertion after a node
2. Insertion before a node
3, Insertion at a given position
The two statements of insertion(tmp->link = p->link; p->link = tmp;) will be written in all the
three cases, but the way of finding the pointer p will be different.
p->link = tmp;
return start;
}
p = p->link;
}
printf("%d not present in the list\n",item);
FEEUEH start;
}/*End of addafter()*/
Let us see what happens if item is present in the last node and we have to insert after the last node. In this
case p points to last node and its link is NULL, so tmp->1link is automatically assigned NULL and we don’t have
any need for a special case of insertion at the end. This function will work correctly even if we insert after the
last node.
In this case we are given a value from the list, and we have to insert the new node before the node that contains
this value. Suppose the node Q contains the given value, and node P is its predecessor. We have to insert the
new node T before node Q, i.e. T is to be inserted between nodes P and Q. In the function given below, data is
the new value to be inserted and item is the value contained in node Q. For writing the two statements of
insertion ( tmp->link = p->link; p->link = tmp; ) we need to find the pointer p which points to the
predecessor of the node that contains item. Since item is present in Q and we have to find pointer to node P, so
here the condition for searching would be if (p->link->info == item) and terminating condition of the
loop would be (p->link! = NULL)
struct node *addbefore(struct node *start,int data,int item) -
{
‘ struct node *tmp,*p;
if(start == NULL )
{
printf("List is empty\n");
return start;
}
/*If data to be inserted before first node*/
if(item == start->info)
{ ;
tmp = (struct node *)malloc(sizeof (struct node) );
tmp->info = data;
—..tmp->link.= start;
start = tmp;
return start;
}
Das stare,
while (p->link!=NULL)
{
if (p->link->info == item)
(
{
tmp = (struct node *)malloc(sizeof (struct node) );
tmp->info = data;
tmp->link = p->link;
p->link = tmp;
return start;
}
p = p->link;
}
frre i ("%d not present in the list\n",item);
ae return start;
}/*End of addbefore()*/
Data Structures Through C in Depth
58
separately because we have
If the node is to be inserted before the first node, then that case has to be handled
start will be NULL and the term start->info will create
to update start in this case. If the list is empty,
(item == start-> info) we should check for empty list.
problems so before checking the condition if
The way of finding pointer p is different. If we have to insert at the first position we will have to update
start so that case is handled separately.
struct node *addatpos(struct node *start,int data,int pos)
{
struct node *tmp, *p;
slike 5
tmp = (struct node *)malloc(sizeof (struct node) );
tmp->info = data; =
if (pos==1)
{
tmp->link = start;
start = tmp;
return start;
}/*End of create_list()*/
In all the cases, at the end we should call free (tmp) to physically remove node T from the memory.
After deletion
Node P is the first node
start points to node P
After deletion
start is NULL start
tmp = start;
start = NULL;
In the second statement, we can write start->link instead of NULL. So this case reduces to the first
one(3.1.5.1).
Start
Before deletion
Link of P points to node T Node P NodeT NodeQ
Link of T points to node Q
start
Atter deletion
Node Q is after node P Node Pu Node T Node Q
Link of P points to node Q ea
The address of q is stored in tmp->1link so instead of q we can write tmp->link in the above statement.
p->link = tmp->link;
The value to be deleted is in node T and we need a pointer to its predecessor which is node P, so as in
addbefore() our condition for searching will be if (p->link->info == data). Here data is the element
to be deleted.
Deans Gainer
while(p->link!= NULL)
{
if (p->link->info == data)
{
tmp = p->link;
p->link = tmp->link;
free (tmp) ;
return start;
}
p = p->link;
Since linkof node T is NULL, in the above statement instead of NULL we can write tmp->1ink. Hence we can
write this statement as-
p->link = tmp->link;
start 2000
: Node A Node B Node C
Figure 3.14
(i) Node D is the first node so start points to it.
(ii) Node A is the last node so its link is NULL.
(iii) Link of D points to C, link of C points to B and link of B points to A.
Now let us see how we can make a function for the reversal of a linked list. We will take three pointers
prev, ptr and next. Initially the pointer ptr will point to start and prev will be NULL. In each pass, first
the link of pointer ptr is stored in pointer next and after that the link of ptr is changed so that it points to its
previous node. The pointers prev and ptr are moved forward. Here is the function reverse() that reverses a
linked list.
struct node *reverse(struct node *start)
{
struct node *prev, *ptr, *next;
prev = NULL;
DEE = start;
while (ptr!=NULL)
{
next = ptr->link;
ptr->link = prev;
DECVe=" DE:
ptr = next;
}
start = prev;
return start;
}/*End of reverse()*/
The following example shows all the steps of reversing a single linked list.
prev is NULL
ee i BO+GE->os ee EO+GE-+oR
start start start
f-2G-GE-oE
abI+->GE-o ee se! me
start start start
Beas
3 E exe
ee
3 44 |bs
RH
rt
inked Lists
63
start
at
mot
ee ee
Figure 3.15 Doubly linked list
The basic logic for all operations is same as in single linked list, but here we have to do a little extra work
because there is one more pointer that has to be updated each time. The main() function for the program of
doubly linked list is-
/*P3.2 Program of doubly linked list*/
#include<stdio.h>
#include<stdlib.h>
struct node
{
struct node *prev;
int info;
struct node *next;
be ;
struct node *create_list(struct node *start) ;
void display(struct node *start); wt
struct node *addtoempty(struct node *start,int data) ; amen
struct node *addatbeg(struct node *start,int data); =
struct node *addatend(struct node *start,int data);
struct node *addafter(struct node,*start,int data,int item);
struct node *addbefore(struct node *start,int data,int item );
oe
, Data Structures Through C in Depth
64
main ()
switch (choice)
{
case 1:
start=create_list(start) ;
break;
case 2:
. display (start) ;
break;
case 3:
* printf("Enter the element to be inserted : ");
scanf ("%d", &data) ;
start=addtoempty (start, data) ;
break;
case 4:
printf("Enter the element to be inserted : ");
scanf ("%d", &data) ;
start=addatbeg
(start, data) ;
break;
Casey:
printf("Enter the element to be inserted : ");
scanf ("%d", &data) ; °
start=addatend(start, data) ;
break;
case 6:
printf("Enter the element to be inserted : ");
scanf("%d", &data) ;
printf("Enter the element after which to insert : ");
scanf ("%d", &item) ; :
start=addafter
(start, data,item) ;
break;
case 7:
printf("Enter the element to be inserted : ");
scanf("%d", &data) ;
printf("Enter the element before which to insert :.");
scanf("%d",&item) ;
start=addbefore(start,data,item) ;
break;
case 8:
printf("Enter the element to be deleted : es
scan£("%d", &data) ;
start=del (start, data) ;
break;
case 9:
Meierme
M
Linked Lists e ewes gs
65
start=reverse(start);
break;
case 10:
exit(1);
default:
printf ("Wrong choice\n");
}/*End of switch*/
}/*End of while*/
}/*End of main ()*/
The next part of node T should point to node P, and address of node P is in start so we should write-
tmp->next = start;
Node T is inserted before node P so prev part of node P should now point to node T.
start->prev = tmp;
Now node T has become the first node so start should point to it.
start = tmp;
Data Structures Through C in Depth
66
In single linked list this case had reduced to the case of insertion at the beginning but here it is not so.
struct node *addtoempty(struct node *start,int data)
{
. struct node *tmp;
tmp = (struct node *)malloc(sizeof(struct node)
);
tmp->info = data;
tmp->prev = NULL;
tmp->next = NULL;
start=tmp;
return start;
}/*End of addtoempty()
*/
Start
Node P NodeQ
Now we will see how we can write the function addafter() for doubly linked list. We are given a value
and the new node is to be inserted after the node containing this value.
Suppose node P contains this value so we have to add new node after node P. As in single linked list here also
we can traverse the list and find a pointer p pointing to node P. Now in the four insertion statements we can
replace q by p->next .
tmp->prev = p; => tmp->prev = p;
tmp->next = q; Se tmp->next = p->next;
q->prev = tmp; Ee p->next->prev = tmp;
p->next = tmp; > p->next = tmp;
Note that p->next should be changed at the end because we are using it inprevious statements.
in single linked list we had seen that the case of inserting after the last node was handled automatically. But
here when we insert after the last node the third statement (p->next->prev = tmp;) will create problems.
The pointer p points to last node so its next is NULL hence the term p->next->prev is meaningless here. To
avoid this problem we can put a check like this-
Lf (p->next
! =NULL)
p->next->prev = tmp;
if(p->info == item)
{
tmp->prev = p;
tmp->next = p->next;
if (p->next
! =NULL)
p->next->prev = tmp;
p->next = tmp;
return Ssitart;
}
p = p->next;
}
printf("%d not present in the list\n",item) ;
\ return start;
}/*End of addafter()*/
Now we will see how to write function addbefore() for doubly linked list. In this case suppose we have to
insert the new node before node Q, so we will traverse the list and find a pointer g to node Q. In single linked
list we had to find the pointer to predecessor but here there is no need to do so because we can get the address of
predecessor by q->prev. So just replace p by q->prev in the four insertion statements.
tmp->prev = p; => tmp->prev = q->prev;
tmp->next = qd; == tmp->next = q;
VG=>pirevi s= Emp}; > q->prev = tmp;
p-Snext = tmp; =~ q->prev->next = tmp;
q->prev should be changed at the end because it being used in other statements. Thus third statement should
be the last one.
tmp->prev = q->prev;
tmp->next = qd;
q->prev->next = tmp;
G->prevy = emp);
As in single linked list, here also we will have to handle the case of insertion before the first node separately.
struct node *addbefore(struct node *start,int data,int item)
{
struct node *tmp,*q;
if (start==NULL)
{
printf("List is empty\n");
return start;
}
if(start->info == item)
{
tmp = (struct node *)malloc(sizeof(struct node));
tmp->info = data;
tmp->prev = NULL;
tmp->next = start;
start->prev = tmp;
start = tmp;
return start;
}
q = start;
while (q!=NULL)
{
Le (q>>anfo == Item)
{
tmp = (struct
er. node.*
a. i
)malloc(sizeof (struct node) ) ,
tmp->prev = q->prev;
tmp->next = q;
Q->prev->next = tmp;
q->prev = tmp;
Linked
ge Lists — aE i ALTE ASS Ea a <5) 69
return start;
Qu = q->next;
oes
printf£("%d not present in the list\n", item);
GOEUrn Stare.
}/*End of addbefore()*/
Node T
Figure 3.20 Deletion of the first node
tmp will be assigned the address of first node.
Emp = Start; ' Z
start will be updated so that now it points to node P
Start = start->next;
Now node P is the first node so itsprev part should contain NULL.
Data Structures Through C in th
70
start->prev = NULL
INULL
Node T
Figure 3.21 Deletion of the only node
The two statements for deletion will be -
tmp = start;
start = NULL;
In single linked list we had seen that this case reduced to the previous one. Let us see what happens here. We
can write start->next instead of NULL in the second statement, but then also this case does not reduce to the
previous ane. This is because of the third statement in the previous case, since start becomes NULL, the term
start->prev is meaningless.
. Before deletion
aed tet
Kes) rls ris rus > Shh
Se]
oS
eo]
on
q->prev p;
In single linked list, this case reduced to the previous case but here it won’t.
struct node *del(struct node *start,int data)
{
struct node *tmp;
if(start == NULL)
{
printf("List is empty\n");
return start;
}
if(start->next == NULL) /*Deletion of only node*/
if(start->info == data)
{ A :
tmp = start;
start = NULL;
free (tmp) ;
return start;
else
{
printf(“Element %d not found\n”,data) ;
return start;
}
if(start->info == data) /*Deletion of first node*/
{
tmp = start;
start = start->next;
start->prev = NULL;
free(tmp) ;
; return start;
}
tmp = start->next; /*Deletion in between*/
while (tmp->next!=NULL )
{
if(tmp->info == data)
{
tmp->prev->next tmp->next;
tmp->next->prev tmp->prev;
free (tmp) ;
return start;
}
tmp = tmp->next;
}
if(tmp->info == data) /*Deletion of last node*/
{
tmp->prev->next = NULL;
free(tmp) ;
return start; /
}
printf("Element %d not found\n",data) ;
return start;
}/*End of del()*/
: Data Structures Through C in Depth
72
Figure 3.24
In the reversed list-
(i) start points to Node D.
(ii) Node D is the first node so its. prev is NULL.
(iii) Node A is the last node so its next is NULL. :
(iv) next of D points to’C, next of C points to B and next of B points to A.
(v) prev of A points to B, prev of B points to C, prev of C points to D.
For making the function of reversal of doubly linked list we will need only two pointers.
struct node *reverse(struct node *start)
{
Struct node *pil, *p2;
pl = start;
b2) = pl--next;,
pl->next = NULL;
pl->prev=p2 ;
while (p2!=NULL)
{
p2->prev = p2->next;
p2-Snext = pl;
pl = p2;
p2 = p2->prev;
}
Seance =" phy
printf("List reversed\n");
return start;
}/*End of reverse()*/
Ina doubly linked list we have an extra pointer which consumes extra space, and maintenance of this pointer
makes operations lengthy and time consuming. So doubly linked lists are beneficial only when we frequently
need the predecessor of a node.
eee aS
Figure 3.25
Each node has a successor and all the nodes torm a ring. Now we can access any node of the linked list
:
without goingg back and starting traversal again from first node be cause list
ist is
is in
i the form of ii
go from last node to first node.
rel Mae:
Linked Lists aa! 73
We take an external pointer that points to the last node of the list. If we have a pointer last pointing to the last
“node, then last->1ink will point to the first node. pu
as
Node P NodeZ
Figure 3.26
In the figure 3.26, the pointer last points to node Z and last->1link points to node P. Let us see why we
have taken a pointer that points to the last node instead of first node. Suppose we take a pointer start pointing
to first node of circular linked list. Take the case of insertion of a node in the beginning.
Figure 3.27
For insertion of node T in the beginning we need the address of node Z, because we have to change the link
of node Z and make it point to node T. So we will have to traverse the whole list. For insertion at the end it is
obvious that the whole list has to be traversed. If instead of pointer start we take a pointer to the last node
then in both the cases there won*t be any need to traverse the whole list. So insertion in the beginning or at the
end takes constant time irrespective of the-length of the list.
If the circular list is empty the pointer last is NULL, and if the list contains only one element then the link of
lastpoints to last.
Now let us see different operations on the circular linked list. The algorithms are similar to that of single
linked list but we have to make sure that after completing any operation the link of last node points to the first.
/*P3.3 Program of circular linked list*/
#include<stdio.h>
#include<stdlib.h>
struct node
{°
int: Info};
struct node *link;
ia
struct node *create_list(struct node *last);
void display(struct node *last)j;
struct node *addtoempty(struct node *last,int data);
struct node *addatbeg(struct node *last,int data);
struct node *addatend(struct node *last,int data);
struct node *addafter(struct node *last,int data,int item);
struct node *del(struct node *last,int data);
main () )
{
int choice, data, item;
struct node *last=NULL;
while(1)
{
printf("1.Create List\n");
primer ("2-Display\n”) 5
printf£("3.Add to empty list\n");
printf("4.Add at_beginning\n") ;
printf£("5.Add at end\n") ;
print£(*6sAddv-atter-\n");
printf("7.Delete\n");
princr( SSeOurtNn 2) 5
74 Data Structures Through C in Depth
switch (choice)
if
case 1:
last=create_list
(last) ;
break;
case 2:
display (last);
break;
case 3:
printf("Enter the element to be inserted : ");
scanf ("%d", &data) ;
last=addtoempty
(last, data) ;
break;
case 4:
printf("Enter the element to be inserted : ");
scanf ("%d", &data) ;
last=addatbeg
(last, data) ;
break;
case 5:
printf("Enter the element to be inserted : ");
scanf ("%d", &data) ;
last=addatend(last,data) ;
break;
case 6:
printf("Enter the element to be inserted : ");
scanf ("%d", &data) ;
printf("Enter the element after which to insert : ");
scanf ("%d", &item) ;
last=addafter(last,data, item);
break;
case 7:
printf("Enter the element to be deleted : ");
scanf("%d", &data) ;
last=del (last, data) ;
break;
case 8:
exit(1);
default:
printf("Wrong choice\n") ;
}/*End of switch*/
}/*End of while*/
}/*End of main()*/
if(last == NULL)
{
printf("List is empty\n");
inked Lists
return;
= last->link;
printf("%d ",p->info);
p = -p->link;
}Jwhile(p!=last->link) ;
pEInce(*\n"):
}/*End of display()
*/
+
Figure 3.27 Insertion at the beginning of the list
Before insertion,P is the first node so last->1ink points to node P.
_ After insertion, link of node T should point to node P and address of node P is in last->link
tmp->link = last->link;
We know that last->link always points to the first node, here T is the first node so
last->link points to node T (or last).
last->link = last;
The order
of the abeve three statements is important.
struct nede *addatend(struét aode *last, int data)
{ aa :
sthuct node_*tmp;
tmpy = (struct node *)malloc(sizeof(struct node) );
tmp->Sinfo = data;
tmp->link = last->link;
last=>link = tmp;
asic = tmp;
return last;
}/*End of addatend()*/
After deletion, link of node Z should point to node A, so last->1ink should point to node A. Address of node
A is in link of node T.
last->link = tmp->link;
Node T Node A
last
INULL last
Node T
There will be only one node in the list if link of last node points to itself. After deletion the list will become
empty so NULL is assigned to last.
last = NULL;
Before deletion, last points to node T and last->link points to node A, p is a pointer to node P. After
deletion, link of node P should point to node A. Address of node A is in last->link.
p->link = last->link;
if(p->link->info == data)
{
tmp = p->link;
p->link = tmp->link;
free (tmp) ;
return last;
}
p = p->link;
}
if(last->info
( == data) /*Deletion of last node*/
tmp = last;
p->link = last->link;
Laste= =p
free(tmp) ;
return last;
}
printf("Element %d not found\n", data) ;
return last;
}/*End of del()*/
We have studied circular lists which are singly linked. Double linked lists can also be made circular. In this
case the next pointer of last node points to first node, and the prev pointer of first node points to the last node.
struct node
{
quate, UisliBloyp
struct node *link;
Nee
struct node *create_list(struct node *head);
void display(struct node *head);
struct node *addatend(struct node *head,int data) ;
struct node *addbefore(struct node *head,int data,int item yey
struct node *addatpos(struct node *head,int data,int pos);
struct node *del(struct node *head,int data);
struct node *reverse(struct node head) ;
main ()
{
int choice,data, item,pos;
struct node *head;
head = (struct node *)malloc(sizeof (struct node) );
head->info = 0;
head->link = NULL;
head = create_list (head) ;
while(1)
{
DisimMc L(y DaspilLeay:
Wa);
printf("2.Add at end\n");
printf ("3.Add before node\n") ;
printf("4.Add at position\n") ;
printf ("5.Delete\n");
printf("6.Reverse\n") ;
jopaabalonele)
7/AOWEhEWen\wa! )yp
Print£("Enter your choice +);
scanf ("%d", &choice) ;
switch(choice)
{
case 1:
display (head) ;
break;
case 2:
printf("Enter the element to be inserted : ");
scant ("%d", &data) ;
head = addatend(head, data) ;
break;
case 3:
printf("Enter the element to be inserted : ");
scanf ("%d", &data) ;
printf("Enter the element before which to insert : ");
scanf("%d",&item) ;
head = addbefore (head, data, item) ;
break;
case 4:
printf("Enter the element to be inserted : ");
scanf ("%d", &data) ;
printf("Enter the position at which to insert : yh
scanf("%d",&pos) ;
head = addatpos
(head, data, pos) ;
break;
case 5:
printf("Enter the element to be deleted : Mine
scanf ("%d",&data) ;
head = del (head, data) ;
break;
case 6
head = reverse (head);
break;
Linked Lists
8]
case 7:
exit(1);
default:
We printf ("Wrong choice\n\n") ;
}/*End of switch */
}/*End of while */
}7*End of main() */
}
D = p->link;
}
printf("%d not present in the list\n",item) ;
return head;
}/*End of addbefore()*/
struct node *addatpos(struct node *head,int data,int pos)
{
struct node *tmp,*p;
aigate, 35
tmp = (struct node *)malloc(sizeof (struct node) )j;
tmp->info = data;
p = head;
for (i=174<=pos-171++)
{
; p\= p->Llink;
if (p==NULL)
{
printf("There are less than %d elements\n",pos) ;
return head;
}
}
tmp->link = p->link;
p->link = tmp;
return head;
}/*End of addatpos()*/
The header node can be attached to circular linked lists and doubl linked lists also, i
shows circular single linked list with header node. ‘ So. The following figure
List with 3 nodes Empty list
Figure 3.36 Circular single linked list with header
Here the external pointer points to the header node rather than at the end.
The following figures show doubly linked list and doubly linked circular list with header nodes.
head a , head ay
Figure 3.37
Initial List
co evesesesnssonesenesenscecesseesnslaceeaseessseeaseesseearsasessnseeneessanensennecnanensesneneneeanacnnesnanessanensuanaseneuanensnenseareenssensscnsecussenanassenssuunsannssnnsssnaesennass]
Insert 29
Insert 6
asoecnceccccveneseescoeessescessssepecccssesescsesssbaseesonesscceeseesssnsssaeessnevenenesnousensessenenauesssassenesssusenessnnansnesusaneasanesnansenesene
senses
Insert, 50 statue
él +f + 2 4 a J 3 Lk
The new node has to be inserted after the node which is pointed by pointer p. The two lines of insertion are
same as in single linked list.
tmp->link = p->link;
p->link = tmp;
If-the insertion is to be done in the end, then also the above statements will work.
Other functions like display(), count () etc will remain same. The functions search() will be altered a
little because here we can stop our search as soon as we find an element with value larger than the given
element to be searched. The functions like addatbeg(), addatend(), addafter(), addbefore(),
addatpos () don’t make sense here because if we use these functions then the sorted order of the list might get
disturbed. The function insert_s() decides where the element has to be inserted and inserts it in the proper
place. The process of deletion is same as in single linked list.
/*P3.5 Program of sorted linked list*/
#include<stdio.h>
#include<stdlib.h> \
struct node
( :
int into;
struct node *link;
}3
struct node *insert_s(struct node *start,int data);
void search(struct node *start,int data);
void display(struct mode *start);
main ()
{
int choice, data;
struct node *start = NULL;
while(1)
{
printf("1.Insert\n");
printf("2.Display\n") ;
printf("3.Search\n"); ae
prantet("4 Exit\n") :
printf("Enter your choice : ");
scanf ("%d", &choice) ;
switch (choice)
{
case 1:
printf (“Enter the element to be inserted : ");
scanf ("%d", &data) ;
start = insert_s(start, data) ;
break;
case 2:
display (start);
break;
case 3:
printf("Enter the element to be searched : ");
scanf ("%d", &data) ;
search(start, data);
break;
case 4:
exit (i);
default:
printf ("Wrong choice\n");
}/*End of switch*/
}/*End of while*/
} /*end of main */
tmp->link = start;
start = tmp;
return start;
else
{
p = start;
while (p->link!=NULL && p->link->info < data)
p = p->link;
tmp->link = p->link;
p->link = tmp;
}
return start;
}/*End of insert()*/
void search(struct node *start,int data)
{
struct node *p;
int pos;
if(start==NULL || data < start->info)
{
_printf("%d
not found in list\n",data);
return;
}
p = start;
pos = 1;
while(p!=NULL && p->info<=data)
{
if (p->info == data)
printf ("td found at position %d\n",data,
pos) ;
return;
ta Structures Through C in t'
86
p= p->link;
postt+;
}
printf("%d not found in list\n",data) ;
}/*End of search()*/
if(start == NULL)
: Peer
printf("List |
is ah
empty\n");
return;
}
q = start;
DLInes ("Lusty us) i\ne);
while (q!=NULL)
{
printf("%d ",q->info) ;
q = q->link;
}
ohealgohep
eM Wal) )e6
}/*End of display() */
Linked list L
Figure 3.39
ink ists
Exchange
5>4 Exchange
the following function will sort a single linked list through selection sort technique by exchanging data.
roid selection(struct node *start)
: }
}
}/*End of selection()*/
The terminating condition for outer loop is (p->1ink! =NULL), so it will terminate when Pp points to the last
node, i.e. it will work till p reaches second last node. The terminating condition for inner loop is (q!=NULL) , SO
it will terminate when q becomes NULL, i.e. it will work till q reaches last node. After each iteration of the outer
loop, the smallest element from the unsorted elements will be placed at its proper place. In the figure 3.40, the
shaded portion shows the elements that have been placed at their proper place.
The pointer variable end is NULL in the first iteration of outer loop, so inner loop will terminate
when p
points to the last node i.e. the inner loop will work only till p reaches second last node. After first
iteration
value of end is updated and is made equal to q. So now end points to the last node. This time the inner Loof
we terminate when p points to the second last node i.e. the inner loop will work only till p reaches third last
node.
After each iteration of outer loop, the pointer end moves one node back towards the beginning.
Initially end
is NULL, after first iteration it points to the last node, after second iteration it points to the
second lastara an
inked List ‘ Ht 89
lee
so on. The terminating condition for outer loop is taken as (end!=start->link), so the outer loop will
terminate when end points to second node, i.e. the outer loop will work only till end reaches the third node.
First pass
Pointer end is NULL
4<5 No Exchange
5 >2 Exchange
3>2 Exchange
}
}
}/*End of bubble()*/
In both the loops, pointers p and q are initialized and changed in the same way as in selection sort by
exchanging data. Now let us see what is to be done if p->info is greater then q->info.
7 |3 | +t 5_ BS
Node R Node P Node A NodeS NodeQ Node B
The positions of nodes P and Q have to be exchanged, i.e. node P should be between nodes S and B and node Q
should be between nodes R and A
(i) Node P should be before node B, so link of node P should point to node B
p->link = q->link;
(it) Node Q should be before node A, so link of node Q should point to node A
q->link = p->link; :
(iii) Node Q should be after node R, so link of node R should point to node Q
Bo > danke = cg
(iv) Node P should be after node S, so link of node S should point to node P
S >> =}
For writing the first two statements we will need a temporary pointer, since we are exchanging
p->link and q->link.
tmp = p->link;
p->link = q->link;
q->link tmp;
If p points to the first node, then r also points to the first node i.e. nodes R and P both are same, so in
this
case there is no need of writing the third statement (r->link = q;)/ ;
We need the third statement only if the pointer p is not equal to start. So,it can be written as-
if (p!=start)
BeSlink = qq;
If start points to node P, then start needs to be updated and now it should
point to node Q
inked Lists 91
if (p==start)
start =.q;
After writing the above statements the linked list will look like this-
The positions of nodes P and Q have changed and this is what we want because the value in node P was
more than value in node Q. Now we will bring the pointers p and q back to their positions to continue with our
sorting process. For this we will exchange the pointers p and q with the help of a temporary pointer.
tmp = pi p=qi q = tmp;
}
}
return start;
}/*End of selection_l()*/
In the previous figures, we have taken the case when p and q point to non adjacent nodes, and we have
written our code according to this case only. Now let us see whether this code will work when p and q point to
Inthat
adjacent nodes. The pointers p and q will point to adjacent nodes only in the first iteration of inner loop.
case s and q point to same node. The following figure shows this situation-
need to consider a
You can see that the code we have written wilh work in this case also. So there is no
separate case when nodes p and q are adjacent.
92 Data Structures Through C in Dept
When we sort a linked list by exchanging data, links are not disturbed so start remains same. But when
sorting is done by rearranging links, the value of start might change so it is necessary to return the value of.
start at the end of the function.
}
}
return start; s
}/ Bad OG DubpbDL ema)7,
a
1 a
3.7 Merging
If there are two sorted linked lists, then the process of combining these sorted lists into another list of sorted
order is called merging. The following figure shows two sorted lists and a third list obtained by merging them.
Merged List
Figure 3.42
If there is any element that is common to both the lists, it will be inserted only once in the third list. For
merging, both the lists are scanned from left to right. We will take one element from each list, compare them
and then take the smaller one in third list. This process will continue until the elements of one list are finished.
Then we will take the remaining elements of unfinished list in third list. The whole process for merging is
shown in the figure 3.43. We’ve taken two pointers p1 and p2 that will point to the nodes that are being
compared. There can be three cases while comparing p1->info and p2->info
l.If (pl->info) < (p2->info) ;
The new node that is added to the resultant list has info equal to p1->info. After this we will make p1 point to
the next node of first list.
2. If (p2->info) < (pl->info)
The new node that is added to the resultant list has info equal to p2->info. After this we will make p2 point to
the next node of second list.
3. If (p1->info) == (p2->info) .
The new node that is added to the resultant list has info equal to p1->info(or p2->info), After this we will
make p1 and p2 point to the next nodes of first list and second list respectively. The procedure of 'nerging is
shown in figure 3.43.
while(p1!=NULL && p2!=NULL)
{
if(p1->info < p2->info)
{
start3 = insert(start3,p1->info) ;
pl = pl=>link;
}
else if (p2->info < pl->info)
{
start3 = insert(start3,p2->info) ;
p2 = p2->link;
}
else if(p1->info == p2->info)
{
start3 = insert (start3,pl->info) ;
pi = pl->link;
p2 = p2->link;
}
} i
The above loop will terminate when any of the list will finish. Now we have to add the remaining nodes of
the unfinished list to the resultant list. If second list has finished, then we will insert all the nodes of first list in
the resultant list as-
while (pi1!=NULL)
{
start3 = insert(start3,pl->info) ;
Data Structures Through Cin Depth
94
pl = pleslink;
}
nodes of second list in the resultant list as-
If first list has finished, then we will insert all the
while
(p2 !=NULL)
( ;
start3 = insert (start3,p2->info) ;
pa) = p2-alink;
p2 is NULL
{
struct node’ *p, *tmp;
tmp = (struct node *)malloc(sizeof (struct node)
);
tmp->info = data;
/*list empty or data to be added in beginning */
if(start == NULL || data<start->info)
tmp->link = start;
start = tmp;
return start;
else
{
Dp = stant;
while(p->link!=NULL && p->link->info < data)
p = p->link;
tmp->link = p->link;
p->link = tmp;
}
return start;
}/*End of insert_s{()*/
tmp->link = start;
start = tmp;
return start;
Pp = start;
while (p->link! =NULL)
p = p->link;
tmp->link = p->link;
p->link = tmp;
1
}
return start;
}/*End of insert ()*/
The function insert_s() is the same that we have made in sorted linked list and it inserts nodes in
ascending order. We have used this function to create the two sorted lists that are to be merged. The function
inked Lists
97
insert () is a simple function that inserts nodes in a linked list at the end. We have used this function to insert
nodes in the third list.
3.8 Concatenation
Suppose we have two single linked lists and we want to append one at the end of another. For this the link of
last node of first list should point to the first node of the second list. Let us take two single linked lists and
concatenate them.
start! start2
Node A NodeB Node C Node D Node P pote
R
(a) First Circular Linked List (b) Second Circular Linked List
last!
lasti->link = last2->link;
We will lose the address of node A, so before writing this statement we should save last1->1ink.
ptr = last1->link;
1
coefficient exponent
[5] 4} >i.
info
‘descending order based on the exponent. An empty list will represent zero polynomial. The following program
shows creation of polynomial linked lists and their addition and multiplication.
main()
{
struct node *startl = NULL,*start2 = NULL;
printf("Enter polynomial 1 :\n"); startl = create(startl) ;
printf("Enter polynomial 2 :\n"); start2 = create(start2);
printf("Polynomial 1 is : "); display(start1);
printf("Polynomial 2 is : ");. display(start2);
poly_add(startl, start2);
poly.mult(startl, start2);
}/*End of main()*/
tmp->link = start;
start = tmp;
else
Der i= Stare;
~ while(ptr->link!=NULL && ptr->link->expo >= ex)
ptr. ptr->link;
|
in
TOD ch p calls Data Structures Through C in Dept!
tmp->link ptr->link;
ptr->link tmp ;
}
return start;
}/*Enc of insert()*/
i
tmp->link = start;
Sstarte=—stmpr
} |
else /*Insert at the end of the list*/
ptm = start; ;
while (ptr->link!=NULL)
ptr = ptr->link;
tmp->link = ptr->link;
ptr->link = tmp;
}
return start;
}/*End of insert()*/ 4
f |
inked Lists | : 3 | 101
pl pl1->link;
p2 p2->link;
2, } .
} | :
/*if poly2 has finished and elements left in poly1*/ : : Se
while (p1!=NULL)
Lice
iy start3 = insert (start3,pl->coef,p1l->expo) ;
pl = pl->link;
} arene
/*if polyl has finished and elements left in poly2*/ ;
while (p2!=NULL)
{
start3 = insert (start3,p2->coef,p2->expo) ;
p2 = p2->link;
} :
printf("Added polynomial is : ");
display (start3) ; :
}/*End of poly_add() */
void poly ‘mult (struct node *pl, struct node *p2)
{
struct node *start3;
struct node *p2_beg = p2;
start3 = NULL; - es \
if(pl == NULL || p2 == NULL)
{ i s
p2 = p2_beg;
while
(p2 !=NULL)
{-
start3 = insert_s(start3,pl->coef*p2->coef,p1->expo+p2->expo) ;
p2 = p2->link;
}
pl
= pl->link; \
} 2
printf("Multiplied polynomial is : ");
display (start3);
}/*End of poly_mult()*/
The pointers p1 and p2 will point to the current nodes in the polynomials which will be added. The process
of addition is shown in the figure 3.46. Both polynomials are traversed until one polynomial finishes. We can
have three cases-
3. If (pl->expo) == (p2->expo)
The new node that is added to the resultant list has coefficient equal to (p1->coef + p2->coef) and exponent
equal to p1->expo(or p2->expo). After this we will make p1 and p2 point to the next nodes of polynomial]
and polynomial 2 respectively. The procedure of polynomial addition is shown in figure 3.46.
while(p14=NULL && p2!=NULL)
{
if(pl->expo > p2->expo)
{
p3_start = insert(p3_start,p1l->coef,p1l->expo) ;
Div=—pl=>laink:
}
else if (p2->expo > pil->expo)
{
p3_start = insert (p3_start,p2->coef,p2->expo) ;
t p2) = p2=->link;
}
else if(pl->expo == p2->expo)
{
p3_start = insert (p3_start,p1->coef+p2->coef,pl->expo) ;
jolly = jose abil ep
p2 = p2->link;
}
The above loop will terminate when any of the polynomial will finish. Now we have to add the remaining
nodes of the unfinished polynomial to the resultant list. If polynomial 2 has finished, then we will put all the
terms of polynomial | in the resultant list as-
while (p1!=NULL)
{
p3_start = insert (p3_start,pl->coef,p1->expo) ;
pl = pi->link;
}
If polynomial | has finished, then we will put all the terms of polynomial 2 in the resultant list as-
while
(p2 !=NULL)
{
p3_start = insert(p3_start,p2->coef,p2->expo) ;
p2) ="p2=Slink;
We can see the advantage of storing the terms in descending order of their exponents. If it was not so then
we would; have to scan both the lists many times.
Linked Lists 103
PRS
213Es
Zee
te
Exercise
- aes Cee ,
__ In all the problems assume that we have an integer in the info part of nodes.
1, Write a function to count the number of occurrences of an element in a single linked list.
2. Write a function to find the smallest and largest element of a single linked list.
3. Write a function to check if two linked lists are identical. Two lists are identical if they have same number of
elements and the corresponding elements in both lists are same
4. Write a function to create a copy of a single linked list.
5. Given a linked list L, write a function to create a single linki:d list that is reverse of the list L. For example if
the list L is 1->2->3->4->5 then the new list should he 5->4->3->2->1. The list L should remain
‘unchanged.
6. Write a program to swap adjacent elements of a single li:,ked list
(i) by exchanging info part
(ii) by rearranging links.
For example if a linked list is 1->2->3->4->5->6->7->8, then after swapping adjacent elements it should
become 2->1->4->3->6->5->8->7.
7. Write a program to swap adjacent elements of a double linked list by rearranging links.
8. Write a program to swap the first and last u...i.cnts of a single inlee list
(i) by exchanging info part.
(ii) by rearranging links.
9. Write a function to move the largest element to the end of a single linked list.
10. Write a function to move the smallest element to the beginning of a single linked list.°
11. Write a function for deleting all the nodes from a single linked list which have a value N.
12. Given a single linked list L1 which is sorted in ascending order, and another single linked list L2 which is
not sorted, write a function to print the elements of second list according to the first list. For example if the first
list is 1->2->5->7->8, then the function should print the 1", 2", 5", 7", 8" elements of second list.
13. Write a program to remove first node of the list and insert it at the end, without changing info part of any
node.
14. Write'a program to remove thelastnode of the listand insert it in the beginning, without changing info part
of any node. |
15. Write a program to move a node N positions ponvard'in a single linked list.
16, Write a function to delete a neyfrom a single linked list. The only information we have is a pointer to the
node that has to be deleted.
17. Write functions to insert a nde! just before and just after a node pointed to by a pointer p, without using the
pointer start. —
Data Structures Through C in Depth
106
linked list.
18. What is wrong in the following code that attempts to free all the nodes of a single
p=start;
while (p!=NULL)
{
free(p);
p=p->link;
}
Write a function Destroy () that frees all the nodes of a single linked list.
19. Write a function to remove duplicates from a sorted single linked list.
20. Write a function to remove duplicates from an unsorted single linked list.
21. Write a function to create a linked list that is intersection of two single linked lists, i.e. it contains only the
elements which are common to both the lists.
22. Write a function to create a linked list that is union of two single linked lists, i.e. it contains all elements of
both lists and if an element is repeated in both lists, then it is included only once.
23. Given a list L1, delete all the nodes having negative numbers in info part and insert them into list L2 and all
the nodes having positive numbers into list L3. No new nodes should be allocated.
24. Given a linked list L1, create two linked lists one having the even numbers of LI and the other having the
odd numbers of L1. Don’t change the list L1. ,
25. Write a function to delete alternate nodes(even numbered nodes) from a single linked list. For example if the
list is 1->2->3->4->5->6->7 then the resulting list should be 1->3->5->7.
26. Write a function to get then"” node from the end of a single linked list, without counting the elements or
reversing the list.
.27. Write a function to find out whether a single linked is NULL terminated or contains a cycle/loop. If the list
contains a cycle, find the length of the cycle and the length of the whole list. Find the node that causes the cycle
i.e. the node at which the cycle starts. This node is pointed by two different nodes of the list. Remove the cycle
from the list and make it NULL terminated.
28. Write a function to find out the middle node of a single linked list without counting all the elements of the
list.
29. Write a function to split a single linked list into two halves.
30. Write a function to split a single linked list into two lists at a node containing the given information.
31. Write a function to split a single linked list into two lists such that the alternate nodes(even numbered nodes)
go to a new list.
32. Write a function to combine the alternate nodes of two null terminated single linked lists. For example if the
first list is 1->2->3->4 and the second list is 5->7->8->9 then after combining them the first list should be
1->5->2->7->3->8->4->9 and second list should be empty. If both lists are not of the same length, then the
remaining nodes of the longer list are taken in the combined list. For example if the first list is 1->2->3->4 and
the second list is 5->7 then the combined list should be 1->5->2->7->3->4.
33. Suppose there are two null terminated single linked lists which merge at a given point and share all the
nodes after that merge point (Y shaped lists). Write a function to find the merge point (intersection point).
34. Create a double linked list in which info part of each node contains a digit of a given number. The digits
should be stored in reverse order, i.e. the least significant digit should be stored in the first node and the most
significant digit in the last node. If the number is 5468132 then the linked list should be
2->3->1->8->6->4->5. Write a function to add two numbers represented by linked lists.
35. Modify the program in the previous problem so that now in each node of the list we can store 4 digits of the
given number. For example if the number is 23156782913287 then the linked list would be
3287->8291->1567->23.
36. Write a function to find whether a linked list is palindrome or not.
Ais Construct a linked list in which each node has the following information about a student - rollno, name,
marks in 3 subjects. Enter records of different students in list. Traverse this list and calculate the total marks
percentage of each student. Count the number of students who scored passing marks( above 40 percent).
38. Modify the previous prograta so that now the names are inserted in alphabetical order inthe list. Make this
program menu driven with the following menus-
Linked Lists
e | e IE 107
(i) Create list (ii) Insert (iii) Delete (iv) Modify (v) Display record (vi) Display result
vi
Delete menu should have the facility of entering name of a student and the record of that student should
be
deleted. Display record menu should ask for the roll no of a student and display all information. Display result
should display the number of students who have passed. Modify menu has the facility of modifying a record
given the roll number.
Stacks and Queues
In linked list and arrays, insertions and deletions can be performed at any place of the list. There can be
situations when there is a need of a data structure in which operations are allowed only on the ends of the list
and not in the middle. Stack and Queue are data structures which fulfill these requirements. Stack is a linear list
in which insertions and deletions are allowed only at one end while queue is a linear list in which insertion is
performed on one end and deletion is performed on the other end.
4.1 Stack
Stack is a linear list in which insertions and deletions are allowed only at one end, called top of the stack. We
can see examples of stack in our daily life like stack of trays in a cafeteria, stack of books or stack of tennis
balls. In all these cases we can see that any object can be removed or added only at the top.
(a) Empty stack (b) Push A (c) Push B (d) Push C (e) Push D
(i) Push E
Figure 4.2
Stacks and Queues 109
We can see that the element which is pushed last is popped first from the stack. In the example of figure 4.2,
D is pushed at last but it was the first one to be popped. The behaviour of stack is like last in first out, so it is
also called LIFO(Last In First Out) data structure.
Before pushing any element we must check whether there is space in the stack or not. If there is not enough
space then stack is said to be in overflow state and the new element can’t be pushed. Similarly before pop
operation if stack is empty and pop operation is attempted, then stack is said to be in underflow state.
Since stack is a linear list, it can be implemented using arrays or linked lists. In the next two sections we will
study these two implementations of stack.
[0] (1) [2] (3) [4] ins {0} [1] (2) (3) (4)
(j) Push 30
top =2
main ()
{
int choice, item;
while(1)
{
prante (a) Push \ne) ;
printf ("2.Pop\n");
printf("3.Display the top element\n") ;
printf ("4.Display all stack elements\n") ;
Diane.
(Se OumtN sy
printf("Enter your choice : ");
scanf ("%$d", &choice) ;
switch(choice)
{
case 1
printf("Enter the item to be pushed : ");
scanf("%d",&item) ; ; ‘inf
push (item) ;
break;
case 2:
item = pop();
printf("Popped item is : %d\n",item);
break;
case 3: :
printf£("Item at the top is : %d\n", peek());
break;
case 4:
display ();
break;
case 5:
exit(1);
default:
printf ("Wrong choice\n");
}/*End of switch*/
}/*End of while*/
}/*End of main()*/
int pop()
{
int item;
if (isEmpty
() )
{
printf("Stack Underflow\n");
exit(1);
}
item = stack_arr[top];
top) =", top— i;
return item;
}/*End of pop()*/
int peek()
{
if (isEmpty
() )
{
printf ("Stack Underflow\n") ;
Stacks and Queues . pee
exit(1);
}
return stack_arr[top] ;
}/*End of peek()*/
int isEmpty
()
{
if (top==-1)
return 1;
else
return 0;
}/*End of isEmpty()*/
int isFull()
{
if (top==MAX-1)
return 1;
else
return 0;
}/*End of isFull()*/
void display()
{
Int:
if(isEmpty() )
{ -
printf("Stack is empty\n");
return;
}
printf("Stack elements :\n\n");
for(i=top; i>=0; i--) ‘4
printf(" d\n", stack_arr[i]);
pranice(™ \n™)\5
}/*End of display()*/
The function push() pushes an item on the stack, the function pop() pops an item from the stack and
returns the popped item to main(). The function peek() returns top item without removing it from the stack,
so the value of top remains unchanged. The function display () displays all the elements of the stack.
We will take the beginning of linked list as the top of the stack. For push operation, a node will be inserted in
the beginning of the list. For pop operation, first node of the list will be deleted. If we take the end of the list as
_top of the stack then for each push and pop operation we will have to traverse the whole list. We will take a
pointer top that points to the first node of linked list. This pointer top is same as the pointer start that we had
taken in single linked list.
For pushing an element on stack we follow the procedure of insertion in beginning of the linked list. The
function push() would be similar to the function addatbeg() of single linked list. The stack will overflow
only when there is no space left for dynamic memory allocation and in this case call to function malloc () will
return NULL. So inside the function push (), we will check for this overflow condition.
A
For pop operation, we will delete the first element of linked list. The underflow condition will arise when
linked list is empty i.e. when top is equal to NULL. So inside the function pop (), we will check for this
underflow condition.
top is NULL
top
(e) Push20
(h) Push30
top =
main()
{ ;
int choice, item;
while(1)
{
jopaniqgtere
(Wake ieotsleh\ alll) ie
peImtr ( 25 Pop\n)F
printf£("3.Display item at the top\n");
printf("4.Display all items of the stack\n");
PHEintr(YS. Ouat in")
printf("Enter your choice :
scanf("%d",
switch(choice)
{
case 1:
printf("Enter the item to be, pushed : ney
scanf ("%d", &item) ;
push (item) ;
break;
case 2:
Stacks and ( 113
item = pop();
printf("Popped item is : %d\n",item);
break;
case 3: .
printf("Item at the top is %d\n",peek());
break;
case 4:
display();
break;
case 5:
( exit (1);
default
printf ("Wrong choice\n") ;
}/*End of switch*/
}/*End of while*/
}/*End of main()*/
int peek()
{
if (isEmpty() )
{
printf("Stack Underflow\n") ;
exit(1);
}
return top->info;
}/*End of peek()*/
int isEmpty ()
{
if (top==NULL)
return 1;
else
return 0;
}/*isEmpty()*/
Data Structures Through C in t
114
void display()
{
struct node *ptr;
DEre= copy
if (isEmpty() )
{
printf("Stack is empty\n");
return;
}
printf("Stack elements :\n");
while (ptr!=NULL)
{
prints(’ sd\n") ptr—>inito))
ptr = ptr->link;
}
ronan aese (Npalbh)Pp
}/*End of display()*/
4.2 Queue
Queuc is a linear list in which elements can be inserted only at one end called rear of the queue and deleted only
at the other end called front of the queue. We can see examples of queue in daily life like queue of people
waiting at a counter or a queue of cars etc. In the queue of people and queue of cars, the person or car that enters
first in the queue will be out first. The behaviour of queue is first in first out, so it is also called’ FIFO(First In
First Out) data structure. The following example shows that the new element is inserted at the end called rear
and the deletion is done at the other end called front.
Figure 4.5
In a queue, the insertion operation is known as enqueue and deletion is known as dequeue. If insert
operation is attempted and there is not enough space in the queue, then this situation is called overflow and the
new element can’t be inserted. If queue is empty and delete operation is attempted, then this situation is called
underflow.
(0) [1] [2] [3] [4] (0) {1} [2]. [3] [4] (0) {1} [2] [3] [4]
H
Figure 4.6
From the figure 4.6, we can note the following things-
(i) At any time the number of elements in the queue is equal to (rear-front+1), except initially empty queue.
(ii) When front is equal to rear, there is only one element in the queue ( (b),(f),(h),(j) ),except initially empty
queue.
(iii) When front becomes equal to (rear+1), the queue becomes empty ( (g) and (k) ). So we can see that the
queue is empty in-two situations, when initially front is equal to -1 or when front becomes equal to
(rear+1). These are the two conditions of queue underflow.
(iv) When rear becomes equal to 4 it can’t be incremented further. After case (i), value of rear becomes 4, so
now it is not possible to insert any element in the queue. Hence we can say that if the size of array is MAX, then
it is not possible to insert elements after rear becomes equal to MAX-1. This is the condition for queue
overflow. Z
The function insert () will insert an item in the queue and the function del () will delete an item from the
queue. Inside the function insert (), first we will check the condition of overflow and then insert the element.
Inside the function del (), first we will check the condition for underflow and then delete the element. The ~
function peek () returns the item at the front of the queue without removing it.
/*P4.3 Program of /queue using array*/
#include<stdio.h> ~
#include<stdlib.h>
#define MAX 10
int queue_arr [MAX];
int rear = -1;
mitefront = —1;
void insert(int item);
int del(); e
int peek();
void display();
int isFull();
int isEmpty ();
main()
{
int choice, item; \
while(1)
{ . \
printf ("1.Insert\n") \
printf£("2.Delete\n") ;
ta Structur rough C in th
116 /
int del ()
{
int item;
if (isEmpty() )
{
printf ("Queue Underflow\n");
exit(1);
‘ }
item = queue_arr[front] ;
front = front+1;
return item;
}/ ARNG Ob ACL () 27
int peek()
{
if (isEmpty () ) :
{
printf ("Queue Underflow\n") ;
exact);
}
return queue_arr [front] ;
}/*End of peek()*/
Stacks and Ss l7
int isEmpty
()
{
if(front==-1 || £front==rear+1)
return 1;
else
:
return 0;
}/*End of isEmpty()
*/
int isFull()
{
if (rear==MAX-1)
return 1;
else
Z return 0;
}/*End of isFull()*/
void display()
{
int i;
if (isEmpty
() )
{
printf("Queue is empty\n");
return;
}
printf("Queue is :\n\n");
for(i=front; i<=rear; i++) ‘
printf("%d ",queue_arr[i]).;
ee Lt oN NI)! 3
}/*End of display() */
There is a drawback in this array implementation of queue. Consider the situation when rear is at the last
position of array and front is not at the 0" position.
front=3, rear=7
(LT
[20T 2s730735 [40]
(0) (1) (2) (3) (4) (5) (6) {7
There are 3 spaces for adding the elements but we cannot insert any element in queue because rear is at the
last position of array. One solution to avoid this wastage of space is that we can shift all the elements of the
_ array to the left and adjust the values of front and rear accordingly.
front=0, rear=4
[20] 2530 |35{4o] | |
[0] (1) (2) (3) (4) (5) (6) [7]
This is practically not a good approach because shifting of elements will consume lot of time. Another
efficient solution to this problem is circular queue. Wewill study about it later in this chapter.
front
Z Figure 4.2
/* P4.4 Program of queue using linked list*/
#include<stdio.h> i
#include<stdlib.h>
struct node |
( |
int info;
struct node *link;
}*front = NULL, *rear = NULL;
void insert(int item);
int del();
int peek();
“int isEmpty ();
void display();
main ()
{
int choice, item;
while(1)
{
’ ~ printf("1.Insert\n") ;
printf ("2.Delete\n") ;
printf("3.Display the element at the front\n");
printf("4.Display all elements of the queue\n");
pranté(*S,Qudt\n"®);
printf£("Enter your choice : ");
scanf("%d", &choice);
switch (choice)
{
case 1:
printf("Input the element for adding in queue : ");
scanf ("%d", &item) ;
insert (item) ;
break;
case 2:
printf("Deleted element is %d\n",del());
break;
case 3:
printf("Element at the front of queue is %d\n",peek());
break; :
case 4:
display ();
break;
case 5:
exit(1);
default :
printf (“Wrong choice\n");
}/*End of -switch*/
}/*End of while*/
}/*End of main()*/
int del()
{
struct node *tmp;
int item;
if (isEmpty
())
{
printf("Queue Underflow\n");
exit (1);
}
tmp = front;
item = tmp->info;
front = front->link;
free(tmp);
return item;
}/*End of del()*/
int peek()
{
if (isEmpty())
{
printf£("Queue Underflow\n");
exit (1);
}
return front->info;
}/*End of peek()*/
int isEmpty()
{
if (front==NULL)
return 1;
else ; =
return 0;
}/*End of isEmpty()*/
void display() L s
{
| struct node *ptr;
“ ptr: =, front;
if (isEmpty
())
| 120 Data Structures Through C in Depth
printf£("Queue is empty\n");
return;
} F
printf ("Queue elements :\n\n");
while (ptr! =NULL)
{
printé ("sd ",ptr->info) ;
ptx =) ptr—>link;
}
Prine ( \a\s)i;
}/*Endof display()*/
We can implement queue with circular linked list also, here we take only one variable rear.
/*P4.5 Program of queue using circular linked list*/
#include<stdio.h>
#include<stdlib.h>
struct node
{
aeke, Hlboleve)p
struct node *link;
}*rear = NULL;
void insert (int item);
int del();
void display();
int isEmpty ();
rmtepeek()/; \
main()
{
int choice, item;
while(1)
{ °
printf ("1.Insert\n");
printf("2.Delete\n") ;
printf("3.Peek\n") ;
printf ("4.Display\n") ;
Dis intet (M5. Ounstennere
pLinte("Enter your schoice :).")>
scanf ("%d", &choice) ;
switch(choice)
{
case 1:
printf("Enter the element for insertion : ");
scanf ("%d",&item) ;
insert (item) ;
break;
case 2:
printf ("Deleted element is %d\n",del());
break;
case 3:
printf("Item at the front of queue is %a\n",peek());
break;
case 4:
display ();
break;
case 5:
exit (1);
default:
printf ("Wrong choice\n");
}/*End of switch*/
}/*End of while*/
}/*End of main()*/
tacks an s 121
rear = tmp;
tmp->link = rear;
else
{
tmp->link = rear->link;
rear->link = tmp;
rear = tmp;
}
}/*End of insert()*/
del ()
{ fr
int item;
struct node *tmp;
if (isEmpty() )
{
printf ("Queue underflow\n") ;
exit (1);
}
if(rear->link == rear) /*If only one element*/
{
tmp = rear;
rear = NULL;
tmp = rear->link;
rear->link = rear->link->link;
}
item = tmp->info;
free(tmp) ;
return item;
}/*End of del()*/
int peek()
{
if (isEmpty() )
{
printf ("Queue underflow\n") ; :
exit (1);
}
return rear->link->info;
}/* End of peek() */
int isEmpty()
{
if (rear==NULL)
——
return 1;
else
return 0;
}/*End of isEmpty()*/
ta Structures Through C in th
122
void display()
{
struct node *p;
if (isEmpty
() )
{
printf("Queue is empty\n");
return;
}
printf("Queue is :\n");
p = rear->link;
do ‘
{
printf("%d ",p->info) ;
p = p->link;
}while(p!=rear->link);
pRInte (Nn); \
}/*End of display()*/ \
e < 1)
i front
= 3, rear =7 i" (1]
K) i
Figure 4.8
Now after the (n-1)" position, 0" position occurs. If we want to insert an element, it can be inserted at 0"
position.
front = 3, rear = 0
‘ [3] [4]
Figure 4.9
The insertion and deletion operations in a circular queue can be performed in a manner similar to that of
queue but we have to take care of two things. If value of rear is MAX-1, then instead of incrementing rear
we will make it zero and then perform insertion. Similarly when the value of front becomes MAX-1. it will
not be incremented but will be reset to zero. Let us take an example and see various operations on a circular
queue.
Stacks and Queues , 123
(0) [1] [2] (3) [4 i LO} 21 (SI 4) : (0) [1] [2] [3] [4]
Jstaeereaneeananvarenenracnrsannseeevanennneneuseneseusnseeseneeanenaneneneaenubennneeunsuneasanesennsssneuseeessnesennesuanessseneneeeassnssssavannnessnasssenabenescsnssrenseceuncessesseneessensssnerses
(0) (1) (2) (3) [4 (0) [1] [2] [3) [4) (0) (1) (2) (3) [4
(0) (1) .[2]) [3] [4] (0) (1) (2) [3) [4] (0) (1) (2] B) {4
(0) [1] [2] [3] [4] : (0) [1] [2) (3) [4] H (0) [!) (2) [3] [4]
; Figure 4.10
(i) As in simple queue, here also if front is equal to rear there is only one element, except initially empty
queue (Cases (b) and (d)).
-_ (ii) The circular queue will be empty in three situations, when initially front is equal to -1 or when front
_ becomes equal to (rear+1), or when front is equal to 0 and rear is equal to MAX-1I (Cases (a),(e) and the
case when all elements deleted from case(0)).
(iii) The overflow condition in circular queue has changed. Here overflow will occur only when all the positions
of array are occupied i.e. when the array is full. The array will be full in two situations, when front. is equal to
0 and rear is equal to MAX-1 (Case (n)), or when front is equal to (rear+1) (Case (j)).
From the last two points we can see that the condition front==(rear+1) is true in both cases, when the
queue is empty and when the queue is full. Similarly the condition (front==0 && rear==MAX-1 ) is true in
both cases. We should make some change in our procedure so that we can differentiate between an empty queue
and a full queue.
When the only element of the queue is deleted, front and rear are reset to -1. We can check for empty
queue just by checking the value of front, if front is -1 then the queue is empty otherwise not. Let us take an
example and see how this is done-
meeeriete eee) eT |
(0) (1) [2] [3] [4] (0] (1) [2] [3] (4) [0] (1) (2) (3) [4]
Figure 4.11
ta Structures Through C in Depth
124 1
/
main ()
{
int choice, item;
while(1)
{
printf("1.Insert\n") ;
printf£("2.Delete\n");
printf("3.Peek\n");
printf("4.Display\n") ;
Drdneh (iS Out ni
printf("Enter your choice
scanf ("%d", &choice) ;
switch (choice)
{
case l
printf("Input the element for insertion : ");
scanf ("%d", &item) ;
insert (item) ;
break;
case 2:
printf("Element deleted is : %d\n",del());
break;
case 3:
printf("Element at the front is : %d\n",peek());
we break;
case 4:
display ();
break;
case 5:
exit(1);
default:
printf ("Wrong choice\n") ;
}/*End of switch*/
}/*Endof while */
}/*End of main()*/
int del()
{
: int item;
if (isEmpty () ) fc
{
printf("Queue Underflow\n") ;
exit(1);
}
item = cqueue_arr[front];
if(front==rear) /*queue has only one element*/
{
fronts=%64;
rea Y=)
} : f
else if (front==MAX-1)
front = 0;
else
front-= £ront+i;
return item;
Ae ENG Of del () +7
int isEmpty()
{
if (front==-1)
return 1;
else
return 0;
}/*End of isEmpty()*/
int peek()
{
if (isEmpty
())
{
printf ("Queue Underflow\n") ;
exit (1);
}
return cqueue_arr[front];
}/*End of peek()*/
void display()
{
Lritey 1)
if (isEmpty
() )
{
printf ("Queue is empty\n");
return;
126 ta Structures Through C in Depth
}
printf ("Queue elements :\n");
iD = ieratonalias
if (front<=rear)
{
while (i<=rear)
printf£("%d ",cqueue_arr[i++]);
}
else
{
while (i<=MAX-1)
printf ("%d ",cqueue_arr[it++]);
= O's
while (i<=rear)
printf("%d ",cqueue_arr[it++]);
}
jopauanons (MAW eIL)) iE
}/*End of display() */
4.4 Deque
Deque (pronounced as ‘deck’ or ‘DQ’) is a linear list in which elements can be inserted or deleted at either end
of the list. The term deque is a short form of double ended queue. Like circular queue, we will implement the
deque using a circular array( index 0 comes after n-1). For this we have to take an array deque_arr[] and two
variables front and rear. The four operations that can be performed on deque are-
(i) Insertion at the front end.
(ii) Insertion at the rear end.
(ii1) Deletion from the front end.
(iv) Deletion from the rear end.
(a) Empty queue (b) Insert 10, 15, 20 at the rear end (c) Insert 25 at the front end
CS
a a eee
(0) (1) (2). (3) (41 (5) (61 { (0) (1) (2) (3) (4) (5) (6) i 0) () (2) G8) (4) [5] [6]
At eee eRe eee en eEee nee eee eeeE EE OEE ENGE EEE ERSEOHEE OER ESEESH ORES OEE ORORARE ERE EEnanen obone eRe ee ee ee en eneneeeneeeh eee ee esses eens eee eeeeeseeeeneeeeen seen onan ene seen ees eneese:
(d) Insert 30, 35 at the front end (e) Delete from the rear end | (f) Delete from the front end
(g) Delete from the front end (h) Delete from the rear end (i) Delete from the rear end
. Insertion at the rear end and deletion from the front end are performed in similar way as in circular queue. To
insert at the front end, the variable front is decreased by | and then insertion is performed at the new position
given by front. If the value of front is 0 then instead of decrementing, it is made equal to MAX-1. To delete
from the rear end, the variable rear is decreased by 1. If the value of rear is 0, then it is not decremented but
it is made equal to MAX-1. The example in figure 4.12 shows various operations on a deque. ©
main ()
{ : :
int choice, item;
while(1)
{
printf("1.Insert at the front end\n");
printf("2.Insert at the rear end\n");
printf("3.Delete from front end\n") ;
printf("4.Delete from rear end\n");
printf("5.Display\n") ;
pIAnte(*60u1t\n"));
printf("Enter your choice : ");
scanf ("%d", &choice) ;
switch (choice)
{
case 1:
printf("Input the element for adding in queue : ");
scanf ("%d",&item) ;
insert_frontEnd (item) ;
break;
case 2:
printf("Input the element for adding in queue : ");
scanf ("%d", &item) ;
insert_rearEnd (item) ;
break;
case 3:
printf("Element deleted is : %d\n",delete_frontEnd()) ; Hh
break;
case 4:
printf("Element deleted is : %d\n",delete_rearEnd());
break;
case 5:
display ();
break;
case 6:
exit(1);
default: :
printf ("Wrong choice\n") ;
}/*End of switch*/
printf("front = %d, rear =%d\n",front,
rear) ;
display(); |
ae
ta Structures Through Cin th
128 ‘
}/*End of while*/
}/*End of main()*/
; if (isFull())
printf("Queue Overflow\n") ;
return;
int delete_frontEnd()
{ t
int item;
if (isEmpty
() )
{
/ printf("Queue Underflow\n") ;
exit (1); :
}
item=deque_arr([front];
if(front==rear) /*Queue has only one element*/
{
front=-1;
rear=-1;
if (front==MAX-1)
£ront=0;
else
front=front+1;
return item;
}/*End of delete_frontEnd()*/
if (isEmpty
() )
{
printf("Queue Underflow\n") ;
exit(1);
}
item=deque_arr[rear];
int isFull()
{
; if((front==0 && rear==MAX-1) || (front==rear+1) )
return 1; %
else
-réturn 0; i
}/*End- of isFull()*/ f
int isEmpty()
{
ee (front == 0-1)
return 1;
else
return 0;
}/*End of isEmpty() */
void display()
{
int i:
if (isEmpty
() )
{
printf("Queue is empty\n");
return;
}
printf("Queue elements :\n");
L=front;
if (front<=rear)
while (i<=rear)
printf£("%d ",deque_arr[i++]);
else
{
while (i<=MAX-1)
printf£("%td ",deque_arr[i++]);
1=0;
while (i<=rear)
printf("%d ",deque_arr[it++]);
}
print£(“\n") ;
-}/*End of display()*/
The four operations of deque are named as push(insert at front end), pop(delete from front end), inject(insert at
rear end), eject(delete from rear end). Deque can be of two types-
1. Input restricted deque
2. Output restricted deque
130 Data Structures Through C in Depth
‘In Input restricted deque, elements can be inserted at only one end but deletion can be performed ran poth
ends. The functions valid in this case are insert_rearEnd(), deletes frontEnd(), delete_ rearEnd().
In Output restricted deque, elements can be inserted from both ends but deletion is allowed only at one end. The ——
functions valid in this case are insert_frontEnd(), insert_rearEnd(), del_frontEnd().
‘front
Information
Priority Link
Figure 4.13
Here priority number | means the highest priority. If priority number of an element is 2 then it means it has
priority more than the element which has priority number 3. Insertion of an element would be performed in a
similar way as in sorted linked list. Here we insert the new element on the basis of priority of element.
front
Delete operation will be the deletion of first element of list because it has more priority
than other elements
of queue.
/*P4.8 Program of priority queue using linked list*/
#include<stdio.h>
#include<stdlib.h>
Stacks and Queues 131
struct node
{
int priority;
int info;
struct node *link;
}*front = NULL;_
void insert(int item, int item_priority) ;
int del();
void display() ;
int isEmpty ();
main()
{
int choice,
item, item_priority;
while(1)
{
printf("1.Insert\n") ;
printf("2.Delete\n") ;
printf("3.Display\n") ;
printf ("4.Quit\n");
printf("Enter your choice : ");
scanf("%d", &choice) ;
switch (choice)
{
case 1:
printf("Input the item to be added in the queue : ");
scanf ("%d", &item) ;
printé ("Enter ts priority = -');
scanf ("%da",&item_priority);
insert (item, item_priority) ;
break;
case 2:
printf("Deleted item is %d\n",del());
break;
case 3:
display ();
break;
case 4:
exit(1);
default
printf ("Wrong choice\n") ;
}/*End of switch*/
}/*End of while*/
}/*End of main()*/
void insert(int item,int item_priority)
{
struct node *tmp, *p;
tmp = (struct node *)malloc(sizeof (struct node) );
if (tmp == NULL) as
{
printf("Memory not available\n") ;
return;
. pmp->info = item; \
‘tmp->priority = item_priority;
/*Queue is empty or item to be added has priority more than first
element */ )
if(isEmpty() || item_priority < front->priority)
{
tmp->link = front;
front = tmp;
p = lfront;
132 Data Structures Through C in Depth
else
tmp = front;
item = tmp->info;
front, = front=>link;
free(tmp) ;
}
return item;
AHI OL aedla(y)!
e/,
int isEmpty()
{
if (front==NULL)
metuEn e-
else
return 0;
}/*End of isEmpty()*/
void display()
if
Si muUce mode ptr,
pEre= Lront;
if (isEmpty
())
printf£("Queue is empty\n");
else
{ printf("Queue is :\n"); =
print£ ("Priority - Ttem\n")>-
while(ptr!=NULL)
{
printf ("%5d \t5d\n", ptr->priority, ptr->info) ;
ptr = ptr->link;
= }
j
}/*End of display() */
\
The priority queue that we have used is max-priority queue or descending priority queue. The other priority |
queue is min-priority or ascending priority queue in which the element with lowest priority is. processed first.
_ The best implementation of priority queue is through heap tree which we will study in the next chapter.
| ;
4.6 Applications of stack
Some of the applications of stack that will be discussed are-
1. Reversal of a string.
2. Checking validity of an expression containing nested parentheses.
3. Function calls.
4. Conversion of infix expression to postfix form.
5. Evaluation of postfix form.
Stacks and Queues ces 133
‘We can reverse a string by pushing each character of the string on the stack. After the whole string is pushed on
the stack, we can start popping the characters from the stack and get the reversed string.
, neh ead
S
dueet
legs
Figure 4.14
We have pushed the string “SPOT” on the stack and we get the reversed string “TOPS”.
/*P4.9 Program of reversing a string using stack*/
#include<stdio.h>
#include<string.
h>
#include<stdlib.h>
#define MAX 20
Bt top = -1;
char stack[MAX];
char pop(); of ;
void push(char) ; . ,
main() a
{
char St (20);
unsigned int i;
printf("Enter the string : " );
gets (str);
/*Push characters of the string str on the stack*/
for(i=0; i<strlen(str); i++)
push(str[i]);
/*Pop characters from the stack and store in string str*/
for(i=0;—i<strlen(str); i++) 2
str[i] = pop();
printf("Reversed string is : "); : a
puts (str); .
}/*End of main()*/
void push(char item)
{ ‘:
if (top==(MAX-1) )
{
printf("Stack Overflow\n") ;
return; =
}
: stack[++top] = item;
P/*End of push()*/
char pop()
{
if (top==-1)
{
printf("Stack Underflow\n") ;
exit(1);
} .
return stack[top--];
W/ Ena of, pop)*7
|
134 Data Structures Through Cin Dep’
We can use stack to check the validity of an expression that uses nested parentheses. An expression will t
valid if it satisfies these two conditions-
1. The total number of left parentheses should be equal to the total number of right parentheses in tk
expression.
2. For every right parenthesis there should be a left parenthesis of the same type.
Some valid and invalid expressions are given below-
[A-B* (C+D) Invalid
(1+5} Invalid
PSA (9 = 2))a Valid
[A+B -—(C%D}] Invalid
[A/ (B-C) *D] Valid
The procedure for checking validity of an expression containing nested parentheses is -
|.Initially take an empty stack.
2. Scan the symbols of expression from left to right.
3. If the symbol is a left parenthesis then push it on the stack.
4. If the symbol is right parenthesis
If the stack is empty
Invalid : Right parentheses are more than left parentheses.
else
Pop an element from stack.
If popped parenthesis does not match the parenthesis being scanned
Invalid : Mismatched parentheses.
5. After scanning all the symbols
If stack is empty
Valid : Balanced Parentheses.
else
Invalid : Left parentheses more than right parentheses.
2 ES
Ln Sia
ey|
See
Invalid: Scanned O—
i 2 stack is empty
symbol ‘}’ and symbols scanned
popped symbol ‘(’ but stack is not Valid
don’t match lempt
main ()
{
tacks and Queues 135
char exp[MAX];
int valid;
printf("Enter an algebraic expression : ");
gets (exp);
valid = check(exp) ;
if (valid==1)
printf("Valid expression\n");
else
printf("Invalid expression\n");
}
return;
}
Ope =— top ii;
stack[top] = item;
\}/*knd of push () *7
char~pop()
{
de (Op==—1y)
{
printf("Stack Underflow\n") ;
excited)
}
return(stack[top--])j;
}/7 *End “of (pop ())*/
| [AR formain |
main() called | main( ) calls f1() | IC) calls £2) £2) calls £3()
Push AR of main() | Push AR off1() i Push AR of f2() i Push AR of £3( )
AR for f2(
AR for f1() AR for f1 i pet
~s AR for main( ) AR for main( ) i TAR for main() _|
£3¢) completed f2( ) completed eae 1() completed
if H :H main( )comple’ ted
Pop AR of f3() i Pop AR of f2( ) PopARoffl() Pop Ae of ety )
Figure 4.15
Stacks and Queues : 137
So ‘”’ has highest priority and ‘+’, ‘-’ have lowest priority. If two operators have same priority(like ‘*’ and
‘/’) then the expression is scanned from left to right and whichever operator comes first will be evaluated first.
This property is called associativity and we’ve assumed left to right associativity for all our operators.
‘Now if we apply these precedence rules to our expression 9+6/3, then we see that division should be
performed before addition. Let us take an arithmetic expression and see how it can Je evaluated using our
precedence rules.
A+B*C-D*E/F- G/H-I
| ESTE es | ees
The numbers indicate the sequence of evaluation of operators. If we want to override the precedence rules
then we can use parentheses. Anything that is between parentheses will be evaluated first. For example if in the
above expression we want to evaluate H-I before G/H then we can enclose H-I within parentheses.
L
A+B*C-D*E/F-G/(H-I
| EJ | ier
Here we can see that the expression inside the parentheses is evaluated first and after that, evaluation is on the
basis of operator precedence.
This is how we can evaluate arithmetic expressions manually. We have to know the precedence rules and
take care of parentheses. The expression has to be scanned many times and every time we have to reach
different place in expression for evaluation.
Now let us see how computer can evaluate these expressions. If we use the same procedure then there will be
repeated scanning from left to right which is inefficient. It would be nice if we could transform the above
138. . Data Structures Through C in Depth
expression in some form which does not have parentheses and all operators are arranged in proper order
the new
according to their precedence. In that case we could evaluate the expression by just scanning
let us see what
transformed expression once from left to right. The Polish notations are used for this purpose so
these notations are. b
The great Polish mathematician Jan Lukasiewicz had given a technique for representation of arithmetic
expressions in early 1920’s. According to the two notations that he gave, the operator can be placed before or
after operands. Although these notations were not very readable for humans, but proved very useful for
compiler designers in generating machine language code for evaluating arithmetic expressions.
The conventional method of representing-an arithmetic expression is known as infix because the operator is
placed in between the operands. If the operator is placed before the operands then this notation is called prefix
or Polish notation. If the operator is placed after the operand then this notation is called postfix or reverse polish
notation. So now we have three ways to represent an arithmetic expression-
Infix A+B
Prefix +AB
Postfix AB+
Let us see how we can convert an infix expression into its equivalent postfix expression. The rules of
parentheses and precedence would remain the same. If there are parentheses in the expression, then the portion
inside the parentheses will be converted first. After this the operations are converted according to their
precedence. After we’ ve converted a particular operation to its postfix, it will be considered as a single operand.
We will use square brackets to represent these types of intermediate operands. If there are operators of equal
precedence then the operator which is on the left is converted first. Let us take an example-
A-B*C+D*E/ (F+G)
|
A-B*C+D*E/[FG+] Convert F+G to FG+
A- [BC*] +D*E/ [FG+] ConvenreEyBsCato. SCS
A- [BC*]+[DE*]/[FG+] Convert D*E to DE*
A-[BC*]+[DE*FG+/] Convert [DE*]/[FG+] to DE*FG+/
[ABC*-]+[DE*FG+/] Convert A-[BC*] to ABC*-
ABC*-DE*FG+/+ Convert [ABC*-]+[DE*FG+/] to ABC*-DE*FG+/+
The conversion from infix to prefix also uses the same rules, but here the operator is placed before the
operands. The following examples show conversion of infix form to prefix form.
We can see that the prefix and postfix forms of any expression are not mirror image of each other. In all the
three forms the relative positions of the operands remain the same.
In the prefix and postfix forms, parentheses are not needed and the operators and operands are arranged in
proper order according to their precedence levels. Now let us take an infix expression and covert it to postfix
and then see how this postfix expression can be evaluated. For simplicity we will take numerical constants of
single digit only.
3+5* (7-4) *2
3+5*[74-]*2
345* [74-2]
3+[574-2**]
3574-2%*+
To evaluate this postfix expression, it is scanned from left to right and as soon as we get an operator we
apply it to the last two operands.
3574-2°*+ Apply “=- to 7 and-4 7
85 [3)2**+ Apply a. cOnsPangsz
85[(9]*+ BDDLVea Ec OnOmangr g
3(45]+ Apply ‘+’ to 3 and 45
48 \
So we can see that in a postfix expression, the operators are evaluated in the same sequence as they appear in
the expression. While evaluating a postfix expression we are not concerned about any precedence rules or
parentheses.
Now let us see how we can write a program for the two step process of converting the infix form to postfix
form and evaluating the postfix expression. The stack data structure proves helpful here and is used ja both the,
steps.
D
Push omythe stack ABC/D
pee EEee | Add E to postfix ABC/DE
NMG See) SEIN eee aeand ee
iepestiu| ents yaa
(AD) Pop * and +, add
them to postfix anne boa aye"
(1Sh/ UC MRE IV Adar co posUeLx! OF TO! MOK ToUihg2 2 7t Nea Re ACE Sree
dia Aveta aoa tops andaadto BARES ecs <1 eloe anme
* In step(2) the operator ‘+’ is simply pushed on the stack, since the stack is empty.
* In step(4), first we will compare the precedence of ‘/’ with that of the top operator of stack which is ‘+’. Since the
~ precedence of ‘+’ is less than that of ‘/’, we will not pop any operator and simply push ‘/’ on the stack.
* In step(o), we will compare the precedence of ‘*’ with that of the top operator of stack which is ‘/’. Since the
precedence of **’ is equal to that of ‘/’, operator ‘/’ is popped and added to postfix. Now the top operator of stack is
‘+’. Now we will compare precedence of ‘*’ with that of ‘+’. Since the precedence of ‘+’ is less than that of ‘*’,
now operator ‘*’ is pushed on the stack.
* In step(9), ‘+’ is simply pushed on the stack since we have assumed that the precedence of ‘(’ is least.
* In step (11), we have to pop all operators till left parentheses, so ‘+’ is popped and added to postfix.
* In step(12), precedence of ‘*’ is greater than that of ‘-’, so ‘*’ is popped and added to postfix. Now top operator on
stack is ‘+’ and its precedence is equal to that of ‘-’, so it is also popped and added to postfix. After this ‘-’ is pushed
on the stack.
* In step(14), all the symbols of the expression have been scanned, so now we have to pop all the operators from the
stack and add to postfix. The only operator remaining on stack is ‘-’, so it is popped and added to postfix.
Let us work out a few more examples to fully comprehend the process.
iiiios Ao BaG/ (D*h-F) Infix : A-B/C*D*E+F/G
Mime
en
AB™C?
REE RIS ea
PESTS ARG ee ed
Cy} fe pabec= pees eas
[| ampty__[aptcspesre 7s _|
Stacks and Queues | 141
pee |
eet GRE S|
Seer RE AG
ES Ee a ee
ROO ORL Ee ee eee
eer ee en os ron) eer ar
apr
|S 5627934 /— | | | Empey | 753244922" /s6are a
1..Scan the symbols of array postfix one by one from left to right.
(a) If symbol is operand
Push it on the stack
(b) If symbol is operator
Pop two elements from stack and apply the operator to these two elements
Suppose first A is popped and then B is popped
result = B operator A
Push result on the stack
2. After all the symbols of postfix have been scanned, pop the only element left in the stack and it is the value of
postfix arithmetic expression.
Let us take a postfix expression and evaluate it-
142 Data Structures Through C in Depth
Infix : 7+5*3%2/(9-2%2)
+6*4
Postfix :
5
5
5
p 2 and 3, )bush 342 ~ 5 oO
~ a .
4
xJx4 Nw]N
~ 4 - >
~ =wf
>|a Ww]
cn]
ca]
cr] wlwlw
wo
PlRe}]A
|
After evaluation of the postfix expression its value is 40. Let us take the same postfix expression in infix form
and evaluate it.
74+5*3°2/ (9-2°2) +6*4
74+5*3°2/(9-4)+6*4
74+5*3°2/5+6*4
74+5*9/5+6*4
7+45/5+6*4
74+9+6*4
74+9+24
16+24
40
We can sce the same result but in postfix evaluation we have no need to take care of parentheses and
sequence
of evaluation.
/* P4.11 Program for conversion of infix to postfix and evaluation of: postfix.
It will evaluate only single digit numbers*/
#include<stdio.h>
#include<string.h>
#include<math.h>
#include<stdlib.h>
#define BLANK ’ '
#define TAB ‘\t’
#define MAX 50
void push(long int symbol);
long int pop();
void infix_to_postfix();
long int eval_post()j;
int priority(char symbol);
int isEmpty ();
int white_space();
char infix[MAX],postfix[MAX];
Yong int stack[MAX];
mite Op
main()
{
long int value;
top = =i;
PLInkh (Enter arb scare ys
gets (infix);
infix ton post fis),
printf£("Postfix : %s\n",postfix) ;
value = eval_post();
Stacks and Queues 143
void infix_to_postfix()
{
unsigned int i,p = 0;
char next;
char symbol;
for(i=0; i<strlen(infix); i++)
{
symbol = infix[i];
if (!white_space (symbol) )
{
switch (symbol)
{
caset 2(4:
push(symbol) ;
break;
case ‘)’:
while( (next=pop())!='(' )
postfix[p++] = next;
break;
case ‘+’:
case:
y CasSem
casey 7: e
case ‘%':
Casey awa
while(!isEmpty() && priority(stack[top] )>=priority(symbol) )
. postfix[pt+] = pop ();
push(symbol) ;
break;
default: /*if an operand comes*/
— J postfix[p++] = symbol;
}
}
)-
while(!isEmpty() )
postfix[p++] = pop();
postfix[pJ="\0"; /*End postfix with ‘\0’ to make it a _string*/
}/*End of infix_to_postfix()*/
int isEmpty()
{
Eat
Os ==—iky)
return 1;
else
return 0;
}/*End of isEmpty()*/
In the above discussion we have taken all our operators as left associative for simplicity i.e. when two
operators have same precedence then first we evaluated the operator which was on the left. But the
exponentiation operation is generally right associative. The value of expression 24243 should be 2°=256 and not
4°=64. So A*B’C should be converted to ABC, while we are converting it to AB*C*. So let us see what
change we can make in our procedure if we have any operator which is right associative. We will define two
priorities in this case-
Operator — _ | In-stack Incoming
F Priorit Priorit
Pewee
4
na2; aefan one 2
AS
ene
For left associative operators both priorities will be same but for right associative operators incoming priority
should be more than the in-stack priority. In our program we will make two functions instead of function
priority(). Now whenever an operator comes while scanning infix, then pop the operators which have in-
stack priority greater than or equal to the incoming priority of the symbol operator.
while ( tisempty() && instack_pr(stack[top] )>=incoming_pr (symbol) )
postfix[p++] = pop();
Here instack_pr() and incoming_pr() are functions that return the in-stack and incoming priorities of
operators.
Exercise
Recursion is a process in which a problem is defined in terms of itself. The problem is solved by repeatedly
breaking it into smaller problems, which are similar in nature to the original problem. The smaller problems are
solved and their solutions are applied to get the final solution of the original problem. To implement recursion
cals il. in programming, a function should be capable of calling itself. A recursive function is a function that
calls itself.
steesereeeee
}/*End of main? /
void rec()
}/*End of rec()*/
Here the function rec() is calling itself inside its own function body, so rec() is a recursive function.
When main() calls rec(), the code of rec() will be executed and since there is a call to rec() inside rec(),
again rec() will be executed. It seems that this process will go on infinitely but in practice, a terminating
condition is written inside the recursive function which ends this recursion. This terminating condition is also
known as exit condition or the base case. This is the case when function will stop calling itself and will finally
start returning.
Recursion proceeds by repeatedly breaking a problem into smaller versions of thesame problem, till finally
we get the smallest version of the problem which is simple enough to solve. The smallest version of problem
can be solved without recursion and this is actually the base case.
ee
2. 2
3 func(5);
> 4. «
5 func (n-2) ;
>
el
|ee
lll
So we can see that the recursive functions are called in a manner similar to that of regular functions, but here
the same function is called each time. When execution of an instance of recursive function is finished, we return
to the previous instance where we had left it.
We know each function has some local variables that exist only inside that function. The local variables of
the called function are active while tne local variables of the caller function are kept on hold or suspended. The
same case occurs in recursion but here the caller function and called function are copies of the same function.
When a function is called recursively, then for each instance’ a new set of formal parameters and local
variables(except static) is created. Their names are same as declared in function but their memory locations are
different and they contain different values. These values are remembered by the compiler till the end of function
call, so that these values are available while returning. In the example of figure 5.2, we saw that there were three
instances of func (), but each instance had its own copy of formal parameter n.
The number of times that a function calls itself is known as the recursive depth of that function. In the
example of figure 5.2, the depth of recursion is 3.
1 n=0
n= 1
n * (n-1)! | n>0
Now we will write a program, which finds out the factorial using a recursive function.
/*P5.1 Program to find the factorial of a number by recursive method*/
#include<stdio.h>
long int fact(int n);
main ()
{
int num;
Fahne,
a.
’ long fact( int n)
if( n==0 )
return 1;
-~ return (n * fact(n-1) );
if( n==0 )
return 1;
{
if( n==0 )
---return |;
return (n * fact(n-1) );
argument of n-1. This process of calling fact () continues till it is called with an argument of 0. Suppose we
want to find out the factorial of 3. ;
Initially main() calls fact (3)
Since 350, fact (3) calls fact (2) winding phase
Since 2>0, fact (2) calls fact (1)
Since 1>0, fact (1) calls fact (0)
First time when fact () is called, its argument is the actual argument given in the main() which is 3. So in
the first invocation of fact () the value of n is 3. Inside this first invocation, there is a call to fact () with
argument n-1, so now fact () is invoked for the second time and this time the argument is 2. Now the second
invocation calls third invocation of fact () and this time argument is 1. The third invocation of fact () calls
the fourth invocation with an argument of 0.
When fact () is called with n=0, the condition inside if statement becomes true, i.e. we have reached the
base case, so now the recursion stops and the statement return 1 is executed. The winding phase terminates
here and the unwinding phase begins and control starts returning towards the original call.
Now every invocation of fact () will return a value to the previous invocation of fact (). These values are —
returned in the reverse order of function calls:
Value returned by fact (0) to fact (1) =1
Value returned by fact (1) to fact (2) +1 * fact(0) = 1*1 iH = Unwinding Phase
Value returned by fact (2) to fact (3) =2* fact(1) = 2*1 = 2
Value returned by fact (3) tomain() =3 * fact (2) = 3*2=6
The function fact() is called 4 times, each function call is different from another and the argument
supplied is different each time.
In the program we have taken long int as the return value of fact (), since the value of factorial for even
small value of n exceeds the range of int data type.
return (n + summation(n-1));
}/*End of summation() */
In the previous two examples we had found out the product and sum of numbers from 1 to n, now we will write
a recursive function to display these numbers. This function has to only display the numbers so it will not return
any value and will be of type void.
We have written two functions display1() and display2 (), these functions are traced in figures 5.4 and
5.5. Let us see which one gives us the desired output.
void displayl(int n)
{ {
i£f(n==0)
return;
printf("%d ",n);
displayl(n-1);
}/*End of displayl1()*/
void display2(int n) .
{
if (n==0)
return;
display2(n-1);
PIInCh (sd sen);
}/*End of display2()*/
The function display1() is traced in figure 5.4. Initially the print£() insidc the first invocation will be
executed and in that invocation value of n is 3, so first 3 is displayed. After this the print f() inside second
invocation is executed and 2 is displayed, and then print£() inside third invocation is executed and 1 is
displayed. In the fourth invocation the value of n becomes 0, we have reached the base case and the recursion is
stopped. So this function displays the values from n to 1, but we wanted to display the values from 1 to n.
PEE DADA BAMMNON |
F main( ) *
$ ( N
Me ee See N
x
; display |(3); N
---)—> NY
'
i]
Ae hha Ae he N
4) ¢:
’
L]
t
i] x BO OLE A
'
‘
4
i]
' Zsiapleyi\ int n)
1 $ 3
' $ if( n==0) ;
: $ return; :
: 3 printf( "%d ".n); ORRIN
$ display 1(n-1!);Ati int n) "Pis2 J
mec aseo5 4
‘ ' ea 5
: : ' ; return; ;
() () ‘ $ printf(" %d " wn); pebwnnnin
:osec 6c
tSSS eeseeS ~ e
ss display
2--4
(19 display int n) [eis]
4 ‘ r] { EeY)
H ' Neen marnnrenrl if( n==0 ) ;
’ ‘ ‘ g return; 5
' ' 1 g : $ .
‘ ‘ ' $ printf("%d " wn); PARAMMAMMMM
: j ' 4 display1(n-! +—F display ( int n)
ee = - ee |
' ' g {
‘ ' Nr eveceeatereveraceveeey if( n==0 ) :
‘ je ewe ene= foccccco f---> return; ‘
Fi A ip ea a ¢ _ printf("%d ",n); $
} display 1(n-1); ‘
¥
Figure 5.4 “)EVRY RY AY AY AY AY AY AY AN ANA AS o 2,
154 Data Structures Through C in Depth
4main( ) \
N display2(3); ;
ee pn. g
' > ¢
; Aeon oc) foc 3
H ? display2( int n )
' z { id
i bees
§ if( n==0 )
return; §
1 f) ‘
return; z
In display2(), tne printt() 18 after ne recursive call so the printf () functions are not executed in the
winding phase, but they are executed in the unwinding phase. In unwinding phase, the calls return in reverse
order, so first of all the print £() of third invocation is executed and in this invocation value of n is 1, so 1 is
displayed. After this the print£() of second invocation is executed and in this invocation value of n is 2 so 2
is displayed and at last the printf () of first invocation is executed and 3 is displayed. |
||
5.4.4 Display and Summation of series
Let us take a simple series of numbers form 1 to n. We want to write a recursive function that displays this
series as well as finds the sum of this series. For example if the number of terms is 5, then our function should
give this output.
14+424+34+4+5 =15
We have already seen how to display and find the sum of numbers from 1 to n so let us combine that logic to
write a function for our series.
/*P5.3 Program to display and find out the sum of series*/
f= SCLzeS ad + 2 eR a Ota tesol
ea << a7,
#include<stdio.h>
int rseries(int'n);
main( )
i
agatiey aly
Recursion
SD
printf("Enter number of terms : ");
scanf("%d", &n);
Let us find out whether the function rseries() gives us the desired output or not. This function will return
the sum of the series but will not display any term of the series. This is because in the unwinding phase when
control returns to the previous invocation, the return statement
(return n + rseries(n-1)) is executed and so the function returns without executing the printf ()
function. To make the function work correctly we can write it like this -
int rseries(int n)
{ /
int sum; ‘ :
if(n == 0) f:
return 0;
sum = (n + rseries(n-1));
Brine
i" eo) 4") on)
return sum;
}/*End of rseries()*/
The printf () is before the return statement and after the recursive call, so it will always be executed in the
unwinding phase.
if (n/10==0)
{
printf ("%d",n);
return;
}
printf ("%d" ,n%10) ;
Rdisplay(n/10) ;
}/*End of Rdisplay()*/
Tf eene< 0)
printf ("%d",rem) ;
else
; printf ("%c",rem-10+'A');
}/*End of convert()*/
we
For simplicity we have checked the divisibility by all numbers starting from 2(prime and non prime), but
example 6 is non prime factor of 84 but it has already been removed as a factor
will get only prime factors. For
of 2 and factor of 3.
Here nothing is to be done in the unwinding phase. The value returned by the last recursive call just becomes
the return value of previous recursive calls, and finally it becomes the return value of the first recursive call.
int nterms, i; :
printf("Enter number of terms : ");
scanf("%d", &nterms) ;
for(i=0; i<nterms; i++)
Prance (sda sta.))/;
pranti(*\n");
}/*End of main()*/
int fib(int n)
{
if(n==0 || n==1)
return(1);
return(fib(n-1) + fib(n-2));
}/*End of fib()*/
Here the function £ib() is called two times inside its own function body. The following figure shows the
recursive calls of function £ib() when it is called with argument 5.
fib(S
fib(2)]
Figure 5.6
This implementation of Fibonacci is very inefficient because it performs same computations repeatedly, for
example in the above example it computed the value of f£ib(2) 3 times.
if (n<9)
return 0;
sumofDigits = 0;
while (n>0) .
{
sumofDigits += n%10; |
n/=10;
He
divisibleBy9 (sumofDigits) ;
}/*End of divisibleBy9()*/
This function returns 1 if number is divisible by 9 otherwise it returns 0.
:
Now let us make a function that checks divisibility by 11. A number is divisible by 11 if and onlyif the.
difference of the sums of digits at odd positions and even positions is either zero or divisible by 11.
test( 91628153 ) -> [ 28(9+6+8+5) -— 7(1+2+1+3)] =21
test(21)->[2-1)=1
test(1) -> Not divisible by 11
The function will be recursively called for the difference of digits. in odd and even positions. The recursion
will stop when the number reduces to one digit. If that digit is 0, then number is divisible by 11 otherwise it is
not divisible by 11.
int divisibleByl1l(long int n)
{
ints l=O)s2=0) Cue £s
abe (ig —(0)))
return 1;
if (n<10)
j return 0;
while (n>0)
{
s1° += n%10;
n/=10;
s2 += n%10;
n/=10;
}
diff = sl>s2 ? (sl-s2) : (s2-s1l);
divisibleBy11 (diff) ;
}/*End of divisibleByl1()*/
1. We can move only one disk from one pillar to another at a time.
2. Larger disk cannot be placed on smaller disk.
rsio
16]
Suppose the number of disks on pillar A is n. First we will solve the problem for n=1, n=2, n=3 and then we
will develop a general procedure for the solution. Here A is the source pillar, C is the destination pillar and B is
the temporary pillar.
For n=3
Ave Ba ABO 'C Av B.2G A B C
A’. BG AL FBC Ar B Co A B C
AC BA B>C ADC
(i) Move disk | from pillar A to C (APC)
(ii) Move disk 2 from pillar A to B (ADB)
(iii) Move disk | from pillar C to B (C>B)
(iv) Move disk 3 from pillar A to C (A9C)
(v) Move disk 1 from pillar B to A (B>A)
(vi) Move disk 2 from pillar B to C (B9C)
(vii) Move disk 1 from pillar AtoC (AC)
These were the solutions for n=1, n=2, n=3. From these solutions we can observe that first we move n-1!
disks from source pillar (A) to temporary pillar (B) and then move the largest(n") disk to the destination
pillar(C). So the general solution for n disks can be written as-
The base case would be when we have to move only one disk, in that case we can simply move the disk from
source to destination pillar aad return. The recursion tree for n=5 is given below.
Da ta Structures Through C in De: t h
+ tofh(1, A, B, C)---A>C
LOETIC2,. Aly Coy rr AB
| ———+ tofh(1, C, A, B)----C9B
tofh(3, A, B, C)------n---n--n2n2nnnnannnnenc
erence corer oerenerenennn= ey.
Fe tofh(1, B, C, A)----
—* tofh(2, B, oy msl
OestGa BOC
|
a tofh(1, A, B, C)----A>C
>
-----
toth(5, A, B, C)--------nnn nnn enn nnnnnnnn
nnnnnnnnneem nn----- arama nanannna nanan ASDC
ennnnnnnnrman----n
-----
N
tofh(1, A, B, C)----A>C
fh ACB) a A>B
“+ tofh(1, C, A, B)----C>B
The recursive function for solving tower of Hanoi can be defined as-
}
void tofh(int ndisk,char source,char temp,char dest)
i,
RE ce a Se oP
{
: if (ndisk==1)
at % é
printf("Move Disk. %da@ from %c-->%c\n", ndisk, source, dest); : ir
return;
}
tofh(ndisk-1,source,dest, temp) ;
printf£("Move Disk %d from %c-->%c\n",ndisk,
source, dest);
tofh(ndisk-1,temp,source,
dest) ;
}/*End of tofH() */
5.5
Recursive Data structures vip lopped
A recursive definition of a data structure defines the data structure in terms of itself. Some data structures like
strings, linked lists, trees can be defined recursively. In these types of data structures we can take advantage of
the recursive structure, and so the operations on these data structures can be naturally implemented using
recursive functions. This type of recursion is called structural recursion.
A string can be defined recursively as- t :
1. A string may be an empty string.
2. A string may be a character followed by a smaller string(one character less).
The string “leaf” is character ‘I’ followed by string “eaf’”, similarly string “eaf” is character ‘e’ followed by
string “af”, string “af” is character ‘a’ followed by string “f”, string “f” is character ‘f? followed by empty
string. So while defining recursive operations on strings, case of empty string will serve as the base case for
terminating recursion. }
Similarly we can define a linked list recursively as- . Ths
(i) A linked list may be an empty linked list.
(ii) A linked list may be a node followed by a smaller jinked list(one node less). |
If we have a linked list containing nodes N1, N2, N3, N4 then we can say that
linkedlist(N1-> N2-> N3-> N4) is node N1 followed by linked list(N2-> N3-> N4),
linkedlist(N2-> N3-> N4) is node N2 followed by linked list(N3-> N4),
linkedlist(N3-> N4) is node N3 féllowed by linked list(N4), .
linkedlist(N4) is node N4 followed by an empty linked list.
While implementing operations on linked lists we can take the case of empty linked list as the base case for
stopping recursion. .
In this chapter we will study some recursive procedures to operate on strings and linked lists. In the next
chapter we will study about another data structure called tree which can also be defined recursively. Some
operations.on trees are best defined recursively, so before going to that it is better if you understand recursion in
linked lists. {
We have seen the recursive definition of strings and the base case. Now we will:write some recursive functions
for operations on strings. The first one is to find the length of a string.
The length of empty string is 0 and this will be our terminating condition. In general case, the function is
recursively called for a smaller string, which does not contain the first character of the string.
int length(char *str)
{
Le(ASere => INO: )
return 0;
return (1.+ length(str+1));
}/*End. of length ()*/
164 \ Data Structures Through C in Depth
if
We can print a string by printing the first character‘of the string followed by printing of the smaller string,
the string is empty there is nothing to be printed so we will do nothing and return(base case).
void display(char *str)
{
| sine (bifsheic Se INO)
return;
putchar (*stx) %
display (str+1);
}/*End of display()*/
By changing the order of printing statement and recursive call we can get the function for displaying string
in reverse order. |
void Rdisplay(char *str)
{ a
Note that for displaying string in standard order we have placed the printing statement before the recursive
call while during the display of numbers from 1 to n (Section 5.4.3), we had placed the printing statement after
the recursive call. We hope that you can understand the reason for this and if not then trace and find out.
Similarly we can make a function to find out the sum of the elements of linked list.
int sum(struct node *ptr)
{
if (ptr==NULL)
return 0;
return ptr->info + sum(ptr->link);
}/*End of sum() */
Next we will make a function for printing the elements of a linked list. We can print a list by printing the
first element of list followed by printing of the smaller list, if the list is empty there is nothing to be printed so
we will do nothing and return.
We are just walking down the list till we reach NULL, and printing the info part in the winding phase.
This
function will be invoked as — .
dispiay(start);
Recursion
| 6S i
Next we will make a function to print the list in reverse order, i.e. it will print all the elements starting from
the last element.
void Rdisplay(struct node *ptr)
{
Lf (ptr==NULL)
return;
Rdisplay(ptr->link) ; :
printf("%d ",ptr->info);
}/*End of Rdisplay()*/
. The next function searches for an element in the list and if it is present it returns 1, otherwise it returns 0.
int search(struct node *ptr, int item)
hee
if (ptr==NULL)
return 0;
if (ptr->info==item)
return 1;
return search(ptr->link,
item);
}/*End of search()*/
The next function inserts a node at the end of the linked list.
struct node *insertLast (struct node *ptr,int item)
{ 4 . ”
struct node *temp;
if (ptr==NULL)
{
temp = malloc(sizeof(struct node)
);
temp->info = item;
temp->link = NULL;
return temp;
}
ptr->link = insertLast(ptr->link, item);
return ptr;
}/*End of insertLast()*/
if (ptr->link==NULL)
{
free(ptr);
ag. return NULL;
if (ptr->link==NULL)
return pen,
temp=reverse(ptr->link) ;
ptr->link->link=ptr;
CE ame
e turn temp;
t
P \
AR for fact(1)
AR for fact(2 ) AR for fact(2)
AR for fact(3) AR for fact(3) AR for fact(3)
main( ) called main( ) calls fact(3) fact(3) calls fact(2) fact(2) calls fact(1)
Push AR of main( ) Push AR of fact(3 ) Push AR of fact(2) Push AR of fact(1)
- AR for fact(2)
AR for fact(1) AR for fact(3)
AR for main( ) AR for main( )
Figure 5.8 .
In the winding phase, the stack grows as new activation records are created and pushed for each invocation
of the function. In the unwinding phase, the activation records are popped from the stack in LIFO manner till
the original call returns.
Recursive solutions involve more . execution overhead than their iterative counterparts, but ‘their main
advantage is that they simplify the code and make it more compact and elegant. Recursive algorithms are easier
to understand because the code is shorter and clearer.
Recursion should be used when the underlying problem is recursive in nature or when the data structure on
which we are operating is recursively defined like trees. Iteration should be used when the problem is not
inherently recursive, or the stack space is limited.
For some problems which are very complex, iterative algorithms are harder to implement and it is easier to
solve them recursively. In these cases, recursion offers a better way of writing our code which is both logical
and easier to understand and maintain. So sometimes it may be worth sacrificing efficiency for code readability.
Recursion can be removed by maintaining our own stack or by using iterative version.
In non void functions, if the recursive call appears in return statement and that call is not part of an
expression then the call is tail recursive.
int GCD(int a,int b)
{
att
= —=0))
return a;
return GCD(b,a%b); /*Tail recursive call*/
W/eEnd OL GCL)*7,
long fact(int n)
{
Tf (n==0,)
return (1);
return(n * fact(n-1)); /*Not a tail recursive call*/
}/*End of fact()*/
Here the call fact (n-1) appears in the return statement but it is not a tail recursive call because the call is
part of an expression. Now let us see some functions which have more than one recursive calls.
void tofh(int ndisk,char source,char temp,char dest)
r i
if (ndisk==1)
{
printf£("Move Disk %d from %c-->%c\n",ndisk,
source, dest) ;
return;
}
tofh( ndisk-1,source,dest,temp ); /*Not tail recursive call*/
printf("Move Disk %d from source, dest) ;
%c-->%c\n",ndisk,
tofh(ndisk-1,temp,source,dest ); /*Tail recursive call*/
}\/*End of tofh()*/
168° Data Structures Through C in Depth _
Here only one recursive call will be executed in each invocation of the function, and that recursive call will
be the last one to be executed inside the function. So both the calls in this function are tail recursive.
The two recursive calls in fibonacci() function(P5.9) are not tail recursive because these calls are part of
an expression, and after returning from the call, the return value has to be added to the return value of other
recursive call.
A function is tail recursive if all the recursive calls in it are tail recursive. In the examples given earlier the
tail recursive functions are - displayl(), GCD(), binary_search(). The functions display2(), tofh(),
fact () are not tail recursive functions.
Tail recursive functions can easily be written using loops, because as in a loop there is nothing to be done after
an iteration of the loop finishes, in tail recursive functions there is nothing to be done after-the current recursive
call finishes execution. Some compilers automatically convert tail recursion to iteration for improving the
performance.
In tail recursive functions, the last work that a function does is a recursive call, so no operation is left
pending after the recursive call returns. In non void tail recursive functions(like GCD) the value returned by the
last recursive call is the value of the function. Hence in tail recursive functions, there is nothing to be done in
the unwinding phase,
Since there is nothing to be done in the unwinding phase, we can jump directly from the last recursive call to
the place where recursive function was first called. So there is no need to store the return address of previous
recursive calls and values of their local variables, parameters, return values etc. In other words there is no need
of pushing new AR for all recursive calls. |
Some modern compilers can detect tail recursion and perform tail recursion optimization. They do not push a:
new activation record when a recursive call occurs; rather they overwrite the previous activation record by
current activation record, while retaining the original return address. So we have only one activation record in
the stack at a time, and this is for the currently executing recursive call. This improves the performance by
reducing the time and memory requirement. Now it doesn’t matter how deep the recursion is, the space required
will be always be constant.
Since tail recursion can be efficiently implemented by compliers, we should try to make our recursive
functions tail recursive whenever possible.
A recursive function can be written as a tail recursive function using an auxiliary parameter. The result is
accumulated in this parameter and this is done in such a way that there is no pending operation left after
the
recursive call. For example we can rewrite the factorial function that we have written earlier as-a tail recursive
function.
long TRfact (int. n; int result)
{
ila Gal 0)
return result; ‘
return TRfact(n-1, n*result);
}/*End of TREact () */
Recursion
a LS LL OOOCOCSCSC—~GD 169
F : : |
This function should be called as TREact (n,1). We can make a helper function to initialize the value of
result to 1. The use of this helper function hides the auxiliary parameter.
TailRecursiveFact
(int n)
{
return TRfact(n,1);
}/*End of TailRecursiveFact (.) */
Functions which are not tail recursive are called augmentive recursive functions and these types of functions
have to finish the pending work after the recursive call finishes.
£2();
}
a ((')
{ 7
ate y
The chain of functions in indirect recursion may involve any number of functions, for example £1() calls
£2(), £2() calls £3.(), £3() calls £4(), £4() calls £1(). If a function calls itself directly i.e. £1() is called
inside its own function body, then that recursion is direct recursion. All the examples that we have seen in this
chapter use direct recursion. Indirect recursion is complex and is rarely used.
Exercise.
Find the output of programs from | to 16.
1. main()
{
prinee( "sd Sd\n",funcl
(3,8), fune2 (3,8) );
}
mune l(ame a, iat I)
{
LECasb)-
mecurcny OF
return b + funcl(a,b-1)j;
}
func2 (int-a,int b)
{ |
if (a>b) 7
: return 0;
return a\+ func2(a+1,b);
- if (a>b)
return 1000;
return a + func(a+1,b);
}
3. main()
{ ‘
printf ("td\n",func(6));
printf ("%d\n", funcl (6) );
}
int
( func(int
: a)
if (a==10)
return a;
return a + func(atl);
}
int funcl(int a) ri
{
2£\(a==0) :
return a;
return a + funcl(a+1)j;
}
4. main()
{ :
print£("%sd\n", func (4,8) );
printf ("%d\n", func(3,8));
}
int func(int a,int b)
{
if (a==b)
return a;
return a + b + func(a+1,b-1);
}
5. main()
{
func1(10,18);
Printse (“\n");
fune2,(10,18);
}
void funci(int a,int b)
{
if (a>b)
return;
DrantLd'sd 4), b)):
funcl (a, b-1) ;
}
void func2(int a,int b)
{
if (a>b)
return;
furic2
(a, b-1) ;
prints (eden, b)) +
} a3
6. main()
{
funcl (10,18) ;
print£("\rn");
£uneZ)(07, 18) is"
Recursion
a
EN 171
EE
8. main()
{
printf ("%d\n",count
(17243) );
}
AMECOUNtG (Int. 1m) 3
{
if (n==0)
return 0;
else
return 1 + count(n/10);
}
9, main()
{
printf ("%d\n",
func (14837) );
a ae
int func(int n)
{
return (n)? n%10 + func(n/10) : 0;
}
10. main()
{
printf ("%d\n", count (123212,2));
}
int count(long int.n,int d)
{
Pe(n==09
; return 0;
else
if (n$10 == qd) ,
re return 1 + count(n/10,qd);
A. else
‘ return count (n/10,d);
172 Data Structures Through C in Depth
11. main()
{
Aeemmariors [Olly
ain sip oye Onan
rede £( USa\n ey cure Varx 7/6))))\5
}
inte func(int arr
|] ;int size)
i
if (size==0)
return 0; -
else if (arr[size-1]%2==0)
return 1 + Lunc(arn,Seze-L);
else
return func(arr, size-1);
}
12. main()
{
Hoke eliqra AMON lilyp,Am Shute
Oy
Primer (Usd \a; funiei(ar,/6)))) 7
}
imteecune (unt arin
ll te sa ze)
{
if (size==0)
igereweheigh 10)F;
return arr[size-1] + func(arr,size-1);
}
13. main()
{
Chan ustr
ll OOlirers
pEINtE ("Enter a string ?<:"));
Gees (Stem) i;
DEInti("hnter a charactenys )s;
scanf ("%c", &a) ;
jorobalene (ME xelWal! ae (cheney)))
}
ilinlis He ((ellayenia (is) relavelne fe\)
{
L£(*s=="\0")
return 0;
Aes ==)
return 1 + f£(stl,a);
return f(s+1l,a);
}
14. main()
{
rebharelle (C21)
func2 (4);
}
void funcl(int n)
{
baal. wal
if (n==0)
return;
LOM (= di Moe det)
jopasinahep
ad(les lh\ie
jopaulpn
oun Na) Be
fumed, =)
}
void func2 (int n)
Recursion
TS = ea ee |
ee 173
7/2)
{
dlTae) sae
Lt (m==0)
return;
func2 (n-1);
For \(f=1 50 1<= m5! lity)
printel Wis), «
jowaubr
ese CNG CN
gi
15. main()
{
Int raLrn (Ol =(253 0 14, 6,52)
printf ("%d\n",func(arr,6));
}
16. main()
{
Pitot PLO; mia A 2, 4a, 8.0 }s
DIEIMEL (sa \n, cune(arr
0,5) )%
}
int func(int arr[J,int low,int high)
{
INteMUd,. Lekt, Tight;
if (low==high)
return arr[low];
mid = (low+high) /2;
left = func(arr, low,mida) ;
right = func(arr,mid+1,high) ;
PEefe, <sright)
return left;
else
return right;
19 25 Add
9 50 Add
4 100
2 200
1 400 Add
475
Now to get the product we’ll add those values of the right hand column, for which the corresponding left
column values are odd. So 25, 50, 400 will be added to get 475, which is the product of 19 and 25.
32. Write recursive functions to find values of |logsN| and [log,N].
33. Write a recursive function to find the Binomial coefficient C(n,k) which is defined as-
C(n,0)=1
C(n,n)=1
C(n,k) = C(n-1,k-1) + C(n-1,k)
34. Write a recursive function to compute Ackermann’s function A(m,n) which is defined as-
n+l if m=0 |
A(m,n) = A(m-1,1) if m>0, n=0
A(m-1, A(m,n-1) ) otherwise
Recursion
ee
e e ee 1S
es _175
41. In the program of previous problem, make changes so that spaces, punctuations marks, uppercase and
lowercase differences are ignored. The strings “A man, a Het a canal — Panama!”, “Live Evil” should be
recognized as palindromes.
42. Write a function to convert a positive integer to string.
43. Write a function to convert a string of numbers to an integer.
44. Write a function to print all possible permutations of a string. For example if the sine is “abc” then all
possible permutations are abc, acb, bac, bca, cba, cab.
45. Write a function to print these pyramids of numbers.
1 1234 4321
12 123 321
I-23 “ 12 Duck
1234 © l 1
46. A triangular number is the number of dots required to fill an equilateral triangle. The first 4 triangular
_ numbers are 1, 3, 6, 10.
* * * *
* * * *
Jk ROK ** *
. RR KR /
The data structures that we have studied till now like linked list, stack, and queue are all linear data structnres,
Linked list is a good data structure from memory point of view, but the main disadvantage with linked list is
that it is a linear data structure. Every node has information of next node only. So the searching in linked list is
sequential which is very slow and of O(n). For searching any element in list, we have to visit all elements of the
list that come before that element. If we have a linked list of 100000 elements and the element to be searched is
not present or is present at the end of list then we have to visit all the 100000 elements. Now we will study a
non linear data structure in which data is organized in a hierarchical manner. A tree structure represents
hierarchical relationship among its elements. It is very useful for information retrieval and searching in it is very
fast. Before going to the definition of trees, let us become familiar with the common terms used.
6.1 Terminology
We will use the tree given below in figure 6.1 to describe the terms associated with trees.
Level - Level of any node is defined as the distance of that node from the root. Root nodeis at a distance 0 from
itself So it is at level 0. Level number of any other node is 1 more than level number of its parent node. Node M
is at level 0, nodes B, C, R, E are at level 1, nodes K, G, H, I, J, F, L are at level 2, nodes A, N, OZR USD eSek
sare at tevel 3.
Height — The total number of levels in a tree is the height of the tree. So height is equal to one more than the
largest level number of tree. It is also sometimes known as depth of the tree. The largest level number in the
example tree is 3 so its height is 3+1 = 4
(Some texts define height of tree equal to the largest level and some take root at level 1. In this book we’ll use
the level and height as defined here.)
Siblings - Two or more nodes which have same parent are called siblings or brothers. B, C, R, E are siblings
since their parent is M. All siblings are at the same level but if is not necessary that nodes at the same level are
siblings. For example K and I are at same level but they are not siblings as their parents are different.
Path - Path of a node is defined as the sequence of nodes Nj, No, ..... Nm such that each node N; is parent of
Nj; for 1 <i <m. Ina tree there is onty one path between two nodes. The path length is defined as the number
of edges on the path (m-1).
Ancestor and Descendent - Any node N, is said to be the ancestor of node N,, if node N, lies in the unique
path from root node to the node N,,. For example node E is an ancestor of node S. If node N, is ancestor of node
Nm, then node N,, is said to be descendent of node N,.
Subtree - A tree may be divided into subtrees which can further be divided into subtrees. The first node of the
subtree is called the root of the subtree. For example, the tree in figure 6.1 may be divided into 4 subtrees -
subtree BKGHANOP rooted at B, subtree C rooted at C, subtree RIJU rooted at R, subtree EFLDST rooted at
E. The subtree BKGHANOP can be further divided into three subtrees — subtree KAN rooted at K, subtree G
rooted at G, subtree HOP rooted at H. We can see that a subtree rooted at any node X consists of all the
descendents of X. The root of the subtree is used to name the subtree so instead of saying subtree BKGHANOP
we generally say subtree B.
Degree - The number of subtrees or children of a node is called its degree. For example degree of node M is 4,
of B is 3, of F is 2, of Lis 1, and of S is 0. The degree of a tree is the maximum degree of the nodes of the tree.
The degree of the example tree is 4.
Forest - A forest is a set of n disjoint trees where n > 0. If the root of a tree is removed we get a forest
consisting of its subtrees. If in the example tree we remove the root M, then we get a forest with four trees.
Re
The following five trees are examples of binary trees-
®
i},
© OW (3)
(1) 2) Figure 6.3
In figure 6.3, the trees (2), (3) and (4) are similar as they have same shape while trees (2) and (4) are copies
as they have same shape and data. Tree (1) has different shape so it is not similar to other trees.
In binary trees, we define left and right descendent. Any node N; is left descendent of node N; , if N; belongs
to the left subtree of N;, similarly any node N; is right descendent of node N; , if N; belongs to the right subtree
of N; 5
The trees which have minimum number of nodes are called skew trees. These trees have h nodes and each
<<
tree has only one path. Some skew trees are shown below.
eS oe
there is no node with one child.
vA
nodes and the original nodes of the tree are internal nodes. The following figure shows a binary tree and the
corresponding extended binary tree.
‘In the figure, external nodes are shown by squares and internal nodes by circles. We can see that all the
external nodes are leaf nodes while the internal nodes are non leaf nodes.
The extended binary tree is a strictly binary tree i.e. each node has either 0 or 2 children.
The path length for any node is the number of edges traversed from that node to the root node. This path
length is equal to the level number of that node. Path length of a tree is the sum of path lengths of all the nodes
of the tree.
Internal path length of a binary tree is the sum of path lengths of all internal nodes which is equal to the sum
of levels of all internal nodes. The internal path length of the tree given in figure 6.6 is-
T=O+1+14+2+24+3=9
External path length of a binary tree is the sum of path lengths of all external nodes which is equal to the
sum of levels of all external nodes. The external path length of the tree given in figure 6.6 is-
E=24+24+34+34+3+4+4+4=21
The internal path length and external path length will be maximum when the tree is skewed(as in figure 6.4)
and minimum when the tree is a complete binary tree(section 6.7). The following property shows the relation
between internal and external path lengths.
Property 9 - In an extended binary tree, if E is the external path length, I is the internal path length and n is the
number of internal nodés, then E=I+2n
Proof : This property can be proved by induction on the number of internal nodes. If tree contains only root
then E =I =n = 0 and the theorem is correct for n=0. If n=1, then there is only one internal node which is the
root node. The root node will have two children which are external nodes. So the internal path length is 0 and
external path length is 2 i.e. E=2, I=0 and n=1. Hence the property is true for n=1 also.
Suppose we have a binary tree T that has k internal nodes. Let E and I be external and internal path lengths
of this tree. Let A be an internal node in this tree such that both children of A are external nodes. Let p be the
level of node A i.e. there are p edges from root to node A. We delete both children of node A from the tree.
Node, A is not an internal node now so the number of internal nodes is k-1. Suppose E’ and I’ are the external
and internal path lengths of this resulting tree T’.
Two external nodes are deleted which are at level (p+1) so external path length decreases by 2(p+1) and
node A becomes an external node so external path length increases by p.
ee
— DPF yp esc. -cysedacdeceton (1)
‘Node A is not an internal node now so internal path length decreases by p.
Be Iire fore. igi nde ol cigaiones (2)
Induction hypothesis : We assume that the property is true for the tree T’ that we obtain after deleting the two
children of A.
B’=T' +: 2(k-1)
Substituting the values of E’ and I’ from (1) and (2)-
E-2(p+1)
+p =I-p
+ 2(k-1)
On simplifying this equation we get-
B=1+2k
So we have proved that if the property is true for tree T” with k-1 internal nodes then it is also true for tree T
with k internal nodes.
The total number of nodes n in a full binary tree are 2°~ 1, so its height h = log,(n+1). Now suppose all nodes in
a full binary tree are numbered from 1 to n starting from node on level 0, then nodes on level 1 and so on.
Nodes on the same level are numbered from left to right. Root node is numbered 1.
8 9 10 ie 12 13.14 15
Figure 6.8
From the figure we can see that if the number of any node is k, then number of its left child is 2k, number of its
right child is 2k+1 and number of its parent is floor(k/2).
ee
The following three trees are complete binary trees.
The number of nodes will be minimum when the last level has only one node. In this case the total nodes
will be -
total nodes in a full unary, tree of height (h—1) + one nde
=(2" i -1) ss ]
= pa OP 1 :
(In some texts, strictly binary trees, full binary trees and complete binary trees are defined in a different way. In
this book we’ll use these terms as defined here.)
10 Tires ps ats 15
Figure 6.10
We'll store the nodes in the array named tree from index | onwards so if a node is numbered k then the
data of this node is stored in tree[k]. The root node is stored in tree[1] and its left and right children in
tree[2] and tree[3] respectively and so on. So the full binary tree in figure 6.10 can be stored in array as-
tree ES CS ee
B64 SS Oar 7) 8559 Og Mie Deal seh 141 15
We can easily extend es yeas for other binary trees also. For this we consider any binary tree as a
complete binary tree with some missing nodes. The nodes that are missing are also numbered and the array
locations corresponding to them are left empty. For example-
Figure 6.11
If a node is at index k in the array then its left child will be at index 2k and right child will be at index 2k+1.
Its parent will be at index floor(k/2). We can see that lots of locations in the array are wasted if there are many
missing nodes.
We have left the location 0 empty, it may be used tor some other purpose. If we want to store nodes in the
array starting from the index 0, then we can number the nodes from 0 to n-1. Root node is numbered 0 and
184 : Data Structures through C in Depth
2k+1 and
stored at index 0 in the array. Now if a node is at index k in the array then its left child will be at index
right child will be at index 2k+2. Its parent will be at index floor( (k-1)/2 ).
of
The sequential representation is efficient from execution point of view because we can calculate the index
parent and index of the left and right children from the index of node. There is.no need to have explicit pointers
pointing to other nodes; the relationships are implicit. .
“We know that the maximum number of nodes possible in a binary tree of height h is 2°- 1 so the size of |
array needed is.equal to 2” 1. Thik size will be minimum if h is minimum, and it will be maximum if h is
maximum. The height is minimum for complete binary trees which is [ logo(n+1) | so in this case size of array
needed is- -
27 togan+l) | aii
The height is maximum(equal to n) for skewed binary trees, and so in this case size of array needed pees
We have seen that lot of space iswasted if there are many missing nodes. The maximum wastage occurs in
the case of a right-skewed binary trée of;height h, it would require an array of size 2"-1 out of which only n
positions will be occupied. For example a right skewed tree having 20 nodes (height = 20) would require an
array of size 1048575 but only 20 of these locations would be used. A complete binary tree of 20 nodes(height
=5) would at most require an array wf size 31. So this method is not very efficient in terms of space for trees
other than complete and full binary trees..
Sequential representation is a static representation and the size of tree is restricted because of the limitation
of array size. The size of array has to be known in advance and if array size is taken too small then overflow
may occur and if array size is too large then space may be wasted.
Insertion and deletion of nodes réquires lot of movement of nodes in the array which consumes lot of time,
so it is suitable only for data that won’t change frequently.
soe ES ace
aa
See Seon areute FEES] ae
sete Reet
BSS Bey
Figure 6.12
The nodes and their memory addé=sses are shown in the figure. For example the node A is located
at address
400 and its left and right child pointers contain the addresses 250 and 780 which are the addresses
of its left
child B and right child C. For leaf nodes, the left and right pointers are NULL. We can see that the
nodes of the
tree are scattered here and there in memory, but still they are connected to each other through the 1child and
rchild pointers, which also maintain the hierarchical order of the tree. The logical view of the linked
representation of the binary tree in figure 6.12 can be shown as-
SI GESSette
totet
es
Figure 6.13
To access the nodes of the tree, we will maintain a pointer that will point to the root node of the tree.
Struce node *root;
We insert one more node in the tree and this will lead to removal of one null link but two new null links will
appear, hence the total null links of the tree T will increase by 1. So the nodes and null links of new tree T’ are
k’ andp’ where k' =k+1, p’=p+l
We can write this as -
k=k’-1
P=p';!
Putting these values in equation (i) we get
po = kr 1
So we have proved that if the property is true for a tree T with k node¥ then it is also true for tree T’ with
k+1 nodes. Hence the property is proved.
186 Data Structures through C in Depth
-- Visit A
-- Traverse left subtree of A in preorder
s Visit B
a Traverse left subtree of B in preorder
¢ Visit D
@ Traverse right subtree of B in preorder
¢ Visit E
° Traverse left subtree of E in preorder
- Visit H
* Traverse right subtree of E in preorder
- Visit I
-- Traverse right subtree of A in preorder
a Visit C
w Traverse left subtree of C in preorder
¢ Visit F
Preorder Traversal: A B D E H ICFGJ
@ Traverse right subtree of C in preorder
* Visit G
* Traverse left subtree of G in preorder
- Visit J
* Traverse right subtree of G in preorder
- Empty
Let us take another example of a binary tree and apply each traversal.
Preorder: A BDHECFIGJK
Inorder :DHBEAIFCJGK
Postorder:H DEBIFJKGCA
Figure 6.17
ta Structures thro C in Dept
188
If we
The three traversals have been defined recursively, so they can be implemented using a stack.
recursively then an implicit stack is used and if we implement them non recursively then we
implement them
have to use an explicit stack. We’ll see both the implementations of all three traversals.
the
We assume that a binary tree already exists and is represented using the linked representation where
value as the information in each
pointer root is a pointer to the root node of the tree. We will take an integer
node of the tree, and visiting a node would mean printing this value. :
The recursive functions for the three traversals are given below. All these functions are passed the address of
the root.
void preorder(struct node *ptr)
{
if (ptr==NULL) /*Base Case*/
return;
prante ("ta “)ptr—-santo)l,
preorder (ptr->lchild) ;
preorder (ptr->rchild) ;
}/*End of preorder()*/
void inorder (struct node *ptr)
{
if (ptr==NULL) /*Base Case*/
return;
inorder (ptr->lchild) ;
Dirints( {sdees,ptr—>in£o));
inorder (ptr->rchild) ;
}/*End of inorder()*/
void postorder(struct node *ptr)
{
if (ptr==NULL) /*Base Case*/
return;
postorder (ptr->1lchild) ;
postorder (ptr->rchild) ;
: jopgammiena
(TE TeL UM olen mesabiahetoy)7
}/*End of postorder()*/
In the explanation when we say push or pop a node, we would actually mean push or pop the address of that
node.
- Figure 6.18
If we apply this process to the tree given in figure 6.18, the steps would be-
- Push 50
- Pop 50: Visit 50
Push right and left child of 50 : Push 60 , Push 40
- Pop 40: Visit 40
40 has no right child so push its left child ; Push 30
- Pop 30: Visit 30
Push right and left child of 30: Push 35 , Push 25
- Pop 25 : Visit 25
25 has naright child so push its left child : Push 20 » 4
- Pop 20: Visit 20 — ' ,
20 has no children so nothing is pushed i °
- Pop 35: Visit 35
Push right and left child of 35: Push 36, Push 33 e
- Pop 33’: Visit 33
33 has no children so nothing is pushed
- Pop 36: Visit 36
36 has no chiléren so nothing iis pushed
- Pop 60: Visit 60
60 has no left chid so push its right child : Push 70
- Pop 70: Visit 70
Push right and left child of 70 :Push 80, Push 65
Data Structures through C in Depth
190 Z
Preorder traversal : 50 40 30 25 20 35 33 36 60 70 65 80
if (ptr->lchild! =NULL)
push_stack(ptr->1lchild);
}
jonquingene (ONGaIw ays
}/*End of nrec_pre*/
Since we move along leftmost path and push nodes, we know that whenever we pop a node, its left subtree
traversal is finished. So a node is visited just after popping it from the stack.
If we apply this process to the tree given in figure 6.18 the steps would be-
(i) Move along the leftmost path rooted at 50
. - Push 50
- Push 40
- Push 30
- Push 25
- 20 is leftmost node, it has no right subtree : Visit 20 and pop
- 25 popped : it has no right subtree : Visit 25 and pop
- 30 popped : it has right subtree : Visit 30 and move to its right subtree
a
Trees a a a =)
191 |
while (ptr->rchild==NULL)
{
printf£("td “,ptr->info),;
if(stack_empty
() )
return;
per = popsstack();
}
printL ("td -",ptr->info) ;
; ptr = ptr->rchild;
}
jonest ayera (INGEiy
}/*End of nrec_in()*/
20 25 33 36 35 30 40 65 80 70 60 50
e
Trees S
,
| 193
While writing the function for postorder we will use a secondary pointer q to know whether the right subtree
has been traversed or not.
The function for non recursive postorder traversal is-
void nrec_post(struct node *root)
{
SEnucte node *qm “ptr = Lnoot;
if (ptr==NULL)
{
printf("Tree is empty\n");
return;
}
q = root;
_while(1)
{
while (ptr->lchild! =NULL)
{
push_stack(ptr) ;
ptr=ptr->lchild;
}
while (ptr->rchild==NULL || ptr->rchild==q)
{ vA
If the right subtree of ptr has already been traversed, then the node visited recently would be the root of right
subtree of ptr. The pointer q is used to store the address of recently visited node.
Level: 0
Level|
Level 2
Level 3
This traversal can be implemented using a queue that will store the addresses of the nodes. The procedure for
level order traversal is-
ta Structur ro
194
OOO OOOOO®H
Pre: BDHE Pre: CFIGJK
In: DHBE In: IFCJGK
Left subtree of A:
From preorder we get node B as root
From inorder we get nodes D, H in left subtree of B, and node E in right subtree of B.
Right subtree of A :
From preorder we get C as the root |
From inorder we get nodes I, F in left subtree of C, and nodes J, G, K in right subtree
of C.
_ Figure 6.20
Postorder: HIDJEBKFGCA |
Inorder : HDIBEJAKFCG
Node A is last node in postorder traversal, so it will be the root of the tree. From inorder, we see that nodes to’
the left of the root node A are nodes H, D, I, B, E, J, so these nodes form the left subtree of A, similarly nodes
K, F, C, G form the right subtree of A since they are to the right of A. :
@OOOOOO ©OO®@
Post: HIDJEB Post: KFGC
In: HDIBEJ .-. In: KFCG
Left subtree of A :
From postorder we get node B as root.
_ From inorder we get nodes H, D, I in left subtree of B, and nodes E, J in right subtree
wotB, |. .; 2 ;
Right subtree of A :
From postorder we get node C as root.
From inorder we get nodes K, F in left subtree of C, and node G in right subtree of C. a
Figure 6.21
ta Structures thro hCin t
~ 198
Now we’ll see a quicker method of creating the tree from preorder and inorder traversal. In preorder traversal,
scan the nodes one by one and keep inserting them in the tree. In inorder traversal, underline the nodes which
have been inserted. To insert a node in its proper position in the tree, we will look at that node in the inorder
traversal and insert it according to its position with respect to the underlined nodes. ‘
Preorder: AB DGHEICFIJK
Inorder: GDHBEIACJFK
Insert A:
GDHBEIACJFXK a)
A is the first node in preorder, hence it is root of the tree.
Insert B:
GDHBEIAC JFK
B is to left of A, hence it is left child of A.
Insert D:
GDHBEIACJFK
D
is to the left of B, hence
D is left child of B.
Insert G:
GDHBEIACJFK
Gis to the left of D, hence
G is left child of D.
Insert H:
GDABEIACJFK |
H is to the left of B and right of D, hence H is right child of
Insert
E:
e
GDHBEIACIJIFK ©
. Eis.to the left of A and right of B, hence E is right child of B.
. ogon
Ip CS ®
© i
Insere lea , \
Insert ©; WR a
GDHBEIACIFK
C is to.the right of A, hence C is right child’of A. By
K @ |
SCS ®
Trees
en
ND Sati
Insert J:
Insert K:
: K i ‘
7
Creation of tree from postorder and inorder by this method is same as creation of tree from inorder and
preorder. The only difference is that in postorder we will start scanning the nodes from the right side i.e.
1 the-last
node in the postorder will be inserted first and first node will be inserted at last.
The function for constructing a tree from inorder and preorder is given next. The inorder and preorder
traversals are stored in linked lists. Structure for nodes of tree is declared as struct treenode and structure
for nodes of list is declared as struct listnode.
struct treenode *construct(struct listnode *inptr, struct listnode *preptr,int num)
{
struct treenode *temp;
struct listnode *q;
Hime. ayshn |
if (num==0)
return NULL; |
temp=(struct treenode *)malloc(sizeof(struct treenode)) ;
temp->info=preptr->info;
temp->lchild = NULL;
temp->rchild = NULL;
if (num==1) /*if only one node in tree*/
return temp;
Gi) Shares
for(i=0; q->info != preptr->info; i++)
q = q->next;
/*Now q points to root node in inorder list and number of nodes in its
left subtree is i*/
/*For left subtree*/ :
temp->lchild = construct (inptr, preptr->next, i);
/*For right subtree*/
fom Ht) <= et)
preptr=preptr->next;
temp->rchild =.construct(q->next, preptr, num-i-1) ;
return temp;
}/*End of construct ()*/
The function for constructing a tree from inorder and postorder is-
struct treenode *construct (struct listnode *inptr,struct listnode *postptr, int num)
er
{
struct treenode *temp;
struct listnode.*q, *ptr;
ta Structures through C in t
200
Dart tommsioriadne
if.(num==0)
return NULL;
ptr=postptr;
for (i=1;, i<num; i++}
ptr = ptr—->next;
/*Now ptr points to last node, of postorder which is root*/
temp=(struct treenode *)malloc(sizeof(struct );
treenode)
temp->info=ptr->info; ;
temp->lchild = NULL;
temp->rchild = NULL;
if(num==1)/*if only one node in tree*/
return temp;
(Giebhelolengh
for (i=0; q->info!=p2r->info; i++)
q = q->next; F
1 /*Now i denotes the number of nodes in left subtree
and q points to root node in inorder list*/
/*For left: subtree*/
-temp->Ichild = construct(inptr, postptr, i); i
/*For right subtree*/
LOG jJ=L5 j= ety)
postptr = postptr->next;
temp->rchild = construct (q->next, postptr, num-i-1);
return temp;
}/*End of construct
() */
dmiteneletiyner.ght;
if(ptr == NULL) /*Base Case*/
return 0;
h_left = height (ptr->lchild) ;
h_right = height (ptr->rchild) ;
cine,(Ider tens et uchats))
return 1 + h Jere;
else
return 1 + h_right;
}/*End of height ()*/
‘The height of the empty tree is 0 and this serves as the base case. In the recursive case, we can find out the
height of a tree by adding 1 to the height of its left or right subtree(whichever is more). For example suppose we
have to find out the height of the given tree-
Figure 6.22 oe a
The heights of left and right subtrees rooted at H are 2 and 3 respectively. So the height of this tree is can be
obtained by adding 1 to height of right subtree, and hence the height of this tree is 4. The height of left and right
subtrees can be found out in the same way. . ae
/
JA.
4
|M
Trees EME okey
. BN ae
6.11 Expression tree
Any algebraic expression can be represented by a tree in which the non leaf nodes are operators and leaf nodes
are the operands. Almost all arithmetic operations are unary or binary so the expression trees are generally
binary trees. The left child represents the left operand while the right child represents the right operand. In the
case of unary minus operator, the node will have only one child. Every algebraic expression représents a unique
tree. Let us take an algebraic expression and the corresponding expression tree.
We can see that the leaf nodes are the variables or constants and all the non leaf nodes are the operators. -
The parentheses of the algebraic expression do not appear in the tree but the tree retains the purpose of
parentheses, for example the + operator is applied to operands a and b and then the / operator is applied to a+b
and c. Now we’|l take an expression tree and apply each traversal.
Algebraic expression : (a—b*c)/(d+e/f)
The preorder and postorder traversals give the corresponding prefix and postfix expressions of the given
algebraic expression. The inorder traversal gives us an expression im which the order of operator and operands —_
are same as in the given algebraic expression but without the parentheses. We can get a parenthesized infix
expression by following the given procedure- Ze
Starting from the root node, the above recursive procedure will give us this expression- iy
((a—(b*c))/(d+(e/f)))
This expression has many surplus pair of parentheses, but it exactly represents the algebraic expression
(a—b *c)/(d+€/f).
ta Structures through C in Dept
- 202
Figure 6.25
A binary search tree is a binary tree that may be empty and if it is not empty then it satisfies the followin
_ properties- P
L.-All the keys in the left subtree of root are less than the key in the root.
eh All the keys in the right subtreeof root are greater than the key in the root.
3. Left-and right subtrees of root are also binary search trees.
We have assumed that all the keys of binary search tree are distinct, although there can be binary search tree
with duplicates and in that case the operations explained here have to be changed .
For | simplicity we assume that the data part contains onl y an integer
i i also serv
value which
following trees are examples of binary search trees. oS eee
Trees 203
Sea
Figure 6.26
In all these trees we can see that for each node N in the tree, all the keys in left subtree of node N are smaller
than key of node N and all the keys in right subtree of node N are greater than the key of node N.
Figure 6.27
The different traversals for the binary search tree in figure 6.27 are-
Preorder : 67 34 12 10.45 38 60 80 78 95 86
Inorder : 10 12 34 38 45 60 67 78 80 86 95
Postorder: 10 12 38 60 45 34 78 86 95 80 67
Level order :67 34 80 12 45 78 95 10 38 60 86
Note that the inorder traversal of a binary search tree gives us all keys of that tree in ascending order.
The function for searching a node non recursively is given below. This function is passed the address of the root
of the tree and the key to be searched. It returns address of the node having the desired key or NULL if such a
node is not found.
struct node *search_nrec(struct node *ptr, int skey)
{
while (ptr! =NULL)
{
if(skey < ptr->info)
ptr = ptr->lchild; /*Move to left child*/
else if(skey > ptr->info)
ptr = ptr->rchild; /*Move to right child*/
else /*skey found*/
return ptr;
}
return NULL; /*skey not found*/
}/*End of search_nrec()*/
The search process can also be described recursively, To search a key in the tree, first the key is compared
with the key in the root node. If the key is found there, then the search is successful. If key is less than the key
in’ the root, then the search is performed in left subtree because we know that all keys less than root are stored in
left subtree. Similarly if key is greater than the key in the root then the search is performed in the right subtree.
In this way the search is carried out recursively. The recursive process stops when we reach the base case of
recursion. Here we have two base cases, first when we find the desired key(successful search) and second when
we reach a NULL subtree (unsuccessful search). The recursive function for searching a node is given below.
This function is passed the address of root of the tree and the value of the key to be et
struct node *search(struct node *ptr,int skey)
{
if (ptr==NULL)
{
printf("key not found\n");
return NULL;
}
else if(skey < ptr->info) /*search in left subtree*/
return search(ptr->lchild, skey); /
else if(skey > ptr->info) /*search in right subtree*/
return search(ptr->rchild, skey); vA
else /*skey found*/
return ptr;
}/*End of search()*/
Searching in binary search tree is efficient because we just have to traverse a branch of the tree and not all
the nodes sequentially as in linked list. The running time is O(h) where h is the height of the tree.
Trees : | 205
Both these operations run in O(h) time, where h is the height of the tree.
To insert 39, first it is searched in the tree and the search terminates because we get NULL right subtree of
38, So it is inserted as the right child of node 38. Let us create a binary search tree from the given keys-
BU 3000, 38,.35905,22, 092.94, 13,98
Creating a Binary search tree from the keys : 50, 30, 60, 38, 35, 55, 22, 59, 94, 13, 98
rh
Figure 6.30
Note that if we alter the sequence of data we will get different binary search trees for the same data.
The non recursive function for the insertion of a node is given below. This function
is passed the address of
root of the tree and the key to be inserted. The root of the tree might change
inside the function so it returns the
address of root node.
Trees sa ; 207
ade node *insert_nrec(struct node *root,int ikey)
printf("Duplicate key");
return root;
} } a 3
This function
will be called as root = insert_nrec(root,ikey) ;
To insert in empty tree, root is made to point to the new node. As we move down the tree, we keep track of
the parent of the node because this is required for the insertion of new node. The pointer ptr walks down the
path and the parent of the ptr is maintained through pointer par. When the search terminates unsuccessfully,
ptr is NULL and par points to the node whose link should be changed to insert the new node. The new node
tmp is made the left or right child of the parent par.
The insertion of a node in binary search tree can be described recursively also. First the key to be inserted is
compared with key of root node. If the key is found there then there is nothing to be done so we return. If the
key to be inserted is less, than the key in the root node then it is inserted in left subtree otherwise it is inserted in
right subtree. The same process is followed for both subtrees till we reach a situation where we have to insert
node iin an empty subtree.
" The recursive function for the insertion of a node is given below. This function is passed the dares of root
of the tree and the key to be inserted. It returns the address of root of the tree. The two base cases which stop the
recursion are — when we find the key or when we reach a NULL subtree.
struct node *insert(struct
node *ptr,int ikey)
{
if (ptr==NULL) /*Base Case*/
{
bts = (struct node *)malloc(sizeof (struct node)
);
ptr->info = ikey;
ptr->lchild = NULL;
ptr->rchild = NULL;
}
else if(ikey < ptr->info) /*Insertion in left subtree*/
ptr->lchild = insert(ptr->lchild, ikey);
else if (ikey > ptr-Sinto) /*Iinsertion in right subtree */
ptr->rchild = insert(ptr->rchild, ikey)-;
else : (
printf("Duplicate key\n"); /*Base Case*/
ta Structures thro Cin
208
return ptr; v4
}/*End of insert()*7/
Insertion operation takes O(h) time where h is the height of the tree.
- CaseA:
To delete a leaf node N, the link to node N is replaced by NULL.If the node is left child of its parent then the
left link of its parent is set to NULL and if the node is right child of its parent then the right link of its parent is
set to NULL. Then the node is deallocated using free ()
” (34)
@2) 5)
Figure 6.32 Deletion of 80
To delete 80, the rig link of 67 is set to NULL.
If the leaf node to be deleted has no parent i.e. it is the root node then pointer root is set to NULL.
struct-node *case_a(struct node *root,struct node *par,struct node *ptr)
{
if(par==NULL) /*root node to be deleted*/
root = NULL;
else if (ptr==par->l1child)
par- ee = NULL;
else
par- iia = NULL;
yo
free(ptr);
return Loot;
}/*End of case_a()*/
Case B:
In this case, the node to be deleted has only one child. After deletion this single child takes the place of the
deleted node. For this we just change the appropriate pointer of the parent node so that after deletion it points to
the child of deleted node. After this, the node is deallocated using free (). Suppose N is the node to be deleted,
P is its parent and C is its child.
If N is left child of P, then after deletion the node C becomes left child of P.
If'N is right child of P, then after deletion the node C becomes right child of P.
The node 82 is to be deleted from the tree. Node 82 is right child of its parent 67, so the single child 78 takes
the place of 82 by becoming the right child of 67.
The node 34 is to be deleted from the tree. Node 34 is left child of its parent 59, so the single child 45 takes
the place of 34 by becoming the left child of 59.
j ta Structures through C in Depth
210
|
Ce
Q
vo es ©
Gs) G0)
Figure 6.35 Deletion of 59
ie: ree
The node 59 is to be deleted from the tree. After searching we will find that its parent is NULL because it is
the root node. After deletion the single child 34 will become the new root of the tree.
struct node *case_b(struct node *root,struct node *par,struct node * G3)
"
struct node *child;
/* Ga eral1zes Chala*/
if(ptr->lchild!=NULL) /*node to be deleted has left child */
child = ptr->lchild;
else /*node to be deleted has right child */
child = ptr->rchild;
if (par==NULL) /*node to be deleted is root node*/
root = chald;
else if (ptr==par->1child) /*node iis left child of its parent*/
par—>l child) = child;
else /*node is right child af its parent*/
par->rchild = child;
free(ptr);
return root;
}/*End of case_b()*/
First we check whether the node to be deleted has left child or right child. If it has only left child then we
store the address of left child and if it has only right child then we store the address of right child in pointer
variable child.
If node to be deleted is root node then we assign child to root. So child of deleted node(root) will become
the new root node. Otherwise, we check whether the node to be deleted is left child or right child of its parent. If
it is left child then we assign child to 1child part of its parent otherwise we assign child to rchild part of
its parent.
Case C:
This is the case when the node tobe deleted has two children. Here we have to find the inorder successor of the
node. The data of the inorder successor is copied to the node and then the inorder successor is deleted from the
tree} :
Inorder successor of a node can be deleted by case A or case B because it will have either one right child or
no child, it can’t have two children.
Inorder successor of a node is the node that comes just after that node in the inorder traversal of the tree.
Inorder successor of a node N, having a right child is the leftmost node in the right subtree of the node N. To
find the inorder successor of a node N we move to the right child of N and keep on moving left till we find a
node with no left child. So we can see why inorder successor can’t have a left child.
Here node N having the key 81 is to be deleted from the tree. Its inorder successor is node S having the key
89, so the data of node S is copied to node N and now node S is to be deleted from the tree. Node S can be
deleted using case A because it has no child.
Here node N having the key 67 is to be deleted from the tree. Its inorder successor is node S having the key -
78, so the data of node S is copied to node N and now node S has to be deleted from the tree. Node S can be
deleted using case B because it has only one child.
.ptr->info = succ->info;
/if(succ->lchild == NULL && succ->rchild == NULL)
root = case_a(root,parsucc,succ) ;
else
root = case_b(root, parsucc,succ) ;
| .
return root; '
}/*End of case_c()*/
In the following function del_nrec1(), we have not made separate functions for different cases. All the
cases are Jandled;in this function only. Case A and caseB are regarded as one case, because in case A the
Be inittilized to NULL.
child will!
ta Structures thro Cin
212
else
tmp = ptr;
if(ptr->lchild!=NULL) /*only left child*/
ptr = ptr->lchild;
else if (ptr->rchild!=NULL) /*only right child*/
ptr = ptr->rchild;
else f2noOschl1a*/
ptr = NULL;
free(tmp);
}
}
yeturn per;
}/*End of del()*/
We can use the inorder predecessor instead of inorder successor in case C, and in that case we get a different
binary search tree. Inorder predecessor of a node N is the rightmost node in the left subtree of node N. Like
other operations, deletion also takes Och) running time where h is the height of the tree. The main() function
for all recursive operations is given next.
/*P6.3 Recursive operations in Binary Search Tree*/
#include<stdio.h>
#include<stdlib.h>
struct node
{
struct node *lchild;
int into;
struct node *rchild;
iy
struct node *search(struct node *ptr,int skey);
struct node *insert (struct node *ptr,int ikey);
struct node *del(struct node *ptr,int dkey);
struct node *Min(struct node *ptr);
pStruct node *Max(struct node *ptr); . =
void preorder(struct node *ptr);
void inorder(struct node *ptr);
void postorder(struct node *ptr);
int height (struct node *ptr);
main()
{
struct node *root=NULL, *ptr;
int choice,k;
while(1)
{
PRInNEE(ENNS) ¢
printf ("1.Search\n") ;_
printf("2.Insert\n") ;
printf("3.Delete\n") ;
printf("4.Preorder Traversal\n");
printf("5.Inorder Traversal\n") ;
printf("6.Postorder Traversal\n") ;
printf("7.Height of tree\n");
printf("8.Find minimum and maximum\n") ;
printé(*9.OuTt
\n")F
printf("Enter your choice : ");
ta Structures through Cin th
214
; |
scanf ("%d", &choice)
switch(choice)
{
case 1:
printf("Enter the key to be searched : ");
scanf ("%d", &k) ;
ptr = search(root,k); |
: if (ptr==NULL)
printf("Key not present\n");
else :
printf("Key present\n");
break;
case 2: rae
printf("Enter the key to be inserted : ");
scanf ("%d", &k) ;
LOO t= LNs exits (cOO,
ko)
break;
case 3:
printf("Enter the key to be deleted : ");
scanf ("%d",&k); \
root = deli(xoot,.k);
break;
case 4:
preorder (root) ;
break;
case 5:
inorder (root) ;
break;
case 6:
postorder (root);
break;
case 7:
printf ("Height of tree is d\n", height (root) );
break; a
case 8:
joeag A Wikia (Seoxoye,))
if (ptr! =NULL)
printf("Minimum key is %d\n",ptr->info) ;
ptr = Max(root);
if (ptr! =NULL)
printf("Maximum key is %d\n",ptr->info) ;
break;
case 9:
exit();
default:
printf ("Wrong choice\n");
}/*Enad OL swatch*/
‘}/*End of while*/
}/*End of main()*7
The functions for traversals and height are same as given in binary tree.
ra%y Posey
He
Bot
os |CES
First node in inorder traversal has no predecessor and last node has no successor. So the left pointer of the
first node and the right pointer of the last node in inorder traversal contain NULL. For consistency the lthread
field of first node and rthread field of last node are set to true.All the declarations and main() function for
the program of threaded tree are given below.
7*P6,5 Insertion, Deletion and Traversal in fully in-threaded Binary Search Tree*/
#include<stdio.h>
#include<stdlib.h>
typedef enum {false,true} boolean;
struct node *in_succ(struct node *p);
Seruet node *in pred(struct node *p);
struct node *insert(struct node *root,int ikey);
struct node *del(struct node *root,int dkey);
struct *case_a(struct node *root,struct
node node *par,struct node *ptr);
StruGite *case_b(struct node *root,struct
node node *par,struct node *ptr);
struct node *case_c(struct node *root,struct node *par,struct node *ptr);
void inorder(struct node *root) ;
void preorder(struct node *root);
struct node
{
struct node *left;
boolean lthread;
int info;
boolean rthread;
struct node *right;
main ()
int choice,num;
struct node *root=NULL;
while(1)
{
joan ahora (GDA alan)
prints ("ljinsert\n");
printf("2.Delete\n");
printf£("3.Inorder Traversal\n");
printf("4.Preorder Traversal\n");
printé£("5 .Ourt\n") >
printf("Enter your choice : ");
scanf("%d",&choice) ;
switch (choice)
{
case 1:
printf("Enter the number to be inserted Nes
scanf("%d", &num) ;
root = insert (root, num);
break;
case 2:
printf("Enter the number to be deleted : LG
scanf("%d",&num) ;
root = del(root,num) ;
break;
case 3:
inorder (root) ;
break;
case 4:
preorder (root);
break;
case 5:
exit(1);
default:
printf ("Wrong choice\n");
}/*End of switch*/
| an a a
}/*End of while*/
}/*End of main()*/
6.13.5.1 Insertion
As in binary search tree here also first we will search for the key value in the tree, if it is already present then
we return otherwise the new key is inserted at the point where search terminates. In BST, search terminates
either when we find the key or when we reach a NULL left or right pointer. Here all left and right NULL
pointers are replaced by threads except left pointer of first node and right pointer of last nodé. So here search
will be unsuccessful when we reach a NULL pointer or a thread. .
The new node tmp will be inserted as a leaf node so its left and right pointers both will be threads.
tmp = (struct. node *)malloc(sizeof(struct node) );
tmp->info = ikey;
tmp->lthread = true;
tmp->rthread = true;
Trees
a 2NS
BID
Case 1 : Insertion in empty tree
Both left and right pointers of tmp will be set to NULL and new node becomes the root.
root = tmp; o
tmp->left = NULL;
tmp->right = NULL;
Before insertion, the left pointer of parent was a thread, but after insertion it will be a link pointing to the new
node.
par->lthread = false;
par->left = tmp;
f
The following example shows a node being inserted as left child of its parent.
Before insertion, the right pointer of parent was a thread, but after insertion it will be a link pointing to the new
node.
par->rthread = false;
par->right = tmp;
The following example shows a node being inserted as right child of its parent.
ta Structures through C in Depth
if (ptr->lthread == false)
ptr = ptr=>lett;
else zy
break;
}
else
{
if(ptr->rthread == false)
ptr = ptr-Sright;
else \
break;
}
}
if (found)
printf("Duplicate key");
else
{
tmp = (struct node *)malloc(sizeof (struct node));
tmp->info = ikey;
tmp->lthread = true;
tmp->rthread = true;
if (par==NULL)
{
root = tmp; fae
tmp->left = NULL; j
tmp->right = NULL;
}
of if(ikey <'par->info)
{
tmp->left = par->left;
D
Trees 2 : : Y 22A
tmp->right = par;
— par->lthread = false;
par->left = tmp;
else
{ >
tmp->left = par;
tmp->right = par->right;
par->rthread = false;
par->right = tmp;
}
}
return root;
¥W/*End of insert ()*/
6.13.5.2 Deletion
First the key to be deleted is searched, and then there are different cases for deleting the node in which key is
found.
struct node *del(struct node *root,int dkey)
{ ,
struct node “par, *ptr; a
ante Lound = 10>
4 ptr = root; par = NULL;
while (ptr !=NULL)
{
if (dkey==ptr->info)
{ nt
found = 1; "
break;
}
par = ptr;
if(dkey < ptr->info)
{
if (ptr->lthread==false)
ptr = ptr->left;
else
break;
}
else
{
if (ptr->rthread==false)
Dir —=_ptr——right;
else
break;
}
h
i£(£ound==0)
printf("dkey not present in tree");
else if (ptr->lthread==false && ptr->rthread==false)
/*2 children*/
root = case_c(root,par,ptr);
else if (ptr->lthread==false) /*only left child*/
root = case_b(root, par,ptr);
4 else if (ptr->rthread==false)
/*only right child*/
root = case_b(root, par,ptr);
else /*no child*/
root = case_a(root,par,ptr) ;
return root;
Z*End of del()*/
the
In BST, for deleting a leaf node the left or right pointer of parent was set to NULL. Here instead of setting
pointer to NULL it is made a thread.
should
If the leaf node to be deleted is left child of its parent then after deletion, left pointer of parent
become a thread pointing to its predecessor. The node which was inorder predecessor of the leaf node before
deletion, will become the inorder predecessor of the parent node after deletion.
par->lthread = true; pan->left = ptr->left;
Delete 17 Inorder:5 10 14 16 20 30
Inorderssanl Oe AmelOmelyecOmo0)
Figure 6.43 Deletion of 17
struct node *case_a(struct node *root, struct node *par,struct node *ptr )
{
Te(par =— NULL) /*key. to be deleted is in root node*/
root = NULL;
else if (ptr == par->left)
{
par->lthread = true;
par->left = ptr->left;
else
{
par->rthread = true;
Par=srught) = ptr—srighte:
}
Eree (ptr) ;
return root;
}7/*End of case_a()*/
If node to be deleted has left subtree, then after deletion right thread of its predecessor should point to its
successor.
Ratsed
at
Delete 16 Inorder:5 10 13 14 15 20 30
IMordens5 810) 13 S415 16) 208 30
}/*End:.
of case_b()*/
ptr->info = succ->info;
if (succ->lthread == true && succ->rthread == true)
root = case_a(root, parsucc,succ) ;
else 5
root = case_b(root, parsucc,succ) ;
return root;
L/i<Hnd of Case lc ()*/
Header node
In an empty tree, the left pointer of header node will be a thread pointing to itself.
I
(a)Average number of comparisons to reach a node (b) Average number of comparisons to reach a node
=(1+2+3+4+5+6+7+8+9)/ 9 = 45/9 = 5 = (14+2424+34+34+3434444+) = 25/9 =2.77
ie Figure 6.47
In tree (a), we need one comparison to Teach node 1, two comparisons to reach node 2, nine comparisons to
reach node 9. Therefore, the average number of comparisons to reach a node is 5. In tree (b), we need one
comparison to reach node 4, two comparisons each to reach node 2 or 6, three comparisons each to reach node
1, 3, 5 or 8, four comparisons each to reach node 7 or 9. In this tree, average number of comparisons is 2.77.
We can see that the tree (b) is better than tree (a) from searching point of view. In fact tree (a) is a form of
linear list and searching is O(n). Both the trees have same data but their structures are different because of
different sequence of insertion of elements. It is not possible to control the order of insertion so the concept of
height balanced binary search trees came in. The technique for balancing a binary search tree was introduced by
Russian mathematicians G. M. Adelson-Velskii and E. M. Landis in 1962. The height balanced binary search
tree is called AVL tree in their honour.
The main aim of AVL tree is to perform efficient search, insertion and deletion operations. Searching is
efficient when the heights of left and right subtrees of the nodes are almost same. This is possible in a full or
complete binary search tree, which is an ideal situation and is not always achievable. This ideal situation is very
nearly approximated by AVL trees.
An AVL tree is a binary search tree where the difference in the height of left and right subtrees of any node
can be at most |. Let us take a binary search tree and find out whether it is AVL tree or not.
Figure 6.48
For the leaf nodes 12, 45, 65 and 96, left and right subtrees are empty so difference of heights of their
subtrees is-0.
For node 20, height of left subtree is 2 and height of right subtree is 3, so difference is 1.
For node 15, height of left subtree is 1 and height of right subtree is 0, so difference is |.
For node 56, height of left subtree is 1 and height of right subtree is 2, so difference is 1.
For node 78, height of left subtree is 1 and height of right subtree is 1, so difference is 0.
In this binary search tree, the difference in the height of left and right subtrees of any node is at most 1 and
so it is an AVL tree. Now let us take another binary search tree.
226 Data Structures through C in Depth
Figure 6.49
For leaf nodes 2 and 30, left and right subtrees are empty so difference is 0.
For node 10, height of left subtree is 3 and height of right subtree is 3, so difference is 0.
For node 5, height of left subtree is 2 and height of right subtree is 0, so difference is 2.
For node 3, height of left subtree is 1 and height of right subtree is 0, so difference is 1.
For node 16, height of left subtree is 0 and height of right subtree is 2, so difference is 2.
For node 28, height of left subtree is 0 and height of right subtree is 1, so difference is 1.
This tree is not an AVL tree since there are.two nodes for which difference in heights of left and right
subtrees exceeds 1.
Each node of an AVL tree has a balance factor, which is defined as the difference between the heights of left
subtree and right subtree of a node.
Balance factor of a node = Height of its left subtree - Height of its right subtree
From the definition of AVL tree, it is obvious that the only possible values for the balance factor of any node
are =1, 0, I:
A node is called right heavy or right high if height of its right subtree is one more than height of its left
subtree. A node is called left heavy or left high if height of its left subtree is one more than height of its right
subtree. A node is called balanced if the heights of its right and left subtrees are same. The balance factor is 1
for left high, -1 for right high and O for balanced node. .
So while writing the program for AVL tree, we will take an extra member in the structure of the tree node,
which will store the balance factor of the node.
struct node
{
struct node *lchild;
alate =aligtiteyy,
SeEructenode *rchiid;
int balance;
As in binary search tree, we will take an integer value in the info part of the node, which will serve as the —
key.
The three trees in figure 6.50 are examples of AVL trees. The balance factor of each node is shown outside
the node. In all these trees, the balance factor of each node is -1, 0 or 1.
Figure 6.50
Let us see some trees that are binary search trees but not AVL trees.
Trees 227
Figure 6.51
Tree T2
Figure 6.52 Right Rotation
In the tree T1, A is left child of node P and subtrees AL, AR are subtrees of node A. The subtree PR is right
subtree of node P. Since T1 is a binary search tree, we can write-
keys(AL) < key(A) <keys(AR) < key( P) < keys (PR) .........0ccseeeeeees (1)
The inorder traversal of tree T1 will be-
inorder(AL), A, inorder(AR), P, inorder (PR)
Now we perform a right rotation about the node P and the tree T1 will be transformed to tree T2 as shown in
figure 6.52. The changes that take place are -
(i) Right subtree of A becomes left subtree of P
(ii) P becomes right child of A
(iii) A becomes the new root of the tree
If tree T2 has to satisfy the property of a binary search tree then the relationship among the keys should be-
keys(AL)< key(A) < keys(AR)< key( P) < keys (PR)
We have seen that this relation is true( from (1) ), so tree T2 is also a binary search tree. The inorder traversal of
tree T2 will be - \
inorder(AL), A, inorder(AR), P, inorder (PR)
This is the same as the inorder traversal of tree T1.
Here T1 could be a binary search tree or a subtree of any node of a binary search tree. Let us illustrate this -
rotation with an example. i
228 Data Structures through C in Depth
60)
is) @
@ Gs) Go)
OOM f
Figure 6.53 Right rotation about node 25
Here we are performing right rotation about node 25. If we compare this figure with the figure 6.52, then
node 25 is the node P and node 15 is the node A. The subtree AL consists of nodes 4 and 6 while subtree AR
consists of only node 18. The subtree PR consists of only node 30. The changes that take place dueto right
rotation are-
(i) Right subtree of node 15 becomes left subtree of node 25.
(ii) Node 25 becomes the right child of node 15.
(iii) The subtree which was rooted at node 25 is now rooted at node 15. So previously, the root of left subtree of
node 50 was node 25, while after rotation the root of left subtree of node 50 is node 15.
Let us see two more examples of right rotation.
Here right rotation is performed about node 54. The subtree PR consists of nodes 59 and 60, subtree AL’
consists of nodes 29, 15, 30, 8 and subtree AR consists of nodes 45, 39, 52.
— (5) (50)
Here right rotation is performed about node 50. The subtrees PR and AR are empty while subtree AL
consists of node 15.
ihe function for right rotation is simple; it takes a pointer to node P, performs rotation and returns pointer to
node A which is the new root of the subtree initially rooted at P.
struct node *RotateRight (struct node *pptr)
{ (
struct node *aptr;
aptr = pptr->lchild; /*A is left child of P */
pptr->1child = aptr->rchild; /*Right child of A becomes left child of P*/
aptr->rehild = pptr; /*P becomes right child of A*/
return aptr; /*A is the new root of the subtree initially r oot *
}/*End of RotateRight
() */ a
Trees | 229
fa)
——— {P} Ya\
fo\ [s\
Tree TI ‘Tree T2
Figure 6.56 Left Rotation
In the tree T1, A is right child of node P and subtrees AL, AR are subtrees of node A. The subtree PL is left
subtree of node P. Since T1 is a binary search tree, we can write-
keys(PL) < key(P) < keys(AL) < key(A) < keys(AR) ¢..5.......ccceceeeeeeees (2)
The inorder traversal of this tree would be-
inorder(PL), P, inorder(AL), A, inorder(AR)
Now we perform a left rotation about the node P and the tree Tl will be transformed to tree T2. The changes
that take place are -
’ (i) Left subtree of A becomes right Be Ae of P.
(ii) P becomes left child of A.
(iii) A becomes the new root of the tree.
If tree T2 has to satisfy the property of a binary search tree then the relationship among the keys should be-
keys(PL) < key(P) < keys(AL) < key(A) < keys(AR)
We have seen that this relation is true(from (2) ), so tree T2 is also a binary search tree. The inorder traversal of
binary search tree T2 will be-
inorder(PL), P, inorder(AL), A, inorder(AR)
This is same as the inorder traversal of the tree T1. Here is an example of left rotation.
Here we are performing left rotation about node 35. If we compare it with the figure 6.56, then node 35 is
node P, node 45 is node A. The subtree AL consists of only node 40 and subtree AR consists of nodes 49 and.
55. The subtree PL consists of only node 25. The changes that take place due to rotation are-
(i) Left subtree of node 45 becomes right subtree of node 35,
(ii) Node 35 becomes the left child of node 45
(iii) The subtree which was rooted at node 35 is now rooted at node 45.
Let us see two more examples of left rotation.
Data Structures through Cin th
230
Here we are performing left rotation about node 65. The subtree PL consists of nodes 60, 53 and 62, subtree AL
consists of nodes 72, 69, 74, and subtree AR consists of nodes 80,79,85,90.
29)
i) = &)
Figure 6.59 Left rotation about node 15
Here we are performing left rotation about node 15. The subtrees PL and AL are empty while subtree AR
consists of node 50. The function for left rotation is-
struct node *RotateLeft (struct node *pptr)
{.
struct=node *aptr;
aptr = pptr->rchild; /#A Ls right .chiid of<P*7
pptr->rchild = aptr->lchild; /*Left child of A becomes right child of P*/
aper—>ichilde=——sppitx, /*P becomes left child of A*/
return aptr; /*A is the new root of the subtree initially rooted at P*/
}/*End of RotateLerft()*/
Now we will see how to perform insertion and deletion operations in AVL tree. The main() function and
other function declarations required for the AVL tree program are given next. The definitions of the functions
are 'given with the explanation of the procedures.
/*P6.6 Program of AVL tree*/
#include<stdio.h>
#include<stdlib.h> Na,
#define FALSE 0 }
#define TRUE 1
struct node
{
struct node *lchild;
mt, 1nto};
struct node *rchild;
int balance;
hy
void inorder(struct node *ptr);
struct node *RotateLeft(struct node *pptr);
struct node *RotateRight (struct node *pptr);
struct node *insert(struct node *pptr,int ikey);
struct node *insert_left_check(struct node *pptr,int *ptaller);
struct node *insert_right_check(struct node *pptr,int *ptaller);
struct node *insert_LeftBalance(struct node *pptr);
struct node *insert_RightBalance(struct node *pptr);
ES |
struct node *del(struct node *pptr, int dkey) ;
struct node *del_left_check(struct node *pptr,int *pshorter) ;
struct node *del_right_check(struct node *pptr,int *pshorter) ;
struct node *del_LeftBalance(struct node *pptr, int *pshorter) ;
struct. node *del_RightBalance(struct node *pptr, int *pshorter) ;
main()
{
int choice, key;
struct node *root = NULL;
while(1)
{
Punts ("\n™ i
printf("1.Insert\n") ;
printf("2.Delete\n"); :
printf("3.Inorder Traversal\n") ;
PEINEL( *40OuLeE\n 4)
printf("Enter your choice : ");
scanf("%d", &choice) ;
switch(choice)
{ , \
casé 1: :
printf ("Enter the key to be inserted : ");
scanf ("%d", &key) ;
root = insert (root,key) ;
break;
case 2:
printf("Enter the key to be deleted : ");
scanf ("%d", &key) ;
root = del(root, key) ;
break;
case 3:
inorder (root) ;
break;
case 4:
exit (1);
default:
printf ("Wrong choice\n") ;
}/*End of switch*/
}/*End of while*/
}/*End of main()*/
(iii) 35 inserted in T
Figure 6.61
The nodes are inserted in appropriate place following the same procedure as in binary search tree. In all the
four cases of figure 6.61, we can see that the balance factors of only ancestor nodes are affected i.e. only those
nodes are affected which are in the path from inserted node to the root node. If the value of balance factor of
each node on this path is -1, 0, or 1 after insertion, then the tree is balanced and has not lost its AVL tree
property after the insertion of new node. Therefore, there is no need to do any extra work. If the balance factor
of any node in this path becomes 2 or -2 then the tree becomes unbalanced. The first node on this path that
becomes unbalanced(balance factor 2 or -2) is marked as the pivot node. So a pivot node is the nearest
unbalanced ancestor of the inserted node.
In the first two cases(insertion of 77 and 58) the tree remains AVL after the insertion of new node, while in
the last two cases(insertion of 35 and 99) the balance factors of some nodes become 2 or -2, thus violating the
AVL property. In third case 48 is marked as the pivot node since it is the nearest unbalanced ancestor of the
inserted node, similarly in fourth case 94 is marked as the pivot node.
After the insertion of new node in the tree, we examine the nodes in the path starting from newly inserted
node to the root. For each node in this path, the following three cases can occur-
(A) Before insertion the node was balanced(0) and now it becomes left heavy(1) or right heavy(-1). The height
of subtree rooted at this node changes and hence the balance factor of the parent of this node will change so we
need to check the balance factor of the parent node. So in this case, we update the balance factor of the node and
then check the balance factor of the next node in the path.
(B) Before insertion the node was left heavy(1) or right heavy(-1) and the insertion is done in the shorter
subtree, so the node becomes balanced(0). We will update the balance factor of the node. The height of subtree
rooted at this node does not change and hence the balance factors of the ancestors of this node will remain
unchanged. Therefore, we can stop the procedure of checking balance factors i.e. there is no need to go upto the
root and examine balance factors of all the nodes.
(C) Before insertion the node was left heavy(1) or right heavy(-1), and the insertion is done in the heavy
subtree, so the node becomes unbalanced( 2 or -2). Since the balance factor of 2 or -2 is not permitted in an
AVL tree, we will not directly update the balance factor unlike the other two cases. This node is marked as the
Trees 233:
pivot node and balancing is required. The balancing is done by performing rotations at this pivot node and
updating the balance factors accordingly. The balancing is done in such a way that the height of the subtree
rooted at pivot node is same before and after insertion. Therefore, the balance factors of ancestors will not
change and we can stop the procedure of checking balance factors.
Now let us apply these cases in the four insertions that we have done-
(i) Insertion of 77 in tree T — Nodes to be checked are 78, 80 and 72. Firstly, node 78 is examined, before
insertion it was right heavy and now it has become balanced, this is Case B, we’ll change the balance factor to 0
and stop checking of balance factors. Balance factors of all the ancestors of this node will remain same as they
were before insertion. This is because the height of the subtree rooted at 78 did not change after insertion.
(ii) Insertion of 58 in tree T — Nodes to be checked are 62, 56 and 72. Firstly, node 62 is examined, before
insertion it was balanced and now it has become left heavy, this is Case A, so we’ll change the balance factor to
1 and now check node 56. Node 56 was left heavy before insertion and now it became balanced(Case B), so
we’ll change the balance factor of node 56 to 0 and now we can stop procedure of checking balance factors.
(iii) Insertion of 35 in tree T — Nodes to be checked are 29, 48, 56 and 72. Firstly node 29 is examined, before
insertion it was balanced and now it has become right heavy, this is Case A, so we’ll change the balance factor
to -1 and check the next‘node which is 48. This node was left heavy before insertion and insertion is done in its
left subtree so now it has become unbalanced(pivot node), this is case C so now balancing will have to be done
and after balancing we can stop checking of balance factors.
(iv) Insertion of 99 in_T — Nodes to be checked are 98, 94, 80 aiid 72. Firstly node 98 is examined, before
insertion it was balanced and now it has become right heavy, this is Case A, so we’ll change the balance factor
to -1 and check next node which is 94. This node was right heavy before insertion, and insertion is done in its
right subtree so now it has become unbalanced(pivot node), this is case C so now balancing will have to be done
and after balancing we can stop checking c¢ balance factors.
This was an outline of the insertion process, now we will look how we can write the function for insertion. In
pinary search tree, we had written both recursive and non-recursive functions for insertion. Here we will write
the recursive version because we can easily check the balance factors of ancestors in the unwinding phase. We
will take recursive insert function of BST as the base and make some additions in it so that the tree remains
jalanced after insertion. Here is the function for insertion in an AVL tree-
struct node *insert (struct node *pptr,int ikey)
{
static int taller;
.
{
pptr->rchild = insert (pptr->rchild,
ikey) ;
if (taller==TRUE)
pptr. = insert_right_check(pptr, &taller) ;
}
else /*Base Case*/
{
printf("Duplicate key\n");
taller = FALSE;
}
return pptr;
}/*End of insert()*/
There are two recursive calls in this function. After the recursion stops, we will pass through each node in the
path starting from inserted node to the root node i.e. in the unwinding phase we will pass through each ancestor .
of the inserted node.
The functions insert_left_check() and insert_right_check() are written after the recursive calls
so they will be called in the unwinding phase. The function insert_left_check() will be called to check
and update the balance factors when insertion in done in left subtree of the current node. If insertion is done in
right subtree of the current node then the function insert_right_check() will be called for this purpose.
These functions are not always called; they are called only when the flag taller is TRUE. Now let us see what
this taller is and why are these functions called conditionally.
In the three cases that we have studied earlier, we know that we can stop checking if case B or case C occurs
at any ancestor node. So taller is a flag which is initially set to true when the node is inserted and when any
of the last two cases(B or C) are encountered we will make it false(inside insert_left_check() or
insert_right_check.)). The process of checking balance factors stops when taller becomes false.
Now we will try to understand the working of this function with the help of an example. Let us take the
example of insertion of node 58 in tree T (figure 6.61(ii)). The recursive calls are shown in the figure 6.62.
There are 4 invocations of insert () and the recursion stops when insert () is called with pptr as NULL.
In this case, a new node is allocated and its address is assigned to pptr. All the fields of this node are set to
proper values. The value of flag taller is set to true. The value of pptr is returned and the unwinding phase
begins.
The control returns to the previous invocation(3™ call) of insert () and since taller is true, the function
insert_left_check() is called(insertion in left subtree of 62). Inside this function, the appropriate action
will be taken according to the case that occurs. We have seen earlier that case A occurs at this point so inside
insert_left_check() the balance factor of node 62 will be reset. We know that when case A occurs we do
not stop the procedure of checking balance factors so the value of flag taller will remain true.
Now control returris to the previous invocation(2™ call) of insert () and since the value of taller is true,
the function insert_right_check() is called(insertion in right subtree of 56). This time case B occurs so
inside insert_right_check(), we’ll reset the balance factor of node 56 and now we have to stop the process
of checking balance factors. For this we will make the value of taller false inside the function
insert_right_check(). This will ensure that the functions insert_left_check() and
insert_right_check() will not be called in the remaining invocations of insert ().
Now control returns to the first invocation of insert () and since now the value of taller is false, the
function insert_left_check() will not be called. After this the control returns to main().
If the key to be inserted is already present, then it is not inserted in the tree and so there is no need of
checking the balance factors in the unwinding phase. Hence, the value of taller is made false in the case of
duplicate key.
Trees : 035
SPOON SRDS IARI OGSHAADI HAIR RNIDDBA RUA IEICE IH IVR SERIO
S *
i BIG I RDG IISA eS
z SERVERS VAMAANAAAA AM NS sabey = <: s
2 ¥main( 3 pptr->info = 72 s
SO8Hs 4 2 05 ikey = 58 Q
os : else if(ikey < pptr->info) } 8
2 N ete =. : a deee : ’ . 5 H 3 %
' fs 5
é i] gieleteleteteteterererereretereteretetetetetereteneteleteteteteteteteteresetetes es BE:
‘ : jinsert(struct node *pptr, int ikey ) (pptr->info is 62 :
: ! if Lx acaathon uiehien ikey is 58 z
y ' i else if(ikey < pptr->info) 3 y
4 1 g AH Ba
' { x
Be
N : ------- i.» pptr->Ichild = insert(pptr->Ichild, ikey); .
: Kot :
Ra i] 1 == x ,
: t \ ; Ps
ys = Bs
= '
3 ; es Meco :
Z a Fore
é
2
|!
' gisletetetetetetetereterers Peletetetetes
g : tinsert(struct node *‘ptr, ‘int‘ikey y pptr is NULL
g yi i{ ikey is 58
5 ' i if(pptr==NULL) }
z ' { pptr = (struct node *) malloc(sizeof(struct node)); #
é : pptr->info = ikey; i
§ rs ' pptr->Ichild = NULL; pptr->rchild = NULL;
: : pptr->balance =0; taller = TRUE;
foe Cara
i) }
éi
i)
&
.
es
Figure 6.62
‘ Now let us study in detail the different cases to be handled inside the functions insert_left_check() and
insert_right_check().
236 Data Structures through C in Depth
Case L_A:
Before insertion : bf(P) =
Case L_B:
Before insertion : bf(P) = -
h+1
Total height = h+2 Total height = h+2
Figure 6.64
After insertion bf(P) =
Height of the subtree rooted at node P has not increased, this means that the balance factors of parent and
ancestors of node P will not change, so we can stop the process of checking balance factors. Therefore, the
value of taller is made FALSE.
Case L_C:
Before insertion : bf(P) = 1
Figure 6.65
After insertion, the balance factor of P will become 2 and this is not a permissible value for balance factor in an
AVL tree so we will not update it. The node P becomes unbalanced and it is the first node to become
unbalanced so it is the pivot node. Now left balancing iis required to restore the AVL property of the tree.
Now let us see how we can balance the tree in this case by performing rotations. We will further explore the left
_Subtree of pivot node. Let A be the root of subtree PL, and AL, AR be left and right subtrees of A. Therefore,
the subtree rooted at pivot node before insertion is-
Seeeiee
t eMpernt etait 2phot o
Figure 6.66
The balance factor of A before insertion will definitely be zero( 1 or -I is not possible), we’ll discuss the
reason-for it afterwards. Now we can have two cases depending on whether the insertion is done in AL or AR.
~ Insertion in AL ; Insertion in AR
J Figure 6.67
Currently we are checking the balance factor of the node P. It means that we have travelled till node A and
ipdated all balance factors. So at this point, the balance factor of A will be | if insertion is done in AL or -1 if
msertion is done in AR. The balance factor of pivot node is the same as was before insertion(it is still 1), we
lave not updated it to 2. Now we have two subcases depending on the balance factor of node A.
Case L_C1:
‘nsertion done in left subtree of left child of node P( in AL)
3efore insertion : bf(P) = 1, bf(A) =0
After insertion, updated balance factor of A = 1
4 single right rotation about P is performed
Figure 6.68
Case L_C2:
Insertion done in right subtree of left child of node P (in AR)
Before insertion : bf(P) = 1, bf(A) = 0
After insertion updated balance factor of A = -1
LeftRight rotation performed
In this case, we need to explore the right subtree of A. Let B be the root of subtree AR, and let BL, BR be left
and right subtrees of B. Before insertion, the balance factor of B will definitely be zero.
Figure 6.71
Here we can have two cases depending on whether insertion is done in BL or BR. It is possible that the
subtree AR is empty before insertion, so in this case the newly inserted node is none other than B and this is our .
third case. So we have 3 subcases-
L_C2a: New node is inserted in BR ( bf(B) = -1)
L_C2b: New node in inserted in BL ( bf(B) = 1)
L_C2c : B is the newly inserted node ( bf(B) = 0)
These 3 subcases are considered just to calculate the resulting balance factor of the nodes; otherwise the
rotation performed is similar in all three cases.
A single rotation about the pivot node will.not balance the tree so we have to perform double rotation here.
First, we will perform a left rotation about node A, and then we will perform a right rotation about the pivot
node P. \
Case L_C2a:
New node is inserted in BR,
After insertion, updated balance factor of B = -1
Total height = h+2 Total height = h+3 Total height = h+3 Total height = h+2
Figure 6.72
\
Trees 239
Total height = h+2 Total height = h+3 Total height = h+3 Total height = h+2
; Figure 6.73
After balancing : bf(P) = -1, bf(A) = 0, bf(B) = 0
Case L_C2c:
B is the newly inserted node, so balance factor of B = 0
In the figure 6.71, if the newly inserted node is B, it means that AR was empty before insertion or we can say
that value of h is zero. Since AL and PR also have height h, they will also be empty before insertion.
Figure 6.74
After balancing : bf(P) = 0, bf(A) = 0, bf(B) = 0
Here are two examples of case L_C2( LeftRight rotation).
insertion and after balancing is h+2) and so taller is made false. The functions
height before
insert_left_check()
is-
struct node *insert_left_check(struct node *pptr, int *“ptaller)
{
switch (pptr->balance)
{
case 0: /*Case L_A : was balanced*/
pptr->balance = 1; /*now left heavy*/
break;
case -1: /*Case L_B: was right heavy*/
pptr->balance = 0; /*now balanced*/
*ptaller = FALSE;
break;
case 1: /*Case L_C: was left heavy*/
pptr = insert_LeftBalance(pptr) ; /*Left Balancing*/
*ptaller = FALSE;
}
return pptr;
}/*End of insert_left_check()*/
pptr->balance 0;
aptr->balance "ott 0;
bptr = aptr->rchild;
switch (bptr->balance)
{
case -1: /*Case L_C2a : Insertion in BR*/
pptr->balance = 0;
‘aptr->balance = 1;
break;
case. 1: /*Case L_C2b : Insertion in BL*/
pptr->balance = -1;
aptr->balance = 0;
break;
case 0: /*Case L_C2c : Bis the newly inserted node*/
pptr->balance = 0;
aptr->balance = 0;
}
bptr->balance = 0;
- pptr->lchild = RotateLeft(aptr) ;
pptr = RotateRight (pptr) ;
} :
return pptr;
}/*End of insert_LeftBalance()*/
In the function insert_LeftBal() we have updated the balance factors before performing rotations. This
is because rotations are done through functions RotateRight() and RotateLeft() and after calling these
functions, pptr no longer points to node P but it points to the new root of the subtree.
The following information in the box summarizes all of the cases explained, and it includes the situation
when the insertion is done in right subtree.
Recraem mommmenmanare wig a cigar
2,2,8 202 SABIAN Moshe RARE SSRASM
ore Pa Bas IES IGI PalARIE HII Hh ARLHS OD, 8 te ons 6 8wo SREP Sage PELE IS PP MAREE IA SERRE EIIS BRIER OREM IHele
ae
A. new node is inserted and taller = TRUE
<
P is the node whose balance factor is being checked.
=5
&eS
If Insertion in left subtree of the node P (Insertion in PL)
Case L_A: If node P was balanced before insertion
Nodz P becomes left heavy
Case L_B : If node P was right heavy before insertion
Node P becomes balanced, taller = FALSE
Case L_C : If node P was left heavy before insertion
Node P becomes unbalanced, Left Balancing required, taller = FALSE
P is the pivot node and its left child is A
' Case L_C1 : If insertion in left subtree of A (in AL )
IE
TERRE
BPEL
SEER Right Rotation (right about P)
Case L_C2 : If insertion in right subtree of A (in AR )
LeftRight Rotation(left about A then right about P)
B is the right child of A
Case L_C2a:: If insertion in right subtree of B (in BR )
bf(P) = 0, bf(A) = |
I
BERRA
Case L_C2b: If insertion in left subtree of B( in BL )
bf(P) =-1, bf(A) = 0
Case L_C2c : If B is the newly inserted node.
ie ; As bf(P) = 0, bf(A) = 0
Case R_B:
Before insertion : bf(P) = 1
Case R_C1:
Insertion done in right subtree of right child of node P (in AR).
After insertion updated balance factor of A = -1
Left rotation about node P is performed.
Oe Eye
Figure 6.83
There can be 3 subcases-
R_C2a: New node is inserted in BR ( bf(B) = -1)
-R_C2b: New node in inserted in BL ( bf(B) = 1)
R_C2c : B is the newly inserted node ( bf(B) = 0)
Like case L_C2, here also double rotation is performed. First we will perform a right rotation about node A,
and then we will perform a left rotation about the pivot node P. We can see that this is the mirror image of case
of left to right rotation. .
|Case R_C2a:
New node is inserted in BR.
After insertion updated balance factor of B = -1
a
Total height = h+3 Total height = h+3 Total height = h+2
Total height = h+2
Figure 6.84
‘After balancing : bf(P) = 1, bf(A) = 0, bf(B) = 0
\Case R_C2b:
New node is inserted in BL
‘After insertion updated balance factor of B = |
244 Data Structures through C in th
Total height = h+2 Total height = h+3 Total height = h+3 Total height = h+2
Figure 6.85
After balancing : bf(P) = 0, bf(A) = -1, bf(B) =0
Case R_C2c:
B is the newly inserted node, so balance factor of B = 0
Oy eyo
| Figure 6.86
pptr->balance = 0;
aptr->balance = 0;
pptr = RotateLeft(pptr) ;
Q
Insert 40
Insert 35
Sa al
Rotate right about 50
Insert 48
Pe OnGdse eb
SSS SS
Insert 42
P:48 Case L_A
Pee UMCASe omAN
P: 40, Case R_C2b
ee od
Insert 60
P:58 Case R_A ; (48)
P : 50, Case R_Cl
SEER oe
Rotate left about 50 G5) (42) (60) (60)
Insert 30
ws P 235, CaseilicA ©)
(40) 68) P: 40, Case L_A (40) (58)
P: 48, Case L_A
Se O@@® BO = ag S O@® B®
Brees Stout eewtyste ent 247
Insert 33
(48)
P;30, Case R_A
P : 35, Case L_C2c (40) (58)
So a Le
@) Rotate left about 30 <3) 2) G0) (60)
Rotate right about 35 Go) 35)
Insert 25
(438)
P.:30, Case L_A
(40) (58) P: 33; Case L_A
Pee Case luGi
Se 8 @ tj ———
Rotate night about 40
(30) G5)
Gs)
There is no need fo remember all the cases while inserting in an AVL tree. We just need to update the
balance factors of the ancestor nodes arid if any node becomes unbalanced we can perform the appropriate
rotations.
(A) Before deletion the node was balanced and after deletion it becomes left heavy or right heavy. We will
update the balance factor of the node. In this case, the balance factors of the ancestors of this node will remain
unchanged. Therefore, we can stop the procedure of checking balance factors.
(B) Before deletion, the node was left heavy or right heavy and the deletion is performed in the heavy subtree,
so after deletion the node becomes balanced. We will update the balance factor of the node and then check the
balance factor of the next node in the path.
(C) Before deletion, the node was left heavy or right heavy and the deletion is performed in the shorter subtree,
so after deletion the node becomes unbalanced. In this case balancing is required which is performed using the
same rotations that were done in insertion. There are 3 subcases in the case C. In one subcase, we will stop the
procedure of checking balance factors while in the other two subcases we will continue our checking.
The function for deletion of a node from AVL tree is shown below-
struct node *del(struct node *pptr,int dkey)
{ a
struct node *tmp, *succ;
Seacie nt Shortver,
return (pptr) ;
}
if(dkey < pptr->info)
{
pptr->lchild dkey) ;
= del(pptr->lchild,
if (shorter == TRUE)
pptr = del_left_check (pptr, &shorter) ;
Hes
else
tmp = pptr;
_if(pptr->lchild != NULL) /*only left child*/
pptr = pptr->lchild;
else if (pptr->rchild != NULL) /*only right child*/
pptr = pptr->rchild;
else /*no children*/
pptr = NULL;
free (tmp) ;
shorter = TRUE;
}
}
return pptr;
}/*End of del()*/
This function is similar to the function for deletion in a BST, except for a few changes required to retain the
balance of the tree after deletion. The flag shorter serves the same purpose as taller did in insert (). It is
initially set to true when the node is deleted and it is made false inside del_left_check() or
del_right_check(). The process of checking balance factors stops when shorter becomes false. When the
item to be deleted is not found in the tree, then also shorter is made false because in this case we do not want
any checking of balance factors in the unwinding phase.
The node P will become unbalanced after deletion, so we will not update the balance factor of P directly. P
becomes the pivot node and balancing is required. We will explore the right subtree of P. Let A be the root of
PR, and let AL, AR be left and right subtrees of A.
Note that while insertion we had explored the subtree in which insertion was done, while here we are
exploring the subtree other than the one from which node is deleted. So, if deletion is performed from left
subtree of pivot node, then right balancing is required.
. The balance factors of only those nodes will be affected by deletion which lie in the path from deleted node
to root node. Presently we are at the point when we have deleted the node and updated all the balance factors in
PL and now we have found that P has become unbalanced. The balance factors of nodes in PR will remain
unaffected after deletion since it is not in the path from deleted node to root. Node A is the root of PR so its
balance factor will be same before and after deletion. Now we can have three cases depending on the three
different values of balance factor of A.
Note that in insertion we had only two cases because there the balance factor of A before insertion was
definitely 0. This was so because there A was in the path that was affected by insertion. In deletion, node A
could have any of the three values(-1, 0, 1) before deletion since it is not in the path that is affected by deletion.
250 Data Structures through C in Depth
Case L_C1:
Before deletion bf(P)= -1, bf(A) =
Left rotation about P is performed
“h+1
Total height = h+3 _ Total height=n43 Total height= h+3
Figure 6.92
Figure 6.94
We can have three subcases depending on the three different balance factors of B. The rotation performediis
same in all the three cases(RightLeft rotation) but the resulting balance factors of P and A are different.
Case L_C3a :'
Before deletion : bf(P) = -1, bf(A) = 1, bf(B) = 0
Total height = h+3 Total height = h+3 Total height = h+3
Figure 6.95
After balancing : bf(P) = 0, bf(A) = 0, bf(B) =0
Case L_C3b:
Before deletion : bf(P) =-1, bf(A) = 1, bf(B) = 1
Total height = h+3 Total height = h+3 Total height = h+3 Total height = h+2
Figure 6.96
After balancing : bf(P) = 0, bf(A) = -1, bf(B) = 0
Case L_C3c:
Before deletion : bf(P) = -1, bf(A) = 1, bf(B) =-1
Total height = h+3 Total height = h+3 Total height = h+3 Total height = h+2
Figure 6.97
The height of the subtree decreases so the value of shorter remains TRUE in case L_C3.
return pptr;
}/*End of del_left_check()*/
pptr->balance = 0;
aptr->balance = 0;
pptr = RotateLeft(pptr) ;
bptr = aptr->lchild;
switch (bptr->balance)
{
case 0: /*Case L_C3a*/
pptr->balance = 0;
aptr->balance = 0;
break;
case 1: /*Case L_C3b*/
pptr->balance = 0;
aptr->balance = -1; : ~
break;
case -1: /*Case L_C3c*/
pptr->balance = 1;
aptr->balance = 0;
}
bptr->balance = 0;
pptr->rchild = RotateRight (aptr) ;
pptr = RotateLeft(pptr);
}
return pptr;
}/*End of del_RightBalance()*/
Si ee eeeeeenmmme ie wen 8
Case R_B:
Before deletion : bf(P) = -1
Case R_C:
Before deletion : bf(P) = 1
Node P becomes unbalanced, balancing required.
Case R_C1:
Before deletion : bf(P) = 1, bf(A) =0
Right rotation about P performed
Case R_C3:
Before deletion : bf(P) = 1, bf(A) =-1
LeftRight rotation performed
Case R_C3a:
Before deletion : bf(P) = 1, bf(A) =-1, bf(B) =0
254 Data Structures through C in Depth
Case R_C3b:
Before deletion : bf(P) = 1, bf(A) =-1, bf(B) = 1
B
soft gi boesegail Lhe ences
i 3 esssertetusitas
paesitbs h
Total height =h+3 Total height = h+3 Total height=h+3 Total height = h+2
Figure 6.103
After balancing : bf(P) = -1, bf(A)=0, bf(B) = 0
Case R_C3c:
Before deletion : bf(P) = 1, bf(A) =-I, bf(B) = -1
switch (pptr->balance)
{
case 0: /*Case R_A : was balanced*/
pptr->balance = 1; /*now left. heavy*/
*pshorter = FALSE;
break;
case -1: /*Case R_B : was right heavy*/
pptr->balance = 0; /*now balanced*/
break;
case 1: /*Case R_C : was left heavy*/.-
pptr = del_LeftBalance(pptr,pshorter );/*Left Balancing*/
Leturnepptr;
}/*End of del_right_check()
*/
struct node *del_LeftBalance(struct node *pptr,int *pshorter)
{
struct node *aptr, *bptr;
aptr = pptr->lchild;
bptr = aptr->rchild;
switch (bptr->balance)
{
case 0: fECASCORR .C3a*/
pptr->balance = 0;
aptr->balance = 0;
S break;
case 1: /*Case R_C3D*/
pptr->balance = -1;
aptr->balance = 0;
break;
case -1: /*Case R_C3c*/
pptr->balance =D;
aptr->balance = 1;
}
bptr->balance = 0;
pptr->lchild = RotateLeft(aptr) ;
pptr = RotateRight (pptr) ;
} ;
return pptr;
}/*End of del_LeftBalance()*/
All the cases and subcases of deletion are shown in the figure 6.105.
Note that in the case of insertion after making one rotation(single or double), the variable taller was made
7ALSE and there was no need to proceed further. From the discussion of deletion, we can see that shorter is
10t always made FALSE after rotations. For example, rotations are performed in cases L_C2, L_C3, R_C2,
2C3 but shorter is not made FALSE. This means that even after performing rotation we may have to
roceed and check the balance factors of other nodes. So, in deletion one rotation may not suffice to balance the
ree unlike the case of insertion. In the worst case, each node in the path from deleted node to root node may
ieed balancing.
Si ,eae a eet De A Pre Pero ee bt
Data Structures through C in Depth
IIE ESOS SEA LAESE SISA IBEETS EES VSSIA HHT EASA HELE ILS GSAS LLLP SASS ADOT ND -
% SEES OE SEIS
LG
A new node is deleted and shortrer = TRUE
P is the node whose balance factor is being checked.
Figure 6.105
Now we will take an AVL tree and delete some nodes from it one by one-
—————
eT al a
Delete 60
‘8 is not a leaf node, so its inorder successor 50 is copied at its place and the node 50 is deleted
rom the tree.
Delete 30
(40) P: 33 Case LA
yA)
G) Gs) @) G8)
Delete 35
P:33 Case R_B,
P:40 Case L_A
—— >
258 i Data Structures through C in Dept
Delete 58
Figure 6.106 :
The root node is black in both trees so property P1 is satisfied for both of them. None of the red nodes has a.
red child so property P3 is also satisfied for them. The property P4 is also satisfied for both the trees. The black
height of first tree is 2 and the black height of second tree is 3. Now let us see some trees that do not satisfy
these properties.
Trees
259
Figure 6.107
In the first tree, property P3 is violated since node 68 is red and it has a red child. We will refer to this
oroblem as double red problem. In the second tree, property P4 is violated for nodes 28 and 85. We will refer to
his problem as black height problem. In the third tree, property P1 is violated since root is red.
According to property P4, all paths from a node N to any external node have same number of black nodes.
rhis is true for the root node as well i.e. any path from root node to any external node will have same number of
slack nodes. Now suppose the black height of root node is k.
If a path from root to external node has all the black nodes, then it will be the smallest possible path. If in a
yath the black and red nodes occur alternately, then it will be the longest possible path. Since black height of
‘oot node. is k, each path will have k black nodes. Therefore, the total nodes in shortest possible path will be k
ind total nodes in longest possible path will be 2k. Thus we see that no path in a red black tree can be more than
wice the another path of the tree. Hence a red black tree always remains balanced. The height ofa red black ,
ree with n internal nodes is at most 2log,(n+1).
In the structure of a node of red black tree, we will take an extra member Soli that can take two values, red
wr black. We will also need to maintain a parent pointer that will point to the parent of the node.
struct node
(
enum {black,red} color;
Siteent O17
struct node *lchild;
struct node *rchild;
struct node *parent; ry
VF
Before proceeding further, let us define some terms that we will be using in the insertion and deletion
yperations.
srandparent - Grandparent of a node is the parent of the parent node.
Sibling - Sibling of a node is the other child of parent node.
JIncle-Uncle is the sibling of parent node.
Nephews - Nephews of a node are the children of the sibling node.
Near Nephew- If node is left child, then left ‘child of sibling node is near nephew and if node is right child,
hen right child of sibling node is near nephew.
‘ar Nephew- The nephew which is not the near nephew is the far nephew. In the figure 6.108, the nephew that
ppears closer to the node is the near nephew and the other one is far nephew.
Figure 6.108
260 Data Structures through C in Depth
In the tree given in figure 6.108, we have marked the relatives of node 70. Its parent is node 60, grandparent
is node 50, uncle is node 40, sibling is node 54, near nephew is node 55 and far nephew is node 53.
6.15.1 Searching
Since red black treeis a binary search tree, the procedure of searching is same as in a binary search tree.
Searching operation does not modify the tree so there is need of any extra work.
6.15.2 Insertion
Initially the key is inserted in the tree following the same procedure as in a binary search tree. Each node in red
black tree should have a color, so now. we have to decide upon the color of this newly inserted node. If we color
this new node black, then it is certain that property P4 will be violated i.e. we’ll have black height problem(you
can verify this yourself). If we color the new node red, then property P1 or P3 may be violated. The property P1
will be violated if this new node is root node, and property P3 will be violated(double red problem) if new
node’s parent is red Thus when we give red color to the new node, we will have some cases where no property
will be violated after insertion while if we color it black it is guaranteed that property P4 will be violated.
Therefore, the better option is to color the newly inserted node red.
We have decided to color the new node red, so we will never have violation of property P4. If the node is
inserted in an empty tree, i.e. new node is the root node, then property P1 will be violated and to fix this
problem, we will have to color the new node black. If new node’s parent is black then no property will be
violated. If new node’s parent is red, then we will have a double red problem and now we will study this case in
detail.
We have two situations, depending on whether the parent is left child or right child. In both situations, we
have symmetric cases. The outline of all the subcases in this case is given below.
Parent is Red
-Parent is left child,
Case L_1: Uncle is red
Recolor
Case L_2: Uncle is Black
Case L_2a : Node is right child
Rotate left about parent and transform to case L_2b
Case L_2b : Node is left child
Recolor and Rotate right about grandparent.
-Parent is right Child
Case R_1 : Uncle is red
Recolor
Case R_2 : Uncle is black
Case R_2a : Node is left child
Rotate right about parent and transform to case R_2b
Case R_2b : Node is right child
Recolor and Rotate left about grandparent.
Now let us take all the cases one by one. In the figures, we have marked current node as N, its parent as P, its
grandparent as G, and its uncle as U. Initially the current node is the newly inserted node, but we will see that in
some cases, the double red problem is moved upwards and so the current node also moves up.
The left and right rotations performed are similar to those performed in AVL trees.
Case L_1: Parent is left child and Uncle is red
In this case, we just recolor the parent and uncle to black and recolor grandparent to red. The following figures
show this recoloring. The same recoloring is done if the node is right child.
e
_ Trees 261
Figure 6.109
Looking at the recolored figures it seems that now we do not have any double red problem, but remember
that this is only a part of the tree and the node D might have a parent and grandparent. If the parent of node D is
black, then we are done but if the parent of node D is red, then again we have a double red problem at node D.
We have just moved the double red problem one level up.
Now the grandparent node is the node that needs to be checked for double red problem. So after recoloring,
we make the grandparent as the new current node, we have marked it N in the recolored figure. Any of the 6
cases(L_1, L_2a, L_2b, R_1, R_2a, R_2b) may be applicable to this new current node.
It might happen that we repeatedly encounter this case(or case R_1) and we move the double red problem
upwards by recoloring and ultimately we end up coloring the root red. In that case, we can just color the root
black and our double red problem would be removed (insertion of 40 and insertion of 51 given in the examples
of insertion). a
Case L_2a : Parent is left child, Uncle is black and Node is right child
Figure 6.110 fe
This case is transformed to case L_2b by performing a rotation. We perform a left rotation about the parent
node B, and after rotation node C becomes the parent node, and node B becomes the current node. Now parent
is left child, node is left child and uncle is black and this is case L_2b.
Case L_2b : Parent is left child, Uncle is black and Node is left child
Figure 6.111
In this case parent is recolored black, grandparent is recolored red and a right rotation is performed about the
grandparent node and this removes the double red problem.
and
The other three cases occur when parent is right child, they are symmetric to the previous three cases
only the figures of these cases are given. >
Figure 6.112
Case R_2: Uncle is black
Case R_ 2a’: Parent is right child, uncle is black and node is left child
Figure 6.113
Case R_2b : Parent is right child, uncle is black and node is right child
Figure 6.114
Now we will see examples of inserting some nodes in an initially empty red black tree. The explanation of
each insertion is given at the end. While inserting, we will first check the color of the parent, if it is black there
is nothing to be done. If parent is red, we have a double red problem and we will check the color of uncle. If
uncle is red, we will recolor and move up, and if uncle is black, we will perform appropriate rotations and
recoloring.
Insert SO Insert 60 Insert 70 Insert 40
A a
50
ae
Trees
Insert 54
Insert 30, 45
Insert 51
6.15.3 Deletion
Initially the deletion of node is performed following the same procedure as in a binary search tree. We know
that in a BST, if the node to be deleted has two children, then its information is replaced by the information of
inorder successor and then the successor is deleted. The successor node will have either no child or only right
child. Thus the case when node to be deleted has two children, is reduced to the case when node to be deleted
has only one right child or no child.
Therefore, in our discussion, we will take the case of only those nodes that have only one right child or no
child. This means that-node to be deleted is always a parent of an external node i.e. node to be deleted has either
one external node as a left child or two external nodes as children.
Suppose A is the node that has to be deleted, then we can think of six cases some of which are not possible.
ak (ii ) (iii )
Figure 6.115
(iv ) (vi)
Impossible case, because this means node A was violating property P4.
50, the three possible cases are (i), (iv) and (v). The actions to be taken in these three possible cases are-
- In Case (i), Node is red with no children
Delete the node
-In Case (iv ) Node is black with no children
Delete the node and restore the black height property.
- In Case (v), Node is black and has a red child
Delete the node and color the child black
Now we will study in detail the case (iv), when node to be deleted is a black node without any children. We
will take N as the current node, i.e. it is the root of the subtree which is one black node short. Initially N will be
he external node that replaced the node to be deleted and as we proceed we might move the current node
ipwards i.e. we might move the black height problem upwards. There can be different cases depending on the
solor of the sibling, nephews and parent.
Case 1 : N’s sibling is Red
Case 2 : N’s sibling is black, both nephews are black.
Case 2a : Parent is Red
Case 2b : Parent is Black
Case 3 : N’s sibling is black, atleast one nephew is red
Case 3a: Far nephew is black, other nephew will be red.
Case 3b: Far nephew is red, other nephew may be either red or black.
If node is left child, then we will name the cases as L_1, L_2a, L_2b, L_3a, L_3b and if node iis ght child
hen we’ll have symmetric cases R_1, R_2a, R_2b, R_3a, R_3b.
Case L_1: N’s sibling is Red
n this case, parent and both nephews will definitely be black(by property P3).
Figure 6.116
A left rotation is performed and the new sibling is D which is black, so this caseisnow converted to case 2 or
case 3.
Sase L_2: N’s sibling and both nephews are black
Figure 6.117
Jode B is shaded which means that parent can be of either color. In this case, we color the sibling red. After
aat, we take different steps depending on the color of the parent.
‘ase L_2a : Parent is red
F parent is red, then after coloring the sibling red we have a double red problem. To remove this double red
iroblem we color the parent black.
Data Structures through C in Depth
266
oN {ea
OR “y's aN
Figure 6.118 ,
In the initial figure, subtree rooted at node A was one black node short, in the final figure we have
introduced a black node in its path so now it not a black node short. Since we colored node F red, the black
height of other subtree did not change.
-Thus by removing the double red problem, we have removed the problem of shorter black height also and
there is no need to proceed further. Note that when we enter case 2 from case | then we will enter case 2a only
and not case 2b.
Case L_2b : Parent is black
Figure 6.119
In this case the sibling is colored red and now the subtree rooted at A and subtree rooted at F both are one black
node short. So we can say that the subtree rooted at B is one black node short. Now we make node B the
current node and any of the cases may apply to this node. Thus, in this case we have moved the problem of
shorter black height upwards.
Case L_3: Sibling is black, at least one nephew is red
Case L_3a : Sibling is black, Far nephew is black, other nephew will be red.
This case is converted to case L_3b by performing a right rotation.
P
Ha Figure 6.120
‘ aN
The near nephew D is recolored black, sibling F is recolored red and a right rotation is performed abou
sibling node. After rotation, node A’s far nephew is node F which is red and so this case is converted to case
Lab:
Case L_3b : Sibling is black, Far nephew is red, other nephew may be either red or black.
Figure 6.121
Trees 267
The parent node B and far nephew G are recolored black and node F is given the color of node B. Then a left
rotation is performed about parent node and the black height problem is solved.
The figures of other symmetric cases are given next.
‘Case R_1 : N’s sibling is Red
Figure 6.122
Case R_2a : N’s sibling and both nephews are black, parent is red
Figure 6.123
Case R_2b : N’s sibling and both nephews are black, parent is black
—_P4
Figure 6.124
Case R_3a: Sibling is black, Far nephew is black, other nephew will be red.
Be P ;
:N N
’ aN a H‘
Figure 6.125
Case R_3b : Sibling is black, Far nephew is red, other nephew may be either red or black.
oy
Figure 6.126
Now we will see some examples of deletion.
268 : Data Structures through C in Depth
(1) Delete 55
Here the node to be deleted is a red node so there is no violation of any property after deletion.
(i1) Delete 72
60
Here the node to be deleted is a black node with a red child, so after deletion the child is painted black.
(i111) Delete 70
Node’s sibling is red, so case R_1 applies. Perform a right rotation and now sibling is 55 which is black.
Both
nephews are also black, parent is red and this is case R_2a . Color the sibling red and parent black and we
are
done.
(iv) Delete 45
269
Jere N has black sibling. Far nephew is red, So case R_3b applies and a right rotation and recoloring is done.
v) Delete 53 .
270 Data Structures through C in Depth
which is
Here node N has black sibling, far nephew is black and near nephew is red, so case L_3a applies
converted to case L_3b.
(vii) Delete 75
Here node N’s sibling is red so first case R_1 applies which is converted to case R_3 after a rotation. Aftet
this case R_3a applies which is converted to case R_3b.
(viii) Delete 25
_ Here node N’s sibling, both nephews and parent are black so case L_2b applies. Sibling node 45 is colore}
‘red and then case L_3a applies which is converted to case L_3b.
In the implementation, the external nodes are represented by a single sentinel node to save space and thi
sentinel node also serves as the parent of the root node. The sentinel node is like any other node of the tree an
it is colored black while the values of other fields are insignificant. :
eemae
en DF
/*P6.7 Program of Red black tree*/
#include<stdio.h>
#include<stdlib.h>
struct node
og
while(1)
{
joventigtes
a (CNW ally) We
printLl(Sl. tnsert\n") 7
printf("2.Delete\n") ;
printf("3.Inorder Traversal\n") ;
printer(4 Display\n® ) 5
joparn etoa (Us) S(OfbEm aNvolt) ir
printf("Enter you. choice : ");
scanf ("%d",&choice) ;
switch(choice)
{
case 1:
printf("Enter the number to be inserted : ");
scanf("%d", &num) ; :
insert (num) ;
break;
case 2:
printf("Enter the number to be deleted : ");
scanf ("%d", &num) ; ; :
del (num) ;
break;
case 3:
inorder (root) ;
break;
case 4:
display (root,1);
break;
Cases:
exit(1);
default:_
printf ("Wrong choice\n");
Data Structures through C in Depth
}/*End of switch */
}/*End of while */
}/*End of main()*/
int find(int item,struct node **loc)
{ *
if(par == grandPar->rchild)
uncle = grandPar->1lchild;
if (uncle->color == red) /*Case R_1*/
{
par->color = black;
uncle->color = black;
grandPar->color =red;
nptr “=. grandPar;
}
}
root->color = black;
}/*End of insert_balance()*/
:|
|
:
|
Data Structures through C in Depth
274
else
Trees aa ae
a ee eee eee ee ee 1)
else
sib = nptr->parent->lchild;
if( sib->color == red )/*Case R_1*/
{
sib->color = black;
nptr->parent->color = red;
RotateRight (nptr->parent) ;
sib = nptr->parent->lchild; |
}
if (sib->rchild->color==black && sib->lchild->color==black)
{
sib->color=red;
if (nptr->parent->color == red) /*Case R_2a*/
{
nptr->parent->color = black;
return;
}
else S
nptr=nptr->parent; /*Case R_2b*/
,
else
276
ta Structures through C in Depth
i}
if (pptr->parent == sentinel)
root = aptr;
else if(pptr == pptr->parent->1child)
pptr->parent->lchild = aptr;
else
pptr->parent->rchild = aptr;
aptr->lchild = pptr;
pptr->parent = aptr;
} /*End of RoatateLeft()*/
Figure 6.129
Node 45 does not have a right child because 2*6+1=13 is greater than heap size which is 12. In location 0 of
the array we can store a sentinel value.
. Now we will see how to insert and delete elements from a heap, but before that let us have a look at the
main () function and other declarations needed for our program of heap.
/*P6.8 Program for insertion and deletion in heap*/
#include <stdio.h>
#define MAX_VAL 9999 /*All values in heap must be less than this value*/
void insert(int num, int arr[],int *p_hsize);
int’ del_root (int arr[],int *p_hsize)’;
void restoreUp(int arr{],int loc);
void restoreDown(int arr[],int i, int size);
void buildHeap(int arr[],int size );
main()
{ /
We have taken an array named arr which will be used to represent the heap. The variable hsize is used to
denote the number of elements in the heap and it is initialized to zero. We have defined a symbolic constant
MAX_VAL which is stored in the 0" index of the array and it serves as the sentinel value. All the values in the
heap should be less than this constant. :
The function display () can be written as-
void display(int arr[],int hsize)
{ /
shige are “
waft (hsize==0)
t
printf("Heap is empty\n");
return;
}
for(i=1; #<=hsize; i++)
printé("tdi",arr(i])%
PEAnReEL
Ce \n? )/
printf("Number of elements = %d\n",hsize);
}/*End of display()*/
Insertion in this way ensures that the resulting tree fulfills the structure property i.e. tree remains a complete
binary tree but the second property may be violated. For example in this case we can see that 70 is greater than
its parent 46, which is a violation of the heap order property. To restore the heap order property we perform a
Data Structures through C in De! t'
280
Insert 90
45 is less than 90, move 45 down 80 is less than 90, move 80 down
an 7
Trees 281
Insert 32
Insert 32 in this heap tree ” 37 is greater than 32, hence 32 is at proper place
We can write the insert () function as-
void insert(int num,int arr[],int *p_hsize)
{ *
cg GSES es Fs
me pate—-h/2 7
while(arr[par] < k)
{
arr(i]=arr(par];
i = par;
pane 172;
arr[ijie= k;
/*End of restoreUp()*/
We come out of the while loop when exact place for the key is found. Suppose the key is the largest and has
o be placed in the root node, value of i and par will be 1 and 0 respectively. In this case also the while loop
erminates because we have stored a very large sentinel value in the location 0 of the array. If we don’t store this
value in arr [0] then we have to add one more condition in the while loop.
mile(par>=1 && arr([par]<k)
Perr
So the use of sentinel avoids checking of one condition per loop iteration. Now we will take some keys and
nsert them in an initially empty heap tree.
Data Structures through C in Depth
Figure 6.131
For insertion we move on a single path from leaf node towards the root, hence complexity is O(h). Height a
heap is | log,(n+1) ]so complexity is O(log n). The best case of insertion is when the key can be inserted at
last position i.e. there is no need to move it up. The worst case is when the key to be inserted is maximum; it nee
to be inserted in the root.
6.16.2 Deletion
Any node can be deleted from the heap but deletion of root node is meaningful because it contains the
maximum value.
Suppose we have to delete root node from a heap of size n. The key in the root can be assigned to some
variable so that it can be processed. Then we copy the key in the last leaf node i.e. key in location arr[n] to
the root node. After this size of heap is decreased to n-1, hence the last leaf node is deleted from the heap. Let
us see how we delete root om the heap in the given figure.
The resulting structure after deleting the root in this way is a complete binary tree so the first property still
holds true but the heap order property may be violated if key in the root node is smaller than any of its child. Ir
the above example, the key 37 is not at proper place because it is smaller than both of its children. The
procedure restoreDown will move the key down and place it at its proper place.
Suppose the key k violates the heap order property. Compare k with both left and right child. If both childrer
are smaller than k, then we are done. If one child is greater than k, then move this greater child up. If both lef
Trees } 283
and right child are greater than k, then move the larger of the two children up. After moving the child up, we try
to insert the key k in this child’s place and for this we compare it with its new children. The procedure stops
when both children of k are smaller than k, or when we reach a leaf node.
Now let us see how we can place the key 37 in the appropriate place using the procedure restoreDown.
Let us see one more example of deletion of root from the heap.
imeudelLeroot(antrarct),1ntee* pehnsize)
{
imt max = arr[1]; /*Save the element present at the root*/
arr[1] = arr[*p_hsize]; /*Place the last element in the root*/
' (*p_hsize)--; /*Decrease the heap size by 1*/
restoreDown(arr,1,*p_hsize) ;
return max;
}/*End of del_root()*/
arr[i] = arr[rchild];
2 =erchwlas
els Vea = 2) ah ak
reni lav lehind il;
}
/*If number of nodes is even*/
if(lchild==hsize && num<arr[lchild] )
{
arrial=ann lehiddl;
nb E> Welo pals tr
}
arr[i]=num;
}/*End of restoreDown()*/
When the number~of nodes in heap is odd, all nodes have either 2 children or are leaf nodes and when the
number of nedes is even then there is one node that has only left child. For example if the number of nodes is 10
then node at index 5 has only left child which is at index 10. In the function restoreDown (), we check for this
condition separately.
For deletion we move on a single path from root node towards the leaf, hence complexity is O(h). Height of
heap is | logo(n+1) |, so complexity is O(log n).
1, then arr [3] in heap of size 2, and so on till arr [n] is inserted in heap of size n-1 and finally we get a heap
of, size n.
Since the elements are already in an array, there is no need to call insert (), we just increase the size of
heap and call restoreUp() for next element of array.
void buildHeap(int arr[],int size)
{
arateeesiee
for (i=2; i<=size; i++)
; restoreUp(arr,i);
}/*End of buildHeap() */
This method of building a heap by inserting elements one by one is top down approach.
The worst case is when data is in ascending order i.e. each new key has to rise up to the root, so worst case time
for this approach is proportional to O(n log n). We can build heap in O(n) time by using a bottom up approach.
In this method the array is considered to be representing a complete binary tree. Suppose we have an array of
11 elements stored in arr [1)........arr [11] and we want to convert this array into a heap.
ar STR]
ne 2 3 4
[eo Tal Pe [2[5]
S56 7) 8 SO} 11
We can think of this array representing a complete binary tree as shown in figure 6.133.
1
Figure 6.133
Now this structure satisfies the first property. To convert it to a heap we have to make sure that the second
property is also satisfied. For this we start from the first non leaf node and call restoreDown() for each node
of the heap till the root node.
Figure 6.134
286 : Data Structures through C in Depth
The first non leaf node is always present at index floor(n/2) of the array(if root stored at index 1 and heap
size is n). So. the function restoreDown() is called for all the nodes with indices floor(n/2) , floor(n/2)-1,
floor(n/2)-2.09as.. 2,1. In our example the heap size is 11, hence the first non leaf node is 46 present at index 5.
So in this case restoreDown() is called for arr[5], arr[4], arr[{3], arr[2], arr[1].
First restoreDown is called for 46. The right child of 46 is 95 which is greater than 46, so 95 is moved up and
46 is moved down. Similarly retsoreDown is called for 9, 18, 35 and 25 and all of them are placed at proper
places. Finally we get a heap of size 11. The function buildHeap (). using bottom up approach is-
void buildHeap(int arr[],int size) ‘
ihe ele
for(a=sizey(Zee le=lami ==)
s restoreDown (arr,i,size) ;
}/*End of buildHeap()
*/
This was the procedure of building a heap through bottom up approach; now let us understand how we are able
to get a heap by applying this procedure.
The heap is constructed by making smaller subheaps from bottom up. Each leaf node is a heap of size 1, so
we start working from the first non leaf node. Since we are working from bottom to up, whenever we analyzea ~
node, its left and right subtrees will be heaps. Starting from the first non leaf node, each node N is considered as
the root of a subtree whose left and right subtrees are heaps and restoreDown () is called for that node so that
the whole subtree rooted at node N also becomes a heap.
There are three main applications of heap-
1. Selection algorithm
2. Implementation of priority queue
3. Heap sort
where Wj denotes the weight and P; denotes the path length of an external node.
If we create different trees that have same weights on external nodes then it is not necessary that they have
same external weighted path length.
Trees 987
Huffman tree is built from bottom up rather than top down i.e. the creation oftrees starts from leaf nodes and
proceeds upwards. \ :
Suppose we have n elements with weights wy, Wp......... w, and we want to construct a Huffman tree for
these set of weights.
For each element we create a tree with a single root node. So initially we have a forest of n trees, each data
item with its weight is placed in its own tree. '
In each step of the algorithm, we pick up two trees T; and Tj with smallest weights(w, and w;) and combine
them into a new tree T,. The two trees T; and Tj become subtrees of this new tree T, and the weight of this new
tree is the sum of the weights of the trees T; and Tj i.e. wj + w;. After each step, the number of trees in our forest
will decrease by 1. The process is continued till we get a single tree and this is the final Huffman tree. The leaf
nodes of this tree contain the elements and their weights. Let us take 7 elements with weights and create an
extended binary tree by Huffman algorithm.
Ags Ba © DE -FG
Meme 16211. 7 -20 23 3. 15
(i) Initially we have a forest of 7 single node trees.
The two trees with smallest weights are trees weighted 7 and 5(shaded in the figure).
(ii) The trees weighted 7 and 5 are combined to form a new tree weighted 12.
@
We have made tree weighted 7 as the left child arbitrarily, any tree can be made left or right child. Now the two
trees with smallest weights are trees weighted 11 and 12.
(iii) The trees weighted 11 and 12 are combined to form a new tree weighted 23.
288 ta Structures through C in Depth
Now the two trees with smallest weights are trees weighted 16 and 15.
(iv) The trees weighted 16 and 15 are combined to form a new tree weighted 31.
Now the two smallest weights are 20 and 23, but there are two trees with weight 23. We can break thie tic
arbitrarily and chose any one of them for combining.
(v) The trees weighted 20 and 23 are combined to form a new tree weighted 43.
Now the two trees with smallest weights are trees weighted 31 and 23.
(vi) The trees weighted 31 and 23 are combined to form a new tree weighted 54.
Now the two trees with smallest weights are trees weighted 54 and 43.
(vii) The trees weighted 54 and 43 are combined to form a new tree weighted 97.
Now only a single tree is left and this is the final Huffman tree. The weighted path length for this tree is-
16*3 + 15*3 + 23*2 + 20*2 + 11*3 + 7*4 + 5¥*4
=260 :
‘In the above procedure, we have seen that when two trees are combined anyone can be made left or right
subtree of the new tree and if there is more than one tree with equal weights in the roots, then we can arbitrarily
chose anyone for combining. So the Huffman tree produced by Huffman algorithm is not unique. There can be
OE
different Huffman trees for same set of weights, but the weighted path length for all of them would be same
irrespective of the shape of the tree. The following figure shows two more Huffman trees for the same set of
weights given in the example.
:
290 Data Structures through C in Depth
In fixed length code, all the symbols are assigned 3-bit codes irrespective of their faeaueneagn The coded
message occupies 396 bits in this case. In variable length codes, we have assigned 1 bit code to y since it is
‘most frequent while v and z are assigned 4-bit codes since they are less frequent. So the same Be |
can be
coded using only 257 bits if variable length codes are used.
Encoding a message using variable length codes is simple, and as in fixed length codes here also the
individual codes of symbols are just concatenated to get the code for the message. For example if we use the
vatiable length codes given in the table above, then the message “wyxwvyz” would be encoded as
00101100010010101.
Now let us see how we can decode the encoded message. When fixed length codes are used decoding is
simple — start reading bits from the left and just transform each 3 bit sequence to the corresponding symbol.
When variable length codes are used we don’t know how many bits to pick. For example if the encoded
‘message is 0010101011, the code for first symbol may be 0 or 00 or 001 or 0010. From the table we can see that
0 is not a code, but 00 is a code for w and no other code starts with 00. So the first two bits can be decoded to
get the symbol w. The next bit is 1, which is code of y, and no other code starts with 1 so the next symbol
decoded is y. Next bit is 0 which is not a code, 01 is also not a code, 010 is also not a code, 0101 is code of z{
So next decoded symbol is z.
This decoding method works because the variable length codes that we have made are prefix free i.e. i no
code is a prefix of other code. For example 00 is the code for symbol w, so it is not prefix of any other code i.e.
no code starts with 00. Similarly no code starts with 1 since it is code of y. These types of codes are known as,
prefix codes. Now the question is how we can obtain these variable length codes that are prefix free.
Huffman tree is used to generate these codes and the resulting codes are called Huffman codes. First of all
we will create a Huffman tree for the data given in the table. Each external node contains a symbol and its
weight is equal to the frequency of the symbol.
(iv) (v)
Now let us see how this Huffman tree can be used to generate Huffman codes. Each branch in the Huffman
tree is assigned a bit value. The left branches are assigned 0 and right branches are assigned 1.
a 15 |
We can find out the code of each symbol by tracing the path that starts from the root and ends at the leaf
node that contains the given symbol. All the bit values in this path are recorded and the bit sequence so obtained
is the code for the symbol. For example to get the code of v, first we move left(O), then right (1) then left(0) and
then again left(0). So the code for v is 0100. Similarly we can find the codes of other symbols which are- |
w y x Vv Zz
00 ] 011 0100 0101
Since all the symbols are in the leaf nodes, this method generates codes that are prefix free and hence
decoding wiil be unambiguous. 3
The same Huffman tree is used for decoding data. The encoded data is read bit by bit from the left side. We
will start from the root, if there is a 0 in the coded message we go the left child-and if there is 1 in the coded
message we go the right child and this procedure continues till we reach the leaf node. On reaching the leaf
node, we get the symbol and then again we start from the root node. Let us take a coded message and decode it.
0001000110101001011
00 01000110101001011 w
00 0100 0110101001011 wv
00 0100 0110101001011 WwvXx
00 0100011 0101 001011 WVXZ (
00 0100 011 0101 00 1011 WVXZW
00 0100 O11 0101 00 1 011 WVXZWY
00 0100 011 0101 00 1 011 . WVXZWyX
_ The drawback with Huffman codes is that we have to scan the data twice — first time for getting the
frequencies and next time for actual encoding.
(A)
(B) © @
© OO@OWY W
4,
Arrows pointing to nodes that will become Binary Tree |
General Tree left and right children in binary tree
Figure 6.136
292 : : Data Structures through C in Depth
Dea @ W&
©GOOOOO
Arrows pointing to nodes that will become 5
General Tree left and right children in binary tree Binary Tree
Figure 6.137
In both the examples, the first figure shows the general tree, the second figure shows the same tree with
arrows representing left and right child of a node in binary tree and the third figure shows the corresponding
binary tree. Note that in any general tree the root does not have a sibling, so the root of the corresponding binary
tree will not have any right child.
A general tree can easily be represented using the binary tree format. The structure for a node of general tree
can be taken as-
struct node
{
Euigh® akieletosy
struct node *firstChild;
struct node *nextSibling;
};
In a binary tree, left and right pointers of a node pointed to the left and right child of the node respectively. In
general tree, left pointer will point to the first child or the leftmost child of the node and right pointer will point
to the next sibling of the node. The linked representation of a general tree is given next-
YO OOODAD &
Figure 6.138
86 90 95 99};
8183 85}
Figure 6.139
From the above explanation we can say that m-way search trees are generalized form of Binary search trees and
a Binary search tree can be considered as an m-way search tree of order 2.
¢.21 B-tree
In external searching, our aim is to minimize the file accesses, and this can be done by reducing the height of
the tree. The height of m-way search tree is less because of its large branching factor but its height can still be
reduced if it is balanced. So a new tree structure was developed (by Bayer and McCreight in 1972) which was a
height Balanced m-way search tree and was named B-tree.
A B-tree of order m can be defined as an m-way search tree which is either empty or satisfies the following
properties-
(i) All leaf nodes are at the same level.
(ii) All non leaf nodes (except root node) should have at least[m/2| children.
(iii) All nodes (except root node) should have at least [m/2] - 1 keys.
(iv) If the root node is a leaf node(only node in the tree), then it will have no children and will have at least one
key. If the root node is a non leaf node, then it will have at least 2 children and at least one key.
(v) A non leaf node with n-1 key values shoulc have n non NULL children.
294 Data Structures through C in Depth
From the definition we can see that any node(except root) in a B-tree is at least half full and this avoids
wastage of storage space. The B-tree is perfectly balanced so the number of nodes accessed to find a key
becomes less.
The following figure shows the minimum and maximum number of children in any non root and non leaf
_-node of B-trees of different orders.
Order of the tree Minimum Children Maximum Children
TACT 22 eee a
RAS Ioaa er eee peneeee,
Eee St 2 eee ee
Now let us see why the m-way search tree in the previous figure 6.139 is not a B-tree.
(i) The leaf nodes in this tree are [1,3,7], [11,19], [29], [34], [46,48], [81,83,85], [87,89] and it can be clearly
seen that they are not at the same level.
(ii)The non leaf node [27, 28] has.2 keys but only one non NULL child, and the non leaf node [86,90,95,99] has
4 keys but only 2 non NULL children.
(iii) The minimum number of keys for a B-tree of order 5 is [5/2]-1 = 2 while in the above tree there are 2
nodes [34], [29] which have less than 2 keys.
The tree given in figure 6.140 is a B-tree of order 5.
Figure 6.140
While explaining various operations on B-tree we’ll take B-trees of order 5. We’ll denote maximum number
of permissible keys by MAX and minimum number of permissible keys(except in root) by MIN. So if the order
is 5, MAX = 5-1 = 4, and MIN = [5/2]-1 =2.
There are special names given to B-trees of order 3 and 4. A B-tree of order 3 is known as 2-3 tree because
any non root non leaf node can have 2 or 3 children, and a B-tree of order 4 is known as 2-3-4 tree because any
non root non leaf node can have 2, 3 or 4 children.
If we reach a leaf node and still don’t find the value, it implies that the value is not present in the tree. For
example suppose we have to search for the key 35 in the tree of figure 6.140. First the key is searched in the
root node, and since it lies between 30 and 70, we move to the node [40, 50]. Now 35 is less than 40, so we
move to leftmost child of node [40, 50] which is [32, 37]. The key is not present in this node also, and pice we
have reached a leaf node the search is unsuccessful. :
(b) Insert 40
(c) Insert 30
30 will be inserted between 10 and 40 since all the keys in a node shouid be in ascending order.
© id) Insert 35 0
10 30 35 40 §
The maximum number of permissible keys for a node of a B-tree of order 5 is 4, so now after the insertion of
35, this node has become full.
(e) Insert 20
The node [10, 30, 35, 40] will become overfull [10, 20, 30, 35, 40], so splitting is done at the median key 30. A
néw node is allocated and keys to the right of median key i.e. 35 and 40 are moved to the new node and the keys
to the left of the median key i.e. 10 and 20 remain in the same node. Generally after splitting, the median key
goes to the parent node, but here the node that is being split is the root node, so a new node is allocated which
contains the median key and now this new node becomes the root of the tree.
Since the root node has been split, the height of the tree has increased by one.
296 ; Data Structures through C in Depth
5 4030)
(g) Insert 25
The node [10, 15, 20, 28] will become overfull [10, 15, 20, 25, 281, so splitting is done and the median key 20
goes to the parent node.
(i) Insert 12
The node [5, 10, 15, 19] will become overfull [5, 10, 12, 15, 19], so splitting is done and the median key 12
goes to the parent node.
(j) Insert 38
The node[35, 40, 50, 60] will become overfull [35, 38, 40, 50, 60], so splitting is done and the median key 40 -
goes to the parent node.
i eee neal:
(1) Insert 48
The node [45, 50, 60, 90] will become overfull [45, 48, 50, 60, 90], so splitting is done and the median key 50
goes to the parent node. After insertion of 50 the parent node also becomes overfull [12, 20, 30, 40, 50} so again
splitting is done and this time root node is splitted so a new root is formed and the tree becomes taller.
Trees
a ss te tn et 297
FI.
Keys in Right
Left_Sibling sibling > MIN
Figure 6.141
6.21.3.1 Deletion from leaf node
6.21.3.1.1 If node has more than MIN keys. \
In this case, deletion is very simple and key can be very easily deleted from the node by shifting other keys of
the node.
(a) Delete 7, 52 from tree in figure 6.142
298 . Data Structures through Cin Depth
Here key 15 is to be deleted from node [15, 19), sisince this node has only MIN keys we will try to borrow from
its left sibling (3, 7, 9, 11] which has more than MIN keys. The parent of these nodes is node [12, 20] and the
separator key is 12. So the last key of left sibling(11) is movedto the place of separator key and the separator
key is moved to the underflow node. The resulting tree after deletion of 15 will be-
Figure 6.144
The involvement of separator key is necessary because if we simply borrow 11 and put it at the place of 15,
then the basic definition of B-tree will be violated.
(c) Delete 19 from the tree in figure 6.144
We will borrow from the left sibling so key 9 will be moved up to the parent node and 11 will be shifted to the
underflow node. In the underflow node the key 12 will be shifted to the right to make place for 11. The resulting
tree is-
Trees ’ 299
COE EE
Figure 6.145
(d) Delete 45 from tree in figure 6.145 ~~
The left sibling of [45,47] is [35,38] which has only MIN keys so we can’t borrow from it, Hence we will try to
___-borrow from the right sibling [65, 78, 80]. The first key of the right sibling(65) is moved to the parent node and
the separator key from the parent node(55) is moved to the underflow node. In the underflow node, 47 is shifted
left to make room for 55. In the right sibling, 78 and 80 are moved left to fill the gap created by removal of 65.
The resulting tree is-
If both left and right siblings of underflow node have MIN keys, then we can’t borrow from any of the
siblings. In this case, the underflow node is combined with its left (or right) sibling.
(e) Delete 28 from the tree in figure 6.146.
one
1g pc aSel 138.52
Figure 6.146
We can see that the node [28,31] has only MIN keys so we’ll try to borrow from left sibling [18, 22], but it also
has MIN keys so we’ll look at the right sibling [39,43] which also has only MIN keys. So after deletion of 28
we’ll combine the underflow node with its left sibling. For combining these two nodes the separator key(24)
from the parent node will move down in the combined node.
Figure 6.147
300 Data Structures through C in Depth
(Binal |
Here the key is to be deleted from [58, 62] which is leftmost child of its parent, and hence it has no left sibling.
So here we’ll look at the right sibling for borrowing a key, but the right sibling has only MIN keys, so we’ll
delete 62 and combine the underflow node with ee right sibling. The resulting tree after deletion of 62 is-
Figure 6.148
(g) Delete 92 from the tree in figure 6.148
CC cueciaeniel
After combining the two nodes, the parent node becomes underflow[73], so we will borrow a key from its
left sibling [15,36,45]. After borrowing the resulting tree is-
frame 6.149
Note that before borrowing, the rightmost child of cet was [47,53] and after borrowing this neoe
becomes leftmost child of node (S537 3h:
(h) Delete 22, 24 from the tree in figure 6.149.
a 1686 89 95
Fisure 6. 150
5_10_15_18
Kaa
39 43 |
srs
4753 LI | LA
[58_64_67_71} [76_86_89_95
LK eT
}
Now the parent node [36] has become underflow so we will try to borrow a key from its right sibling(since it
is leftmost node and has no left sibling), but the right sibling has MIN keys so we will combine the underflow
node[36] with its right sibling[55, 73]. The separator key(45) comes down in the combined node, and since it
was the only key in the root node, now the root node becomes empty and the combined node becomes the new
root of the tree and height of the tree decreases by one. The resulting tree is-
Sa PnP AES
If we have to delete a key from [3,7] then we’ll have to combine it with its right sibling [12,19], and if we
have to delete a key from [35, 38], we have to borrow from its right sibling [65, 78, 80].
Fipinte 6. 151
The successor key of 12 is 15, so we'll copy 15 at the place of 12 and now our task reduces to deletion of 15
from the leaf node. This deletion is performed by borrowing a key from the right sibling.
" Figure 6.152
(k) Delete 30 from the tree in figure 6.152.
slime
These were some examples of deletion from non leaf node. We could have taken the predecessor key also
instead of successor, as predecessor key is the largest key in the left subtree and is always in the leaf node.
The main() function and declarations for the B tree program are-
/*P6.9 Program for performing various operations in a B-tree*/
#include<stdio.h>
#include<stdlib.h>
#define M 5 /*order of B tree*/
#define MAX (M-1) /*Maximum number of permissible keys in a node*/
#if M%2==
#define CEIL_Mdiv2 (M/2)
#else
#define CEIL_Mdiv2 ((M+1)/2)
#endiff
#define MIN (CEIL_Mdiv2-1) /*Minimum number of permissible keys in a node except root*/
struct node
{
int count;
int key [MAX+1];
struct node *child[MAX+1];
TSE
struct node *Search(int skey,struct node *p, int *pn);
int search_node(int skey,struct node *p,int *pn);
void display(struct node *ptr,int blanks);
void inorder(struct node *ptr);
/*Functions used in insertion*/
struct node *Insert(int ikey,struct node *proot);
int rec_ins(int ikey,struct node *p,int *pk,struct node **pkrchild) ;
vdid insertByShift(int k,struct node *krchild,struct node *p,int n);
void split(int k,struct node *krchild,struct node *p,int n,int *upkey,struct node **newnode);
/*Functions used in Deletion*/
struct node *Delete(int dkey,struct node *root);
void rec_del(int dkey,struct node *p);
void delByShift(struct node *p,int n);
int copy_succkey(struct node *p,int n);
void restore(struct node *p,int n);
void borrowLeft (struct node *p,int n);
void borrowRight (struct node *p,int n);
void combine(struct node *p,int m);
int main()
{
struct node *root = NULL, *ptr;
“int key,choice,n;
while(1)
{
printf("1.Search\n2.Insert\n3.Delete\n") ;
printf ("4.Display\n5.Inorder traversal\n6.Quit\n");
printf("“Enter your choice : ");
scanf ("%d", &choice) ;
switch (choice)
{
case 1:
printf("Enter the key to be searched :\");
scanf ("%d", &key) ;
if( (ptr=Search(key, root, &n)) == NULL )
printf("Key not present\n") ;
else
printf("Node %p, Position %d\n",ptr,n);
break;
case 2:
printf("Enter the key to be inserted : ");
scanf ("%d", &key) ;
root = Insert(key, root);
break; :
case 3:
printf("Enter the key to be deleted : ");
scanf ("%d", &key) ;
root = Delete(key, root);
break;
case 4:
printf("Btree is :\n\n");
Gisplay( root, 0 );
printf (“\n\n2)i;
break;
case 5:
inorder (root) ;
printé(*\n\n");
break;
case 6:
exit(1);
default: ;
printf("Wrong choice\n") ;
break;
}/*End of switch*/
}/*End of while*/
304 ; Data Structures through C in Depth
}/*nd Of maiz
() */
int count;
int key [MAX+1];
struct node *child[MAX+1];
}7
Here count represents the numberof keys currently present in a given node. It is incremented when a key is
inserted into the node and decremented when a key is deleted from the node. We will take two arrays, one of
type int for the keys, and the other of type struct node* for the children. The size of both arrays is MAX+1.—
The maximum number of permissible children of a node are MAX+1 so all the elements of array child will
be used. The maximum number of permissible keys in a node is MAX so we’ll never use key [0]. Taking the
size of array key as MAX+1 simplifies the code. In the program whenever we will have to shift keys and
pointers of a node we can simply do it by using a for loop. Only the case of child[0] has to be hauuicd
separately.
The symbolic constant M represents the order of the B-tree. The maximum number of keys in a node is —
represented by MAX and is equal to (M-1). The minimum number of keys in a node(except root) is given by
MIN which is equal to [M/2]| — 1.
6.21.4 Searching
The functions used in searching are Search() and search_node(). Search() is a recursive function that is
used to search a key by moving down the tree using child pointers. It uses a function search_node() to search
for the key inside the current node.
struct node *Search(int skey,struct node *p,int *pn)
{
int found;
if(p == NULL) /*Base Case 1 : if key not found*/
return NULL;
found = search _node(skey,p,pn);
if (found) /*Base Case 2 : if key found in node p*/
return pi;
else /*Recursive case : Search in node: p->child[*pn]
*/
return Search(skey,p->child[*pn],pn) ;
}/*End of Search ()*/
int
( search_node(int skey,struct/
node *p,int *pn)
Let us first see how the function search_node() works..This function searches for skey inthe node p and ,
returns | if the key is present in the node otherwise it returns 0. If the key is found then *pn represents the
position of the key in the node, otherwise the value *pn is used by Search() function to move to
the
appropriate child.
Trees 305
If skey is less than the leftmost key of the node then 0 is returned indicating that key is not present in this
node and value of *pn is set to 0 which instructs the function Search () to continue its search in the 0" child of
eee *p. If skey is greater than the leftmost key then we start searching for the key in the node from the right
side.
Search() is a tail recursive function. It returns NULL if the key is not found in the tree, otherwise it
returns the address of the node in which the key is found. The first argument skey is the key that is to be
searched in the tree, pointer p represents the root of the tree on which the search proceeds, and the last argument
is a pointer to int that will be used to give the position of key inside the node.
Recursion can stop in 2 cases, first when we reach the leaf node i.e. p==NULL indicating that skey is not
ame in the tree, and second is if search_node() returns | indicating that skey is present in the current
node p.
"Figure ¢
6.153,
Search 34 in tree of figure 6.153
Search) led, hapa. p> SLesgic rantainmainciade aust cate
*pn=2
return NULL
6:21.5 Insertion
The functions used in insertion are Insert (), rec_ins(), insertByShift ( ) and split().
struct node *Insert(int ikey,struct node *proot)
( 4
ioe, taller; ;
struct nede *krchild?*temp;
306 ; ‘ Data Structures through C in Depth
taller = rec_ins(ikey,proot,&k,&krchild) ;
if(taller) /*tree grown in height, new root is created*/
{
temp = (struct node *)malloc(sizeof (struct node) );
temp->count = 1
temp->child[0] = proot;
temp->key[1] = k;
temp->child[1] = krchild;
proot = temp;
}
return proot;
}/*End of Insert()*/
Here ikey is the key to be inserted and proot is pointer to the root of the tree. The function returns a pointer to
the root of the tree.
Inside Insert (), a recursive function rec_ins() is called which performs the main task of inserting the
key into the tree. It takes 4 arguments, the first two are the same as in Insert (). The last two arguments are
addresses of variables k and krchild i.e. this function will set values of variables k and krchild. Tiic :ewrn
value of rec_ins() function will determine whether the tree has grown in height or not. A B-tree grows in
height only when the root node is split (see steps (e) and (1) insertion of 20 and 48 ). When root node is split
then a new root has to be created which needs allocation of a new node. A new node is allocated which contains
the key k and the original root node is made its left child while krchild is made its right child and finally this
new node is made the new root of the tree.
For example in step (1) insertion of 48, the root node before insertion was [12, 20, 30, 40], the function
rec_ins() returns | and sets the value of k as 30 and makes krchild point to [40,50]. The new root node
contains only one key which is k and its right child pointer is krchild, and the old root node is its left child -
pointer. Now let us discuss the working of this function rec_ins ().
The function rec_ins() is a recursive function. In the winding phase we move down the tree recursively.
There can be two base cases, one when we reach a NULL subtree which means we have to insert the key, and
the other is when we find the key in some node and the key will not be inserted. In the unwinding phase, we
insert the key and perform the splitting if required. In unwinding phase we move up the tree on the path of
insertion, and we know that splitting is propagated upwards so the insertion of key and splitting is done in this
phase.
int rec_ins(int ikey,struct node *p,int *pk,struct node **pkrchild)
sige, sely,
int flag;
if(p==NULL) /*Base case 1*/
*pk = ikey;
*pkrchild = NULL;
return 1;
ave = rec_ins(ikey,p->child[n]
,pk, pkrchild) ;
if (flag)
if(p->count < MAX)
{
insertByShift
(*pk, *pkrchild,p,n);
return 0;
else
In the unwinding phase, we will insert.the key whenever we ‘get a;non full node using insertByShift (),
otherwise we will.call split ().
. The return value of rec_ins() can be 1 or 0. The return value of 1 indicates that the insertion is not
complete and we need to continue work in the unwinding phase. The return value of 0 means that insertion is
_ finished and there is no need to do any work in the unwinding phase.
Initially when we reach the base case (p==NULL), rec_ins() returns 1. When the function split() is
called, | is returned indicating that the insertion is not over and the median key is still there waiting to be
inserted in the tree. When insertByShift () is called, 0 is returned indicating insertion is finished.
. When we have a duplicate key, there is nothing to be done in the unwinding phase because the key is not
inserted in
i the tree. So in this case also we return.0.
*pk and *pkrchild represent the key to be inserted and its right child respectively. Initially when the
recursion stops *pk is set to ikey and *pkrchild is set to NULL. Whenever split.(). is called, it sets the
value of *pk to the median key and also changes the value of *pkrchild.
We will take some examples and see how the key is actually inserted. Let us insert key 17 in the tree given in
figure 6.154.. f
Figure 6.154.
For insertion of 17, splitting is required and the resulting
tree is-
Now let us trace and see how this insertion happens in our program.
1) Initially Insert () calls rec_ins(). with.p pointing to root node[30]. Inside rec_ins(), since p is not
NULL, search_node() is called which sets n=0, and a recursive call is made to rec_ins() with 0" child of
p-e, [12,20]; :
2) In the second recursive call of rec_ins (),.p points to node [12,20] which is not NULL so search_node()
is called which sets n=1. A recursive call is made to rec_ins() with 1* child of p i.e. [14,15,16,19].
3) In the third recursive call of rec_ins(), p points to node [14,15,16,19] , search_node() sets n=3. Now a
recursive call is made with 3™ child of p which is NULL.
4) In the fourth recursive call of rec_ins() p is NULL, hence we have reached ee base case a the recursion ~
stops. *pk is set to 17, and *pkrchild is set to. NULL which means that the key to be inser ted is 17 and child
to its right will be NULL. This recursive call of rec_ins () returns 1 and unwinding phase begins.
5) Now we go back to the 3 recursive call of rec_ins() where p points to node [14,15,16,19]. The fourth
recursive call had returned 1, so value of flag is 1 which-means that insertion is not yet over, so we check the
number of keys in the node: Since keys are equal to MAX we need to call the function split() . This function
splits the node and now sets *pk to 16, *pkrchild points to [17,19]. This means that now the key to be
inserted is 16, 'and its right child will be [17,19]. The 3" recursive call finishes and it returns 1.
ta Structures through C in Depth
308
6) Now we go back to the 2"4 recursive call of rec_ins() where p points to node [12, 20]. The third recursive
number of
call had returned 1, so value of flag is 1 which means that insertion is not yet over, so we check the
ByShift (). This function inserts
keys in the node. Since keys are less than MAX, we call the function insert
the key at proper place in this node. The 2"! recursive call finishes and it returns 0.
7) Now we go back to the 1* recursive call of rec_ins() where p points to node [30]. The second recursive
call had returned 0, so value of flag is 0 which means that insertion is over. The value 0 is returned.
8) Now we go back to Insert (). The outermost call of rec_ins() returned 0 indicating that insertion is over.
Figure 6.155
95 can be inserted in the leaf node, so no splitting is required. The resulting tree after insertion is-
return 0
return 1
Figure 6.156
In this case, the splitting is propagated up to the root node. The root node is split and a new root is created _
and the height of tree increases.
[36
38 =6[40
43} =[51_ 65}; [71_73} gi85} = [90_93
Here the outerthost(first) recursive eilereturns ittoray y indicating that insertion has not completed
and still the key 35 remains to be inserted. Inside Insert (), value of taller becomes | and a new root is created
and 35 is inserted into it.
irec—ins( ),ikey=80, p— [2, 24, 35, 96] ip->count===MAX, call split( ), now *pk=35, *pkrchild— [69, 96]
in=3____.pe>child[n] 39, 46, 69, 88)nn i
Pree PPP ie, Sr return 1
Prec_ins( ),ikey=80, p— [39, 46, 69, 88] ip >count== , call split( ), now *pk=69, *pkrchild— [80,88]
in=3.___p->child(n) 471.73, 81,85) da
seeteede coSecs ah ectseeee\
Ao eeea Be ere l
cots return 1
irec_ins( ),ikey=80, p— [7], 73, 81, 85] i , .
in=2 a asigre hes an call split( ), now *pk=80, *pkrchild— [81,85]
Pee inet akeyeeO. po NULL ES See *‘ return 1
Nenencencareerensesesessereeseeereeesensesneueseeseeeseseanennecnarananeneesaseennsnensaensaeesaneenaeees
'p — NULL, recursion stops, *pk=80,
Pp *pkrchild-—NULL
Now let us see how the functions insertByShift () and split() work. P
void insertByShift(int k,struct node *krchild,struct node *p,int n)
{
int 71;
for (1=p->count; i>n; i--)
{
p->key[i+1] = p->key[i]; : '
p->child[it+1] = p->child[i];
;
p->key[n+1] = k;
(pr>child{n+1] = krchild;
‘p->count+t+;
}/*End of insertByShift()*/ }
This function will be called only if node p contains less than MAX keys i.e. when there is no chance of
overflow if a new key is inserted into it, and the key can be simply inserted by oe some keys to the right.
This function will insert the key k and pointer krchild into the node p at the (n+1)" position. For this initially
ta Structures in
310
all keys and pointers which are after the n” position are shifted right one position to make room for key k and
krchild and after this these two are inserted at the (n+1)" position in the node and the count of node p is
‘ incremented.
void split(int k,struct node *krchild,struct node *p,int n,int *upkey,
struct node **newnode)
{
ites
int lastkey;
struct node *lastchild;
int d = CEIL_Mdiv2;
if (n==MAX)
{
lastkey = k;
lastchild = krchild;
}
else
p->key[i+1] = p->key[il];
p->child[i+1] = p->child[i];
}
p->key[i+1] = k;
p->child[it+1] = krchild;
} : ;
*newnode = (struct node *)malloc(sizeof(struct node) );
*upkey = p->key[d];
for (i=1,j=d+1; j<=MAX; i++,j++)
{
(*newnode) ->key[i] = p->key[jl;
(*newnode) ->child[i] = p->child{j];
}
(*newnode) ->child[0] = p->child[d];
p->count = d-1; + /*Number of keys in the left splitted node*/
(*newnode)->count = M-d; /*Number of keys in the right splitted node*/
(*newnode) ->key[M-d] = lastkey;
(*newnode) ->child[M-d] = lastchild;
yi inatel (ape scyaplhrhtes(())Raa//
If the new keyis to be inserted somewhere in between the node, then firstly the last key of the node is saved
in variable lastkey and then the new key is inserted by shifting other keys to the right. For example if 39 is the
key to be inserted then, first 48 is saved in variable last key and then 40, 45 are shifted right and 39 is inserted
in the node.
[38 40 45 48 |
aS ENT
RI EN
Insert 39 38 39 40 45 ||
Now space for a new node is allocated. The variable d denotes the median value ceil(M/2). The median key
i.e. the d™ key is the key that will be moved up to the parent so it is stored in upkey.
When a node is full it contains MAX keys, and after the arrival of a new key, the total number of keys
becomes (MAX+1). This number is equal to M which is the order of the tree. Now we have to split these M
keys. The first (d-1) keys which are to the left of d"" key remain in the same node, and the remaining M-d keys
which are to the right of d" key will move into a new node.
6.21.6 Deletion
The functions used in deletion are - Delete(), rec_del(), delByshift(), copysucc(), restore(),
borrowLeft(), borrowRight (), combine().
struct node *Delete(int dkey,struct node *root)
{ d.
, struct node *temp;
rec_del (dkey, root) ; ;
/*If Tree becomes shorter, root is changed*/
if(root!=NULL && root->count == 0)
{
temp = root;
root = root->child[0]);
free (temp) ;
} .
return root;
}/*End of Delete()*/
Delete() function calls another recursive function rec_del() which performs the main deletion
process. We know that after the deletion, if the root node becomes empty then the height of the tree decreases
(see step (i), deletion of 31). So, after rec_de1() has deleted the key from the tree, Delete() checks whether
the root node has become empty or not, and if there is no key left in the root node then the 0” child of the root
becomes the new root. The function rec_del () is-
void rec_del (int dkey, struct node *p)
{
int n, flag, succkey;
if (p==NULL) /*reached leaf node, key does not exist*/
printf("Value %d not found\n",dkey) ;
else
{
flag = search_node(dkey,p, &n) ;
if (flag) /*found in current node p*/
{
if (p->child[n]==NULL) /*node p is a leaf node*/
delByShift(p,n);
else /*node p is a non leaf node*/
{
succkey = copy_succkey(p,n) ;
rec_del(succkey, p->child[n]);
} :
}
else /*not found in current node p*/
rec_del(dkey, p->child[n));
{ ; ;
if (p->child[n]->count < MIN) /*check underflow in p->child[n]*/
restore (p,n);
}
; }
}/*End. of .rec_del ().*/
In’rec_del () there are two recursive calls, in both of them the second argument is same i.e.
p->child[n] ones n is obtained by search_node(). This procedure of traversing down the tree through n™
child of p is same.as in rec_ins() .If rec_del() is called with p as NULL, then it means that the key is not
present in the tree.
The function search_node() is called which searches for the key in the current node, if it is found then we
check whether the current node is a leaf node or a non leaf node. If it is a leaf node then the key is simply
deleted by shifting the other keys using function delByShift (), and if it is a non leaf node then the successor
key is copied at the place of dkey using copy_succ(). This function returns successor key which is stored in
variable succkey, and now our task is to delete this successor key, so now rec_del() is called with the first
argument as succkey.
When we call delByShift(), underflow might occur in the node. We will check for this underflow when
we return from the recursive calls i.e. in the unwinding phase. If there is underflow we’ll call restore() which
performs all the processes of borrowing keys or combining nodes. This function will need the parent of the
underflow node. This is why in the code we are checking count of p->child[n] and not that of p. If
p->child[n] underflows we will send its parent to the restore() function. If p isa leaf node then underflow
is not checked because in this case p->child[n] will be NULL.
This process of checking underflow will check underflow in all nodes that come in the Tnaapaig phase
except root node. The root node is different from other nodes since the minimum number of keys for root node
is 1. If the root node underflows i.e. it becomes empty then Delete() function will handle this case of
underflow of root node.
Consider the tree given in figure 6.157 and delete 19 from it.
Figure 6.157
Deletion of 19 is simple since the node [15, 17, 19] contains more than MIN keys. The tree after deletion is-
When rec_del() is called with p-[15, 17, 19], search_node() returns 1 as 19 is present in this node,
and so the value of flag becomes 1. Now since noe p is leaf node, delByShift() is called and 19 is deleted
from the node. This is the base case of recursion, so now recursion terminates and unwinding phase starts. In the
unwinding phase, we will check whether underflow has occurred in any node or not.
_ Trees 313
Consider the tree given in figure 6.158 and delete 69 from it.
_ Figure 6.158
Here key 69 is present in a non leaf node, so successor key 71 will be copied at its place and then 71 will be
deleted from the leaf node. The resulting tree after deletion is-
aeeneseeceenenseescesanarssseesasasssnsesssPessnsusecscusessesonssensssseneesasaaausenanssosssseoosenen ones *y
rec_del(),dkey 69, p— [35]
p->child[n] — [50, 96], so restore( ) not called
Peeeeerrrrerrrrererrrtrrrrrrrretttrrrirttit? titties
(iii) Consider the tree given next and delete 45 from it.
314 : Data Structures through C in Depth
Figure 6.159
After deletion of 45, the height of the tree decreases. The resulting tree after deletion is-
12-20 3055}
cmt
pvc roo nto eer OP Or ee To etree meter , Root node has become empty, so Delete( ) will
:Delete( ), dkey 45, root— [30] forma new.root node.
see rerenececceeaseacesrssesescanesorenesconssonnees.
pbistssecbipt
fsdeleted so now p—[47]
cotscased aguante nouns eouasen
ee tealp->chilid{n) > NULL , Not checked for underflow
p->key[n] = ptr->key[1];
return ptr->key[1];
}/*End of copy_succkey()*/
This function will be required when a deletion is to be performed in a non leaf node. The task of this
function is to replace the n™ key of node p i.e. p->key [n] by its successor key and return the value of that
successor key. We know that the successor key is the leftmost key in the right subtree. We’ll take a pointer ptr
and use it to move down the right subtree and since we want to reach the leftmost key, we will move only
leftwards using the leftmost child child[0]. We will stop when we reach the leaf node and the leftmost key of
this node is the successor key of p->key{[n].
The function delByShift() is-
vqid delByShift(struct node *p, int n)
( {
intL}
for(i=n+1; i <= p->count; i++)
{
p->key[i-1] = p->key[i];
p->child[i-1] = p->child[i];
}
p=Scoune=—;
}/*End of delByShift()*/
Trees
This function will be called only if node p contains more than MIN keys i.e. when there is no chance of
underflow if a key is deleted from the node, and the key can be simply deleted by shifting some keys to the left.
This function will delete the n"” key and its right child pointer from the node p i.e. key[n] and child{n] are
removed from this node. For this, all the keys and pointers which are to the right of n™ position are shifted one
position left and the count of the node p is decremented. |
The function restore() is-
void restore(struct node *p, int n)
{
1f(n!=0 && p->child[n-1]->count > MIN)
borrowLeft(p, n); ~
else if(n!=p->count && p->child[n+1]->count > MIN)
borrowRight(p,n);
else
{
if(n==0) /*if underflow node is leftmost node*/
combine(p, n+1); /*combine with right sibling*/
else
combine(p, n); /*combine with left sibling*/
iS restore () */
The function restore() is called when a node underflows. The underflow node is the n" child of the node
p. Let us recall how we proceed in the case of an underflow. First we try to borrow from left sibling and then
from right sibling. If borrowing is not possible then we combine the node with left sibling, and if left sibling
doesn’t exist then we combine it with right sibling.
Since underflow node is the n™ child of the node p, (n-1)" child of p is the left sibling and (n+1)" child of p
is the right sibling of underflow node. The left sibling will not exist if the underflow node is leftmost child of its
parent (n==0), and the right sibling will not exist if the underflow node is the rightmost child of its
parent (n==p->count). So before borrowing we have to check for these two conditions also. If the underflow
node is not the leftmost child and the left sibling contains more than MIN keys then we can borrow from left,
otherwise if the underflow node is not the rightmost child and the right sibling contains more than MIN keys
then we can borrow from right. If both these conditions fail we’ll combine the node with the left sibling, but if
the node is leftmost we’ll combine it with the right sibling.
When we combine with right sibling the second argument is n+1, and with left sibling it is n. While
combining with left sibling, the second argument is not n-1, we’ll come to know the reason for this after
studying function combine ().
The function borrowLeft () is-
void borrowLeft(struct node *p, int n)
{
int +i;
struct node *u; /* underflow node*/
Struct node "Ls; /*left sibling of node u*/
dQ =) p=>child{[n];
lsi=.pa>child[n=-1],
/*Shift all the keys and pointers in underflow node u one position right*/
for (i=u->count; i>0; i--)
{
u->key[i+1] = u->key[il]+
u->child[i+1] = u->childf[i];
}
u->child[{1] = u->child[0];
/*Move the separator key from parent node p to underflow node u*/
u->key[1] = p->key[n]; ;
u->count++;
/*Move the rightmost key of node 1s to the parent node p*/
p->key[n] = ls->key[1ls->count];
/*Rightmost child of 1s becomes leftmost child of node u */
316 : Data Structures through C in Depth
u->child[0] = ls->child[1ls->count] ;
ls=>count--;
}/*End of borrowLeft()*/
The node u represents the underflow node, 1s is its left sibling and p is their parent node. Initially all the
keys and pointers in node u are shifted one position right to make room for the new key( in the figure 6.160, key
20 is shifted right). After this the separator key(18) from the parent is moved to the underflow node. The
rightmost key of the left sibling(14) is moved to the parent node. The rightmost child of 1s ( node [15,16] )
becomes the leftmost child of node u.
Figure 6.160
All keys and children are not shown in this figure.
The function borrowRight() is -
void borrowRight (struct node *p,int n)
{
slighee ae
struct node *u; /*underflow node*/
struct node *rs; /*right sibling of node u*/
uU = p->child[n);
rs = p->child[n+1];
/*Move the separator key from the parent node p to the underflow node u*/
u->count++;
u->key[u->count] = p->key[n+1];
/*Leftmost child of node rs becomes the rightmost child of node u*/
u->child[u->count] = rs->child[0];
/*Move the leftmost key from node rs to parent node p*/
p->key[n+1] = rs->key([1]; ;
rs->count--;
/*Shift all the keys and pointers of node rs one position left*/
ES aSChika
tO) Sars >chaskelfii
for(i=1; i<=rs->count; i++)
{
rs->key[i] = rs->key[it+1];
rs->child[i] = rs->child[i+1];
}
}/*End of borrowRight ()*/
The node u represents the underflow node, rs is its right sibling and p is the parent node. Initially move the
separator key(in the figure 6.161, key 6) from the parent node p to the underflow node u. Note that here shifting
of keys in the underflow node is not needed. The leftmost child of rs (node [7,9] ) becomes the rightmost child
of u. The leftmost key from node rs (10) is moved to the parent node p. At last all the remaining keys and
pointers in node rs(11,12,13) are shifted one position left to fill the gap created by removal of key 10.
rs
Figure 6.161
All keys and children are not shown in this figure.
The function combine () is-
Trees
ae eee)
void combine(struct node *p, int m)
{ :
ee eid
Semuce node “x5
SEructe node *y-;
x = p->child[m];
y = p->child[m-1];
/*Move the key from the parent node p to node y*/
y->count++;
y->key[y->count] = p->key[m];
/*Shift the keys and pointers in p one position left to fill the gap*/
for(i=m; i<p->count; i++)
{
p->key[i] = p->key[i+1];
p-Schild[i}) = p->child[i+1];
i:
p= count——;
/*Leftmost child of x becomes rightmost child of y*/
y->child[y->count] = x->child[0];
/*Insert all the keys and pointers of node x at the end of node y*/
for(i=1; i<=x->count; i++)
f
y->count++;
y->key[y->count] = x->key[i];
y->child[y->count] = x->child[i];
} ‘
free (x);
}/*End of combine()*/
This function combines the two nodes x and y. The node x is the m"" child and node y is the
(m-1)" child of node p. Initially the separator key(key e in the figure 6.162) from the parent node is moved to
node y, and all the keys that were on the right side of key ein the node p are shifted left to fill the gap created
by the removal of e. Now the leftmost child of x( node [f,g] ) becomes the rightmost child of y. At last all the
keys and pointers of node x are inserted at the end of node y and the memory occupied by node x is released
using free().
Figure 6.162
In the function restore () we have used combine like this-
combine (p,n) ~~
m=n, x is n' node, y is (n-1)" node
Here x is underflow node and y is its left sibling.
7a
a
- 318 Data Structures through C in Depth
6.22 B+ tree
A disadvantage of B tree is inefficient sequential access. If we want to display the data in ascending order of
keys, we can do an inorder traversal but it is time consuming, let us see the reason for it. While doing inorder
traversal, we have to go up and down the tree several times, i.e. the nodes have to be accessed several times.
Whenever an internal node is accessed, only one element from it is displayed and we have to go to another
node. We know that each node of a B tree represents a disk block and so moving from one node to another
means moving from one disk block to another which is time consuming. So for efficient sequential access, the
number of node accesses should be as few as possible.
B+ tree which is a variation of B tree, is well suited for sequential access. The two main differences in B tree
and B+ tree are-
(1) In B tree, all the nodes contain keys, their corresponding data items (records or pointers to records), and child
pointers but in B+ tree the structures of leaf nodes and internal nodes are different. The internal nodes store only
keys and child pointers while the leaf nodes store keys and their corresponding data items. So the data items are
present only in the leaf nodes. The keys in the leaf nodes may also be present in the internal nodes.
(ii) In B+ tree, each leaf node has a pointer that points to the next leaf node i.e. all leaf nodes form a linked list.
The figure 6.163 shows a B tree and the figure 6.164 shows a B+ tree containing the same data.
[ees 319
40 50
Pay Dso |
12 19 4 BO 25 27 28 | 30 35 38 50. 60. 90 }
Diz Dis Dis} D0 sD DT Day _Das Das J Dsy Deo
Doo J
Figure 6.164
The alphabet ‘D’ with subscript shown under the key value represents the data item. While discussing B tree we
had not shown this data item in the figures so that the figures remain small.
In B+ tree, the internal nodes contain only keys and pointers to child nodes, while the leaf nodes contain
keys, data items and pointer to next leaf node. The internal nodes are used as index for searching data and so
hey are also called index nodes. The leaf nodes contain data items and they are also called data nodes. The
index nodes form an index set while the data nodes form a sequence set. All.the leaf nodes form a linked list
and this feature of B+ tree helps in sequential access i.e. we can search for a key and then access all the keys
following it in a sequential manner. Traversing all the leaves from left to right gives us all the data in ascending
order. So both random and sequential accesses are simultaneously possible in B+ tree.
5.22.1 Searching
in B tree our search terminated when we found the key, but this will not be the case in B+ tree. Ina B+ tree, it is
s0ssible that a key is present in the internal node but is not present in the leaf node. This happens because when
ny data item is deleted from the leaf node, the corresponding key is not deleted from the internal node. So
resence of a key in an internal node does not indicate that the corresponding data item will be present in the
eaf node. Hence the searching process will not stop if we find a key in an internal node but it will continue till
we find the key in the leaf node. The data items are stored only in the leaf nodes, so we have to go to the leaf
10des to access the data.
Suppose we want to search the key 20 in the B+ tree of figure 6.164. Searching will start from the root node
0 first we look at the node [30], and since.20 < 30, we’ll move to left child which is:[12, 20]. The key value is
‘qual to 20 so we’ll move to the right child which is the leaf node and there we find the key 20 with its data
tem. 1
B+ tree supports efficient range queries i.e. we can access all data in a given range. For this we need to
earch the starting key of the range and then sequentially traverse the leaf nodes till we get.the end key of the
ange.
}.21.2 Insertion
irst a search is performed and. if the key is not present in the leaf node then we can have two'cases depending
yn whether the leaf node has maximum keys or not.
If the leaf node has less than maximum keys, then key and data are simply inserted in the leaf node in
wrdered manner and the index set is not changed.
320 : Data Structures through C ‘in Depth
If the leaf node has maximum keys, then we will have to split the leaf node. The splitting of a leaf node is
slightly different from splitting of a node in a B tree. A new leaf ncde is allocated and inserted in the sequence
set(linked list of leaf nodes) after the old node. All the keys smaller than the median key remain in the old leaf
node, all the keys greater than equal to the median key are moved to the new node, the corresponding data items
are also moved. The median key becomes the first key of the new node and this key(without data item) is
copied(not moved) to the parent node which is an internal node. So now this median key is present both in the
leaf node and in the internal node which is the parent of the leaf node.
Splitting of a leaf node
keys < median remain in old leaf node
keys >= median go to new leaf node’
Median key is copied to parent node.
If after splitting of leaf node, the parent becomes full then again a split has to be done. The splitting of an
internal node is similar to that of splitting of a node in a B tree. When an internal node is split the media® ':-, 2s
moved to the parent node.
Splitting of an internal node
keys < median remain in old leaf node
keys > median go to new leaf node
Median key is moved to parent node. -
This splitting continues till we get a non full parent node. If root node is split then a new root node has to be
allocated.
Suppose we have to insert data with keys 42 and 24 in B+ tree of figure 6.164.
The key 42 can be simply inserted in the leaf node [40, 45, 48]. After inserting 24 in the tree, we get an
overflow leaf node [20, 24, 25, 27, 28] which needs to be splitted. A new leaf node is allocated and keys 25, 27,
28 with data items are moved to this node. The median key 25 is copied to the parent node and is present in the
leaf node also.
12 20 25 _ |
bas
12 15 19 + 20 24 a 30 35 38 & 40 42 45 48 F 50 60 90 }
Dyy Dss Dig}
E
Ds Dio } Dj2 Dis Diy F Dx Dog D25 Do7 Dog Dao Dy Das Dax :
Bestoae trooper areas cia e Ets AAT RG aan cessien rac saa cera RRR Dso Deo Doo ;
SOG EU NA ee
Figure 6.165
6.21.3 Deletion :
First a search is performed and if the key is present in the leaf, then we can have two cases depending on
whether the leaf node has minimum keys or more than that. —
‘If the leaf node has more than minimum elements then we can simply delete the key and its data item, and
move other elements of the leaf node if required. In this case, the index set is not changed i.e. if the key is
present in any internal node also then it is not deleted from there. This is because the key still serves as a
separator key between its left and right children.
If the key is present in a leaf which has minimum number of nodes then we have two cases-
(A) If any one of the siblings has more than minimum nodes then a key is borrowed from it and the
separator key in the parent node is updated accordingly.
If we borrow from left sibling then, then rightmost key(with data item) of left sibling is moved to the
underflow node. Now this new leftmost key in the underflow node becomes the new separator key.
If we borrow from right sibling then, then leftmost key(with data item) of right sibling is moved to the
underflow node. Now the key which is leftmost in right sibling becomes the new separator key.
Trees
I ee ee 321
eS. Y.|
(B) If both siblings have minimum nodes then we need to merge the underflow leaf node with its sibling.
This is done by moving the keys(with data) of underflow leaf node to the sibling node and deleting the
underflow leaf node. The separator key of the underflow node and its sibling is deleted from the .
parent node, and the corresponding child pointer in parent node is also removed.
The merging of leaf nodes may result in an underflow parent node which is an internal node. For internal nodes
borrowing and merging is performed in same manner as in B tree.
(i) Delete data with keys 12, 38, 40, 50 from B+ tree of figure 6.164.
20 25 27 28} '
Dx D2 D27 Dogf Dy Das f
Figure 6.166
(ii) Delete 15 from B+ tree of figure 6.166(borrow from right sibling).
Dz; Dz Dog f
Figure 6.167
[30 35 45
Dyo D3s_Das }
asses
30 35 45 |
D3y Das Das
ssoses 22566
Figure 6.169
Let us insert some keys into an initially empty digital search tree.
G) ,
Insert kj= 0110101 Insert k,=0010001 Insert k,= 1010011 / Insert k,= 1100101
Insert ks= 0011011 Insert: kg = 0001011 Insert k7= 1001010 Insert kg=0011001
The first key k, is to be inserted in an empty tree so it becomes the root of the tree. The next.key to be inserted
is ky = 0010001, since this key is not equal to the key in the root, we move down the tree. The first bit from left
is 0 so we have to move left and since the left pointer is NULL, we allocate a new node which becomes the left
child of root node and insert the key k, in that node.
The next key to be inserted is k; = 1010011, since this key is not equal to the key in the root, we move down
the tree. The first bit from left is 1 so we have to move right and since the right pointer is NULL, we allocate'a
new node which becomes right child of root node and insert the key k; in that node.
The next key to be inserted is ky = 1100101, since this key is not equal to the key in the root, we move down
the tree. First bit from left is 1, we move to the right child. The key in right child is not equal to the given key,
sO we examine second bit in the key ky which is 1. Now again we have move to the right and in this case right
pointer is NULL so we allocate a new node and insert the key there. Similarly other keys are also inserted.
For searching a key, we proceed down the tree in similar manner: If at any point, the search key is equal to
the key in current node, the search is successful. Reaching a NULL pointer implies that key is not present in the
tree.
Deletion in DST is much simpler than in BST. If the key to be deleted is in a leaf node, then we can simply
remove the leaf node by replacing it with NULL pointer, If the key to be deleted is in a non leaf node, then that
key can be replaced with a key from any leaf node in any of its subtree.and after that the particular leaf node
may be deleted. For example suppose we want to delete key k, from last figure, This key can be replaced by any
of the keys kg or kg and then the leaf node may be deleted.
All the above operations are performed in O(h) time where h is the height of the tree. The maximum height
of a digital search tree can be p+1 where p is the numbers of bits in the key and so this tree remains balanced.
Exercise
1. Draw all possible non similar binary trees having (i) 3 nodes (ii) 4 nodes.
2. Draw all possible binary trees of 3 nodes having preorder XYZ.
3. Draw all possible binary search trees of 3 nodés having key values ’1, 2, 3.
4. Construct a BST by inserting the following data sequentially:
45 32 70 67 21 85 92 40
Trees 298
5. If we construct a binary search tree by inserting the following data sequentially, then what is the height
‘of the tree formed.
Pie oweie 82 45° 91- 38°70. 40...6]
If the binary search tree is constructed by inserting this data in sorted order, then what will be the height
of that tree.
6. The preorder traversal of a binary search tree Tis23 12 11 9 6 45 32 67 56. Whatare the
‘inorder and postorder traversals of the tree T.
7. Show a binary tree for which preorder and inorder traversals are same.
8. Show a binary tree for which postorder and inorder traversals are same.
9. Construct binary trees from inorder and preorder traversals.
(i) Inorder:12 31 10 45 66 Preorder:12 31 10 45 66
(ii) Inorder:35 26 93 21 68 Preorder:68 21 93 26 35
(iii) Inorder : 16 22 31 15 46 77 19 Preorder: 15 22 16 31 77 46 19
(iv) Inorder11 12 23 24 25 32 43 46-54 65 Preorder:32 23 11 12 24 25 43 54 46 65
10. Construct binary trees from inorder and postorder traversals. ‘ :
(i) Inorder: 12 31 10 45 66 Postorder:12 31 10 45 66
(ii) Inorder:35 26 93 21 68 Postorder:68 21 93 26 35
(ii) Inorder:20 19 24 8 11 13 6 Postorder: 20 24 19 11 6 13 8
(iv) Inorder:4 5 6 11, 19 23 43 50 54 98 Postorder:4 6 5 19 11 50 98 54 43 23
11. For the following binary search trees, show the possible sequences in which the data was entered in these
trees.
7) (9) ()
(6)
2) e x 2 exG)- @
G)
(i) (ii) (iii)
12. Suppose a binary search tree is constructed by inserting the keys 1, 2, 3, 4...... n in any order.
(a) If there are x nodes in right subtree of root, which key was inserted in the beginning.
(b) If there are y nodes in left subtree of root, which key was inserted in the beginning.
13. We know that preorder and postorder traversals can’t uniquely define a binary tree. Show example of binary
trees that have same preorder and postorder traversals.
14. Construct a binary: search tree whose preorder traversal is-
67 34 12 45 38 60 80 78 95 90
15. Construct a binary search tree whose postorder traversal is-
; 10 11 40 48 44 32 65 73 88 77 72 56
16. Construct a full binary tree whose preorder is -
FBGICKL
17. Write a function that returns the size of a binary tree i.e. the total number of nodes in the tree.
18. Write a function that returns the total number of leaf nodes in a binary tree and displays the info part of each
leaf node.
19. Write a function to find the length of shortest path from root to a leaf node, this length is also known as
minimum height of the binary tree. For example the minimum height of this tree is 3.
) \
324 Data Structures through C in Depth
28. Write a function to find whether two binary trees are mirror image of each other or not.
29. Write a function to check whether a binary tree is binary search tree or not.
30. Write a recursive function that inputs a level number of a binary tree and returns the number of nodes at that
level.
31. Width of a binary tree is the number of nodes on the level that has maximum nodes. For example width of
the following binary tree is 4, Write a function that returns the width of a binary tree.
32. Write a function that inputs a level and displays nodes on that level from right to left.
33. Write a function to traverse a tree in spiral or zigzag order. The spiral traversal of tree in exercise 31 is
ACBDEFGJIH.
34. Draw an expression tree for the following algebraic expression and write the prefix and postfix forms of
the expression by traversing the expression tree in and preorder and postorder.
(a+b/c)—-(d+e*f)
35. The level order traversal of a max heap is 50 40 30 25 16 23 20. What will be the level order traversal
after inserting the elements 28, 43, 11.
36. Generate Huffman code for the letters a, b, c, d, e, f having frequencies 16, 8, 4, 2, 1, 1.
37. The following function triesto find out whether a BST is AVL tree or not. Trace. and find out whether it
gives correct output or not. Write the modified function if required.
Trees
828
A.graph G = (V, E) is a collection of sets V and E where V is the collection of vertices and E is the collection of
edges. An edge is a line or arc connecting two vertices and it is denoted by a pair (i, j) where i, j belong to the
set of vertices V. A graph can be of two types - Undirected graph or Directed graph.
Acyclic graph - A graph that has no cycle is called an acyclic graph. Graphs G2, G4, G5, G6, G8 and G10 are
examples of acyclic graphs.
DAG - A directed acyclic graph is named DAG after its acronym. Graph G5 is an example of a dag.
: Data Structures through C in Depth
328
Figure 7.3 ;
Source - A vertex, which has no incoming edges, but has outgoing edges, is called a source. The indegree of a
source is zero. In graph G8, vertices Vo and vs are sources.
Sink - A vertex, which has no outgoing edges but has incoming edges, is called a sink. The outdegree of a sink
is zero. In graph G8, vertex v3 is a sink.
Pendant vertex - A vertex in a digraph is said to be pendant if its indegree is equal to 1 and outdegree is equal
to.0.
Graphs
329
Isolated vertex - If the degree of a vertex is 0, then it is called an isolated vertex. In graph G2, vertex
v, is an
isolated vertex. °
Successor and predecessor - In a digraph, if a vertex v is adjacent to vertex u, then v is said to be the successor
of u, and u is said to be the predecessor of v. In graph G8, vo is predecessor of v, while v, is successor of Vo.
Maximum edges in a graph - If n is the total number of vertices in a graph, then an undirected graph can have
maximum n(n-1)/2 edges and a digraph can have maximum n(n-1) edges. For example an undirected graph with
3 vertices can have maximum 3 edges, and an undirected graph with 4 vertices can have maximum 6 edges. A
digraph with3 vertices can have maximum 6 edges and a digraph with 4 vertices can have maximum 12 edges.
Complete graph - A graph is complete if any vertex in the graph is adjacent to all the vertices of the graph or
we can Say that there is an edge between any pair of vertices :n the graph. A complete graph contains maximum
number of edges, so an undirected complete graph with n vertices will have n(n-1)/2 edges and a directed
complete graph with n vertices will have n(n-1, edges. Graph G1 is a complete undirected graph and graph G11
is acomplete directed graph.
Multiple edges - If there is more than one edge between a pair of vertices then the edges are known as multiple
edges or parallel edges. In graph G3, there are multiple edges between vertices vo and vp.
Loop - An edge is called loop or self edge if it starts and ends on the same vertex. Graph G4 has a loop at vertex
Vi.
Multigraph - A graph which contains loop or multiple edges is known as multigraph. Graphs G3 and G4 are
multigraphs. . a
Simple graph - A graph which does not have loop or multiple edges is known as simple graph.
Regular graph - A graph is regular if every vertex is adjacent to the same number of vertices. Graph Gl is
regular since every vertex is adjacent to 3 vertices.
Planar graph - A graph is called planar if it can be drawn in a plane without any two edges intersecting. Graph
G1 is nota planar graph, while graphs G2, G3, G4 are planar graphs.
Null graph - A graph which has only isolated vertices is called null graph.
7.4.3 Bridge
If on removing an edge from a connected graph, the graph becomes disconnected then that edge is called a
bridge. Consider the following graph, we will remove all the edges one by one from
it and see if the graph
becomes disconnected.
Connected Graph G
BRA! www)
YO WY & wW—w)W&
Remove Edge (v2, v3) from G Remove Edge (v2,v4) from G Remove Edge (v4,vs) from G
Becomes Disconnected Remains connetted Becomes disconnected
Figure 7.7
From the figure 7.7, we find that removal of edge (v2, v3) and removal of edge (v4, Vs) makes the graph
disconnected so these edges are bridges. | ;
iN on removing a vertex from a connected graph, the graph becomes disconnected then that vertex is called
the
articulation point or a cut vertex. Consider the graph in figure 7.8, we will remove all the vertices one by
one
from it and see if the graph becomes disconnected.
; From the figure 7.8, we find that removal of vertex v2 and removal of vertex v4 makes the graph
disconnected so these are the articulation points of this graph.
NS
Connected Graph G
Figure 7.8
Figure 7.9
Figure 7.10
332 : Data Structures through C in Depth
Figure 7.11
maximal strongly connected subgraph. The following figure shows some graphs with their strongly connected
components.
LN
undirected graph is connected then that digraph is weakly connected. In the following figure, the first graph is
weakly connected while the second one is not.
7.6 Tree
An undirected connected graph T-is called tree if there are no cycles in it. There is exactly one simple path °
between any two vertices u and v of T. If there is more than one path between any two vertices, then ‘t would
mean that there is a cycle in the graph and if there is no path between any two vertices then it would 7aean that
graph is not connected. So according to the definition of tree, there will be exactly one simple path between any
pair of vertices of the tree. A tree with n vertices will have exactly n-1 edges. The following are some examples
of trees.
Figure 7.16
The following examples are graphs which are not trees.
Figure 7.17
334 : Data Structures throu h C in Depth
In figure 7.17, the first graph is not a tree as it is not connected, the next two graphs are not trees as they are
cyclic, and the last graph is not a tree as it is not connected and is cyclic.
If any edge is removed from a tree, then the graph will not remain connected, i.e. all edges in a tree are
bridges. If any edge is added to the tree then a simple cycle is formed.
7.7 Forest Z
A forest is a disjoint union of trees. In a forest there is at most one path between any two vertices, this means
that there is either no path or a single path between any two vertices.
Figure 7.18
The first forest in the figure 7.18 consists of 3 trees, and the second forest consists of 5 trees.
e
=O Nomoto Some
ae
eG) SRG) a@axi@)-6p
—Y Y © © O-H
(b) A Spanning forest of Graph shown in (a)
Figure 7.20
fence, all the entries of this matrix are either | or 0. Let us take a directed graph and write the adjacency matrix
Or it.
O_O wu wy
Vo CO}g © all Sut es
v1 10 en At
(2) V2 Ome Ome Oi |
Ws) V3 to” = a0
[ere the matrix entry A(0,1) = 1, which means that there is an edge in the graph from vertex vo to vertex vj.
imilarly A(2,0) = 0, which means that there is no edge from vertex v2 to vertex Vo. In the adjacency matrix of a
irected graph, rowsum represents the outdegree and columnsum represents the indegree of that vertex. For
xample from the above matrix we can see that the rowsum of vertex v; is 3 which is its outdegree and
olumnsun is | which is its indegree.
336 ; Data Structures through C in Depth
Let us take an undirected graph and write the adjacency matrix for it.
Ct) Vo
PeSy”
Ove 1
Vi l 0 0 1
v2 1 0s OF 0
(v3) (v3) V3 im tO @ 3
Figure 7.22
In an undirected graph if there is an edge from i to j, then there will also be an edge from j to i, i.e. AGij) =
A(j,i) for every i and j. Hence the adjacency matrix for an undirected graph will be a symmetric matrix. In an
undirected graph, rowsum and columnsum for a vertex are equal and represent the degree of that vertex.
If a graph has some weights on its edges then the elements of adjacency matrix can be defined as-
Let us take a directed weighted graph and write the weighted adjacency matrix for it.
3
‘ 8 4 Verh On 2k On 8
6 Vi 3 0 4 7
Cz. LOO Oras
eer Bly eee
5
(a) Weighted Directed Graph (b) WeightedAdjacency Matrix for graph (a)
Figure 7.23
Here all the non-zero elements of matrix represent the weight on the corresponding edge.
We know that in C, we can represent a matrix by a two dimensional array, where first subscript represents
row and second subscript represents column of that matrix.
_ Suppose we have n vertices in a graph, and these vertices are represented by integers from 0 to n-1. The
adjacency matrix of this graph can be maintained with a2 “‘mensional integer array adj (n] [n].
(a) A Directed Graph (b) Adjacency Matrix (c) Adjacency Matrix maintainéd
in a 2-d array
Figure 7.24 : a
This adjacency matrix is maintained in the array adj [4] [4]. The following program shows how to create and
display an adjacency matrix of a graph. TN
{
int max_edges,i,j,origin,destin;
int -graph_type;
printf("Enter 1 for undirected graph or 2 for directed graphs");
scanf ("%d", &graph_type);
printf("Enter number of vertices : ");
scanf("%d",&n) ;
if (graph_type==1)
max_edges=n* (n-1) /2;
else
max_edges=n* (n-1);
for(i=1; i<=max_edges; i++)
{
. printf("Enter edge %d(-1 -1 to quit) : ",i);
scanf("%d %d",&origin, &destin) ;
1f((origin==-1) && (destin==-1) )
break;
if(origin>=n || destin>=n || origin<0O || destin<0)
}/*End of for*/
printf("The adjacency matrix is :\n");
for(i=0; ix<=n-1; i++)
{
for(j=0; j<=n-1; j++)
printf ("%4d",adj[il[j]);
prInetLCe \at)F
}
t/*End of main()*/
Insertion of an edge (i, j) requires changing the value of adj [i] [j] from Oto 1.
Oger, 42753: O19 2983
One On i Ome I OF ZO le OI
1 eo ts | 1 TOO tel
— en ee eee Add new edge (3, 1) 2 Garo OLA
3 LOR ie: 0 3 1, ne
Initially there was no edge from vertex 3 to vertex 1, so there was 0 in the 4th row 2nd column. After insertion
of edge (3, 1), this 0 changes to 1.
Deletion of an edge (i, j) requires changing the value of adj [i] [3] from 1 to 0.
Initially there exists an edge from vertex 1 to vertex 2, so there is 1 in the 2nd row 3rd column. After deletion of
this edge, this 1 changes to 0.
/*P7.2 Program for addition and deletion of edges in a directed graph using adjacency
matrix*/
#include<stdio-.h>
} _
338 Data Structures through C in Depth
11
i
printf("Invalid edge!\n");
ay
}
else
adj [origin] [destin] =1;
})/* Bnd On Lor ™/
}/*End of create_graph() */
\
se
en
void insert_edge(int origin,int destin)
ra
5 A
if(destin<0 || destin>=n)
{
printf("Destination vertex does not exist\n");
return;
}
adj [origin] [destin] =1;
}/*End of insert_edge() */
void del_edge(int origin, int destin)
{
if(origin<0 || origin>=n || destin<O || destin>=n || adj[origin] [destin]==
{ | |
Ste ly
for(i=0; i<n; i++)
ail £
for(j=0; j<n; j++)
printf ("%4d", adj ti} [5]); .
DEIMeL GoAn 5s
}
}/*End of display()*/
In figure 7.25(b), on the left we have a linked list(vertically drawn) of the vertices of the graph. In the graph we
have four vertices so there are four nodes in the linked list of vertices. For each vertex we have a separate list
which stores pointers to adjacent vertices. For example the vertices 0, 2, 3 are adjacent to vertex 1 i.e. there are
edges from vertex 1 to these vertices. So in the linked list of vertex 1, we have three nodes containing pointers
to vertices 0, 2, 3. There is only one vertex adjacent to vertex 0, so in the list of vertex 0 there is only one node
and it contains pointer to vertex 1. In the figure, p_O, p_1, p_2, p_3 represent pointers to nodes 0, 1, 2, 3
respectively. x
We can have similar adjacency list for undirected graphs also. In the case of undirected graph, the space
requirement doubles, since each edge appears in two lists. The following figure shows an undirected graph and
its adjacency list structure.
©) e)
ese4
p_2 Bes
Figure 7.27
7.10.2.2 Edge insertion
Graphs : , 34]
Insertion of an edge requires insertion operation in the list of the starting vertex of edge. Suppose we want to
add an edge (2, 0) in the graph of figure 7.25. For this we have to add a node in the edge list of vertex 2, and
this new node will contain a pointer to vertex 0.
0) (1)
Figure 7.28
This was the procedure of edge insertion in directed graph. In undirected graph, the insertion operation has to
be done in the lists of both start and end vertices of the edge.
Figure 7.29
In an undirected graph the deletion operation has to be performed in lists of both start and end vertices of the
edge.
[p_3Bs
E2509
Figure 7.30
/*P7.3. Program for insertion and deletion of vertices and edges in a directed
342
exit (1);
default:
printf ("Wrong choice\n") ;
break;
}/*End of switch*/
}/*End of while*/
}/*End of main() */
void insertVertex(int u)
{
struct Vertex *tmp, *ptr;
tmp = malloc(sizeof (struct Vertex)
);
tmp->info = u; :
tmp->nextVertex = NULL; hel a
tmp->firstEdge = NULL;
if (start==NULL)
{ | ;
start = tmp;
return;
}
_ ptr = start;
while (ptr->nextVertex!
=NULL)
ptr = ptr->nextVertex;
ptr->nextVertex = tmp;
}/*End of insertVertex()*/_
void deleteVertex(int u)
{ ;
struct Vertex *tmp, *q;
struct Edge *p, *temporary;
if (start==NULL)
( :
printf("No vertices to be deleted\n") ;
return;
} ,
if (start->info==u) /*Vertex to be deleted is first vertex of list*/
{
tmp = start;
start = start->nextVertex;
}
else /*Vertex to be deleted is in between or at last*/
{
q = start;
while (q->nextVertex! =NULL)
{
if (q->nextVertex->info==u)
break;
q = q->nextVertex;
}
if (q->nextVertex==NULL)
{
printf ("Vertex not found\n") ;
return;
}
else
{
tmp = q->nextVertex;
q->nextVertex = tmp->nextVertex;
}
*
/*Before
}
freeing the node tmp, free all edges going from this vertex*/
p = tmp->firstEdge;
while (p!=NULL)
{
temporary = Pp;
p = p->nextEdge;
344 Data Structures through C in Depth
free (temporary) ;
}
free(tmp) ;
}/*End of deleteVertex()
*/
void deleteIncomingEdges
(int u)
{
struct Vertex *ptr;
struct Edge *q,*tmp;
pEre= Start;
while(ptr!=NULL)
{
if (ptr->firstEdge==NULL) /*Edge list for vertex ptr is empty*/
{
ptr = ptr->nextVertex;
continue; /*continue searching in other Edge lists*/
}
if (ptr->firstEdge->destVertex->info==u)
{
tmp = ptr->firstEdge;
ptr->firstEdge = ptr->firstEdge->nextEdge;
free (tmp) ;
y continue; /*continue searching in other Edge lists*/
}
q = ptr->firstEdge;
while (q->nextEdge!= NULL)
{ 5
if (q->nextEdge->destVertex->info==u)
{
“tmp = q->nextEdge;
q->nextEdge = tmp->nextEdge;
free(tmp); —
continue;
}
q = q->nextEdge;
}
ptr = ptr->nextVertex;
}/*End of while*/
}/*End of deleteIncomingEdges
() */
struct Vertex *findVertex(int u)
{
Struct Vertex *ptxr;, *loe>
ptr = start;
while (ptr! =NULL)
{
Lf (ptr->info==u)
loc = ptr;
return loc;
else
r ptr = ptr->nextVertex;
}
loc = NULL;
return loc;
'}/*End of findVertex()*/
void insertEdge(int u,int v)
{ \
struct Vertex *locu, *locv;
struct, Edges *ptr,* tmp;
locu = findVertex(u) ;
locv = findVertex(v) ;
if (locu==NULL)
{
'. Graphs 345
print£("%d ->",ptr->info) ;
q = ptr->firstEdge;
while(q!=NULL)
{
printf(" %d",q->destVertex->info) ;
q = q->nextEdge;
}
johiqelealaney(Noe A ir
ptr = ptr->nextVertex;
}
}/*End of display()*/
Thus the path matrix of a graph G is actually the adjacency matrix of its transitive closure G’. The path
matrix of a graph is also known as the transitive closure matrix of the graph.
We will study two methods to compute the path matrix. The first method is by using powers of adjacency
matrix and the second one is Warshall’s algorithm. Here are some inferences that we can draw by looking at the
path matrix-
(i) If the element P[i][j] is equal to 1.
There is a path from vertex i to vertex j
(ii) If any main diagonal element i.e. any element P[i][i] in the path matrix is 1.
Graph contains a cycle
Figure 7.31
Now we compute the matrix AM) by multiplying the adjacency matrix A with itself.
O54 223
QO 1 ite O07
a ey WN ioc:
ANG A=) 5 Te Omnia
321 Oeee
In this matrix, value of AMD[i][j] will represent the number of paths of path length 2 from vertex i to vertex j. ~
For example vertex 0 has two paths of path length 2 to vertex 2, and vertex 3 has two paths of path length
2 to
itself, vertex 2 has one path of path length 2 to vertex 0, there is no path of path length 2 from vertex 2 to
vertex
Graphs 347
1. These paths may not be simple paths, i.e. all vertices in these paths need not be distinct. Now we compute the
matrix AM; by multiplying the adjacency matrix A with AM).
Ont. 23
' (Tn, IS Goal ea ge
AM,=AM,*A=A> = 1 ]3 1 3 3
Zap eoe *I FO M2
Saas Ou 3 1
Here AM3[i][j] will represent the number of paths of path length 3 from vertex i to vertex j. For example,
vertex 0 has 4 paths of path length 3 to vertex 3(paths 0-3-2-3, 0-3-0-3, 0-1-2-3, 0-1-0-3) and vertex 3 has no
path of path length 3 to vertex 1. Similarly, we can find out the matrix AM4.
ee
0 6717 6, <4
A 1 434 7
AM,4= AM3*A=A° = 2 SO Smal
3 1 3e- 1.6
In general we can say that if AM, is equal to A‘, then any element AM,(i][j] represents the number of paths of
path length k from vertex i to vertex j.
Let us define a matrix X where
X = AM) + AMo+......ceeepeeeees s...
+AM,
X[i][j] denotes the number of paths, of path length n or less than n, from vertex i to vertex j. Here n is the total
number of vertices in the graph.
For the graph in figure 7.31, the value of X will be-
OMstn. 2 ps3
91129. 44 49 10
Prine 5) 9213
Dale ae 1 Phd A
cH eege Pn ee
From definition of path matrix we know that P[i][j]=1 if there is a path from i to j, and this path can have
length n or less thann. Now in the matrix X, if we replace all nonzero entries by 1 then we will get the path
matrix or reachability matrix.
0 1 3
1 1 1
| I 1
I ! 1
wr1
©
= 1 =|
— 1
This graph is strongly connected since all the entries are equal to 1.
/*P7.4 Program to find out the path matrix by powers of adjacency matrix*/
#include<stdio.h>
#define MAX 100
void display(int matrix[MAX] [MAX]);
void pow_matrix(int p,int adjp[MAX] [MAX]);
void multiply(int mat1 [MAX] [MAX], int mat2[MAX] [MAX],int mat3 [MAX] [MAX]);
void create_graph() ;
int adj [MAX] [MAX];
ant n;
main()
{ .
/*This function computes the pth power of matrix adj and stores result in adjp*/
void pow_matrix(int p,int adjp[MAX] [MAX] )
{
int i,j,k, tmp [MAX] [MAX];
/*Initially adjp is equal to adj*/
for(i=0; <n; gies)
for(j=0; j<n; j++)
adjp[ij}f{j] = adj(il)(j];
for (k=1; k<p; k++)
{
/*Multiply adjp with adj and store result in tmp*/
multiply (adjp,adj, tmp) ;
for(i=0; i<n; i++)
for(j=0; j<n; j++)
adjpliJ{j] = tmpl[illjl; /*New adjp is equal to tmp*/
Graphs 349
, } ‘
}/*End of pow_matrix()*/
/*This function multiplies matl1 and mat2 ‘and stores the result in mat3*/
void multiply(int mat1[MAX] [MAX],int mat2 [MAX] [MAX],int mat3 [MAX] [MAX] )
{
Cy kg ty ee
for (i=0; i<n; i++)
for(j=0; j<n; j++)
{
mat3[i}){j] = 0;
for(k=0; k<n; k++)
mat3[(i][j] = mat3[i])[j] + mat1[i][k] * mat2[k] [Jj];
}
}/*End of multiply()
*/
void display(int [MAX]
matrix[MAX ] )
{ j
Pee >
for(i=0; i<n; i++)
{
: for(j=0; j<n; j++)
printf ("%4d",matrix[i][j]);
DEINEE (\n 8)
} :
jopanlg
ayoa (GlNG Ll) Wir
}/*End of display()*/
] If there is a simple path from vertex i to vertex j which does not use
any intermediate vertex greater than k, i.e. all intermediate vertices
Britt) = belong to the set {0,1,...... k}
0 Otherwise
P.,[i}{j] = 1 If there is a simple path from vertex i to vertex j, which does not use any intermediate vertex.
PoliJ{j] = | If there is a simple path from vertex i to vertex j, which does not use any other intermediate
vertex except possibly vertex 0.
Fylil{j] = 1 If there is a simple path from vertex i to vertex j, which does not use any other intermediate
vertex except possibly 0,1.
P5[i}{j] = | If there is a simple path from vertex i to vertex j which does not use any other
intermediate vertices except possibly vertices 0,1,2.
ee ee
i er cr
P,GiUj] = 1 If there is a simple path from vertex i to vertex j which does not use any other
intermediate vertices except possibly 0,1,....... k |
P.-1{i){j] = 1 If there is a simple path from vertex i to vertex j which does not use any other
intermediate vertices except possibly 0, 1, .......... n-l.
Here P_; represents the adjacency matrix and P,,., represents the path matrix, let us see why,
\
350 ta Structures through C in th
P., {iJ[jJ=1, If there is a simple path from vertex i to vertex jwhich does not use any vertex.
The only way to go from i to j without using any vertex is to go directly from i to j. Hence
P.,{i][j]=1 means that there is an edge from i to j. So P.; will be the adjacency matrix.
P,,:[i][jJ=1, If there is a simple path from i to j which does not use any vertices except 0, 1,...n-1. There are
total n vertices means this path can use all n vertices; hence from the definition of path matrix we observe that
P,., is the path matrix.
We know that P.; is equal to the adjacency matrix, which we can easily find out from the graph. We have to
find matrices Po, Pj, 2.02.25... P,.; . If we know how to find matrix P;, from matrix P,.; then we can find all these
matrices. So now let us see how to find the value of P,[i][j] by looking at the matrix P,..
If Py.:
{i fj] is 1, PxLi][j] will also be 1, let us see why.
P,.,{i][jJ=1, implies that there is a simple path(say P1) from i to j which does not use any vertices except
possibly On 1eic. ocsee k-1 or we can say that this path does not use any vertices numbered higher than k-1. So it
is obvious that this path does not use any vertices numbered higher than k also or we can say that this path does
not use any other vertices except possibly 0, 1, ...... k. So P,{i][j] will be equal to 1.
lente, Path PI jaceteetens,,
Path PI
No path from i tojusing Path P2 with intermediate vertices Path P2 broken into two paths.
pager ple from from the set {0,1,.....,k} Paths P2a and paths P2b with intermediate
Beee raalel) P,{iG)=1 vertices from the set (0,1,......k-1}
ral Pyili(k]=1 and Py i(k] G]=1
For existence of path P2, the paths P2a and P2b should exist, i.e. for P,{iJ[j] to be 1 when P,.:
[i]{j] is zero, the
values of Py.,[i][k] and P,.;[k][j] should be 1.
We can conclude that if P,.;[i][j]=0 then. P,[i][j] can be equal to 1 only
if Py.,{i}[kJ=1 and P,.;[k][jJ=1.
So we have two situations when P,[i][j] can be 1
1. Py fil{j] = 1 or
2. Py.ifi}[k] =1 and Py.ilk][j} = 1
Graphs
351
Thiis clear that if an entry is 1in matrix P. 1, then it will also be | in Po. So we can just copy all the 1’s and see
if the zero entries of P., can be changed to 1 in Po. Changing of a zero entry to 1 implies that we get a path if we
use 0 as the intermediate vertex.
Py =
—)
en
o> —-cord'e
-—-—
oO
-_-
= OND
a WD
=
=e
x ll
SoS
wnNne= =—_=O
— =-O-—
-—-OoO-—-—
* Find P,[2][1]
P,[2][1] = 0, so look at P,[2][2] and P,[2][1], both are zero, hence P2[2][1]=0
OF 273
1
1
i ]
S
wnrns-=
-—-Om
ee
Ct
= 1
—-Ore
i) ry 2 ww
]
1
And uUw i} 1
wnres
eee ey I
Sses —
—
Here P., is the adjacency matrix and P; is the path matrix of the graph. In the program all the calculation can be
done in place using a single two dimensional array P.
/*P7.5 Program to find path matrix by Warshall’s algorithm*/
#include<stdio.h>
#define MAX 100
void display(int matrix[MAX] [MAX], int n);
int adj [MAX] [MAX];
ant ial £
void create_graph(); ?
main()
{ :
imtce is, s,s
int P[MAX) [MAX];
create_graph();
printf("The adjacency matrix is :\n");
display (adj,n); R
for(i=0; i<n; i++)
for(j=0; j<n; j++)
PT) = ada ilte I:
for(k=0; k<n; k++)
{
for(i=0; i<n; i++)
for(j=0; j<n; j++)
PCil(5] = (Plid(j) || (PLil(k] se PIkJ(3N);
DaincinG
eR ec yd si s\nkK)
display (P,n) ;
}
printf("P%d is the path matrix of the given graph\n",k-1);
_ }7*End of main()) *7
void display(int -matrix[MAX] [MAX],int n)
{
ssrA toamleyepa|ve
for(i=0; i<n; i++)
{
fori(j=057 3<ne ++)
printf ("%3d",matrix[i][j]);
jogubaepa (Wier )\s
/
:
graphs {
353
-/*End of display()*/
roid create_graph()
printf("Invalid edge!\n");
ABSoy
4 }
else :
adj [origin] [destin] = 1;
}/*End of for*/
/*End of create_graph()*/
7.12 Traversal *
fraversal in graph is different from traversal in tree or list because of the following reasons-
a) The: 2 is no first vertex or root vertex in a graph, hence the traversal can start from any vertex. We can
hoose any arbitrary vertex as the starting vertex. A traversal algorithm will produce different sequences for
lifferent starting vertices. .
b) In tree or list, when we start traversing from the first vertex, all the elements are visited but in graph only
hose vertices will be visited which are reachable from the starting vertex. So if we want to visit all the vertices
f the graph, we have to select another starting vertex from the remaining vertices in order to visit all the
ertices left.
c) In tree or list while traversing, we never encounter a vertex more than once while in graph we may reach a
ertex more than once. This is because in graph a vertex may have cycles and there may be more than one path
9 reach a vertex. So to ensure that each vertex is visited only once, we have to keep the status of each vertex
vhether it has been visited or not.
d) In tree or list we have unique traversals. For example if we are traversing a binary tree in inorder there
an be only one sequence in which vertices are visited. But in graph, for the same technique of traversal there
an be different sequences in which vertices can be visited. This is because there is no natural order among the
uccessors of a vertex, anid thus the successors may be visited in different orders producing different sequences.
‘he order in which successors are visited may. depend on the implementation.
Like binary trees, in graph also there can be many methods by which a graph can be traversed but two of
1em are standard and are known as breadth first search and depth first search.
Owns0
G)—>(4)
OO
Figure 7.32 |
Let us take vertex 0 as the starting vertex. First we will visit the vertex 0. Then we will visit all vertices
adjacent to vertex O i.e. 1, 4, 3. Here we can visit these three vertices in any order. Suppose we visit the vertices
in order 1, 3, 4. Now the traversal is-
aa 1S a
os
OO
Now first we visit all the vertices adjacent to 1, then all the vertices adjacent to 3 and then all the vertices
adjacent to 4. So first we will visit 2, then 6 and then 5, 7. Note that vertex 4 is adjacent to vertices 1 and 3, but
it has already been visited so we’ve ignored it. Now the traversal is -
O34 22:65"). |
I
Now we will visit one by one all the vertices adjacent to vertices 2,6,5,7. We can see that vertex 5 is adjacent
to vertex 2, but it has already been visited so we will just ignore it and proceed further. Now vertices adjacent to
. vertex 6 are vertices 4 and 7 which have already been visited so ignore them also. Vertex 5 has no adjacent
vertices. Vertex 7 has vertices 5 and 8 adjacent to it out of which vertex 8 has not been visited, so visit vertex 8.
Now the traversal is-
O13 4 2.0:5.18
Now we have to visit vertices adjacent to vertex 8 but there is no vertex adjacent to vertex 8 so our procedure
stops.
This was the traversal when we take vertex 0 as the starting vertex. Suppose we take vertex 1 as the starting
vertex. Then applying above technique, we will get the following traversal-
124578
Here are different traversals when we take different starting vertices.
Start Vertex Traversal
013426578
124578
25
346578
4578
5)
64758
758
©
—
ONDNKWY8
Gray hs Je 355 |
Note that these traversals are not unique, there can be different traversals depending on the order in which
we visit the successors. |
We can see that all the vertices are not visited in some cases. The vertices which are visited are those vertices
which are reachable from starting vertex. So to make sure that all the vertices are visited we need to repeat the
same procedure for each unvisited vertex in the graph. Breadth first search is implemented through queue.
Let us take vertex 0.as the starting vertex for traversal in the graph of figure 7.32. In each step we will show
the traversal and the contents of queue. In the figure, the different states of the vertices are shown by different
eo White color indicates initial state, grey indicates waiting state and black indicates visited state.
(6)
| !
Here we can/see why we have taken the concept of waiting state. Vertex 4 is in waiting state i.e. it is already
present in the queue so it is not inserted into the queue. The concept of waiting state helps us avoid insertion of
duplicate vertices in the queue.
356 Data Structures through C in Dept h
Here we can see why we have taken the concept of visited state. Vertex 4 is in visited state i.e. it has been -
included in the traversal, so there is no need of its insertion into the queue. The concept of visited state helps us
avoid visiting a vertex more than once.
Now the queue is empty so we will stop our process. This way we get the breadth first traversal when vertex
0 is taken as the Starting vertex.
/*P7.6 Program for traversing a directed graph through BFS, visiting only those vertices that
are reachable from start vertex*/
#include<stdio.h>
#include<stdlib.h>
#define MAX 100
#define initial 1
#define waiting 2
#define visited 3
ant n; {*Number of vertices in the graph*/
int adj [MAX] [MAX]; /*Adjacency Matrix*/
int state[MAX]; /*can be initial, waiting or visited*/
void create_graph();
void BF_Traversal();
void BFS(int v);
int queue[MAX],front = -1,rear = -1;
}
’ }
| pripeL (o\ns );
i}7*End.of BFS().*/
«oid insert_quéeue(int vertex)
if (rear==MAX-1)
printf ("Queue Overflow\n");
else
358 Data Structures through C in Depth
{
if(front==-1) /*If queue is initially empty*/
front = 0;
rear = rear+1;
queue[rear] = vertex ;
}
}4*End of insert_queue()
*/
int isEmpty_queue()
( '
if(front==-1 || front>rear)
return 1;
else
return 0;
}/*End of isEmpty_queue()
*/ \
int: delete_queue()
{ :
int del_item; ———-
if(front==-1 || front>rear)
{
printf("Queue Underflow\n") ;
exit (1);
}
del_item = queue[front];
front. = f£ront+.;
return del_item;
}/*End of delete_queue()
*/
Now suppose that while performing breadth first search, we assign two values to each vertex in the graph —a
predecessor and a distance value. Whenever we insert a vertex in the queue we set its predecessor and distance
values. 2
The predecessor of starting vertex is taken as NIL(-1). The distance value of starting vertex is taken as 0 and
the distance value of any other vertex is one more than the distance value of its predecessor. If we take the case
of example described in section 7.12.1.1, then the predecessor and distance values set in different steps would
be- ' »
/
Graphs
359
tee SO Tat
ee
(a) Graph with predecessor values
ES
(b) Graph with distance values -
Figure 7.33
The distance value of a vertex u gives us the shortest distance (number of edges) of u from the starting
vertex. For example the shortest distance from vertex 0 to vertex 8 is 3. This shortest path can be obtained by
following the predecessors values till we get the start vertex as a predecessor. Let us see how we can get this
path in the case of 8. Predecessor of 8 is 7, predecessor of 7 is 4, predecessor of 4 is 0, so the shortest path is 0 4
7 8.
There may be other paths from vertex 0 to vertex 8 but length of none of them would be less than 3. Hence
breadth first search can be used to find the shortest distances to all vertices reachable from the start vertex in
unweighted graphs. The length of shortest path is given by distance value and this shortest path can be obtained
by following the predecessor values.
/*P7.6b Program for traversing a directed graph through BFS,and finding shortest distance and
shortest path of any vertex from start vertex*/
#include<stdio.h>
#include<stdlib.h>
#define MAX 100 ; ;
#define infinity 9999 .
#define NIL -1
#define initial 1
#define waiting 2
#define visited 3
mnt ns; /*Number of vertices in the graph*/
int adj [MAX] [MAX]; /*Adjacency Matrix*/
int state[MAX]; /*can be initial, waiting or visited*/
int distance [MAX];
int predecessor [MAX];
void create_graph() ;
void BF_Traversal();
void BFS(int v);
int queue[MAX],front = -1,rear = -1;
void insert_queue(int vertex) ;
int delete_queue();
int isEmpty_queue() ;
main()
.
{
int u,v,i,count,path[MAX];
create_graph();
BF_Traversal ();
while(1)
{
printf("Enter destination vertex(-1 to quit). = “);
scanf ("%d", &v);
if (v<-1 || v>n-1)
{ ’
}
}
}/*End of BFS()*/
In the program, we initialize the distance values of all vertices to infinity (a very large number), and the
predecessor values are initialized to NIL(-1). Since our vertices start from 0, we can take the value of NIL equal
to -1. In the function BF_traversal(), we don’t write the for loop as we don’t have to find out shortest
distances and paths to vertices that are not reachable.
Now let’s look at the graph with predecessor values. If we draw only those edges which join a vertex to its
predecessor then we get the predecessor subgraph which is a spanning tree of the graph. This spanning tree is
called the breadth first search spanning tree.
QO DO, . 2 a |
x
(8)
(a) (b)
: Figure 7.34 :
The figure 7.34(a) shows the breadth first spanning tree. This figure is redrawn on the right so that it looks
like a tree with vertex 0 as the root.
If all vertices are not reachable form start vertex then we get a BFS spanning forest having more than one
spanning tree. For example if in the above graph we start traversing from vertex 4 then we get two spanning
rees. Starting at vertex 4 we can visit only vertices 5,7,8 as these are the only vertices reachable from 4. After
his we select vertex 0 as the next start vertex and then rest of the vertices are visited. So we get a BFS spanning
forest consisting of two spanning trees.
Shale Sb
insert_queue(v) ;
state[v] = waiting;
while(!isEmpty_queue() )
{
v = delete_queue();
362 ~ Data Structures through C in Depth
printf("Vertex td visited\n",v) ;
state[v] = visited;
for(i=0; i<n; i++)
{
if (adj(v] (i]J==1 && state[i]==initial)
{
insert_queue(i) ;
state[i] = waiting;
PEInth( t=—<——— Tree edge - (%d,%d)\n",v,i);
}
}
}/*End of BFS()*/
Breadth first search in an undirected graph is performed in the same manner as in a directed graph. Consider the
undirected graph given below.
- » As stated before, this traversal is not unique; there may be other traversals depending on the order of visiting
of successors.
If the undirected graph is connected, then we can reach all the vertices taking any vertex as the start vertex.
If the graph is not connected, then we can visit only those vertices which are in the same connected component
as the start vertex. So we can pick another unvisited vertex and start traversing from there and continue this
procedure till all vertices are visited. The foilowing figure shows a disconnected undirected graph and its BFS
spanning forest. There is a spanning tree corresponding to each connected component of the graph.
int v;
int connected = 1;
for (v=0; v<n; v++)
state[v]=initial;
BFS(0); /*start BFS from vertex 0*/
for(v=0; v<n; v++)
{
if (state[v]==initial)
{
connected = 0;
break;
ar
}
if (connected)
printf("Graph is connected\n") ;
else
printf("Graph is not connected\n");
}/*End of BF_Traversal()*/
We can find all the connected components using breadth first search. To find the connected components, all
verticesin the graph are given a label such that vertices in same component get the same label. For example in
the graph 7.37(a) we have three connected components, all vertices in first component have label 1, vertices in
second component have label 2, vertices in third component have label 3.
pe 2 2 3 3
GQ) GRD @9
CO @ ~ i ING:
@ W—-Y, &%
Figure 7.38
/*P7.6f£ Program to find connected components in an undirected graph*/
#include<stdio.h>
#include<stdlib.h>
#define MAX 100
#define initial 1
#define waiting 2
#define visited 3
Brit 1; /*Number of vertices in the graph*/
int adj [MAX] [MAX]; /*Adjacency Matrix*/
int state[MAX]; /*can be initial, waiting or visited*/
int label[(MAX]; /*Denotes the Component Number*/
void create_graph() ;
void BF_Traversal();
void BFS(int v, int component_Num) ;
int queue(MAX],front = -1,rear = -1;
void insert_queue(int vertex);
int delete_queue():;
int isEmpty_queue() ;
main ()
{
create_graph() ;
BF_Traversal();
}/*End of main()*/
void BF_Traversal()
{
int v,components = 0;
for (v=0; v<n;v++)
state(v] = initial;
components++;
364 ta Structures through C in Depth
}
}
jacana MG aN f
}/ End OL BES ())*7,
Figure 7.39
First, we will visit vertex 0. Vertices adjacent to vertex 0 are 1 and 3. Suppose we visit vertex 1. Now we
look at the adjacent vertices of 1; from the two adjacent vertices 2 and 4 we choose to visit 2. Till now the
traversal is -
012
There is no vertex adjacent to vertex 2, means we have reached the end of the path or a dead end from where
we can’t go forward. So-we will move backward. We reach vertex 1 and see if there is any vertex adjacent to it,
and not visited yet. Vertex 4 is such a vertex and therefore we visit it. Now vertices 5 and 7 are adjacent to 4
and unvisited, and from these we choose to visit vertex 5. Till now the traversal is -
01245
There is no vertex adjacent to vertex 5 so we will backtrack. We reach vertex 4, and its unvisited adiacent
vertex is 7 so wevisit it. Now vertex 8 is the only unvisited vertex adjacent to 7 so we visit it. Til’ now the
“raversal is -
24578
Trt
Vertex 8 has no unvisited adjacent vertex so we backtrack and reach vertex 7. Now vertex 7 also has no
unvisited adjacent vertex so we backtrack and reach vertex 4. Vertex 4 also has no unvisited adjacent vertex so
we backtrack and reach vertex 1. Vertex 1 also has no unvisited adjacent vertex so we backtrack and reach
vertex 0. Vertex 3 is adjacent to vertex 0 and is unvisited so we visit vertex 3. Vertex 6 is adjacent to vertex 3
und is unvisited so we visit vertex 6. Till now the traversal is -
012457836
366 Data Structures through Cin Depth
Now vertex 6 has no unvisited adjacent vertex so we backtrack and reach vertex 3. Vertex 3 also has no
unvisited adjacent vertex so we backtrack and reach vertex 0. Vertex 0 also has no unvisited adjacent vertex left
and it is the start vertex so now we can’t backtrack and hence our traversal finishes. The traversal is-
012457836.
Depth first search can be implemented through stack or recursively.
vertex in the queue that was in waiting state. In DFS, we don’t have the concept of waiting state and so there
may be multiple copies of a vertex in the stack.
If we don’t insert a vertex already present in the stack, then we will not be able to visit the vertices in depth
first search order. For example if we traverse the graph of figure 7.40 in this manner then we get the traversal as
0 1 2.4 7 8 9 5 6 3, which is clearly not in depth first order.
/*P7.7 Program for traversing a directed graph through DFS, visiting only vertices reachable
from start vertex*/ ;
#include<stdio.h>
#include<stdlib.h>
#define MAX 100
#define initial 1
#define visited 2
Bt x; /* Number of nodes in the graph */
int adj [MAX] [MAX]; /*Adjacency Matrix*/,
int state[MAX]; /*Canbe initial or visited */
void DF_Traversal();
void DFS(int v);
void create_graph();
int stack[MAX];
int top = =1;
void push(int v);
int pop(); f
int isEmpty_stack();
main()
{
create_graph();
DF_Traversal
();
}/*End of main()*/
void DF_Traversal()
{
able hi¢5
for (v=0; v<n; v+t)
state([v])=initial;
printf("Enter starting node for Depth First Search : ");
scanf ("%d",&v);
DFS (v) ;
}/*End of DF_Traversal( )*/ =
void DFS(int v)
{
aie:
push(v) ;
while(!isEmpty_stack() )
{ “ : :
v = pop();
if (state[v]==initial)
{
DIANE (ESA .vi)) 5
state[v]=visited;
}
owenieee >=0)jee —)
{
if(adj[v][iJ==1 && state[i]==initial)
push(i);
}
}
S
f
)/*ina Of DES (. ) *7
void push(int v)
{
if (top==(MAX-1) )
I {
368 Data Structures through Cin th
printf("Stack Overflow\n") ;
return;
}
top=top+1;
stack[top] = v;
}/*hndsot spusia ()i*7
int pop ()
{
nae AA
1£ (top==-1)
{
printf("Stack Underflow\n") ;
exit (1);
}
else
{
v = stack[top];
top=top-1;
return v;
}/*End of pop
() */
int isEmpty_stack( ) \
{
58 ((eteje}=—— JE)
return 1;
else
return 0;
}/*End if isEmpty stack()*/
void create_graph()
{
S int i,max_edges, origin, destin;
printf("Eater number of nodes : ");
scanf ("%d", &n) ;
max_edges = n*(n-1);
If all vertices are not reachable from the start vertex then we need to repeat the procedure taking some other
start vertex. This is similar to the process we had done in breadth first search. In the function
DF_Traversal (), we will add a loop which will check the state of all vertices after the first DFS.
As in BFS, ‘here also we can assign a predecessor to each vertex and get the predecessor subgraph which
would be a spanning tree or spanning forest of the given graph depending on the reachability of all vertices
from the start vertex.
Graphs { 36
é
q()
Se:
0) (2) @
G) ©
@ 6 . (6) “G7)
’ ONO
Graph with Predecessor of each vertex Spanning Tree
Figure 7.41
\
void DF_Traversal() ‘
{
Tah Wie;
“for (v=0; v<n; v++)
{
state[v] = initial;
predecessor[v] = NIL;,
} «
}
}/*End of. DFS( )*/
For an undirected graph, the depth first search will proceed in the same manner. Like breadth first search, we
can use depth first search also to find whether a graph is connected or not and for finding all the connected
components.
The figure 7.42 shows the depth first search for the graph of figure 7.39,
vertices which are visited are in
rey color and vertices which are finished are in black color. We
have also given two numbers to each vertex,
liscovery time and finishing time. The discovery time of a vertex is assigned when
the vertex is first discovered
nd visited i.e. when it becomes grey. The finishing time of a vertex is assigned
when we backtrack from it, and
t becomes black. As earlier, we have chosen to visit the successors in ascending order
of their number i.e. if a
ertex has | and 3 as successors then first we will visit 1 and then 3.
|
w
Ww > :
[tee INN
KN Race: aoe
wn aa —~
®
© KJ
Visit 0, 1,2 2 finished, back tlo} 5 finished, back to 4
i)
\ oyt z a
IN beame
i)
KN
wat ~ aN nN
Dee Visit 7, 8
.@:
© 9.10
8 finished, back to 7
Ss 9,10
7 finished, back to 4
peta so So
4 finished, back to |
| me)
B ~
4
I) ws) oe) tL NM WwW - ww =
al Nh
8,11
| finished, back to 0
\o,10
ct 8,11
Visit 3, 6
Ni9,10
15,16 @—> N 9,10
8
8,11
6 finished, back to 3
9
~) ion) ~ ion) ~_
KN
Tees oo 9,10 8,11
NI
9,10
If we number all these steps then we can get the discovery time and finishing time of all the vertices.
1. Call DFS(0) , visit 0
2. lis Ad to 0 and In: Call DFS(1), visit 1
3. 2 is Ad to 1 and In: Call DFS(2) , visit 2
4. Vertex 2 finished
5.4 is Ad to | and In: Call DFS(4), visit 4
6. 5is Ad to4 and In: Call DFS(5) , visit 5
7. Vertex 5 finished
8. 7is Ad to 4 and In: Call DFS(7), visit 7
9. 8 is Ad to 7 and In-: Call DFS(8) , visit 8
10. Vertex 8 finished
11. Vertex 7 finished
12. Vertex 4 finished
13. Vertex | finished
14.3 is Adto0O and In: Call DFS(3), visit 3
; 15. 6 is Ad to 3 and In : Call DFS(6) , visit 6
16. Vertex 6 finished
17. Vertex 3 finished
18. Vertex 0 finished
We can make simple additions in our recursive algorithm for recording the discovery time and finishing time
of each vertex. We will take a variable time and initialize it to 0, and increment it whenever we visit a vertex or
finish a vertex. The discovery times and finishing times are stored in arrays d and f respectively.
void DFS(int, v)
{
dag bh
timet++;
d[v] = time; /*discovery time*/
state[v] = visited;
jopantnal
ere BL ol Wy heals
ForiGi=O-ee<nee ct)
{
LEAs Vali ==)
{ .
if (state[i]==initial)
DFS (1) ;
}
}
state[v] = finished;
£{[v] = ++time; /*Finishing time*/ 4
}/*End of DFS()*/
Graphs 37
lse=
Let us take a directed graph and apply DFS on it taking 0 as the start vertex.
Figure 7.43
Vertices 1, 2, 3 are reachable from 1, so these are visited and then we select 4 as the next start vertex. The
vertices 5, 6,7 8,9 are réachable from 4 so these are visited and then we select 10 as the next start vertex. All the .
remaining vertices are reachable from 10. The traversal would be -
ml, 2, 3, 4,)5,.6, 957758740, 11,12, 15, 14, 13
\ The DFS spanning forest for the graph of figure 7.43 and different types of edges are shown in the figure
7.44.
The following function DFS ()classifies all the edges in a directed graph.
void DFS(int v)
{
int 2;
time++;
d[v] = time;
state[v] = visited;
for(i=0; i<n; i++)
{
: if (adj [v] [i]==1)
{
if (state[i]==initial)
{
printf£("(%d,%d) - Tree edge\n",v,i);
DFS (i);
} .
else if (state[i]==visited)
{ :
printf ("(%d,%d) - Back edge\n",v,i);
}
else if (d[v]<d[il)
else .
printf("(%d,%d) - Cross edge\n",v,1);
}
ut
state[v] = finished;
f[v].'= ++time;
/*End of DFS
() 3/
374 Data Structures throu h C in Depth
Tree edges
(0, 1), (1, 2), (0, 3), (4, 5), GS, 6),
(6, 9), (4, 7), (7, 8), (10, 11), GUE
(12, 15), (11, 14), (14, 13)
Back Edges
(9, 5), (8, 4), (13, 10)
Forward Edges
(0, 2), (4, 6), (11, 15), 10, 14)
@ 7
Spanning Forést and Forward edges
Cross Edges
(3, 2), (4, 1), (6, 3), (8, 5), oa
-
9Om be" E. 4©
--7
Figure 7.44
Now let us take an undirected graph and perform depth first search on it.
Figure 7.45
Graphs ’ 375
Figure 7.46
In an undirected graph, there is no difference between back edges and forward edges because all edges are
bidirectional. So in an undirected graph, any edge between a vertex and its non son descendant is a back edge.
Cross edges are also not possible in the depth first search of an undirected graph. Therefore in the depth first
search of an undirected graph, every edge is classified as either tree edge or back edge. The following figure
shows the tree edges and back edges for the undirected graph given above.
'
'
i]
‘
t]
t]
'
4
‘
Figure 7.47
From this figure it is clear that in an undirected edge there can be no distinction between a forward edge and
back edge so all edges from a vertex to non son descendants are back edges. Now let us see why cross edges are
not possible. Suppose there was an edge in the graph from 12 to 14, then it seems to be a candidate for a cross
edge looking at this spanning forest. But had the edge (12, 14) existed in the graph, the last spanning tree would
have been different. In that case, 14 would be visited after 12 and so 14 would be the son of 12. Hence we can’t
have cross edges in a spanning tree of the forest. Similarly we can’t have cross edges between different
spanning trees. The following function DFS()shows how to classify the tree edges and back edges in an
undirected graph.
void DFS(int v)
( :
rites de
"state[v] = “visited;
for (1=0; -i<n; i++)
{
if (adj[v] [i]==1 && predecessor
[v] !=1)
{
if(state({1]==initial)
{
predecessor[i] = v;
printf ("(%d,%d) Tree edge\n",v,i);
DFS (i);
else if (state[i]==visited)
i
printf£("(%d,%d) Back edge\n", v,i);
|
376 Data Structures through C in Depth
}
}
}
state[v] = finished;
}/*End of DFS()*/
In an undirected graph, an edge between two vertices v1 and v2 is considered twice during DFS, once from
v1 with v2 as its adjacent vertex, and once from v2 with v1 as its‘adjacent vertex. This can cause confusion in
classifying the edges. To avoid this confusion, an edge between v1 and v2 is classified according to whether —
(v1,v2) or (v2,v1) is encountered first during the traversal. For example if there is an edge between vertices 5
and 8 of a graph. Now if vertex 5 is visited first then we will call edge (5,8) as tree edge. Vertex 5 will be made
the predecessor of vertex 8. Now when vertex 8 will be visited, we will not consider edge (8,5). This is because
we know that this edge has already been considered since 5 is predecessor of 8.
- Depth first search can be used to find out whether a graph is cyclic or not. In both directed and undirected
graphs, if we get a back edge during depth first traversal then the graph is cyclic.
‘3. Perform Depth First Search on G®, always picking the vertex with highest finishing
time for a new DES.
The depth first trees in the depth first forest thus formed are the strongly connected
components of G.
While performing DFS on the reverse graph, we start from the vertex that has the highest finishing
time; all
vertices reachable from this vertex will be in the same SCC. Now from the remaining vertices
we pick the
vertex that has highest finishing time and start DFS from there, all vertices reachable from this vertex will
be in
the same SCC. This process continues till all the vertices are visited. Let us take a graph and find
strongly
connected components for it.
(a) Graph G with start and finishing times in DFS, taking vy, v4, Vs, Vo as start vertices
DFS > Vos Vy V3> V2, V45Vg, V5, V65 V7, Vo,Vil»
Yo
17 16 24
Figure 7.48
In figure 7.48(a), the graph is shown with the start and finishing time of each vertex, when DFS is done with vo,
V4, V5, Vo as Start vertices. In figure 7.48(b), the graph G is reversed and the finishing time from (a) is shown for
cach vertex. :
To find strongly connected components, we will start DFS of the reverse graph from vertex Vo since it had
the highest finishing time in previous DFS. The vertices vo , vio, Vi, are visited so they form one SCC. Now
from the remaining vertices vs has the highest finishing time so we start DFS from there. The vertices vs, V7, Vo
ire visited so they form another SCC. Now from the remaining vertices vo has the highest finishing time so we
start DFS from there. The vertices vo, V2, V3, Vi are visited so they form another SCC. Next v4 has highest
finishing time so DFS starts from there and only vy is visited and so it forms one SCC. The remaining vertex vg
ulso forms a SCC. The strongly connected components are-
[ Vo, Vio» Vir }s {¥5» V7, Vo}, (Vo, V2, V3, Vi }, {Va}, {Va}
@)
1 4 9
re 2
6
Figure 7.49
We will study three algorithms for finding out the shortest paths in a weighted graph.
(i) Dijkstra’s Algorithm - Single source, non negative weights.
(ii) Bellman Ford Algorithm - Single source, general weights.
(iii) Floyd’s or Modified Warshall’s algorithm - All pairs shortest paths.
~ The first two algorithms are single source shortest paths algorithms (this source is different from the 0
indegree source defined in section 7.3). In these shortest path problems, a vertex is identified as the source
vertex, and shartest path from this vertex to all other vertices is found out. This source vertex can also be called
the start vertex. Dijkstra’s algorithm works only for non ngEalive weights while Bellman Ford algorithm can be
used for negative weights also.
There is no algorithm that finds shortest path from source to a single destination, and is faster than the single
source shortest paths algorithms. So finding shortest paths from a single source to all vertices is as simple as
finding shortest path to a single destination.
The Floyd’s algorithm is an all pairs shortest path problem. Here we get the shortest paths between eich pair
of vertices of the graph. This is also called Modified Warshall’s algorithm because it is based on the Warshall’s
algorithm that we studied earlier in section 7.11.2.
The shortest paths can prove useful in various situations. For example suppose our weighted graph
represents a transport system, where each vertex is_a city and weights on the edges represent the distance of one
city from another. We would be interested in taking the shortest route to reach our destination. Similarly the
graph may represent an airlines network or a railway track system and the weights on the edges may represent
the distance, time or cost. In all these cases, the knowledge of shortest paths helps us select the best option.
Electric supply system and water distribution system also follow this approach. In computers, it is very useful in
network for routing concepts.
Figure 7.50
Now we will see how we can obtain these shortest paths using Dijkstra’s algorithm. Each vertex is given a
status, which can be permanent or temporary. If a vertex is temporary, then it means that shortest path to
it has
not been found and if a vertex is permanent then it means shortest path to it has been found. Initially
all the
vertices are temporary and at each step of the algorithm, a temporary vertex is made permanent.
Graphs 379
We label each vertex with pathLength and predecessor. At any point of the algorithm, the pathLength
of a
vertex v will denote the length of shortest path — known till now — from source vertex to v. The predecessor
of v
will denote the vertex which precedes v in this path.
Initially the pathLength of all vertices are initialized to a very large number which denotes that at the start of
algorithm we don’t know of any path from source to any vertex. We will call this number as infinity. The
predecessor value of all vertices is initialzed to NIL. In the program we can take NIL to be any number that
does not represent a valid vertex. We will take it -1 in our program since our vertices start from 0.
At the end of the algorithm, pathLength of a vertex will represent the shortest distance of that vertex from
the source vertex and predecessor will represent the vertex which precedes the given vertex in the shortest path
from source.
As the algorithm proceeds, the values of pathLength and predecessor of a vertex may be updated many times
provided the vertex is temporary. Once a vertex is made permanent the values of pathLength and predecessor
for it become fixed and are not changed thereafter. It means that temporary vertices can be relabeled if required,
but permanent vertices can’t be relabeled.
When a temporary vertex is made permanent, it means that the shortest distance for it has been finalised. So
pathLength of a permanent vertex represents the length of shortest path from source to this vertex, i.e. no other
path shorter than this is possible.
At any point of the algorithm, the pathLength of a temporary vertex represents the length of best known path
— from source to this vértex— till now, it is, possible that there may be some other better path which is shorter
than this one. So whenever we will find a shorter path we will update the pathLength and predecessor values of
this temporary vertex. We will try to find shorter paths by examining the edges incident from the vertex which
is made permanent most recently. :
We have stated that at each step a vertex will be made permanent, now the question is which vertex shoul
be chosen to become permanent. For this we use the greedy approach. In greedy algorithms we generally
perform an action which appears best at the moment. Greedy algorithms do not always give optimal results in
general, but in this case we get the correct result and we will see the proof at the end. Applying the greedy
approach, in each step the temporary vertex that has the smallest value of pathLength is made permanent.
Now let us look at the whole algorithm stepwise and see how the pathLength and predecessor values are
updated to get shorter paths and how we finally get the shortest paths for all the vertices. The procedure is-
(A) Initialize the pathLength of all vertices to infinity and predecessor of all vertices to NIL. Make the status of
all vertices temporary.
(B) Make the pathLength of source vertex equal to 0.
(C) From all the temporary vertices in the graph, find out the| vertex that has minimum value of — pathLength,
make it permanent and now this is our current vertex.(If there are many with the — same value of pathLength
then anyone can be selected).
(D) Examine all the temporary vertices adjacent to the current vertex. The value of pathLength is
recalculated for all these temporary successors of current, and relabelling is done if required. Let us sce how
this is done.
Suppose s is the source vertex, current is the current vertex and v is a temporary vertex adjacent to
eunrrent.
Figure 7.51
Initially no paths are known so the pathLength values for all the vertices are set to a very large number(which is
larger than length of longest possible path), we will represent this by o in the figure. The predecessor of all
vertices is NIL in the beginning. Source vertex 0 is assigned a pathLength of 0. In the figures, all vertices will
be joined to their predecessors. A permanent vertex will be joined to its predecessor by a bold arrow.
0, NIL eo, NIL
@
7, NIL@2) G@) G) ©) =NiL
oo, NIL oo, NIL
© @
oo, NIL oo, NIL
From all the temporary vertices, vertex 0 has the smallest pathLength,
Leng
a ase so make it permanent. Its predecessor will remain NIL. Now vertex 0
is the current vertex.
Vertices 1, 2, 3 are temporary vertices adjacent to vertex 0.
pathLength(0) + weight(0,1) < pathLength(1) 0+8 < oo
Relabel 1, pathLength[1] = 8, predecessor[1] = 0
pathLength(0) + weight(0,2) < pathLength(2) 0+2 <co
Relabel 2, pathLength[2] = 2, predecessor[2] =
pathLength(0) + weight (0,3) < pathLength(3) 0+7 <co
Relabel 3, pathLength[3] = 7, predecessor[3] =
0, NIL} 8,0
; w® ie 8“>G)
oo, NIL
oo, NIL
0, NIL} 8,0
path prede From all temporary vertices, vertex 1 has smallest pathLength
-cessor so make it permanent. Now vertex | is the current vertex.
0 0 IL
Vertex 5 is the temporary vertex adjacent to |.
pathLength(1) + weight(1,5) < pathLength(5) 8+16 < co
Perm
Relabel 5, pathLength[5] = 24, predecessor[5] = 1
aces
At the end we get a shortest path tree which includes all the vertices reachable from the source vertex. The
source vertex is the root of this tree and the shortest paths from source to all vertices are given by the branches.
Each vertex has a predecessor so we can easily establish the path after completing the whole process. To find
the shortest path from source to any destination vertex, we look at the last table, start from the destination vertex
and keep on following successive predecessors till we reach the-source vertex.
If destination vertex is 3
predecessor of 3 is 2, predecessor of 2 is 0
Shortest Path is 0 - 2 - 3
If destination vertex is 5
predecessor of 5 is 4, predecessor of 4 is 6, predecessor of 6 is 2, predecessor of 2 is 0
Shortest path is 0-2 -6-4-5
_If we want to find the shortest distance between source and a single destination only, then we can stop our
algorithm as soon as the destination vertex is made permanent.
In the example that we have taken, all the vertices of the graph were reachable from the source verte-:, <. ail
the vertices were permanent in the last table. If there are one or more vertices in the graph that are not reachable
from the source vertex, then the shortest path to all these vertices cannot be found so they will never become
Farm
permanent. For example in the graph given below the vertices 1 and 7 are not reachable from source vertex 0.
Figure 7.52
Following the same procedure, vertices 0, 2, 6, 3, 4, 5 will be made permanent and then the pathLength and
predecessor values of all vertices will be-
oo, NIL
|NIL
IL |Perm | @)
Figure 7.53
Now all the temporary vertices left have pathLength equal to infinty. While stating the procedure of Dijkstra’s
algorithm we had mentioned that the procedure will stop when either no temporary vertices are left
or all
temporary vertices left have pathLength equal to infinity. So in this example we will stop after
making vertex 5
permanent and the vertices 1 and 7 will never be made permanent. These vertices are not reachable from
the
source vertex.
Graphs 385
create_graph();
printf("Enter source vertex : ");
scanf("%d",&s
Dijkstra(s);
while(1)
tae ce
printf("Enter destination vertex(-1 to quit): ");
scanf ("%d", &v) ; ;
Lt (v==- 1) \
break;
if(v<0 || v>=n)
printf("This vertex does not exist\n") ;
else if (v==s)
printf("Source and destination vertices are same\n") ;
else if (pathLength[v]==infinity)
printf("There is no path from source to destination yertex\n") ;
else
findPath(s,v);
}
}/*End of main()*/
woid Dijkstra(int s)
K
int i,current;
}
}
}/*End Of -DIgksitra ()*7
/*Returns the temporary vertex with minimum value of pathLength, Returns NIL if no temporary
vertex left or all temporary vertices left have pathLength infinity*/
int min_temp()
{
Hine “St
int min = infinity;
Haley Je = ANERIAS
for (i=0; i<n; i++)
{
if (status[i]==TEMP && pathLength[i]<min)
{
min = pathLength[i];
iets as
}
}
return k;
}/*End of min_temp()*/
void findPath(int s,int v)
{
sal eagecleral
int path[MAX]; /*stores the shortest path*/
; int shortdist-= 0; /*length of shortest path*/
int count = 0; /*number of vertices in the shortest path*/
/*Store the full path in the array path*/
while(v!=s)
{
count++;
path[count] = v;
u = predecessor([v];
shortdist += adjl[u][v)];
we Se
}
count++;
path[count]=s;
DiEtnEL (ushortest. Path vice)
- for (i=count;—i>=1;° i-—)
PriIntE("td pach ti )!) >
printf("\n Shortest distance is : %d\n", shortdist) ;
}/*End of findPath()*/
void create_graph()
{
int i,max_edges,
origin, destin, wt;
printf("Enter number of vertices : ");
scanf ("%d", &n) ;
max_edges = n*(n-1);
for(i=1; i<=max_edges; i++)
{
printf("Enter edge %d(-1 -1 to quit) : ",i);
scanf("%d %d",&origin, &destin) ;
Graphs ime 387
if((origin==-1) && (destin==-1) )
break;
printf("Enter weight for this edge : ");
scanf ("%d", &wt) ;
if(origin>=n || destin>=n || origin<0O || destin<0)
printf("Invalid edge!\n");
i=>5
else
adj [origin] [destin] = wt;
The function create_graph() is same as in program P7.2, Now after studying the whole procedure let us see
why this algorithm works and gives the optimal result.
Figure 7.54
Suppose we have 4 temporary vertices x, y, z, v. Vertices inside circle are permanent vertices and vertices
outside circle are temporary vertices. We take the vertex with smallest pathLength and declare that shortest path
_to it has been finalized and make it permanent. In this case v has shortest value of pathLength so it is declared
permanent. Let us see why this is the shortest path for v.
If this is not the shortest path for v, then suppose there exists a hypothetical shorter path going through z.
This path goes from s to z and then z to v. Now length of this path is equal te sum of pathLength(z) and
weight(z,v). We are claiming that this path is shorter so the value pathLength(z)+weight(z, v) should be smaller
than pathLength(v). If this is so then pathLength(z) will also be smaller than pathLength(v). Here. comes the
contradiction - if pathLength(z) is smaller than pathLength(v) then our greedy approach would have chosen z to
be made permanent instead of v. So we have proved by contradiction that pathLength(v) is the shortest distance
from s to v.
Figure 7.55
/
The shortest distances from the source vertex 0 to all other vertices as computed by Dijkstra’s algorithm are-
/ [Source
|Destination | Shortest Path en
Ove |11 ~Olwset |.0-Ptrneerts|
Ca ae ae a
ee ees
Ms
a0 CR ae a ed
Pa ee as 0-2-3-5 ie
Gos
0-2-3-5-6 P23 aris|
SSS Ee |
0-2-3-5-6-7| 10 __|
Figure 7.56
_Using Dijkstra’s algorithm, |the shortest path from 0 to 1 is the path O—1 of length 8. But if we observe the
“graph carefully we see that there exists a shorter path 0-2-1 of length 5. Similarly for destination vertex 5
there exists a path O+2—1—>5 of length 14 which Dijkstra’s algorithm was unable to identify.
So we can see that Dijkstra’s algorithm fails if the graph consists of negative weights. This is so because in
Dijkstra’s algorithm once a vertex is made permanent we don’t relabel it i.e. the shortest path to it is finalized.It
is possible that after making 7.vertex permanent we find an edge of negative weight that can be used to reach
the vertex and hence we get/a shorter path. But we have already finalized the shortest path of the vertex by
making it permanent, so we can’t record this shorter path.
In the graph of figure 7.55, if we apply Dijkstra’s algorithm, then the vertices are made permanent in the
order0 42122" 3-56
After making vertex 2 permanent, the edge 2—1 of length -4 is considered and we find a shorter path
0—2—1 but it is not recorded, as vertex 1 is already permanent.
In Dijkstra’s algorithm we make a vertex permanent at each step, ie. the shortest distance to a vertex is
finalized at each step but in Bellman Ford algorithm the shortest distances are not finalized till the end of the
algorithm. Thus in Bellman-Ford algorithm, we drop the concept of making vertices permanent. This is why
Dijkstra’s algorithm is known as label setting algorithm and Bellman Ford algorithm is known as label
correcting algorithm.
Each vertex is labeled with a pathLength and a predecessor value as in Dijkstra’s algorithm. The procedure for
Bellman Ford algorithm is-
(A) Initialize the pathLength of all vertices to infinity and predecessor of all vertices to NIL.
(B) Make the pathLength of source vertex equal to 0 and insert it into the queue.
(C) Delete a vertex from the front of the queue and make it the current vertex.
(D) Examine all the vertices adjacent to the current vertex. Check the condition of minimum weight for
these vertices and do the relabeling if required, as in Dijkstra’s algorithm.
(E) Each vertex that is relabeled is inserted into the queue provided it is not already present in the queue.
(F) Repeat the steps (C), (D), and (E) till the queue becomes empty.
The whole procedure for the graph of figure 7.55 is shown in the following table.
389
Adjacent
vertices
0 94+5>0 :
> 0.1.3 1 9+(-4)<8_ pathLength(1) =5, pred(1) = 2, Enqueue |
7s 3 943 <0o pathLength(3) =12,pred(3) = 2, Enqueue 3
17+4>0
17+(-8) < co pathLength(6) =9, pred(6)=5, Enqueue 6
17+5<23 pathLength(7) =22, pred(7)= 5, 7 already in queue
1 124355 y
12,5 | 27 1246>9
5 124+4>14
Beas
3. 94+5>12
7 942<22 pathLength(7) =11, pred(7)= 6, Enqueue 7
14+4>0
14+(-8) <9 pathLength(6) =6, pred(6)= 5, Enqueue 6
14+5> 11
1 114355
114+6>9
11+4>14
Figure 7.57
‘The shortest paths in this way computed by Bellman Ford algorithm for source vertex 0 are given below-
Figure 7.58
This algorithm will not work properly if the graph contains a negative cycle reachable from source vertex i.e.
a cycle consisting of edges whose weights add up to a negative number. For example consider the following
graph which contains a negative cycle 0— 1-3-0 of length -3.
\?
}
390 : Data Structures through C in Depth
Figure 7.59
Now suppose we have to find the shortest path from vertex 0 to vertex 3. One path from vertex 0 to vertex 3
is 0—1—3 of length -7. But there is a shorter path which is 0-1—3—0—1—3 of length -10. Each time we
visit the negative cycle the length of the shortest path will decrease by
-3 and hence the length of shortest path from vertex 0 to vertex 3 is -cc. So if the graph contains a negative cycle
then the number of edges in the shortest path is not finite and hence shortest paths are not defined for such
graphs.
In the Bellman Ford algorithm each vertex can be inserted into the queue maximum n times. If any vertex is
inserted more than n times, it indicates that there is negative cycle present in the graph. In this case we will be
stuck in an infinite loop. To come out of it we can count the number of insertion of any vertex and if it is greater
than n, we will come out of the loop stating that graph has a negative cycle. In the program we will count the
‘number insertions of source vertex.
/*P7.10 Program to find shortest paths-using Bellman-Ford algorithm*/
#include<stdio.h>
#include<stdlib.h>
#define MAX 100
#define infinity 9999
#define NIL -1
#define TRUE 1
#define FALSE 0
int nj; /*Number of vertices in the graph*/
int adj [MAX] [MAX]; /*Adjacency Matrix*/
int predecessor [MAX];
int pathLength [MAX];
int isPresent_in_queue [MAX];
int front,rear;
int queue [MAX];
void initialize_queue();
void insert_queue(int u);
int delete _queue();
int isEmpty_queue();
void create_graph();
vqid findPath(int s, int v);
int BellmanFord(int s);
main ()
{
int flag,s,v;
create_graph();
printf("Enter source vertex : ");
scanf("%d",&s);
flag = BellmanFord(s) ;
ube |(Gell Yop 11)
{
printf("Error : negative cycle in Graph\n");
exit (1);
}
while (1)
{
printf("Enter destination vertex(-1 to quit): ");
scanf("%d",
&v) ;
; 1£ (v==-1)
ee break;
A&E (v<0 || v>=n)
printf("This vertex does not exist\n");
391
else if (v==s)
printf("Source and destination vertices are same\n");
else if (pathLength[v] ==infinity)
Cer We ate is no path from source to destination vertex\n");
else
findPath(s,v);
}
)}/*End of main()*/
void findPath(int s, int v)
if
: Ant -i, u; ‘
int path[MAX]); /*scores the shortest path*/
int shortdist = 0; /*length of shortest path*/
int count = 0; /*number of vertices in the shortest path*/
/*Store the full path in the array path*/
while(v!=s)
{
count++;
path[count]) = v;
u = predecessor([v];
shortdist += adj[u][v];
V =u;
}
count++; /
path[count] =s; 2
printf("Shortest Path is : ");
for(i=count; i>=1; i--)
erinttl ("td * path [i] ).;
printf("\n Shortest distance is : d\n", shortdist) ;
}/*%End of findPath()*/
int BellmanFord(int s)
i
int k=0,i1,current;
for(i=0; i<n; i++)
{
predecessor[i] = NIL;
pathLength[i] = infinity;
isPresent_in_queue[i] = FALSE;
}
initialize_queue() ;
pathLength[s] = 0; /*Make pathLength of source vertex 0*/
insert_queue(s); /*Insert the source vertex in the queue*/
isPresent_in_queue[s] = TRUE;
while(!isEmpty_queue
() )
{
current = delete_queue();
isPresent_in_queue[current] = FALSE; rt
if (s==current)
k++;
if (k>n)
return -1;/*Negative cycle reachable from source vertex*/
for(i=0; i<n; i++)
{
if (adj(current] [i] != 0) '
if(pathLength[i] > pathLength[current] +adj [current] [i])
{
pathLength[i] = pathLength[current] +adj{current] [i];
predecessor[i] = current;
if (!isPresent_in_queue[i]) ’
{
insert_queue (i);
isPresent_in_queue[i]=TRUE;
392 . : : Data Structures through C in Depth
}
a;
' return 1;
}/*End of BellmanFord()*/
void initialize _queue()
{
abahis Bar
for (i=0; i<MAX; i++)
queue[i] = 0;
rear = -1;front.= -1;
}/*End of initailize_queue()
*/
int isEmpty_queue()
{
if (front==-1 || front>rear)
return 1;
else
return 0;
}/*End of isEmpty_queue()*/
void insert_queue(int added_item)
{
if (rear==MAX-1)
printf("Queue Overflow\n") ;
exit(1);
else
We have seen Warshall’s algorithm(section 7.11.2) that computes the path matrix of a graph and tells us
hether there is a path between any two vertices i and j. Now our problem is to find the shortest path between
vy two vertices i and j. We will take Warshall’s algorithm as the base and modify it to find out the shortest
ath matrix D, such that D[i][j] represents the length of shortest path from vertex i to vertex j. This resulting
gorithm is known as modified Warshall’s algorithm or Floyd’s algorithm or Floyd Warshall algorithm as its
asic structure was given by Warshall and it was implemented by Robert. W. Floyd. Any element D[i][j] of the
1ortest path matrix can be defined as- :
eo (if there is no path from vertex i to vertex j using vertices 0,1, 2, .....k)
We will also find predecessor matrices Pred.,, Predo,....... Pred,.1, where any element Pred,[i][j] can be
fined as-
Predecessor of j on the shortest path from vertex i to vertex j,
using only vertices 0,1,2,........ k as intermediate vertices.
ed,ili] =
-1 (if there is no path from vertex i to vertex j using vertices 0,1, 2, .....k)
.-1{i][j] = length of shortest path from i to j using only vertices 0, 1, 2, .......... n-1
e can find matrix D., from the weighted adjacency matrix by replacing all zero entries by .
length of edge from vertex i to vertex j
If there are n vertices in the graph then matrix D,., will represent the shortest path matrix D..Now our.
purpose is to find out matrices Do, Dj, D2, ......... , D,.1. We have already found out matrix D., by weighted
adjacency matrix. Now if we know how to find out the matrix D, from matrix D,.; then we can easily find out
matrices Do, Dy, Do ......... D,,; also. So let us see how we can find out matrix D, from matrix Dy...
We have seen in Warshall’s algorithm that P;[i][j]=1 if any of these two conditions is true.
1. Palilij=!
2. Pili |[k]=1 and P,.,{k]j] =1
This means there can bea path from vertex i to j using vertices 0,1, 2,......... k in two conditions
1. There is a path from vertex i to vertex j using only vertices 0, 1,........ k-1 (path P1)
2. There is a path from vertex i to vertex k using only vertices 0, 1,........ k-1 and there is a path
from vertex k to vertex j using only vertices 0,1,.......... k-1. (path P2)
Length of first path P1 will be Dy.
fi)Uj]
Length of second path P2 will be D,.;iJ[k] + Dyilk]§]
Now we will compare the length of these two paths.
G)If (Dy ifi(k) + DeafkIG]) < Deal)
This means that path from i to j will be shorter if we use k as an intermediate vertex.
Thus D,{i]§) = Dy-ifi}[k} + DiilkGU]
Pred {i]{j] = Predyi[k]]
(i) If (Dy afi}{k] + DealkIG] ) >= Deli]
This means that path from i to j is not improved if we use k as an intermediate vertex.
Thus DG] = Dei tIG) i Se
Pred ,{iJUj] = Pred, [Uj]
We select the smaller one from the two paths P1 and P2.
Hence value of D,{iJ§] = Minimum( DD, fi), Dy Gi[k] + Dyalk]G) )
Now let us take a graph and find out the shortest path matrix for it.
3
14 6
G3) 2)
Figure 7.60
OeEy 2a 001.2253
bl elite a 0} 19 G
Doo Oe ee eae Prédiposan’ As ilvuhae la Ted
al 2| & 6 = 2 7 MsWis es
3114 © 4 © a eRe Bre s ac
raphs 395
iow let us see how we can find matrix Do from matrix D.,. If we go through vertex 0 and find a smaller path
if we replace the older path with this smaller one. The calculation of some entries of the matrix is shown
elow. :
Find Do[0)}[0]
ID. ,[0][0] + D.,[0][0] > D.,[0][0] => No change
( 9999 + 9999 > 9999)
Find Do[1][0]
ID-,[{1][0] + D.,[0][0} > D-,[1][0} => No change
ie 3 “+ -9999 => 3)
Find Dof1][1]
ID.1{1]{0] + D.,[0}(1] < D.f1){1)
@ 3 + 2 < 9999 )
Dol 1}{1] = D-1{1][0] + D-;[0][1] = 5
iPredo[1][1] = Pred_,[0][1] =0
Find Do[3][1]
D.)[3][0] + D-;[0}{1) < D.y[3)[1]
i@ 4 + ° 2 < 9999 )
Dol3][1] = D_;[3][0] + D_;[0][1] = 16
Predo[3][1] = Pred_,[0][1] =0
he changed values are shown in bold in the matrix.
OL has2es 01 23
Mile oo 07 oo 9 Onl i, 0 10
eet 73, S 4 sngenlgay 904Bcfit,
ha fm
Pons Fat ouigy By Pred ep yt play? 9
3] 14 16 4 23 3/3 03 0
ow we have to find the matrices D, and Pred, , calculation of some entries are shown.
Find Dj[1][3]
Dol 1J[1] + Dof1][3] > Dol 1][3] => No change
(5+7>7)
Find D,[2][0]
Dol2]{1] + Dof1][0] < Dol2][0}
(6+ 3 < 9999 )
D,{2](0] = Do[2][1] + Dof1][0] = 9
Pred ,[{2][0] = Predo[1][0] = 1
Find D,[2)[2] ;
Dof2}[1] + Dof1][2] < Dol2][2]
(6+4<9999 ) ,
Dj [2](2] = Do[2}{1] + Dol 1][2] = 10
Pred [2][2] = Predo[{1][2] = 1
Find D,[3][0]
Dof3][1] + Dof[1][0] > Do[3][0] => No change
(16 +3> 14) es
OF TPULZN “ O01 2° 3
GalapS.5 22: 66-9 Sd Bie boat ee
Dips e tal 283, ees 7 Wed fea tage i real bapa
2/9 6 10 2 pred arg Rig -9 4/2
3/14 16 4 23 Bohs 0 3h
396 Data Structures through C in Depth
Similarly
we can find matrices Dp, Pred,, D3; and Pred3.
01 23 Of 23
0o|5 2 6 8 0 pep Ting
ie tS 1 y Omen ag
Dy = ee eon Onl Ou 2 Pred) = 4 he re a
3/13 10 4 6 Keil Mad WalaJes ey
Onl) 243 01 23
OF eS F678 0 Yo, Sf 2
: aii v EKG) 1 lL.0) 2
Dats a= eal oe 622 6h 2 Pred= Preds= 5 |1 2 3 2
Soles alOr 46 3 ee A aa
The matrix D; is the shortest path matrix D and the matrix Pred; is the predecessor matrix.
Suppose we have to find the shortest path from vertex 3 to vertex 0. The value of D[3][O] is 13 and it is the
length of this shortest path. We can construct the path from matrix Pred.
Pred{3][0] is 1 => predecessor of vertex 0 on shortest path from 3 to 0 is vertex 1
Pred[3][1] is 2 => predecessor of vertex 1 on shortest path from 3 to | is vertex 2
Pred[3][2] is 3 => predecessor of vertex 2 on shortest path from 3 to 2 is vertex 3
So the shortest path is 3-21-40
If any value D{i][j] is infinity, it means that there is no path from vertex i to vertex j. In the above example
we don’t have any infinity in the shortest path matrix.
This algorithm can also be used for cycle detection, if there is no cycle in the graph then all diagonal
clements will be infinity in the last matrix, otherwise there will be finite values along the diagonal
corresponding to vertices which are in the cycle. For example if D[i][i] is a finite value, then it denotes that
vertex i is a part of cycle. If this finite value is negative, then it denotes the presence of a negative cycle in the
graph and in this case shortest paths are not defined.
/*P7.11 Program to find shortest path matrix by Modified Warshall’s algorithm*/
#include<stdio.h>
#include<stdlib.h>
#define infinity 9999
#define MAX 100
int n; /*Number of vertices in the graph*/
int adj [MAX] [MAX]; /*Weighted Adjacency matrix*/
int D[MAX] [MAX]; /*Shortest Path Matrix*/
int Pred[MAX] [MAX]; /*Predecessor Matrix*/
void create_graph() ;
void FloydWarshalls() ;
void findPath(int s,int d);
void display(int matrix[MAX] [MAX],int n);
main()
{
niene, (h),(elr
create_graph();
FloydWarshalls() ;
while(1)
{
printf("Enter source vertex(-1 to exit) : ");
scanf("%d",&s);
Le (S=—=95)
break;
printf("Enter destination vertex : ");
scanf ("%da", &d) ;
if(s<0 || s>n-1 || d<O || d>n-1)
{
printf£("Enter valid vertices \n\n");
ph
B0i
continue;
}
wprantr(" Shortest path is 5
findPath(s,q);
printf£("Length of this path is %d\n",D[s][d]);
}
1
‘End of main()*/
da FloydWarshalls()
int i,j,k;
for(i=0; i<n; i++)
for (j=0; j<n; j++)
{
if (adj [(i] [j)==0)
{
D(i)(j) = infinity;
Pred[i}[(j] =.-1;
}
else
{
D{zj {3} = adj (i) (3);
Predfiytj) = i;
}
}
for(k=0; k<n; k $4) i
{
for(i=0; i<n; i++)
for(j=0; j<n; j++)
if (D{i] (k]+D[k ie Im<e DiI (30))
{
D(il{j] = D{i) [k])+D[k]) [3];
Pred[i) {j] = Pred[k] [j];
}
int i,path[(MAX],count;
if (D[s) (d]==infinity)
{
printf("No path \n");
return;
}
count = —1;
do
{
path[++count] ay
d = Pred[s] [d];
}while(d!=s);
path[++count] = s;
for (i=count; i>=0;.i--)
print£("td ",path[iJ);
printf ("\n");
mad of findPath()*/
|
398 Data Structures through C in Depth
(a) Graph G
O20 Ono aN ae
2:
4 4 3 3
OO ©-O ©-® © ©
3 4
Figure 7.61
Here the tree with weight 9 is the minimum spanning tree. It is not necessary that a graph has unique minimum
spanning tree. If there are duplicate weights in the graph then more than one spanning trees are possible, but if
the all weights are unique then there will be only one minimum spanning tree.
Minimum spanning tree gives us the most economical way of connecting all the vertices in a graph. For
example, in a network of computers we can connect all the computers with the least cost if we construct a
minimum spanning tree for the graph where the vertices are computers. Similarly in a telephone communication
network we can connect all the cities in the network with the least possible cost.
There are many ways for creating minimum spanning tree but the most famous methods are Prim’s and
Kruskal’s algorithm. Both these methods use the greedy approach.
We label each vertex with length and predecessor. The label length represents the weight of the shortest edge
connecting the vertex to a permanent vertex and predecessor represents that permanent vertex. Once a vertex is
made permanent, it is not relabeled. Only temporary vertices will be relabeled if required.
Applying the greedy approach, the temporary vertex that has the minimum value of length is made
permanent. In other words we can say that the temporary vertex which is adjacent to a permanent vertex by an
edge of least weight is added to the tree.
he steps for making a minimum spanning tree by Prim’s algorithm are as-
A) Initialize the length of all vertices to infinity and predecessors of all vertices to NIL. Make the status of
all vertices temporary.
'B) Select any arbitrary vertex as the root vertex and make its length label equal to 0.
‘C) From all the temporary vertices in the graph, find out the vertex that has smallest value of length, make
{t permanent and now this is our current vertex. (If there are many with the same value of length then anyone
san be selected)
‘D) Examine all the temporary vertices adjacent to the current vertex. Suppose current is the current vertex
and v is a temporary vertex adjacent to current.
(i) If weight (current,v) < length(v)
Relabel the vertex v
Now length(v) = weight(current,v)
predecessor(v) = current
(ii) If weight (current,v) >= length(v)
Vertex v is not relabelled
‘E) Repeat steps (C) and (D) till there are no temporary vertices left, or-all the temporary vertices left have
length equal to infinity. If the graph is connected, then the procedure will stop when all n vertices have been
made permanent and n-1 edges are added to the spanning tree. If the graph is not connected, then those vertices
hat are not reachable from the root vertex will remain temporary with iength infinity. In this case no spanning
ree is possible.
Let us take an undirected connected graph and construct the minimum spanning tree
4
Figure 7.62
Initially length values for all the vertices are set to a very large number(larger than weight of any edge).
Suppose co is such a number. We have taken the predecessor of all vertices NIL(-1) in the beginning.
Initially all the vertices are temporary. We select the vertex 0 as the root vertex and make its length label
qual to zero. 0, NIL oo, NIL
@ @
6) @)-=. NIL
oo, NIL
@ ©
oo, NIL oo, NIL
bila aol
Vertices 1, 2, 3, 4 are temporary vertices adjacent to 0
6 6,0
fogOs esa
;@
i
@ x0
'\ 2.0
wae aO) ®
®
From all the temporary vertices, vertex 2 has the smallest length
so make it permanent i.e. include it in the tree. Its predecessor is
0, so the edge that is added to the tree is (0,2). Now vertex 2 is
the current vertex. Vertices 3, 4 are temporary vertices adjacent
to vertex 2.
weight(2,3) > length(3) 14>3 Don’t Relabel 3
weight(2,4) < length(4) 8<10 Relabel 4
predecessor[4]=2 , length[{4]=8
[es[|
ae ae
Prede-
cessor
NIL
From all temporary vertices, vertex 3 has the smallest value of
length so make it permanent. Its predecessor is 0, so the edge
(0,3) is included in the tree. Now vertex 3 is the current working
vertex.
3 Vertices 1, 4, 5 are temporary vertices adjacent to vertex
Vertex | length From all temporary vertices, vertex 4 has the smallest length so
cessor
make it permanent. Its predecessor is 5, so include edge (5,4) in
the tree. Now vertex 4 is the current vertex. There are no
temporary vertices adjacent to 4
Now we have a complete minimum spanning tree. The edges that belong to minimum spanning tree are -
(0,1), (0,2), (0,3), (5,4), (3,5)
Weight of minimum spanning tree will be-
64+24+34+4+5=20
Now let us take a graph that is not connected.
a hs
403
Figure 7.63
fter making vertices 0,2,4,3,1 permanent, the situation would be-
cessor
PS 08
ee
ae
eon
)
* Figure 7.64
2rtices 5, 6, 7 are temporary with length infinity so we stop the procedure and state that the graph is not
mnected and hence no spanning tree is possible.
‘P7.12 Program for creating minimum spanning tree using Prim’s algorithm*/
nclude<stdio.h>
nclude<stdlib.h>
efine MAX 10
efine TEMP 0
ilefine PERM 1
iefine infinity 99S¥y
iefine NIL -1
ruct edge
int u;
DERN.
en;
it adj [MAX] [MAX];
it predecessor [MAX] ;
it status [MAX];
it length [MAX];
id create_graph();
id maketree(int r, struct edge tree[MAX]);
itt min_temp();
in ()
int wt_tree = 0; \
Tighe sigs Seeveneiye
struct edge tree[MAX];
create_graph();
printf("Enter root vertex : "); j
scant (("sd", root);
maketree(root, tree);
printf("Edges to be included in spanning tree are : \n");
foe (i=l a <=n—Le Ir)
{ Pe
}
printf("Weight of spanning tree is : %d\n", wt_tree) ;
}/*End of main()*/
void maketree(int r, struct edge tree[MAX])
( :
ANC Current is
int count = 0;/*number of vertices in the tree*/
for (i=0; i<n; i++) /*Initialize all vertices*/
{
predecessor[i] = NIL;
length[i] = infinity;
status[i] = TEMP;
}
length[r] = 0; /*Make length of root vertex 0*/
while(1)
{
/*Search for temporary vertex with minimum length
and make it current vertex*/
current = min_temp();
if (current==NIL)
{
if (count==n-1) /*No temporary vertex left*/
return;
else /*Temporary vertices left with length infinity*/
{
printf("Graph is not connected, No spanning tree possible\n");
exit(1);
}
}
status[current] = PERM; /*Make the current vertex permanent*/
/*Insert the edge (predecessor[current], current) into the tree,
except when the current vertex is root*/
if (current !=r)
{
count++;
tree[count].u predecessor ([current];
tree[count].v = current;
}
for(i=0; i<n; i++)
if(adj[(current])[i]>0 && status[i]==TEMP)
if(adj[current]
[i] < length[i])
{
\ predecessor[i] = current;
length[i] = adj[{current]
[i];
}
}/*End of make_tree()
*/
/*Returns the temporary vertex with minimum value of length, Returns NIL if no temporary
vertex left or all temporary vertices left have pathLength infinity*/
int min_temp()
{
aleye, Sl;
Ine Mid ine naty:
tnt) Keays
for(i=0; i<n; i++)
{
if (status [i] ==TEMP && length[i]<min)
{
mG Yee length[i];
ke= a;
} »
}
return k;
}/*End of min_temp()*/
iphs
405
id create_graph()
Figure 7.65
406 Data Structures through C in Dep
are examined
Initially we take a forest of 9 trees, with each tree consisting of a single vertex. All the edges
increasing order of their weight.
Edge 0-4, wt = 2 Inserted, see figure 7.66(b)
Edge 3-4, wt = 3 Inserted, see figure 7.66(c)
Edge 0-3, wt = 4 Not inserted, forms cycle 0-3-4-0 in figure 7.66 (c)
Edge 2-5, wt = 5 Inserted, see figure 7.66(d)
Edge 4-5, wt = 6 Inserted, see figure 7.66(e)
Edge 2-4, wt =7 Not inserted, forms cycle 2-4-5-2 in figure 7.66(c)
Edge 1-4, wt=8 - Inserted, see figure 7.66(f)
Edge 0-1, wt = 9 Not inserted, forms cycle 0-1-4-0 in figure 7.66(f)
Edge 1-2, wt = 10 Not inserted, forms cycle 1-2-5-4-1 in figure 7. ae
Edge 4-6, wt = 11 Inserted, see figure 7.66(g)
Edge 4-7, wt = 12 - Inserted, see figure 7.66(h) ;
Edge 6-7, wt = 14 Not inserted, forms cycle 4-6-7-4 in figure 7.66(h)
Edge 4-8, wt = 15 Inserted, see figure 7.66(i)
© oO @
lo © o|@ o|e o
22 0/2
(a) Initially 9 trees in the (b) Trees {0} and {4} joined by
0 9/9 (c) Trees {0,4} and {3} joined by
0, 9
forest ; edge(0,4) to form tree {0,4}. edge (3,4) to form tree {0,4,3}.
? Now 8 trees left in the forest Now 7 trees left in the forest.
OOO} OP Oro
: (d) Trees {2} and {5} joined by | (e) Trees {0,4,3} and {2,5} joined
Okwor wns
; (f) Trees {0,4,3,2,5} and {1}
edge (2,5) to form tree. {2,5}. by edge (4,5) to form tree {0,4,3,2, i joined by edge (1,4) to form tree ;
Now 6 trees left in the forest 5}. Now 5S trees left in the forest i {0,4,3,2, 5,1}. Now 4 trees left in :
i the forest.
At eNen eee eeneeneenenaeaenaneHn eee eAH SAE EGNOS SEES SEDEEOEEAEESOAEESOASESISESEELSEGE SESS ESESEEES ESS OSENSHSEOONSEOOEOEOSEOEESANOESOSOES ES SESEESESS NOES SSSOESE DEE
(g) Trees {0,4,3,2,5,1} and i (h) Trees {0,4,3,2,5,1,6} and {7} (i)Trees {0,4,3,2,5,1,6,7} and
: {6} joined by edge (4,6) to ; joined by edge (4,7) to form tree {8} joined by edge (4,8) to form
: form tree {0,4,3,2,5,1,6}. {0,4,3,2,5,1,6,7}. Now 2 trees left i tree {0,4,3,2,5,1,6,7,8}. Now only
: Now 3 trees left in the forest ? in the forest I tree left and it is the MST.
Figure 7.66
The resulting minimum spanning tree is-
ahs 407
Figure 7.67
t us sce how we can implement this algorithm. We examine all the edges one by one starting from the
Hest edge. To decide whether the selected edge should be included in the spanning tree or not, we will
mine the two vertices connecting the edge. If the two vertices belong to the same tree, it means that they are
ady connected and adding this edge would result in a cycle. So we will insert an edge in the spanning tree
/ if its vertices are in different trees.
Now the question is how to decide whether two vertices are in the same tree or not. We will keep record of
father of every vertex. Since this is a tree, every vertex will have only one distinct father. We will recognize
1 tree by a root vertex and a vertex will be a root if its father is NIL(-1). Initially we have only single vertex
s; each vertex is a root vertex so we will take father of all vertices as NIL. For finding out, to which tree a
ex belongs, we will find out the root of that tree. So we will traverse all the ancestors of the vertex till we
ha vertex whose father is NIL. This will be the root of the tree to which the vertex belongs.
Yow we know the root of both vertices of an’ edge, if roots are same means both vertices are in the same tree
are already connected so this edge is rejected. If the roots are different, then we will insert this edge into the
ining tree and we will join the two trees, which are having these two vertices. For joining the two trees, we
make root of one tree as the father of root-of another tree.
\fter joining two trees, all the vertices of both trees will be connected and have the same root.Initially father
very vertex is NIL(-1), and hence every vertex is a root vertex.
Ogee 4 25 «66 7 8
mn ON N° N N N N NN
Pefs | Inserted
Inserted
=e
wie
vertex
OM
0
jh 3}
:?
Not inserted vertex 0
: ae father _3
father[S]=2 father 3
:
father[2]=3 father 3 N
; father 3. N33
eeeeaeeAt ae 3 NT
oe
ee i ee
Not inserted- vertex 0 1 2 3
t father[7]=1 feiee 3) IN eh
3 4
father 3. N 93 Hactit )
father[8]=1 father 3 Ze WW
ei
eel
Ai.
408" -__Data Structures through C in Deptt
The minimum spanning tree should contain n-1 edges where n is the number of vertices in the graph. This grapl
contains 9 vertices so after including 8 edges in the spanning tree, we will not examine other edges of the grapl
and stop our process.
Edges included in this spanning tree are (0,4), (3,4), (2,5), (4,5), (1,4), (4,6), (4,7), (4,8)
Weight of this spanning tree is 2+3+5+6+8+411+12+15 =62
We will take an array named father and if the index of the array is Soneiiered as the vertex, then the
clement present at that index will represent the father of that vertex.
To obtain the edges in ascending order we can insert them in a priority queue. We will take a linked priority
queue of edges in increasing order of their weights.
In Prims algorithm we have a single tree at all the stages of the algorithm, while in Kruskal’s algorithm we
have atree-only in the end. Kruskal’s algorithm is faster than Prim’s because in latter we may have to conside
an edge Several times, but in the former an edge is considered only once.
/* P7.13 Program for creating a minimum spanning tree by Kruskal’s algorithm*/
#include<stdio.h>
#include<stdlib.h>
#define MAX 100 - :
#define NIL- -1
struct edge
i Int U;
/ inte ve
| int weight;
| struct edge *link;
}*front = NULL;
void make_tree(struct edge tree[]);
void insert_pque(int i,int j,int wt);
struct edge *del_pque();
int isEmpty_pque() ;
woid create_graph();
a lie, Meee OleetE number of vertices in the graph*/
mdin()
{
site a
struct edge tree[MAX]; /*Will contain the edges of spanning tree*/
int wt_tree = 0; /*Weight of the spanning tree*/
create_graph();
make_tree(tree) ;
printf("Edges to be included in minimum spanning tree are :\n");-
for(i=l; i<=n-1; i++)
{
printi( "sd=>",treel i}. wi
Diintf ("sd\n", treeliy sv);
wt_tree += tree[i].weight;
}
printf ("Weight of this minimum spanning tree is : $d\n",wt_tree) ;
}/ *End Of madniy ey
void make_tree(struct edge tree[])
Pa
struct edge *tmp;
imt.will, W2.hOot
vil, ,noot v2\s
int father[MAX]; /*Holds father of each vertex*/
int Ji), Count .= 50); /*Denotes number of edges included in the tree*/
for(L=04 Gene a+)
father[{i] = NIL;
/*Loop till queue becomes empty or till n-1 edges have been inserted in the tree*/
wid lel ( demapey: Seuat) && count<n-1) a
{
tmp = del_pque();
vi = tmp->;
ns 409
v2 = tmp->v;
-while(v1!=NIL)
{
rootevi = vi;
vl = father[vl];
}
while(v2!=NIL)
{
root.v2 = sv 2;
v2 = father[v2];
}
if (root_v1!=root_v2)/*Insert the edge (v1, v2) */
{ ‘
count++;
tree[count].u = tmp->u;
tree[count] .v = tmp->v;
tree[count].weight = tmp->weight;
father [root_vz}=root_vi;
}
Jae
if (count<n-1)
{
printf("Graph
Sf
is not connected, no spanning tree possible\n");
: exit(1); °
*End of make_tree()*/
Inserting edges in the linked priority queue*/
id insert_pque(int i,int j,int wt)
tmp->link = front;
front = tmp;
else
{
G. = sEront;
while(q->link!=NULL && q->link->weight<=tmp->weight)
q = q->link; 4
tmp->link = q->link;
q->link = tmp;
if (q->link==NULL) /*Edge to be added at the end*/
tmp->link = NULL;
}
‘End of insert_pque()*/
Jeleting an, edge from the linked priority queue*/
uct edge *del_pque()
if (front==NULL)
return 1;
410 Data Structures through C in Depth
else
return 0;
}/*End of isEmpty_pque()
*/
void create_graph()
{
int i,wt,max_edges,
origin, destin;
else
insert_pque (origin, destin, wt) ;
}
}/*End of create_graph()*/
NS ©
precios
Figure 7.68
;
We can see that the process of topological sorting linearizes the graph, i.e. we can write all the vertices in a
horizontal line such that all the directed edges go from left to right only. This phenomenon is entirely different
from the usual sorting techniques.
Topological sorting is possible only in acyclic graphs, i.e. if the graph contains a cycle then no topological
order is possible. This is because for any two vertices u and v in the cycle, u precedes v and v precedes u.
There may be more than one topological sequence for a given directed acyclic graph. For example another
topological ordering for the graph of figure 7.68 is -
Noaomc
Figure 7.69
Before studying the algorithm for topological sorting, let us first see where it can be used. There are many
applications where execution of one task is necessary before starting another task. For example understanding
of ‘C’ language and programming is necessary before starting ‘Data Structure through C’. Similarly in this book
also we can go to heap sort or binary tree sort only after understanding tree.
ee
To model these types of problems where tasks depend on one another we can draw a directed graph in which
ertices represent tasks, and if task x has to be completed before task y then there is a directed edge from x to y.
Suppose a student needs to pass some courses to acquire a degree. The curriculum includes 7 courses named
\; B, C, D, E, F, G. Some courses have to be taken before others, for example course paper B can be studied
nly after studying papers A, C, D. The prerequisite courses for each course are given in the table below.
Ae
Sf © >
aN
Figure 7.70
low from this directed graph we can find out the topological order which is the required sequence in which
udents should take the courses. We have seen earlier that there may be more than one topological sequence
gssible; one of such sequences for the above graph is A-C-G-D-B-E-F.
TRUCE HO
Figure 7.71
Now we will study the algorithm for topological sorting. It finds out the solution using the greedy approach.
he procedure is -
Select a vertex with no predecessors (vertex with zero indegree).
Delete this vertex and all edges going out from it.
Repeat this procedure till all the vertices are deleted from the graph.
If in the middle of this procedure we arrive at a situation when no vertex can be deleted, i.e. there is no
sttex left with zero indegree; all the vertices in graph have predecessors, then it means that the graph has a
vele, In this case no solution is possible.
Now let us see how we can implement this algorithm. To keep track of all the vertices with zero indegree we
in use either a stack or queue. Here we will use a queue and it will temporarily store the vertices with zero
degree. We will take a one-d array topo_order which will be used to represent the topological order of the
sttices. The vertices will be stored in this array in the sequence of their deletion from the queue.
Ie
412 Data Structures through C in Depth
Initially the indegree of all the vertices are computed and the vertices that have zero indegree are inserted
into the initially empty queue. A vertex from the queue is deleted and listed in the topo_order array. The
edges going from this vertex are deleted and the indegrees of its successors are decremented by 1. A vertex is .
inserted into the queue as soon as its indegree becomes 0. This process continues till all the vertices in the graph
are deleted. .
222
Let us take a graph and apply the topological sorting-
(3)
Figure 7.72
@ucue = 07276
©
Step 3 - Delete the vertex 6 and edges going from vertex 6.
Queue : 3 topo_order : 0, 2, 6 oS
Updated indegree of vertices: In(1)=1, In(4)=2, In(5)=2 ae
Step 4 - Delete the vertex 3 and edges going from vertex 3. (5)
Queue: Empty topo_order : 0, 2,6, 3
Updated indegree of vertices: In(1)=0, In(4)=1, In(5)=2 od
Insert vertex | into the queue
Queue: |
Now we have no more vertices in the graph. So the topological sorting of graph will be-
2,6, 3,154; 5
if there are vertices remaining in the graph and the queue becomes empty at the end of any step, it implies that
he graph contains cycle so we will stop our procedure. In the program we will take an array indeg, which will
store the indegree of vertices.
/*P7.14 Program for topological sorting*/
+include<stdio.h>
#include<stdlib.h>
+define MAX 100
Lat n; /*Number of vertices in the graph*/
int adj [MAX] [MAX]; /*Adjacency Matrix*/
yoid create_graph() ; :
int queue[MAX],front = -1,rear = -1;
yoid insert_queue(int v);
int delete_queue();
int isEmpty_queue() ;
int indegree(int v);
nain()
r
L
}
}
if (count<n)
{
printf("No topological ordering possible, graph contains cycle\n");
(pci (al)
}
printf ("Vertices in topological order are :\n");
for(i=1; i<=count; i++)
414 Data Structures through C in Depth
printf("%d “,topo_order[i])j;
jopaaisayesad( Wiel)
}4*End of main()*/
void insert_queue(int vertex)
{
if (rear==MAX-1)
printf ("Queue Overflow\n") ;
else
{ a
if(front==-1) /*If queue is initially empty */
iegoycue, w= (8)9
rear = rear+1;
queue[rear] = vertex ;
}
}/*End of insert_queue()
*/
int isEmpty_queue()
{ ;
if(front==-1 || front>rear)
return 1;
else
return 0;
}/*End of isEmpty_queue()
*/
int delete _queue()
{
ite Celwltems
if(front==-1 || front>rear)
else
del_item = queue[front];
EGOMe—s EmOmtetle
return del_item;
}/*End of delete_queue() */
int indegree(int v)
{
iighe! Abn ola Kelton = (0
for (i=0; a<n; i++)
Li (adj (a) fv )i==2)
in_deg++;
return in_deg;
}/*End of indegree()*/
Exercise
What is the relation between the number of edges, sum of indegrees and sum of outdegrees of all the vertices.
5. How many edges are there in a regular graph of n vertices having degree d.
(i) A regular graph of degree 2 has 5 vertices. How many edges are there in the graph.
(ii) A regular graph of degree 3 has 4 vertices. How many edges are there in the graph.
(iii) A regular graph of degree 3 has 5 vertices. How many edges are there in the graph.
(4) Q OO
elo” Ros
s)
Om Orn 4)
vA
wy & (ii) . (iii)
7. Find whether the following graphs are strongly connected or not.
I=20 oS
(ii)
416 ; ta Structures through C in Depth
9. For the following graph compute the shortest paths from vertex 0 to all other vertices using Dijkstras —
algorithm.
Sorting means arranging the data according to their values in some specified order, where order can be either
ascending or descending. For example if we have a list of numbers {6, 2, 8, 1, 4}, then after sorting them in
ascending order we get {1, 2, 4, 6, 8} and after sorting them in descending order we get {8, 6, 4, 2, 1}. Here the
data that’we sorted consists only of numbers, but it may be anything like strings or records. Generally we have
to sort a list of records where each record contains several information fields. Sorting is done with respect to a
key where key is a part of the record. Sorting these records means rearranging the records so that the key values
are in order. The key on which sorting is pefformed is also known as the sort key.
Suppose we have several records of employees where each record contains three fields viz. name, age and
salary. We can sort the records taking any one of these fields as the sort key. The table below shows the
unsorted list of records and the sorted lists having name, age and salary as the sort keys one by one.
Name Age Salary Name Age Salary Name Age Salary Name Age Salar
[Priya |23 |9000_| 9000
|
Chetna
[34 |8000 _
6500 |
Unsorted List L List L sorted by name List L sorted by age in List L sorted by salary in
in ascending order ascending order descending order
Figure 8.1 Sorting on different keys
We can see that sorting the data according to different keys arranges the data in different orders.
In our algorithms we will perform sorting on a listofinteger values only, so that we can focus on the logic of
the algorithm. The extension of these algorithms to sort a list of records is simple. After the discussion of all
sort methods, there are two programs in which we can see how to sort records.
Now let us see what is the requirement of sorting and why is it important to keep our data in sorted order. In
our daily life, we can see many places where. data is kept in sorted order like dictionary, telephone directory,
index of books, bank accounts, merit list, roll numbers etc. Imagine the time taken to search for a word in a
dictionary if the words were not arranged alphabetically or consider the case when you have to search for a
name in telephone directory and the names are not sorted. Suppose you want to know where the topic “Queue”
is given in this book, then obviously you will go to the index of this book to find the page number; you directly
sq to the words starting with ‘Q’ and in an instant you find your word. This was possible because words in the
index were sorted. If the index was not sorted then you had only one option for searching a particular word i.e.
ne by one. So we sec that it is easier and faster to search for an item in data that is sorted. Similarly in
somputer applications, sorting helps in faster information retrieval and hence the data processing operations
yecome more efficient if data is arranged in some specific order. So practically there is no data processing
ypplication that does not perform sorting.
418 Data Structures through C in Depth
|
Name___|Age
:
56
cl
|
Kiran
| 18 |
|
Kiran
| 18__|
5
Unsorted list Sorted list Sorted list Sorted list
(Unstable Sort) (Unstable Sort) (Stable Sort)
Figure 8.2 Stable and Unstable Sort
Any sorting algorithm would place (Amit,37) in first position, (Kiran,18) in fifth position, (Shriya,45) in
sixth position and (Vineet,25) in seventh position. There are three records with identical keys(names), which
are
(Deepa,67), (Deepa,20) and (Deepa,56) and any sorting algorithm would place them in adjacent locations i.e.
second, third and fourth locations but not necessarily in the same relative order. |
A sorting algorithm is said to be stable if it maintains the relative order of the duplicate keys in the sorted
output i.e. if keys are equal then their relative order in the sorted output is same. For example if records R; and
Rj have equal keys and if record Rj precedes record R; in the input data then R; should precede R; in the sorta
output data also if the sort is stable. If the sort is not stable then R; and Rj may be in any order in the
sorted
output. So in an unstable sort the duplicate keys may occur in any order in the sorted output.
Csi docwiteetaete oe tg
In our example the first two sorted lists did not maintain the relative order of the duplicate keys while the
itd one did. So we can say that the first two sorted lists were obtained by unstable sorting algorithms while the
st one was obtained by a stable sorting algorithm. \4 |
Sometimes we need to sort the records according to different keys at different times, i.e. records which are
rted on one key are again sorted on another key. In these types of situations an unstable sort is not desirable. ;
tus take an example and see why it is so. | beat
Suppose we have a list of all students of a school consisting of their names and classes, and the list is
ohabetically sorted on name, i.e. all the names are in alphabetical order. Now suppose we want to sort this list
ith respect to class. Any sorting algorithm will place the names of all classmates in adjacent locations, but
ly|a stable sort will place the names of students in a particular class alphabetically.
Name
| Neelam |.10__| Chetan
|
Chetan | 10 |
Neelam | 10/ _|
jsMonaye<a) 1b | Angh cet ohio 7s|
Rue Sreeeit vee |
Manav |
Te Raj St 38"|
|
Manav
|
|
Neclam
| 10
pe
Raiiacs
ol hb’
List L with names in List L sorted on class List L sorted on class
sorted order (Unstable Sort) (Stable Sort)
Figure 8.3
e can see that in stable sort we got the names of students of each class in alphabetical order while unstable
rt disturbed the initial order of students who were in same class.
Vineet
2020
Aiit
2040
2060
6000
2080
Kiran 4000
3000 Z
Sort this sublist (b) After Sorting
Here we can see that the records are moved from one place to another in the memory, for example the record
incet, 25, 4000) was initially stored at address 2020 but after sorting it is at address 3000.
If records to be sorted are very large then this process of moving records can be an expensive task. In this
se we can take an array of pointers, which contains addresses of the records in memory. Now instead of
ranging the records, we rearrange the addresses inside the pointer array. In figure 8.5, we have performed
‘ting on the same records but this time by adjusting pointers in the pointer array.
420 Data Structures through C in
rocess of sorting requires traversing the given list many times. These traversals may be on the whole list
Jart of it depending on the algorithm. This procedure of sequentially traversing the list or apart of it is
| a pass. Each pass can be considered as a step in sorting and after the last pass we get the sorted list.
Sort Efficiency
1g is an important and frequent operation in many applications and so the aim is not only to get the sorted
ut to get it in most efficient manner. Therefore many algorithms have been developed for sorting and to
e which one to use we need to compare them using some parameters. The choice is made using these three
eters- <
ding time
ace requirement
n time or execution time
data is in small quantity and sorting is needed only at few occasions then any simple sorting technique
1 be adequate. This is because in these cases a simple or less efficient technique would behave at par with
omplex techniques developed to minimize run time and space requirements. So there is no point in
ing lot of time in searching for best sorting algorithm or implementing a complicated techniqie.
e have already discussed about the space requirement of a sort, and we’ve seen that if data to be sorted is
ye quantity then it is better to use an in place sort. ;
© most important parameter is the running time of the algorithm. If the amount of data to be sorted is in
quantity, then it is crucial to minimize run time by choosing an efficient sorting technique.
two basic operations in any sorting algorithm are comparisons and record movements. The record
Ss or any other operations are generally a constant factor of the number of comparisons and moreover the
1 moves can be considerably reduced, so the run time is calculated\by measuring the number of
arisons. Calculating the exact number of comparisons may not be always possible so.an approximation is
by big-O notation. Thus the run time efficiency of different algorithms is expressed in terms of O
on. The efficiency of most of the sorting algorithms is in between O(n log n) and O(n’).
some sorting algorithms, the time taken to sort depends on the order in which elements appear in the
al data i.e. these algorithms behave differently when the data is already sorted or when it is in reverse
For example if the data to be sorted is {4, 6, 8, 9, 10}, then an intelligent algorithm will immediately find
at the data is already sorted and it will not waste time in doing anything. Some sorting algorithms always
ame time to sort, irrespective of the order of data.
e run time of a data sensitive algorithm may be different for different orders of data; hence we need to
e the sorting algorithms in three different cases which are - |
ut data is in sorted order(ascending), e.g. { 1, 2, 3, 4, 5, 6, 7, 8 }
put data is in random order, i.e. all the elements are dispersed in the data and there is no specific order —
x these elements e.g. { 4, 8, 1, 6, 5, 2, 3, 7 }. In this case it is assumed that all n! permutations of data are
y likely where:n is size of data. ;
put data is in reverse sorted order(descending) e.g. { 8, 7, 6, 5, 4, 3, 2, 1}.
2re are numerous sorting algorithms but none of them can be termed best or most efficient, each algorithm
. advantages and disadvantages. The choice of a sorting algorithm depends on the specific situation. For
le if we know in advance that our data is almost sorted then it would be useful to use an algorithm which
entify this order: The size of data is also considered while deciding which algorithm to choose. The
it of space available also determines our choice of algorithm. If we have no extra space then we have to
in place algorithms. So we:can see that the implementation of particular sorting technique depends not
n the order of that technique but also on the situation and type of data. Now after this introduction of
x we are ready to study various sorting algorithms and their analysis.
election Sort .
|
422 : Data Structures through C in Deg
Suppose that you are given some numbers and asked to arrange them in ascending order. The most intuiti
way to do this would be to find the smallest number and put in the first place and then find the second small
number and put it in the second place and so on. This is the simple technique on which selection sort is based.
is named so because in each pass it selects the smallest element and keeps it in its exact place.
Suppose we have n elements stored in an array arr. First we will search the smallest element fre
SEN eta. arr(n-1] and exchange it with arr[0]. This will place the smallest element of list- at |
position of array. Now we will search smallest element from remaining elements arr[1]......... arr[{n-1) al
exchange it with arr[1]. This will place the second smallest element of the list at 1“ position of array. TI
process continues till the whole array is sorted. The whole process is as-
Pass 1 ;
|. Search the smallest element from arr[O]........ arr[n-1].
2. Exchange this element with arr[0].
Result : arr[O] is sorted.
Pass 2 :
1. Search the smallest element from arr[1].......... arr[n-1].
2. Exchange this element with arr[1].
Result : ar[0], arr[]] are sorted.
Pass n-J ;
1. Search the smallest element from arr[n-2] and arr[n-1].
2. Exchange this element with arr[n-2].
Resalts art |O baer
(li escsncceaeasceses
sarearr[n-2] are sorted.
Now all the elements except the last one have been put in their proper place. The remaining last eleme
arr[n-1] will definitely be the largest of all and so it is automatically at its proper place. So we need only n-
passes to sort the array. Let us take a list of elements in unsorted order and sort it by applying selection sort.
S
=
Passs L8_|
j ress 49 [527 59 93 82) , : Bye a
In the first pass, 8 is the smallest element among arr[0]......... arr [8], so it is exchanged with arr[0]
t 82. In the second pass, 25 is the smallest among arr[1)....... arr[8] so it is exchanged with arr[1] ie.,
. Similarly other passes also proceed. The shaded portion shows the elements that have been put in their final
ACE.
P8.1. Program of sorting using selection sort*/
mclude <stdio.h>
efine MAX 100
in()
int arr[MAX],i,j,n,temp,min; :
printf("Enter the number of elements : ");
scanf ("%d", &n) ;
for(i=0; i<n; i++)
{
printf("Enter element %d : ",i+1);
scanf("%d", &arr[i]);
}
/*Selection sort*/
for (i=0; i<n-1; i++)
f
/*Find the index of smallest element*/
min = i;
for(j=i+1; j<n;, j++) y vy
{ ;
if (i!=min)
{
temp = arr[i];
arr[{iJ = arr[min];
arr({min] = temp ;
}
}
Drinkn (Sorted ist 1s: \n") 5
for (i=0; i<n; i++)
Printie(2sd oe, axrwilde).;
PEinte (Nm);
*End of main()*/
Each iteration of outer for loop corresponds to a single pass. In each iteration of outer for loop we have to
change arr [i] with the smallest element among arr[(i]...arr(n-1]. The inner for loop is used to find the
lex of the smallest element and it is stored in min. Initially variable min is initialized with i. After this,
x [min] is compared with each of the elements arr[i+1], arr[i+2]........ arr[n-1] and whenever we get
maller element, its index is assigned to min.
After finding the smallest element, it is exchanged with arr [i]. We have preceded this swap operation with
‘ondition to avoid swapping of an element with itself. This situation arises when an element is already in its
yper place. In the pass 6 of figure 8.6, arr [5] has to be swapped with arr [5] which is obviously redundant.
Compare arr[n-2] and arr[n-1], If arr{n-2] > arr{n-1] then exchange them.
Result : Largest element is placed at (n-1)"" position
arr[{n-1] is sorted.
Pass 2:
Compare arr[O] and arr{1], If arr[0] > arr[1] then exchange them,
Compare arr[1] and arr[2], If arr[!] > arr[2] then exchange them,
Compare arr[2] and arr[3], If arr[2] > arr[3] then exchange them,
Compare arr[n-3] and arr[n-2], If arr[n-3] > arr[{n-2] then exchange them.
Result : Second largest element is placed at (n-2)" position
arr[n-2], arr[n-1] are sorted.
Pass n-2 :
Compare arr[O] and arr{1], If arr[O] >arr[1] then exchange them,
Compare arr[1] and arr[2], If arr[1] >arr[2] then exchange them.
GSU] beracnetsmneere arr[n-2], arr[n-1] are sorted.
Pass n-1 :
Compare arr[O] and arr[1], If arr[O] >arr[1] then exchange them.
sorting . 425
Pass 1
yy a , =x
40 j20[sols6o{30li0} (20]4o[soleof3ofio} [20[4o[sof6ol30fio}] [20[40[s0[60]30]10] [20140150130]
00]10)
ee 2 ts 4° SS OM? R2ET VAL HS OFM 2 53 TAGES Om lie 2 3 4 5 Oslo 2ia3a4, 5
Pass2
20 |40150130
[10 [60]
Ta ae ee
Pass 3
Pass 4
rs
20|30fiofsofsofeo]
30 [20[30[ro}4o}sofeo} [2010/30
[40}50 [60
Ome 2) 34) 15 ON TS 2553" <4h5 0123 4
Pass 5
pa
20170 }30}40]5060} [10 ]20} 30[40 {50160
OUCIUS2) Fa 0i5 OTK Far 1374445
Figure 8.7 Bubble sort
Let us see what happens in the first pass. First arr [0] is compared with arr [1], since 40>20 they are
swapped. Now arr[1] is compared with arr[2}, since 40<50 they are not swapped. Now arr[2] is
ompared with arr[3], since 50<60 they are not swapped. Now arr[3] is compared with arr [4], since
50>30 they are swapped. Now arr [4] is compared with arr [5], since 60>10 they are swapped. At the end of
his pass the largest element 60 is placed at the last position. The elements which are being compared are shown
n italics and the elements which have been placed in proper place are shaded.
Sometimes it is possible that a list of n elements becomes sorted in less than n-1 passes. For example
onsider this list- 40 20 10 30 60 50
After first pass the listis - 20 10 30 40 50 60
After second pass the listis-10 20 30 40 50 60
The list of 6 elements becomes sorted in only 2 passes. Hence other passes are unnecessary and there is no
eed to proceed further. Now the question is how we will be able to know that the list has become sorted. If no
waps occur in a pass it means that the list is sorted. For example in the above case there will be no swaps in the
426 Data Structures through C in Depth
then
third pass. We can take a variable that keeps record of the number of swaps in a pass and if no swaps occur
‘we can terminate our procedure.
AES Aes Program of sorting using bubble sort*/
#include <stdio.h>
#define MAX 100 ,
main ()
{
int arr[MAX],i,j,temp,n, xchanges;
printf("Enter the number of elements : ")j;
scan
f ("%d", &n) ;
Gon (G=055 tne)
{
printf("Enter element %d : ",i+1);
scanf ("%d",&arr[i]);
}
/*Bubble sort*/
for(i=0; i<n-1 ;i++)
{
xchanges = 0;
for (j=0; j<n-1-i; j++)
{
if (arr(j] > arr[j+1])
{
temp = arr[j];
arr({j] = arr[j+1l;
arr[j+1l] = temp;
xchanges++;
}
}
if (xchanges==0) /*If list is_sorted*/
break;
}
PHINeE (I SOrced hist Ls ey Nm es
for (i=07%1<n; i++)
Deine
ft(Smarr ante e,
jopagaliahewa
(ie Naat? jer
}/*End of main()*/
The figure 8.8 shows the proccdure of bubble sort along with the values of i and j. Each iteration of outer for
loop corresponds to a single pass.
If all the elements are sorted then only one pass is required and so there will be only one iteration of outer for
loop. The number of comparisons will be (n-1) and all elements are in their proper place so there will be no
swaps. Hence the time complexity in this case is O(n):
|
|
|
|
j
428 Data Structures through C in Depth
Pass n-1 :
Sorted part : arr[Q], arr[I], ....... arr[n-2]
Unsorted part : arr[n-1]
arr[n-1] is inserted at its proper place among arr[O], arr[1], ............-.-- ,.-arr[n-2]
Result ; arr[O], arr[1],arr[3],................ arr[n-1] are sorted: aM,
To insert an element in the sorted part we need a vacant position, and this space is created by moving all the
larger elements one position to the right. Now let us take a list-of elements in unsorted order and sort them by
‘applying insertion sort.
$2. 4249 G8 2S 352 236-93" 59
[82] 42] 49] 8] 25] 52] 36] 93) 59] [42] 82] 49] 8 |25] 52} 36] 93] 59]
U4
[42] 82] 49] 8 {25] 52] 36] 93) 59} [42] 49] 82|_8 |25] 52] 36} 93] 59}
[42] 49] 82] 8] 25] 52] 36] 93] 59} [8 [42] 49} 82] 25] 52] 36193|59)
FLAG ;
Pass 4 ES Ee ED TS)
Pass 5 |8} 25] 42] 49] 82} 52] 36] 93] 59]
Pass 6 |_8] 25] 42{ 49f:52{ 82] 36] 93] 59] E 8 P25 36} 42] 49] 52] 82] 93] 59]
FEFG
14
Pass 7 [8] 25] 26]'42]-49] 52] 82] 93] 59] 8 125 [36] 42} 49]'52] 82-93] 59]
In each iteration of for loop, the first element of the unsorted part is inserted into the sorted part. The element
to be inserted (arr [i]) is stored in the variable k. In the inner for loop, the sorted part is scanned to find the
exact location for the insertion of the element arr[i]. The search starts from the end of the sorted part so
variable j is initialized to i-1. The search stops when we either reach the beginning of the sorted part or we get
an element less than k. Inside the inner for loop, the elements are moved right one position, and obviously these
are elements which are greater than k. At the end k is inserted at its proper place.
=n(n-1)/4 +n - VIA
j=
=n(n-1)/4 +n - logen
= O(n’)
Average number of moves in i'" iteration of outer for loop
Sortin { 43]
Original list
Pass 1 : Partiton
into 5 sublists
Sort above 5
sublists
Combine these
5 sorted sublists 45 63
Pass 2 : Partition
into 3 sublists
Sort above 3
sublists
Combine these
3 sorted sublists ee 2G 5 2
Pass 3:
Only one sublist : : 19) 56.5 745763
19 45 56 63
/*Shell sort*/
while (inecr>=1)
{
for (l=inGr Ae ist; saat)
{
k=arr[il];
for(j=i-incr; j>=0 && k<arr(j]; j=j-incr)
arr (jtincr }=anr [7 )7
arr [j+incr] =k;
} be
incr=incr-2; /*Decrease the increment*/
}/*End of while*/
Dr mt e(Someed whist Sse Nm)
For (1=O070 1<n yet)
prinvEe (sain far wilay yy;
pra miei Gu Nay) 5
}/*End of main()*/
This program is similar to the program of insertion sort, except for a few changes.
i=0 iO >
[5 |} 8 | 9 [28 [34] [4 ]22 [os J30 [33 [a0] EOE SENWa he 1G Ha ay We ee a ie
i=0 i=l
= : k=1
L518
[9 [2834] [4 [22 [25 J30 [a3 [40 J 42 | BOSS Es Sd Sd ee ee ee
=! =
k=2
[s Ts Jo [28ts4] (4 [2 ps Pops Tole) i else elite ol) ol).
{=2 =! ; k=3
oe Ee ee
;i 5 ee Pe
. 425 {3 33 ae
4 (saps
fies
Pepys Pope aie, slo.
22
=)
15
fl
k=4
15 [8 [9 [28 34 | 14 | 22 [25 [30 [33 [40 | 4 [sebeeto lat | Tr [feb
i=?
i=3 <5
15 [8 [9 [28 [34 | 14 $22 [25 [30 [33 [40 [42 | 4:15 {8 [9 [22s]. [ fT J] JT 7 1
= =3 : k=6
ER ERERERES L422 [25 [30 [33 [40 [42 | As [52 5.8.) 9-122 [25 Jost, ol: TJ] J]
i=4 : k=7 ae
18
19 28 134) (4 [5 [8 [9 [22 [25 [28 |30:-7. | [ Tf
i=4 j=4
Sissi) Gps
ols Ta)[42| CO A
: i=4 ; =5 k=9
15 [8 [9 [28 [34] [4 [22 [25 [30 [33 [40 [42 | 1415 [8 19 [22 [25 [28 [30 [33 [34] [|
i=5 j= k=10
15] 8 | 9 [28 [34 | Ps [22[25 [30 [33 140[2] 14 {5 [8 | 9 [22 }25 [28 [30 (33 [34 [40 [ |
=5 k=11
15 [8 [9 [28 }34 | [a [22 [25 [30[33[40[22| 14 [5 [8 [9 [22 [25 {28 }30 133 [34 [40 [42 |
In the function mers«{) given next, we send the array arr and the lower and upper bounds of the lists to t
merged. The array ter is also sent which will store the merged array.
merge(int arr[], int temp[],int lowl, int upl, int low2, int up2)
{
Le tee lowly ju = melOWA anon aeOwn:
while(i<=upl && j<=up2)
{
sige (elma (fail! <= felbasell(
ill))
temp[k++] = arr[i++];
else
temp[k++] = arr[j++];
}
while (i<=up1)
temp[k++] = arr[i++];
while (j<=up2)
temp [k++] = arr[j++];
If the total number of elements in arr is n, then the performance of the above merging algorithm
O(n).
Divide
Conquer
}
}/*End of merge_sort*/
/*Merges ‘arr{lowl:up1] and arr[low2:up2] to temp[low1:up2] */
void merge(int arr[],int temp[],int lowl,int up1l,int low2,int up2)
{
slsphes Gb lowl;
int j low2 ;
sya Beh Ea MLOVAn
‘while ( (i<=up1) &&- (j<=up2) )
{
if(arr[i] <= arr[j])
temp[k++] = arr[i++] ;
else
temp[k++] = arr[j++] :
}
while (i<=up1)
temp[k++] = arr[it+];
while (j<=up2)
temp[{k++] = arr[j++);
}/*End of merge()*/
void copy(int arr[],int temp[],int low,int up)
{
ngohe, hth
pe
for(i=low; i<=up; i++)
arr[{i] = temp[il]; ——
The process is recursive so the record of the lower and upper bound of the sublists is implicitly maintained.
When merge_sort () is called for the first time, the low and up are set to 0 and n-1. The value
of mid is
ea
Iculated which is the index of middle element. Now the merge_sort() is called
for left sublist which is
x[{0:mida}. The merge_sort () is recursively called for left sublists till it is called for a one element
sub!?
this call, value of low will not be less than up, it will be equal to up so recursion will
terminate. sees
xrge_sort() is called for the right sublist of the previous recursive call. After the return of this call, the
two
blists are merged. This continues till the first recursive call returns after which we get the sorted result.
Before
urning from merge_sort (), the merged list which is in temp is copied back to arr. The following
figure
ows the values of variables low and up in the recursive calls of the function merge_sort ().
Figure 8.13
lick sort“also uses divide and conquer approach but there partition step is difficult while combining is trivial,
re in merge sort partition is simple but combining is difficult.
Sear R PR Pe
This process continues until only one sublist of size n is left.
4 5 8 30 42 64 89 92
If there is a sublist left in the last which cannot be merged with any sublist, it is just copied to the result. In
the example given in figure 8.14, this case occurs in pass | where sublist [3] is left alone and in pass 3 where
sublist [3, 21, 56] is left alone.
/*P8.7 Program of sorting using wish ‘sort without recursion*/
#include<stdio.h>
#define MAX 100
void merge_sort(int arr[],int n);
void merge_pass(int arr[],int temp[],int size,int n);
void merge(int arr[],int temp[],int lowl,int upl,int low2,int up2);
Void copy (int ark ],int temp[],int n);
main() Wy
{ a
int arr[MAX]),i,n; ¢
printf("Enter the number of elements : ")j;
scanf ("%d", &n) ;
for(i=0; i<n; i++)
{
printf("Enter element %d : ",i+1);
scanf("%d",&arr[i]);
} ‘ i
merge_sort(arr,n);
DEIntECisonted List as) 2 \ni"*);
for(i=0;7 isn; i++)
printe ("sad “ arn lil);
} /*End of main()*/
void merge_sort (int arr[{],int n)
{
int temp[MAX];
Tage wepieAcyees Ibe
while (size<n)
( :
merge_pass(arr,temp,size,n);
size =’ size*2;
}
void merge_pass(int arr[],int temp[],int size,int n)
{
int 1i1,lowl,up1,low2,up2;
lowl = 0;
while(lowl+size < n)
{
upl = lowl + size-1;
low2 = lowl + size;
up2 = low2 + size-1;
vy lf(up2 >= n)/*if length of last sublist is less than size*/
uUp2Z) (= wW=1;
merge (arr, temp, lowl,up1, low2,up2) ;
lowl = up2+1; /*Take next two sublists for merging*/
}
for(i=lowl; i<=n-1; i++)
temp{i] = arr[i]; (* LE Any Sublist is Lett 7%
copy (arr, temp,n);
}
void merge(int arr[],int temp[],int lowl,int upl,int low2,int up2)
{
int 1 = hew'l;
ime, 5 I low2;
nt, k= owls.
while (i<=upl && j<=up2)
{ F
prt 441
SNE (\tagnaltal)) <= bara eafa]
temp[k++] = arr[i++];
else
temp[k++] = arr[j++];
}
while (i<=up1)
temp[k++] = arr[i++];
while (j<=up2) :
;
temp[k++] = arr[j++];
figte Say
for (i=0;i<n;i++)
%
arrfai'= tempfil;: |
We can avoid the copying from arr to temp by merging alternately from arr to temp and from temp to
rr. The function merge_sort () would be like this-
erge_sort(int arr[], int n)
Now we can delete the last statement from the fusiction merge_pass() which is used for copying temp to
rr. The total number of passes required is log.n and in each pass n elements are merged, so complexity is
(nlog>n).
iahe, 2saycoy,
struct node *link;
,
else
; return start;
} =
struct node *divide(struct node *p)
{
struct node *q, *start_second;
q = p->link->link;
while (q!=NULL)
{
p = p->link;-
q = q->link;
if (q!=NULL)
q = q->link;
}
start_second = p->link;
p->link = NULL;
return start_second;
}
struct node *merge(struct node *pl,struct node *p2)
{
struct node *start_merged,
*q;
if(pl->info <= p2->info)
{
' start_merged = pl;
ple= pl-Slink;
start_merged = p2;
p2' = p2->1ink};
} .
q = start_merged;
while(p1!=NULL && p2!=NULL)
{
if(pl->info <= p2->info)
{
Q=>linkj=—“piis
q = q->link;
pl = pil->link;
else
2 = p2->link;
}
}
if (p1!=NULL)
q->link = pil;
ito
Sorti oe ee te a :
a cc Ye
else
} ga-hiniks =p?
return start_merged;
} 7
The function divide () takes a pointer to the original list and returns a pointer to the start of the second
sublist. We have taken two pointers p and q, where pointer p points to the first node and pointer q points to the
third node. In a loop we move pointer p once and pointer q twice so that when the pointer q is NULL, pointer p
points to the middle node. The node next to the middle node will be the start of second sublist. The middle node
will be the last node of first sublist so its link is assigned NULL.
The function merge () takes pointers to the two sorted lists, merges them into a single sorted list and returns
a pointer to the merged list. There is no requirement of any temporary storage for merging of linked lists.
3 43 45 56 65 66 67 68 87 89 90 96 98
then after placing 4 at its proper place the list becomes [1, 3, 2, 4, 6, 8, 9, 7]. Now we can partition this list into
two sublists based on this pivot and these sublists are [1, 3, 2] and [6, 8, 9, 7]. We can apply the same procedure
to these two sublists separately i.e. we will select one pivot for each sublist and both the sublists are divided into
2 sublists each so now we get 4 sublists. This process is repeated for all the sublists that contain two or more
elements and in the end we get our sorted list. ~ -
Now let us outline the process for sorting the elements through quick sort. Suppose we have an array arr
with low and up as the lower and upper bounds.
1. Take the first element of list as pivot.
2. The list is partitioned in such a way that pivot comes at its proper place. After this partition, all elements to
the left of pivot are less than or equal to the pivot and all elements to the right of pivot are greater than equal to
the pivot. So one element of the list i.e. pivot is at its proper place. Let the index of pivot be pivotloc.
3. Create two sublists left and right side of pivot, left sublist is arr [low].....arr[pivotloc-1] and the right
sublist is arr [pivotloc+1]....arr[up]. . %
4. The left sublist is sorted using quick sort recursively.
5. The right sublist is sorted using quick sort recursively.
6. The terminating condition for recursion is - when the sublist formed contains only one element or no element.
The sublists are kept in the same array, and there is no need of combining the sorted sublists at the end. Let
us take a list of elements and sort them through quick sort.
low=0
Piel 44 | 197[59 [7 | 0 [42 [os ]e2 [8 195 [oe]
42 [44 [io | 8 fags] so [72 | 65 |82 | 59 [95 [68 |
5 up=
_
po
Ps 119 [42 [44 [4s 759 [6s [6s [72 | 80 |82 [95 _|
Figure 8.15 Quick Sort
Here we are focusing only on the recursion procedure; the logic of placing the pivot at proper place is
discussed later. The values of low and up indicate the lower and upper bounds of the sublists.
Initially the list is [48, 44, 19, 59, 72, 80,42, 65, 82, 8, 95, 68]. We will take the first element(48) as the
pivot and place it in proper place and so the list becomes [42, 44, 19, 8, 48, 80, 72, 65, 82, 59, 95, 68 ]. Now all
the elements to the left of 48 are less than it and all elements to the right of 48 are greater than it. We will take
two sublists Icft and right of 48 which are [42, 44, 19, 8] and [80, 72, 65, 82, 59, 95, 68], and sort them
separately using the same procedure. Note that the order of the elements in the left sublist or in the right sublist
is not the same as it appears in the original list. It depends on the partition process which is used to place pivot
446 Data Structures through C in Depth
at proper place. For now you just néed to understand that all elements to left of pivot are less than or equal to
pivot and all elements right of pivot are greater than or equal to pivot.
The left sublist is [42, 44, 19, 8] and its pivot is taken as 42, after placing the pivot the list becomes [19, 8, —
42, 44]. Now 42 is at its proper place, we again divide this list into two sublists which are [19, 8] and [44]. The
list [44] has only one element so we will stop. The list [19, 8] is taken and here the pivot will be 19 and after
placing the pivot at proper place the list becomes [8, 19]. The left sublist is [8] and it contains only one element
so we will not process it. There are no elements to the right of 19, so right sublist is not formed or we can say
that right sublist contains zero elements so we will stop.
The right sublist is [80, 72, 65, 82, 59, 95, 68] and 80 is taken as the pivot. After placing the pivot the list
becomes [59, 72, 65, 68, 80, 95, 82]. Now the two sublists formed are [59, 72, 65, 68] and [95, 82]. In the first
sublist, 59 is taken as the pivot and after placing it at the proper place the sublist becomes [59, 72, 65, 68].
There are no elements to the left of 59 so only one sublist is formed which is [72, 65, 68]. Here 72 is taken as
the pivot and after placing it the list becomes [68, 65, 72]. There are no elements to the right of 72 so only one
sublist is formed which is [68, 65]. In this sublist 68 is taken as the pivot and after placing it at proper place the
sublist becomes [65, 68]. There are no elements to the right of 68 and the left sublist has only one element so we
will stop.
After this we will take the sublist [95, 82]. Here 95 is taken as the pivot and after placing it at proper place
the sublist is [82, 95]. There are no elements to the right of 95 and the left sublist has only one element so we
will stop. ‘
We can write a recursive function for this procedure. There would be two terminating conditions, when the
sublist formed has only 1 element or when no sublist is formed. If the value of low is equal to up then there will
be only one element in the sublist and if the value of low exceeds up then no sublist is formed. So the
terminating condition can be written as-
if (low>=up)
return;
The recursive function quick () can be written as-
void quick(int arr[],int low,int up)
{
int pivloc;
if (low>=up)
return;
pivloc = partition (arr, low, up);
quick(arr,low,pivloc-1); /*process left sublist*/
quick(arr,pivloc+l,up); /*process right sublist*/
V/A End Of quick)*/
The function partition () will place the pivot at proper place and then return the location of pivot so that
we can form two sublists left and right of pivot.
Since the process is recursive, the record of lower and upper bounds of sublists is implicitly maintained.
Now the second main task is to partition a list into two sublists by placing the pivot at the proper place. Let us
see how we can do this. Suppose we have an array arr [low:up], the element arr [low] will be taken as the
pivot. We will take two index variables i and j where i is initialized to Low+1 and j is initialized to up. The _
following process will put the pivot at its proper place.
(a) Compare the pivot with arr[i], and increment i if arr[i] is less than the pivot element. So the index
variable i moves from left to right and stops when we get an element greater than or equal to the pivot.
(b) Now compare the pivot with arr[j], and decrement j if arr [J] is greater than the pivot element. So the
index variable j moves from right to left, and stops when we get an element less than or equal to the pivot.
(c) If i is less than 5
Exchange the value of arr [i] and arr[j], increment i and decrement i;
else
No exchange, increment i.
Sortin . 447
(d) Repeat the steps (a), (b), (c) till the value of i is less than or equal to j. We will stop when i exceeds j.
(¢) When value of i becomes more than j, we have found proper place for pivot which is given by 3, hence
now pivot is to be placed at position j. Pivot was at location low, so we can place it at j by exchanging the
value of arr{low] and arr [3]. Now pivot is at position 5 which is its final position.
Now let us take a list and see how pivot will be placed at proper place through this partition process.
low=0, up=1 1
i=] and j=11, Pivot = arr[low] = 48 48 [44 [9 [59 [72 Tso] 42Jos [a2] 8 [95 [68]
i=l jell
44 < 48, Increment i
[44 fio [59 [72 | 80] 42 Jos [82]8 [95 [68|
i=1—> j=l!
19 < 48, Increment i
[48 [44 [19] 59] 72] 80|42|65] 82] 8 | 95|68|
i-2—> jell
59 > 48, Stop 1 at 3
i=3 j=l]
68 > 48, Decrement j [48 [44 [19 Ts9 [72] so] 42[os [82] 8 [95] 68)
i=3 < j=l
95 > 48, Decrement j [48 [44 [io [59 [72] 80] 42Jos [82] 8 [95] 68|
? i=3 <—j=10
8 < 48, Stopj at 9. [48_[44 [19 |59] 72|80] 42 Jos |82] 8 J95|68
Now both i and j stopped i=3 j=9 ‘
Since i <j, Exchange arr{3] and arr[9]
Increment i and decrement j [48 [44 [19 Ts [72] 80] 42 Jos [82] 59] 95|68|
i3—> <i-9
72 > 48, stop i at 4 [48 [44 [io [8 [72] 80] 42[os |82] 59] 95 |68|
i=4 j=8
82 > 48, decrement j [as [44 [19] 8 |72] 80] 42}65 [82} 59]95|68|
i=4 ce j=8
65 > 48, decrement j [as [44 Lio] 8 [72] 80] 42[osJ82] 59]95|68|
i=4 << j-7
Increment i and decrement j fas [44 Tio] 8 [42] 80] 72[os [82] 59] 95|68|
ind> <j=6
Pivot placed at proper place [a2 [44Ti9[ 8 [as [so [72 Jos [82] 59] 95|68|
<= Left sublist —> Pivot <—— Right sublists ————>
The location for pivot is the value of j, so the function partition will return value of j. Now for left sublist
low=0 and up=3 and for right sublist low=5 and up=11.
int partition(int arr[],int low,int up)
{
int emp, 1, J, DLV Oey,
i= Vows,
3 = up;
Divet =Farr low);
while(i <= i)
;
while((arr[i] < pivot) && (i < up))
i++;
while(arr[j] > pivot)
Spe — a
} ie
else
i++; “i.
} s
arr[low] = arr([jl]; - s
fkese (ia) | J jonlnienay
Iseleqwuaral wOhy
WASH (One parc2 eon je /
The variable i is moving right, so we have to prevent it from moving past the array bound. For example if
pivot is the largest element in the array then there is no element in the array that can stopi, and it just moves
on. So we have to check the condition (i<up) before incrementing i. The other way out may be to put a
sentinel at the end having largest possible value. The variable j is moving left but it will never cross the bound
of the array because pivot is there at the leftmost position.
The program for quick sort is given below, the functions quick() and partition() are the same as
described earlier.
/*P8.9 Program of sorting using quick sort*/
#include<stdio.h>
#define MAX 100
void quick(int arr[],int low,int up);
ink partielom(imeraca alae Low, int wp) >
main () }
int array(MAX],n,i;
Sortin 2 449
The partition process is not stable so quick sort is not a stable sort.
Empty
Empty
Empty
Figure 8.16
450 Data Structures through C in th
If we have a list of n elements, then first we get 2 sublists of sizes 0 and n-1. Now sublist of size n-1 is
divided into two sublists of sizes 0 and n-2. The total number of lists that are sorted is n-1 and these are of sizes
Meed ¢.N-2 282), :.2. The total number of comparisons will be-
n-1 + n-2+ n-3+.....41
= n(n- 1/2
= O(n’) |
So in worst case the per formance of quick sort is O(n’). Paes
The average case performance is closer to the best case than to the worst case and is found out to be .
O(nlog»n). It is not a stable sort. Space complexity for this sort is O(log n). *)
In the process of finding the utuiah we have placed the (ge? of three at the end, so pivot cannot be greater
than the last element and hence now there is no need of condition i<up before incrementing i.
not good because they tend to maximize the difference in the sizes of the sublists which reduces the
efficiency
of quick sort.
To understand which of the last two options is better, let us consider the case when all the elements in the list
are equal. If both i and j stop then there will be many unnecessary exchanges between.equal elements but the
good thing is that i and 3 will meet somewhere in middle of the list thus creating two almost equal sublists. If
both i and j move then no unnecessary swaps would be there but the sublists would be unbalanced. If all the
elements are equal then j will always stop at the leftmost position and so pivot will always be placed at the
leftmost position and hence one sublist will always be empty. This is the same situation that we studied in the
worst case and the running time is O(n’).
So it is best to stop i and 3 when any element equal to the pivot is encountered. It might seem that
considering the case of all equal elements is not a good idea as it would rarely occur. A list of all equal elements
can be rare but we may get a sublist consisting of all equal elements, for example suppose we have a list of
500,000 elements out of which 10,000 are equal. Since quick sort is recursive so at some point there will be a
recursive call for these 10,000 identical elements and if sorting these elements takes quadratic time then the
overall efficiency will be affected.
par->lchild=tmp,; ; .
else
par->rchild=tmp;
return root;
}/*End of insert( )*/
void inorder(struct node *root, int arr[])
{ }
struct node *ptr=root;
int-i1=0; 4
if (ptr==NULL)
si ;
printf("Tree is empty\n");
return;
}
while(1)
ogre
— while (ptr->lchild!=NULL)
{
push_stack(ptr);
ptr = ptr->ichild;
}
7 while (ptr->rchild==NULL)
{
arr [i++]=ptr->info;
if (stack_empty () )
* return;
/ ptr = pop_stack();
Ret
; arr[it++]=ptr->info;
Pe DeLe=— Derren;
}
}/*End of inorder( )*/
/*Delete all nodes of the tree*/
struct node *Destroy(struct node *ptr)
{
if (ptr! =NULL) :
{
Destroy (ptr->lchild) ;
Destroy (ptr->rchild) ;
free(ptr) ;
}
return NULL;
}/*End of Destroy()*/
yoid push_stack(struct node *item)
K
if (top==(MAX-1) )
{
printf ("Stack Overflow\n") ;
return;
}
stack[++top] =item;
t/*End of push_stack()*/ Y
struct node *pop_stack()
é !
return item;
} /*End Of *popastack(()
+7
int stack_empty()
{
Le (eop—=—L)
return 1;
else
return 0;
} /*End of stack_empty*/
If the data is in sorted order or reverse sorted order then the total number of comparisons is given by-
O+F2+3
+4 + Deena. .+n=n(n+1)/2 => O(n’)
The main drawback of the binary tree sort is that if data is already in sorted order or in reverse order then
the.
performance of binary tree sort is not good.
Sortin 455
If data is in random order and suppose we get a balanced tree whose height is approximately log
n then the
number of comparisons can be given by-
Be aDae 2 (h) *2™!
This is because there can be maximum 2" nodes at any level L, and L+1 comparisons are required to insert
any node at level L as we have seen in case (c). The root node is at level 0 and the last level is h-1. The
efficiency in this case is O(n log n).
_ So if the tree that we obtain is of height log n then the performance of binary tree sort is O(n log n).
Generally with random data the chances of getting a balanced tree are good so we can say that average run time
of binary tree sort is O(n log n).
After insertion of elements, scme time is needed for traversal also. If we use a threaded tree then we can
reduce this time by avoiding the use of stack. Binary. tree sort is not an in-place sort since it requires additional
O(n) space for the construction of tree. It is a stable sort.
First this array is converted to a heap. The procedure of building a heap is described in chapter on trees. The
heap obtained from this array is shown below-
8 3) IC) 11
Now the root is repeatedly deleted from the heap. The procedure of deletion of root from the heap is described
in chapter on trees.
oa
Sana Sa a ST
a) 2
(2) o)
Dee ee EEE ARI LLTLaEaS REM SRS CAEN 95.
It Jae S Ol) OT ee oue STO) sy
This way finally we see that the array arr becomes sorted. If we don’t want to change the array that represents
the heap we can take a separate array s_arr[] for the sorted output. The elements deleted from root can be
stored in s_arr[n], s_srr[n-1] and so on till arr[1]. This time we have to copy the element in position 1
also.
/*P8.11 Program of sorting through heapsort*/
Sorting eee re ee ew oe A NE a AST,
#include <stdio.h>
#define MAX 100
void heap_sort(int arr[],int size);
void buildHeap(int arr[],int size);
int del_root(int arr[{],int *size);
void restoreDown(int arr{],int i,int size);
void display(int arr[{],int n);
main ()
{
int i,n arr[MAX];
printf("Enter number of elements : ");
scanf ("%d", &n) ;
for(i=1; i<=n; i++)
{
printf("Enter element %d : ",i);
scanf("%d",&arr[i]);
F
printf("Entered list is :\n");
display (arr,n);
heap_sort(arr,n);
printi(™ Sorted ist is® 2\n"
_ display(arr,n);
}/*End of main()*/
{
haagiall| —) focal
al chmel),
1 = left;
In LSD radix sort, sorting is performed digit by digit starting from the least significant digit
and ending at the
nost significant digit. In first pass, elements are sorted according to their units (least
significant) digits, in
econd pass clements are sorted according to their tens digit, in third pass elements are sorted
according to
iundreds digit and so on till the last pass where elements are sorted according to their most significant
digits,
For implementing this method we will take ten separate queues for each digit from 0 to 9. In the first
pass,
jumbers are inserted into appropriate queues depending on their least significant digits(units digit), for example
283 will be inserted in queue 3 and 540 will be inserted in queue 0. After this all the queues are combined
tarting from the digit 0 queue to digit 9 queue and a single list is formed. Now in the second pass the numbers
tom this new list are inserted in queues based on tens digit, for example 283 will be inserted in queue 8 and
340 will be inserted in queue 4. These queues are combined to form a single list which is used in third pass.
[his process continues till numbers are inserted into queues based on the most significant digit. The single list
hat we will get after combining these queues will be the sorted list. We can see that the total number of passes
will be equal to the number of digits in the largest number. Let us take some numbers in unsorted order and sort
hem by applying radix sort.
Original List : 62, 234, 456, 750, 789, 3, 21, 345, 983, 99, 153, 65, 23, 5, 98, 10, 6, 372
It is important that in each pass the sorting on digits should be stable i.e. numbers which have same i”
digit(from right) should remain sorted on the (i-1)" digit(from right). After any pass when we get a single list,
we should enter the numbers in the queue in the same order as they are in list. This will ensure that the digit
sorts are stable.
We take the initial input in linked list to simplify the process. If the input is in array then we can convert it to
linked list. In the program we just traverse the linked list and add the number to the appropriate queue. After
this we can combine the queucs into one single list by joining the end of a queue to start of another queue.
We do not know in advance how many numbers will be inserted in a particular queue in any pass. It may be
possible that in a particular pass, the digit is same for all the numbers, and all numbers may have to be inserted
in the same queue. If we use arrays for implementing queues then each array should be of size n, and we will
need space equal to 10*n. So it is betterto take linked allocation of queues instead of sequential allocation.
/*P8.12 Sorting using radix sort*/
#include<stdio.h>
#include<stdlib.h>
struct node
{
aes HLehe(oyy
struct node *link;
}*start = NULL;
void radix_sort()j;
int large_dig();
int digit(int number,int k);
main ()
{
struct node * tmp, *q;
Toe sb pbalp okey
printf("Enter the number of elements in the list : ");
scanf ("%d", &n) ; :
for (a=0)a'<naera)
{
printf("Enter element %d : ",i+1);
scanf("%d", &item) ;
/*Inserting elements in the linked list*/
tmp = malloc(sizeof(struct node) );
tmp->info = item;
tmp->link = NULL;
1f(start==NULL) /*Inserting first element */
stare: = tmp);
else
{
q = Stare;
while (q->link! =NULL)
q = q->link;
q->link = tmp;
}
\iyaKalol fore BRoheti//
radix_sort(); :
print£ ("Sorted list is -:\n");
Gesstancy
while (q!=NULL)
{
printf("%d ",q->info) ;
q = q->link;
}
jobqotiohon
a(Nata)
}/*End of main() */
void radix_sort()
{
int i,k,dig,least_siy,most_sig;
ee
struct node *p,*rear[10],*front[10];
least_sig = 1;
most_sig = large_dig(start);
for (k=least_sig; k<=most_sig; k++)
{
/*Make all the queues empty at the beginning of each pass*/
for(i=0; i<=9; i++)
{
rear[{i] = NULL;
front(iJ] = NULL ;
}
for(p=start; p!=NULL; p=p->link)
if
dig = digit (p->info,k); /*Find kth digit in the number*/
/*Add the number to. queue of dig*/
if(front[dig] == NULL)
front[dig] =p ;
else
rear(dig]->link = p;
rear(dig] = p;
}
~ /*Join all the. queues to form the new linked list*/
sh, 10%p
while(front[i] == NULL)
1++;
ndigt+;
large = large/10 ;
}
return (ndig) ;
/*End of large_dig()*/
*This function returns kth digit of a number */ }
nt digit(int number,int k)
{ 1
digit = number%10 ;
mumber = number/10 ;
\
return (digit) ;
}/ *BRaMoOr Goma jae 7
ee ee ee Ca ee |2. |5 sprsQsoltalh
deete pada
Now all the elements will be placed in different sets according to the corresponding values of f(x). The value
of function f(x) can be 0, 1, 2....., 9 so there will be 10 sets into which these elements are inserted.
098
165, 194
232, 254, 276, 289 |
| aes
Sees
415, 432, 476
5 532, 566
We can sec that the value of f(x) for the elements 489, 232, 276, 254 is same, so they are
inserted in the same
set but in sorted order i:e. 232, 254, 276, 289. Similarly other elements are also inserted
in their particular sets in
sorted order. All the elements ina set are less than elements of the next set and elements
in a set are in sorted
order. So if we concatenate all the sets then we will get the sorted list.
98, 165,194, 232, 254, 276, 289, 345, 415, 432. 476, 9 532. 566, 654. 965
eee Ny Ee
Sortin 3 463
-Ct us sort the same list by taking another non decreasing hash function-
—(x) = (x/kj * 5 (where k the largest of all numbers)
abeleee
Be
[he sorted list that is obtained by concatenating these 6 sets is-
8. 165, 194. 232. 254, 276. 289, 345, 415, 432, 476, 532, 566, 654, 965
For implementing this sorting technique, we can represent each set by a linked list. We know that in each set
the elements have to be inserted in sorted order so we will take sorted linked lists. Starting address of each
linked list can be maintained in an array of pointers.
In the case of function f(x) which gives us the first digit of an element, there will be 10 linked lists each
>orresponding to one set. We can take an array of pointers head[10], and each of its element will be a pointer
pointing to the first element of these lists. For example head[0] will be a pointer that will point to the first
>]ement of that linked list in which all elements starting with digit 0 are inserted. Similarly head[1], head[2],
es ace head[9] will be pointers pointing to first elements of other lists.
head{0] =
head{1]
headf[2]
head[3]
head[4]
head[5] 1532
| }-—>566 Fed
head[6] LoS4 Bee
head[7] is NULL
head(8} is NULL
head[9]
These linked lists can be easily concatenated to get the final sorted list. In the case of function h(x) discussed
vefore, there will be only 6 linked lists.
head[0] L_98]} 5 [165 fg
head[1]
head[2]}
headf3]
head[4] is NULL
p = head[k];
while(p!=NULL)
{
arrf{it++] = p->info;
p = p->link;
}
}
P/*End of addr_sort()*7
tmp->link = head[addr];
head[addr] = tmp;
return;
else
q = head[addr];
whilefq->link != NULL && q->link->info < num)
q = q->link; ©
tmp->link = q->link;
q->link = tmp;
}
}/*End of insert()*/
int hash_fn(int number)
{
int addr;
float tmp;
tmp = (float)number/large;
addr = tmp *5;
return (addr) ;
}/*End of hash_fn()*/
char name[20];
int age;
int salary;
466 . Data Structures through 4 in Depth
main ()
if (xchanges == 0)
break;
1
J
printf("List of Records Sorted on age \n");
for(i=0; i<n; i++) S /
{
printf ("%s\t\t",arr[i] .name) ;
Deine ( Sd\tt amici aacge)is
printf ("%d\n",arr[i].salary) ;
}
pone alimyens
(CU Awl) ih
}/*End-of main()*/
_/*P8.15 Program to sort the records on different keys using bubble sort*/
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define MAX 100
struct date
{
int day;
int month;
int year;
Me
struct employee
{
char name[20];
struct date dob;
struct date doj;
int salary;
temp = emp[j];
emp({j] = emp[j+1];
emp[j+1] = temp;
xchangest++; _
}
}
if(xchanges ==.0)
break; .
}
}/*End of sort_name()
*/
void sort_dob(struct employee emp[],int n)
{
struct employee temp;
int i,j',xchanges; ~
for(i=0; i<n-1; i++)
{
xchanges = 0;
for(j=0; j<n-1-i; j++)
{
if (datecmp(emp[j].dob,emp[j+1].dob) > 0)
{
temp = emp(j];
emp[{j] = emp[j+1];
emp{j+1] = temp; ;
xchanges++;
| |
} 1
. if (xchanges == 0)
break;
}
}/ *End OEVsortsdob())
*/
void sort_doj(struct employee emp[],int n)
{
struct employee temp;
int i,j,xchanges;
for(i=0; i<n-1; i++)
{
xchanges = 0;
for(j=0; j<n-1-i; j++)
{
if (datecmp(emp[j].doj,emp[j+1].doj ) > 0)
{
temp = emp(j];
emp({j] = emp[j+1]; i
emp[j+1] = temp;
xchanges++;
}
}
if(xchanges == 0)
break;
Sorting
469
}/*End of sort_doj()*/
void sort_salary(struct employee emp[],int n)
{
| struct employee temp;
int i,j,xchanges;
for(i=0; i<n-1; i++)
it
rt xchanges = 0;
for(j=0; j<n-1-i; j++)
{
if(emp[({j].salary > emp[j+1].salary)
{ ‘
temp = emp[j];
:
emp[j] = emp[j+1]; -
emp[j+1] = temp;
xchanges++;
}
}
if(xchanges == 0)
break;
} \
xchanges = 0;
for (j=0n G<n—-le=a; a+)
Gy ;
| if(ptr[j3]->age > ptr[jt+1]->age)
{
temp = ptr[jl];
ptr[j)] = ptxlj+1),;
ptr[j+1) = temp ;
xchanges++;
}
if(xchanges == 0)
break;
}
printf("List of Records Sorted on age \n");
SSeOIe(MSO Gp aleialp. aue?)}
{
printeé("Ss\t\t", ptriij/-sname);
; DEIMeEL US Ve ean per | dll->age)i;
printf ("%d\n",ptr[i]->salary) ;
}
jopesoyere(’) Nail) 7B
}/*End of main() */
Exercise
(ii) 3--10...12...89...54...15...43..78...28...45
Ci) S245 89 8541543678" 28-10
Identify the sorting technique used in each case.
(3. The insertion sort algorithm in the chapter uses linear search to find the position for insertion. Modify
this algorithm and use binary search instead of linear search.
[4. Suppose we are sorting an array of size 10 using quicksort. The elements of array after finishing the
first partitioning are - ;
Deomeeomer
12 16-11 9 20
Which clement(or elements) could be the pivot.
15. Show the contents of the following array after placing the pivot 47 at proper place.
Bet Beat 25 SOM P2S8 79192 35a T 136 92 212 i)
16. Show how this input is sorted using heap sort.
12 VAS ~2he 76 2838 97% 82 654
(7. Write a program to sort a list of strings using bubble sort.
18. Show the elements of the following array after first pass of the shell sort with increment 5.
S40 DINsG5t0 S9ombls 545 BBy QIeqFG> Ohad TreoS]is F291 yi24 oil
Searching and Hashing
if (i<n)
return i;
else
return -1;
}
Ors te©~2 Sieg DesOes esos o se OMe Its 14s Reo 20)
Search56 [Bo 19] 9 Too[29]2 [ao]as[or]53[12Tos[6[rs[o2[oo[23[os[as[56]
The function for sequential search using sentinel is- ; /
int LinearSearch(int arr[],int n,int item)
{ '
Int 2=0'; ;
while(item!=arr[i])
i++;
if (i<n)
return i;
else
return -1;
The number of comparisons in an unsuccessful search can be reduced if the array is sorted. In this case we
need not search for the item till the end of the list, we can terminate our search as soon as we find an element
that-is greater or equal to the search item(in ascending array). If the element is equal to the search item the
search is successful otherwise it is unsuccessful. The function for sequential search in an ascending order array
is-
int LinearSearch (int arr[],int n,int item)
{
Geeta StetOe
while(i<n && arr[{i]<item)
i++;
if (arr[i]==item)
return i;
else
return -1;
middle element. If the item to be searched is less than the middle element, it is searched in left half otherwise it
is searched in the right half.
Now search proceeds in the smaller portion of the array(subarray) and the item is ebinaecs with its middle
element. If the item is same as the middle element, search finishes otherwise again the subarray is divided into
two halves and the search is performed in one of these halves. This process of comparing the item with the
middle element and dividing the array continues till we find the required item or get a portion which does not
have any element.
To implement this procedure we will take 3 variables viz. low, up and mid that will keep track of the status
of lower limit, upper limit and middle value of that portion of the array, in which we will search the element. If
the array contains even number of elements, there will be two middle elements, we'll take first one as the
middle element. The value of the mid can be calculated as-
Mice=— (CLOWFUp) L263
The middle element of the array would be arr [mid], the left half of the array would be arr [low]......
arr [mid-1] and the right half of the array would be arr [mid+1].... arr[up].
The item is compared with the mid element, if it is not found then the value of low or up is updated for
selecting the left or right half. When low becomes greater than up, the search is unsuccessful as there is no
portion left in which to search.
If item > arr[ mid]
Search will resume in right half which is arr[mid+]1] ....... arr[up]
So low = mid+1, up remains same
If item < arr[mid]
Search will resume in left half which is arr[low] ......... arr[mid-1]
up = mid-], low remains same
If item == arr[mid], search is successful
Item found at,mid position
If low > up, search is unsuccessful
Item not present in array
‘Let us take a sorted array of 20 elements and search for elements 41, 62, and 63 in this array one by one. The
portion of array in which the element is searched is shown with a bold boundary in the figure.
Search 41
41 >23
ete
mid = (0+8)/2
=4
Search in right half
low
= mid+1 =5
41 >32
low =5, up = 8,
Search in right half
mid = (5+8)/2 =6
low = mid+1 =7
ehasien te
low=7,up=8
Searchats
in ti
nid = (7+8)/2 =7
ae KHiET b
low = 8, up = 8,
41 found
mid = (8+8)/2 = 8
arching and Hashing
475
arch 62
Pea 10ep
low =.10,
=19.0s DTT
up = 19,5 TS
SPEED
~
PPT]
a aie
sexchn eta
62 <75
= mid-! = 13
arch 63
iow= 0, up = 19, ()
63 > 47
Search in right half
nid = (0+19)/2=9 CPR PE PP per41 low = mid+! = 10
ownid=(10413)/2=
=10.up= 13, (2 [9Tee Te ee LaBs Beals [ra] DeJ TH Be) search
inrighthat
: a 63 > 62
11 low
= mid +1 = 12
WW low=12 and up=11, the value of low has exceeded the value of up so the search is unsuccessful. The
gram for binary search is-
P9.4 Binary search in an array*/
aclude <stdio.h>
2fine MAX 50
in()
while (low<=up)
{
mid =. (low+tup)
/2;
if(item > arr[mid])
low = mid+1; /*Searcn inf rrghnt half */
else
if (item < arr[mid])
up = mid-1; /*Search in left half */
else
return mid;
}
return =-1;
}
The function BinarySearch() returns the location of the item if found, or -1 otherwise.
The best case of binary search is when the item to be searched is present in the middle of the array, and in
this case the loop is executed only once. The worst case is when the item is not present in the array. In each
iteration, the array is divided into half, so if the size of array is n, there will be maximum log n such divisions.
Thus there will be log n comparisons in the worst case. The run time.complexity of binary search is O(log n)
and so it is more efficient than the linear search. For an array of about 1000000 elements, the maximum number
of comparisons required to find any element would be only 20.
Binary search is preferred only where the data is static i.e. very few insertion and deletions are done. This is
because whenever an insertion or deletion is to be done, many elements have to be moved to keep the data in
sorted order. Binary search is not suitable for linked list because it requires direct access to the middle element.
The recursive function for binary search is-
int RBinarySearch(int arr[],int low,int up,int item)
{
sole jasttole
if (low>up)
return -1;
mid = (low+tup) /2;
if(item > arr[mid]) /*Search:in right half */
RBinarySearch(arr,mid+1,up, item) ;
else if(item < arr[mid])/*Search in left half */
RBinarySearch(arr,low,mid-1,item) ;
else
return mid;
9.3 Hashing
We have seen different searching techniques where search time is dependent on the number of elements.
Sequential search, binary search and all the search trees are totally dependent on number of elements and many
key comparisons are involved. Now we’ll see another approach where less key comparisons are involved and
searching can be performed in constant time i.e. search time is independent of the number of elements.
Suppose we have keys which are in the range 0 to n-1, and all of them are unique. We can take an array of
size n, and store the records in that array based on the condition that key and array index are same. For example,
Suppose we have to store the records of 15 students and each record consists of roll number and name of the
student. The roll numbers are in the range 0 to 14 and they are taken as keys. We can take an array of size 15
and store these records in it as-
earching and Hashing A477
Up. Dyuthis |
14 Niharika
Figure 9.2
The record with key(roll number) 0 is stored at array index 0, record with key | is stored at array index | and.
» on. Now whenever we have to search any record with key k, we can directly go to index k of the array,
ecause random access is possible in array. Thus we can access any record in constant time and no key
ymparisons are involved, This method is known as direct addressing or key-indexed search but it is useful only
hen the set of possible Key values.is small.
Consider the case when we have to store stern of 500 employees of a company, and their 6 digit employee
| is taken as the key. The employee id can be anything from 000000 to 999999, so here the set of possible key
alues is much more than the number of keys. If we adopt the direct addressing method, then we will need an
tay of size 10° and only 500 locations of this array would be used. In practice, the number of possible key
alues will be more than the number of keys actually stored, so this direct addressing is rarely used.
Now let us see how we can modify the direct addressing approach so that there is no wastage of space and
ill we can use the value of key to find out its address. For this we will need some procedure through which we
un convert the key into an integer within a range, and this converted value can be used as index of the array.
stead of taking key equal to the array index, we can compute the array index from the key. This process of
mverting a key to an address(index of array) is called hashing or key to address transformation a‘id it is done
rough hash function. A hash function is used to generate an address from a key or we can say that cach key is
apped on a particular array index through hash function. The hash function takes a key as input and returns the
ish value of that key which is used as the address for storing the key in the array. Keys may be o: any type like
fegers, strings etc but hash value will always be an integer.
| ara See
{45]] 8183330487 “C in Depth”
|
(47]
[48] 8175586440 “C++ FAQs”
[49]] 8178082195 “Effective C++”
| SES
enc
| ee aa
[52]
{53]} 8176567418 “DS thru C in depth”
{54]
[55]
Figure 9.3
Now suppose we have to store a book with ISBN 8173380843. The sum of the digits is 45, i.e. the address :
given by our hash function is 45, but this address is already occupied. This situation is called collision. —
.Collision occurs when the hash function generates the same address for different keys. The keys which are —
mapped to same address are called synonyms. For resolving collision we have different collision resolution ~
techniques about which we will study in detail later in this chapter. Ideally a hash function should give unique ~
addresses for all keys but this is practically not possible. So we should try to select a hash function that
minimizes collision. While making algorithms that use hashing we have to mainly focus on these two things- 2
1. Selecting a hash function that converts keys to addresses.
2. Resolving the collision.
Suppose table size is 100 and we decide to take 2 rightmost digits for getting the hash table address, so the
addresses
of above keys will be 61, 65, 71 and 28 respectively.
This method is easy to compute but chances of collision are more because last two digits can be same in
many keys. .
Collisions can be minimized if the table size is taken to be a prime number. Let us take table of size 97 and
see how the following keys will be inserted in it.
82394561, 87139465, 83567271 °
82394561 % 97 =45
87139465 % 97 =0 ae
SoD On2 il eO—29 E
So here the hash address of above keys will be 45, 0 and 25. |
We can combine other hashing methods with division method and it, will ensure that addresses are in the
range of hash table.
If a function h returns address 5 for a key and table size is 11, then we will search for empty locations in thi:
sequence - 5, 6, 7, 8, 9, 10, 0, 1, 2, 3, 4. If location 5 is empty, there will be no collision and we
can insert the
key there, otherwise we will linearly search these locations and insert the key when we find an empty
one. Note
that we have taken our array to be closed so after the last location(10") we probe
the first location of array(0'")
Let us take a table of size 11, and insert some keys in it taking a hash function-
searching and Hashing ; 48]
(key) = oe % 11
[0}
(29)
= 29%11 =7
(46)
= 46% 11 =
(18)
= 18%11 =7
(36)
= 36%11 =3
(43)
= 43%11 = 10
(21)
= 21%11 = 10
(24)
= 24%11 =2
(54)
= 54%11 = 10
H(54, 0) = (10
+ 0) % 11 = 10 (not empty)
H(54, 1) =(10+1)% 11=0 (not empty)
H(54, 2) =(10+2 )% 11 =1 (Empty, Insert the key)
This is the way insertion is done in case of linear probing. For searching a key, first we check the hash
ddress position in the table, if key is not available at that position, then we sequentially search the keys after
hat hash address position. The search terminates if we get the key or reach an empty location or reach the
sition where we had started. In the last two cases the search is unsuccessful.
The main disadvantage of the linear probing technique is primary clustering. When about half of the table is
ull, there is tendency of ‘cluster formation i.e. groups of records stored next to each other are created. In the
bove example a cluster of indices 10, 0, 1, 2, 3, 4 is formed. If a key is mapped to any of these indices then it
vill be stored at index 5 and the cluster will become bigger. Suppose a key is mapped to index 10, then it will
€ stored at 5, far away from its home address. The number of probes for inserting or searching this key will be
. Clustering increases the number of probes to search or insert a key, and hence the search and insertion times
f the records also increase.
Let us take a table of size 11 and hash function h(key) = key%11, and apply this technique for inserting th
following keys.
h(46) = 46%11 =2
h(28) = 28%11 =6
h(21) =21%11 = 10
h(3S) = 35%11 =2
h(57)i=S7%11 =2
h(39) = 39%11 =6
h(19)= [9%11 =8
h(S50)= 50%11 =6
Let us see how some keys can be inserted in the table taking these two functions-
h(k)= key%11
|, h'(k)=7 + (key % 7)
(i) Insertion of 46, 28, 21 - No collision, all are insertedat their home addresses .
(ii) Insertion of 35 — Collision at 2. ;
Next probe is done at - (2 + 1(7-35%7) ) %11 = 9, Location 9 is empty, so 35 is inserted there
(iii) Insertion of 57 — Collision at 2
Next probe is done at - (2 € 1(7 - 57%7) ) % 11 =8, Location 8 is empty, so 57 is inserted there
(iv) Insertion of 39 — Collision at 6
Next probe is done at - (6 + 1(7-39%7) %I 1=9, Location 9 is not empty \
Next probe is done at - ( 6 + 2(7-39%7) )%11 =1, Location 1 is empty, so 39 is inserted there
(v) Insertion of 19 —collision at 8
Next probe is done at - ( 8 + 1(7-19%7) )%11 =10, Location 10 is not empty
Next probe is done at - ( 8 + 2(7-19%7) )%11=1, Location 1 is not empty
Next probe,is done at - ( 8 + 3(7-19%7) )%11 =3, Location is empty, so 19 is inserted there
(vi) Insertion of 50 — Collision at 6
Next probe is done at - ( 6 + 1(7- 50%7) )%11 =1, Location | is not empty
Next probe is done at - ( 6 + 2(7- 50%7) )%11=7, Location 7 is empty,so 50 is inserted there
The problem of secondary clustering is removed in double hashing because keys that have same hash
address probe different sequence of locations. For example 46, 35, 57 all hash to same address 2, but their probe
sequences are different..
AG 2, 558,05. 3sOyya eo des
Bom2, 9.9, 1, BF AO ESL
While selecting the secondary hash function we should consider these two points — first that it should never
give a value of 0, and second that the value given by secondary hash function should be relatively prime to the
table size.
Double hashing is more complex and slower than linear and quadratic probing because it requires two times
calculation of hash function. \
Load factor of a hash table is generally denoted by A and is calculated as-
A =n/m
Here n is the number of records, and m is the number of positions in the hash table.
In open addressing the load factor is always less than 1, as the number of records cannot exceed the size of
the table. To improve efficiency, it is desirable that there should always be some empty locations inside the
table i.&. the size of table should be more than the number of actual records to be stored. If a table becomes
dense i.e. load factor is close to 1, then there will be more chances of collision hence deteriorating the search
efficiency. Leaving some locations empty is wastage of space but improves search time so it is a space vs. time
tradeoff. eal 5)
3 ta Structures Thro in t
484
A disadvantage of open addressing is that key values are far from their home addresses, which increases the
number of probes. Another problem is that hash table overflow may occur. =A
Here variable status is of type enum type_of_record and it can have one of these values - EMPTY,
DELETED, OCCUPIED.
After this we will declare an array of type Record and this is our hash table.
struct Record table|MAX];
First of all we need to initialize the locations of the array to indicate that they are empty. This is done by
setting the status field of all the array records to EMPTY.
If table[i] contains a record then value of table[i].status will be OCCUPIED, if table[i] is empty
because of deletion of a record, i.e. it is empty now but was previously occupied then value of
table[i].status will be DELETED, otherwise if table[i] is empty(was never occupied) then value of
table[i].status will be EMpTy.
The function search() returns -1 if key is not found otherwise it returns the index of the array where key
was found, Whenever we search for a key, our search will terminate only when we reach a location table[i]
whose status is EMPTY. While inserting, we can stop and insert a new record whenever we reach a location —
table(i] whose status is EMPTY or DELETED. Deletion of a record stored at location table [i] is done
by
setting status to DELETED.
The search will terminate when an EMpPry location is encountered or the key is found. Thus it is
important
that at least one location in the table is left EMPTY so that the search can terminate if key is not present.
ST
int empid;
char name([20];
int age;
‘truct Record
int i,key,choice; .
struct Record table[MAX];
struct employee emprec;
for(i=0; i<=MAX-1; i++)
table[i].status = EMPTY;
while(1)
{
printf("1.Insert a record\n");
printf("2.Search a record\n");
printf£("3.Delete a record\n");
printf("4.Display table\n");
Dreintee
(US. cd CNT).
printf£("Enter your choice\n");
g scanf ("%d", &choice) ;
switch(choice)
{ }
case l ‘
printf("Enter empid,name,age : ");
scanf ("%d%s%d", &emprec.empid, emprec.name, &emprec.age) ;
insert (emprec, table) ;
break;
case 2
printf("Enter a key to be searched : ");
scanf("%d", &key) ;
i = search(key, table) ;
eefa(eaiaoa—
ts)
printf("Key not found\n");
else \
printf("Key found at index %d\n",i);
break;
case 3:
printf("Enter a key to be déeleted\n");
scanf ("%d", &key) ;
del (key, table) ;
486 t
/ break;__
case 4:
display (table) ;
break;
case 5:
exit (1);
}/*End of. Switch*/
}/*End of while*/
}/*End of main()*/ -
int search(int key,struct Record table[])
{
int i,h, location;
h = hash(key) ;
location = hy;
6559 Bsa
==
H(4895) = 4895 % 7 Vv
H(6559) = 6559 % 7 04 2
(2]
H(5912) = 5912 % 7
H(4047)
= 4047 %7=1 (3) Bee
H(6766) = 6766 % 7 =4 meee
H(4390) = 4390 % 7=1 [4] ree
5912 Bex
H(4640) = 4640 % 7 =6 eae
H(4900) = 4900 % 7 =0 [5] Bes
H(4411) =4411%7=1
Here keys 4411, 4390, 4047 all hash to the same address i.e. array index 1, so they are all stored in a separate
linked list whose starting address is stored in location | of the array. Similarly other keys are also stored in
respective lists depending on their hash addresses.
If linked lists are short, performance is good but if lists become long then it takes time to search a given key
in any list. To improve the retrieval performance we can maintain the lists in sorted order.
For inserting a key, first we will get the hash value through hash function, then that key will be inserted in
the beginning of the corresponding linked list. Searching a key is also same, first we will get the hash value
through hash function, and then we will search the key in corresponding linked list. For deleting a key, first ‘hat
key will be searched and then the node holding that key will be deleted from its linked list. ;
In open addressing, accessing any record involved comparisons with keys which had different hash values
which increased the number of probes. In chaining, comparisons are done only with keys that have same hash
values.
488 truct i
In open addressing, all records are stored inside the hash table itself so there can be problem of hash table
overflow and to avoid this, enough space has to be allocated at the compilation time. In separate chaining, there
will be no problem of hash table overflow because linked lists are dynamically allocated so there is no
limitation on the number of records that can be inserted. It is not necessary that the size of table be more than
the number of records. Separate chaining is best suited for applications where the number of records is not
known in advance.
In open addressing, it is best if some locations are always empty. If records are large then this results in
wastage of space. In chaining there is no wastage of space because the space for records is allocated when they
arrive.
Implementation of insertion and deletion is simple in separate chaining. The main disadvantage of separate
chaining is that it needs extra space for pointers. If there are n records and the table size is m, then we need extra
space for n+m pointers. If the records are very small then this extra space can prove to be expensive.
In separate chaining the load factor denotes the average number of elements in each list and it can be greater
than |.
/*P9.9 Separate chaining*/
#include <stdio.h>
#include<stdlib.h>
#define MAX 11
struct employee
{
int empid; = *
char name[{20];
int age;
YG
struct Record
{ 2 Z \
struct employee info;
struct Record *link;
ie
void insert (struct employee emprec, struct Record *table[]);
int search(int key, struct Record *table[]);
void del(int key, struct Record *table[]);
void display(struct Record *table[]);
int hash(int key);
main()
{
struct Record *table[MAX];
/ struct employee emprec;
int i,key,choice;
for (1=0;i1<=MAX-1;i++)
table[i] = NULL;
while(1)
{
printf£("1.Insert a record\n") ;
printf£("2.Search a record\n") ;
printf("3.Delete a record\n") ;
printf£("4.Display table\n");
PLINCIN VOmmock tN)
printf("Enter your choice\n");
scanf("%d", &choice) ;
switch (choice) y
{
case l
printf ("Enter the record\n");
printf("Enter empid, name, age : ");
scanf ("%d%s%d", &emprec.empid, emprec.name, &emprec. age);
insert (emprec, table) ;
Searching and Hashing
489
break;
casey 2a :
printf("Enter a key to be searched : ");
scanf ("%d", &key) ;
i = search(key, table) ;
(a=)
printf("Key not found\n") ;
else
printf ("Key found in chain %d\n",i);
break;
case 3:
printf("Enter a key to be deleted\n");
scanf ("%d", &key) ;
del (key, table) ;
break;
case 4:
display(table) ;
break;
Casenor ’
exit(1);
}
za}
}/*End of main()*/
void insert(struct employee emprec,struct Record *table[])
{ /
,
int h,key; .
struct Record *tmp;
iibeyal el
struct Record *ptr;
h = hash(key) ;
ptr = table[h];
while (ptr!=NULL)
{
if (ptr->info.empid == key)
return h;
ptr = ptr->link;
}
return -1;
}/*End of search()*/
void dellint key,struct Record *table[])
(
ant h;
Struct, Recorads* tmp; “prExX;
h = hash(key) ;
if (table[h] ==NULL)
Bi
; printf£("Key %d not found\n",
key) ;
return;
} 4
if (table[h]->info.empid == key)
{
tmp=table[h];
table[h]=table[h] ->link;
free (tmp) ;
return;
}
ptr = table[h];
while (ptr->link!=NULL)
{
if (ptr->link->info.empid == key)
{
tmp=ptr->link;
ptr->link=tmp->link;
free (tmp) ;
return;
}
ptr=ptr->link;
}
printf("Key %d not found\n",
key) ;
}/*End ‘of dei ()*/
int hash(int key)
{
return (key%MAX) ;
}/
A PG Oise Masiaiyaty
Bucket 0
(1) Bucket 1
H(4895) = 4895 % 7 =2
H(6559) = 6559 % 7 =0
H(5912) = 5912 %7=4 2] Bucket2
H(4047) = 4047 %7=1
H(6766) = 6766 % 7 =4 '
H(4390) = 4390 %7=1. (3] Bucket 3
H(4640) = 4640 % 7 =6
H(4900) = 4900 % 7 =0
H(4411)
=4411 % 7 =1 es Bucket 4
(5) Bucket 5
- (6] Bucket 6
\ disadvantage of this technique. is wastage of space since many buckets will not be occupied or will be
artially occupied. Another problem is that this method does not prevent collisions but only defers them, and
vhen collisions occur they have to be resolved using a collision resolution technique like open addressing.
=xercise
We have seen data structures which are used to implement different concepts in a programming language. In
each one of them we specify the need of storage space for implementing these data structures. For example in
the case of linked list, we allocate the space for node at the time of insertion and free the storage space at the
time of deletion. This allocation and freeing of space can be done at run time because C language supports
dynamic memory allocation. Generally it is the work of operating system to provide the specified memory to
user and manage the allocation and release process. Now we will see different techniques and data structures for
storage management. .
The dynamic memory allocation requires allocation and release of different size of memory at different
times. Many applications might be running on a system and they can request for different size of memory. The
memory management system of the operating system is responsible for managing the memory and fulfilling the
memory requirements of the users.
Suppose we have 512K of available memory and the programs P1, P2, P3, P4 request for memory blocks of
sizes 110K, 90K, 120K, 110K respectively. The memory is allocated sequentially to all these programs.
0 S11
$12
phene (} Ee
Figure 10.1
After some time programs P1 and P3 release their memory.
Figure 10.2
Now there are two blocks of memory that are being used, and three blocks that are free; the free blocks can
be used to satisfy memory requests. The operating system needs to keep record of all the free memory blocks so
that they can be allocated whenever required. This is done by maintaining a linked list of all the free blocks,
known as free list. Each free block contains a link field that contains the address of next free block. The blocks
can be of varying sizes, so a size field is also present in each free block. These fields can be present in some
fixed location of the block, so that they can be accessed if the starting address of the block is known. Generally
these fields are stored at the start of the block. A pointer named freeblock or avail is used to point to the
first free block of this linked list. The figure 10.3 shows how the free blocks are linked together to form a free
list.
Figure 10.3
storage Management
493
Whenever a request for a block of memory comes, memory is allocated from any of the blocks
in free list
lepending upon the size of the memory requested and the sequential fit method used. If the
size of block is
qual to the memory requested then the whole block is allocated otherwise a portion of the block is allocated
ind the remaining portion becomes a free block and it remains in the free list. When a block of
memory
ecomes free, it is joined to this free -list.
Figure 10.4
Figure 10.5
Since the link part in each block is stored at the starting of each block, it is better to allocate the second
ortion of the block for the memory requested. By doing this, there would be no need to change the link part of
revious free block. The link part of the splitted free block would become the link part of the new free block
nd so there is no need to change it.
Figure 10.6
The advantage of best fit method is that the bigger blocks are not broken. A disadvantage is that there are
Jany small sized blocks left(here 15K) which are practically unusable.
Figure 10.7
The advantage of this method is that after allocation of memory, the blocks that are left are of reasonable
size and can be used later. For example here the block left after allocation is of size 85.
The order in which blocks are stored on the free list can improve the searching time. For. example the search
time in best fit can be reduced if the blocks are arranged in order of increasing size. If the blocks are arranged in
order of decreasing size then the largest block will always be first one and so there will be no need of searching ~
in the case of worst fit. For first fit, the blocks can be arranged in order of increasing memory address.
Generally the first fit method proves to be the most efficient one and is preferred over others.
10.2 Fragmentation
After several allocations and deallocations, we can end up with memory broken into many small parts which are
practically unusable, thus wasting memory. This wastage of memory is known as fragmentation. There are two
types of fragmentation problems -
(i) External Fragmentation - This occurs when there are many non contiguous free memory blocks. This results
in wastage of memory outside allocated blocks.
KO es 180
Figure 10.8
In figure 10.8, we have four free memory blocks and the total free memory is 262. Suppose a program
requests for memory of size 150. Although the free memory is much more than the memory requested, this
requirement cannot be fulfilled because the free memory is divided into non contiguous blocks and we don’t
have 150K of contiguous free memory. External fragmentation can be removed by compaction.
(ii) Internal Fragmentation - This occurs when memory allocated is more than the memory requested, thus
wasting some memory inside the allocated blocks. For example suppose the memory is to be allocated only in
sizes of powers of two, i.e. the blocks that can be allocated can be of sizes 1, 2, 4, 8, 16, 32, 64, 128........
Suppose a request of memory of size 100:comes, then the memory block of size 128 would be allocated for that,
thereby wasting 28 memory locations inside the allocated block.
Figure 10.9
Storage Management
49s,
Suppose now P%S is freed, we can remove block.at address 180 from the free list and combine
these two blocks
to form one block and put this larger block of size 190 on the free list.
Figure 10.10
If the free list is arranged in order of increasing memory address, then we need not search the whole list. For
example suppose we free a block B. We search the list for a block B1 where the address of B1 is greater than
address of B.
If the free block B1 is contiguous to B, then B1 is removed from the list and is combined with B. The new
combined block is placed in the list at the place of B1. If the block B1 is not contiguous to block B then block B
is inserted in the list just before B1.
Suppose block B2 is the free block immediately preceding B1. If B2 is contiguous to block B, then B2 is
removed from the list and is combined with B and the new combined block is placed in the list at the place of
B2.
Figure 10.11
Let us assume that a size field is stored at a positive offset from the start address so that we can find size by
expression size(p) where p is the start address. Now suppose a block B at address k is freed, its size would be
size(k) and the starting address of right neighbor would be k+size(k). To find start address of left neighbor
we need size of left neighbor, but to access its size we need to know its start address. We only know the end
address of left neighbor which is (k-1) so there is no way in which we can find the size of left neighbor. The
solution to this problemis to store a duplicate size field in each block at a negative offset from the end of the
block, let us call it bsize. If end address is q then the size of block can be accessed by expression bsize(q).
So now start address of left neighbor of B would be k-bsize(k-1).
When a block is freed we need to know whether its left and right neighbors are free or not. A status field is
stored in each block, if the block is free then this field is 0 otherwise it is 1. This status field is stored both at the
start and end of the block. If start address of a block is p, then we can access its free status field by
status (p), and if q is the end address then we can get its free status field by bstatus (q). So if a block B at
start address k is being freed, we can access status field of right neighbor by status (k+size(k) ) and that of
left neighbor by bstatus (k-1).
Each block needs the size and status fields at both the ends i.e. at both the boundaries and hence the name of
this method is boundary tag method.
In boundary tag method, the free list will be stored as a double linked list. This is because we don’t reach a
block by traversal and hence we don’t know its predecessor which is required for the removal of a block from
496 ‘ Data Structures Through C in Depth
the list. So each free block will have next and prev pointers pointing to the next and previous free blocks, and
these will be stored at a positive offset from the start of the block.
The structure of a block in boundary tag method is-
Lola bebo!
teen
Figure 10.12
Now we’ll free memory blocks one by one and see how the free blocks are combined in the free list.
(1) Free P3 in figure 10.12.
P3 has no free neighbor, so it is just added to the beginning of the free list.
|
tha[54|
Figure 10.13 :
torage Management : 497
/ Figure 10.14
@
labera
eo
Figure 10.15
Figure 10.16
remove a block from it and break it into two buddy blocks where each buddy is a p-block, i.e. each buddy
is of
size 2°. Now one of these buddies will be allocated and the other will be put on the p-list. If both p-list and
(p+1)-list are empty then we will look at the (p+2)-list and take a block from it and split it into two (p+1)
blocks. One of these blocks is put into (p+1)-list and the other is again split into two p-blocks. One of these p-
blocks is put into p-list and the other is allocated. This procedure
of splitting will continue till we get a block of
SiZe p.
When a block of size 2' is deallocated, we compute the address of its buddy block. If the buddy block is not
free then the deallocated block is added to the i-list. If the buddy block is free then we combine the two buddies
to form a block of size 2"*'. After combining we find the address of the buddy of the combined block of size
2"*' If the buddy of this block is not free then this block is added to the (i+1)-list, otherwise it is combined with
its buddy to form a block of size 2'**. This process continues till we get a block whose buddy is not free or till
we get a block of size 2”.
Suppose we have 2° = 512 K of memory, and the smallest size of block that can be allocated is taken as 2° =
8K. So we will have to maintain 7 free lists viz. 3-list, 4-list, 5-list, 6-list, 7-list, 8-list and 9-list.
(1) Initially all the free lists are empty except the 9-list which contains a block that starts at address 0.
() Sil
$12 - |
Free lists : 3—4N. 4--N, 5-N, 6-N. 7-N, 8-N, 9-0
B2ti=S} fd 6h sie
Free lists :3--N, 4¢N, 5-N., 6-N, 7-N, 8-N, 9-N
When a free block is returned, we need to compute the address of its buddy. A block can have either a left
buddy or a right buddy. If the size of block is 2' and it is situated at address k * 2', then the block has a right
buddy if k is even while it has a left buddy if k is odd. If the block has left buddy then the address of left buddy
would be (k-1) * 2', and if the block has right buddy then the address of right buddy would be (k+1) * 2'. For.
example a block of size 2’ situated at address 256(2* 2’) will have a right buddy situated at 384(3*2’). A block
of size 2° situated at address 32(1 *2°) will have a left buddy situated at address 0(0*2°)
The address of the left or right buddy of a block of size 2' can also be determined by complementing the if
bit in the address of the block(if we count the least significant bit as the 0" bit). If the i bit in the address of a.
block of size 2' is 1, then the block has a left buddy otherwise it has a right buddy. .
For example suppose we have block of size 2° at address 384, :
110.000 000 }
5" bit is 0, so it will have a right buddy whose address can be obtained by complementing the 5" bit 110 100
000(4 16). |
If we have a block of size 2’ at address 384,
110 000 000
7" bit is 1, so it will have a left buddy whose address can be obtained by complementing the 7" bit 100 000
000(256). 4
The address of buddy of a block can be obtained by-
=
tora e@ Management
501
<) Free P1
he 5-block at 0 is now free, its buddy at 32 is not free. So the 5-block at 0 is put into the 5-list.
128 256 shine
<1) Free P2
he 7-block at 128 is now free. It has an adjacent free 7- block at 256, but it can’t be combined with it since both
f them are not buddies. The tree 7-block at 128 was formed by splitting an 8-block at 0, while the free 7-block
| 256 was formed by splitting an 8-block at address 256. So the free 7-block at 128 is put into the 7-list.
256 384
ere note that although we have adjacent free space of 256 K, the largest memory request that can be fulfilled is
28K.
ii) Free P6
he 5-block at 32 is now free; its buddy is at 0 which is also free so we combine both these buddies to form a 6-
lock at 0. Now the buddy of this 6-block at 0 is the 6-block at 64 which is not free,.so the free 6-block at 0 is
Ut into the 6-list.
256 384
an) Free PS
he 7-block at 384 is now frec and its buddy at 256 is also free, so both of them are combined to form an 8-
ock at 256. The buddy of this 8-block at 256 is 8-block at 0, but it is currently not available as it has been
litted into smaller blocks. So the free 8-block at 256 is put into the 8-list.
256
iv) Free P4
ne 6-block at 64 is now free, its buddy is at 0 and it is also free so both of them are combined to form a free 7-
ack at 0. The buddy of this 7-block at 0 is a 7-block at 128 which is also free so both of these are sombined to
rm a free 8-block at 0. Now the buddy of 8-block at 0 is an
block at 256 which is free so both are combined to form a 9-block at 0. This 9-block is put into the 9-list.
502m Data Structures Through C in Dept
Now let us see what information needs to be stored in each Yee for the inplepentaccr of a buddy system.
The four fields which are included in all blocks are-
(i) status flag, which denotes whether the block is free or not
(ii) size, which denotes the size of the block. Instead of storing the exact size we can store the power of 2, e.g.
for a block of size 128 we can store 7.
(iii) a left link which points to the previous block on the free list.
(iv) aright link which points to the next block on the free list.
The last two fields are used only when the block is free.
To find whether the buddy of a block B is free or not, first we will calculate the address of the buddy. After
that we’ll see the size of block located at that address, if the size is not equal to the size of the block B, it means
that the buddy of B has been split into smaller blocks. If the size of the block at the calculated address is same
as that of B, then we can check the status flag of the buddy to find whether it is free or not. 4
For example in step (xi), a 7-block located at address 128 is freed. It can be calculated that this block has a
left buddy situated at address 0. Now the size field of block at address 0 is checked. It is a 5-block, so this
means that the buddy of the block that we have freed just now is not available and it has been splitted into
smaller blocks.
In step (ix) a 7-block at address 256 is freed. It can be calculated that this block has a right buddy situated at
address 384. Now the size field of the block at address 384 is checked and it is found to be a 7-block, this means
the buddy of the block is not splitted. Now the status flag of the buddy is checked which shows that the buddy is
not free and so the two buddies cannot be combined.
In step (xiii) a 7-block at 384 is freed. It can be calculated that this block has a left buddy situated at address
256. Now the size field of the block at address 256 is checked and it is found to be a 7-block, this means the
buddy of the block is not splitted. Now the status flag of the buddy is checked which shows that the buddy is
free and so the two buddies are combined. |
For deletion of the buddy from the free list we need address of its predecessor. We are not reaching the
buddy by traversing the free list, but we get its address by calculation i.e. we are directly reaching to any
osition in list. So we need to maintain the free lists as double linked list so that we can get the address of
predecessor of a block while removing it from the list.
In binary buddy system, lot of space is wasted inside the allocated blocks. Only memory requests which are
in power of 2 result in no waste, while any other size of memory requested will result in wastage of space. Lot
of memory is wasted if the size of memory requested is just a bit larger than the smaller block and is very less
than the larger block. For example if we need memory of size 130, we will have to allocate a block of size 256
thus wasting 126 memory locations. Hence internal fragmentation is a main disadvantage of binary buddy
system, External fragmentation is also present in buddy systems as there can be many adjacent free blocks
which can’t be combined because they are not buddies.
The numbers in fibonacci sequence are 1, 2, 3, 5, 8, 13, 21, 34, 55, 89,... Each element is the sum
of
revious two elements. The Fibonacci buddy system uses these Fibonacci numbers as permitted block
sizes.
flere an i-block is a block whose size is equal to the i” Fibonacci number. For example a 7-block is a block
of
ize 21. For the implementation of Fibonacci buddy system, an array of Fibonacci numbers needs to be stored
9 that the i" Fibonacci number can be found easily . :
In binary buddy system, when an i-block splits we get two same size (i-1)-blocks. In Fibonacci system we
vill get buddies of different sizes. When an i-block splits, we get one (i-1)-block and one (i-2)-block. For
xample if we split a 10-block (size 89), we get a 9-block(size 55) and a 8-block(size 34). In binary buddy
ystem we could easily compute the address of the buddy of a block B, from the address and size of block B.
flere this computation is not simple as the buddies are of different sizes. We need to store some additional
iformation in each block for obtaining the address of its buddy. In each block a left buddy count(LBC) field is
utroduced. Initially the left buddy count of the whole block is 0. This count is changed when a block is split or
wo buddies are combined.
plitting:
‘BC of left buddy = LBC of parent block + 1
BC of right buddy = 0
Tombining:
‘BC of combined block = LBC of left buddy — 1 /
If the LBC of a block is 0, this means that it is a right buddy. For example if a block of size 34 has LBC 0,
lis means that it is a right buddy formed by splitting a block of size 89, and the left buddy is of size 55. If the
BC is | or more than 1, this means that it is’a left buddy.
Suppose we have 144 K of memory, and the smallest size of block that can be allocated is taken as fib(5) =
K. So we will have to maintain 7 free lists viz. 5-list, 6-list, 7-list, 8-list, 9-list, 10-list, 1 1-list.
|) Initially all the free lists are empty except the 11-list which contains a block that starts at address 0. The LBC
f this block is 0.
143
| 144 [11] |
ree lists : S-+N, 6—-N, 7-N, 8-N, 9-N, 10-N, 11-0
0 34 55 89 : 110 123
34 [8] 0 “hy oe 4 {8
Free lists : S-9N, 6-110, 7-N, 80, JN, 10—N, 11>N
(Vl) ALIULALE ZUIKN LOL oO
fib(6) < 20 <= fib(7), so we necd to allocate a 7-block. The 8-block at address 0 is splitted.
0 21 34 89 LORts i zs
0 21 34 55 * 89 110 123 7
(vill) Free P7
The LBC of block P7 is 0 which means that it is a right buddy. Its left buddy would be a 7-block situated at —
address 0(21 — 21). There is a 7-block at address O but it is not free so no combination takes place.
Now the newly combined node is an 8-block at address 0. Its LBC is 3 indicating that it is a left buddy. Its right
buddy would be a 7-block at address 34(0+34), The right buddy is free so again a combination takes place.
torage Management
505
55 89 110 123
he LBC of block P2 is not 0 which means that it is a left buddy. Its right buddy would be a 6-block at address
10(89+21). The right buddy exists but is not free.
55
i) Free P|
he LBC is zcro indicating that it is a right ay Its left buddy would be an 8-block at ncldiess 89(123- 34).
he left buddy is not currently available.
ii) Free P6
he LBC is zero indicating that it is a right buddy. Its left buddy would be a 7-block at address 89(110 -21). The
ft buddy is free so they are combined.
55 89 110 123
55 89 123
he LBC of combined block is 1, hence it is a left buddy. Its right buddy is a 7-block at address 123(89+34),
id it is free so they are combincd.
mnwn Ss —) nnn So
89 [10 : 0 So[9
1e LBC of combined block is 1, hence it is a left buddy. Its right buddy is a 9-block at address 89(0+89), and it
free so they are combined.
506 : Data Structures Through C in Depth
Rar
ei ne MRNAS
Free lists : 5—N. 6—N. 7-N, 8-N, 9-N, 10-N, 11-0
In Fibonacci buddy system, internal fragmentation is reduced since there is a larger variety in the sizes of free
blocks.
10.6 Compaction
After repeated allocation and dealloctaion of blocks, the memory becomes fragmented. Compaction is a
technique that joins the non contiguous free memory blocks to form one large block so that the total free
memory becomes contiguous. All the memory blocks that are in use are moved towards the beginning of the
memory i.e. these blocks are copied into sequential locations in the lower portion of the memory. For example
in the following figure, blocks allocated for programs P1, P3, P5, P7 are copied in locations starting from 0 and
a single large block of free memory is created.
0 50 u 0
When compaction is performed, all the user programs come to a halt. A problem can arise if any of the used
blocks that are copied contain a pointer value. For example suppose inside block PS, the location 350 contains
address 310. After compaction the block P5 is moved from location 290 to location 120, so now the pointer
value 310 stored inside P5 should change to 140. So after compaction the pointer values inside blocks should be
identified and changed accordingly.
Exercise
1. Consider the situation in the given figure and allocate 50K for a program using-
1) First Fit method
1i) Best Fit Method
1ii)Worst Fit method
4543
IRs
vies |ef 92 Or| Header Node
3. Suppose we have 512K of free memory and the smallest block of memory that can be allocated is of size 8K.
Use binary buddy system to allocate 100K for P1, 25K for P2, 50K for P3, 60K for P4, 125K for P5.
4. The following memory blocks were allocated using binary buddy system. Deallocate the block P4.
chapter 2
\rrays, Pointers and Structures
- Error, the size of an array should be a constant expression. Za) shonin Si) 4. sum = 60
-a=4,b=6
arrl[{O] = 6, arrl[4] =
arr2[0] = 1, arr2[4] =5
eee. 30° 800) 35 «8357740 40 457445
. Error, since arr is a constant pointer and it can't be changed.
-26 31 36 41 46
By (*p)++ we increment the value pointed to by p, and by p++ we increment the pointer.
a2) 30 35 4055 10. 35 55 65 11.90 85 70 65 60 55 40 35 30 25
2.90 85 70 65 60 55 40 35 30 25 13.20 20 10 lava 2) b=2 15.4567
6. Marks = 80 Grade = A Marks = 80 Grade = A
7. Error - The structure definition is written inside main( ), so it can't be accessed by function func( ). It should be
written before main( ) so that it is accessible to all functions.
8. 12 19. 13
chapter 3
inked List-
. Traverse the list L, and keep on inserting the nodes one by one at the beginning of the new list.
. We did this in a single pass of bubble sort; here we will do the swap unconditionally. After Svppine two nodes we will
save them arid move to the next node.
. Unlike single linked list, here we will not need a pointer to the previous node
. While swapping through info, we will need a pointer to the Jast node. While swapping through links, we will need a
Ointer to the second last node.
. Compare adjacent elements and if the first one is greater than second, swap them. This is the teehnhigiie we used in a
ingle pass of bubble sort.
0. Compare the first element with all other elements starting from the second one, and if any element is smaller than the
rst element, swap them. This is the technique we used in a single pass of selection sort.
3. Here we need a pointer to the last node of the list.
aoa p sta
fi tipi nL teil ne
6. For deleting a node we need a pointer to the previous node, so that we can change link part of previous node and make it
9int to the node which is after the node to be deleted. Here we are given only a pointer to the node to be deleted, and since
e list is single linked list we can’t get the pointer to previous node. We don’t have the start pointer so we can’t traverse the
st and get the pointer to previous node.
Je copy the data from the next node into the node to be deleted and then delete the next node.
he last node cannot be deleted if only a pointer to it is given, because in that case we need the previous node so that its link
in be set to NULL.
Sl0me ee Data Structures Through C in Depth
17. Inserting a node after a node pointed by p is simple. For inserting a node before a node pointed by p we need the pointer
to previous node also. But we have no other information except the pointer p. So we can use the trick of copying info as we
did in exercise 16. ;
18. After freeing the pointer p, the expression p->link is meaningless. So we need to store p->link in some variable
before freeing the pointer p. At last the pointer start should be assigned NULL.
19. Since the list is sorted, the duplicate elements will be adjacent. We can start traversing from the beginning and compare
the data of adjacent nodes. If the data is same we can delete the next node and continue. Before deleting any node we need
to store the address of its next node.
20. In an unsorted list, the duplicate elements need not be adjacent. So here, either we can sort the list and then delete
duplicates as in E19 or we can use two nested loops to compare the elements of both lists.
25.
26. Take two pointers and initially point them to the first node. Move the first pointer n times, now move both the pointers
simultaneously and when the first pointer will become NULL, the second pointer will be pointing to the
~ n'Jast node. For example to find 3™ last node in the list given below, first move pointer p1 three times, then move both p1
and p2 till p! becomes NULL. The pointer p2 will then point to the 3™ last node.
27. A simple solution to tind a cycle in linked list is to have a visited flag in every node of the linked list. The list is
traversed from the beginning and as cach node is visited, its visited flag is set to true. If we reach a node whose visited flag
is true then there is a cycle in the list.
Another solution which is eflicient and in which we need not make any changes in the nodes of the list, is based on
Tortoise and Hare algorithm. Take two pointers and initially point them to the first node. Move the first pointer one node at
a time and move the second pointer two nodes at a time. If the list is not NULL terminated i.e. it contains a cycle then after
some time both the pointers will enter the cycle and will definitely meet at some node. The fast and slow pointer can meet
only if there is a cycle in the node or the whole list is circular. This algorithm is called tortoise and hare algorithm because
we have used a slow pointer(tortoise) and a fast pointer(hare). This algorithm is also known as Floyd’s cycle detection
algorithm.
To count the number of nodes in the cycle, fix the pointer p1 at node 18 and move the pointer p2 one node at a time.
Stop
the pointer p2 when it meets pointer pl. The pointer p2 visits a full cycle and the number of times it moves is
equal to the —
number of nodes in the cycle(5 nodes).
Solutions
[5G
E nea Sil
9OD
To count the remaining nodes in the list, make the pointer p1 point to the first node. The pointer p2 points to node 18.
Now move both pointers till they meet. :
32.
53:
Find the length of both lists. Suppose d is the difference of lengths of both lists. Take two pointers pointing to beginning of
the lists. Move the pointer which points to the longer list d times. Now move both pointers simultaneously and compare
them. If at any point both the pointers are equal then we get a merge node. If we reach the end of any of the lists without
getting the merge node, we can conclude that the lists don’t intersect at any point. In the example, the merge point is the
node containing data 52.
36. First find the middle of the list as in E28. Split the list into two halves, reverse the second half of the list and compare
the two halves. After comparing reverse the second half and join the two halves to get the original list. There will be
different cases for odd and even elements.
Chapter 4 .
Stack and Queue
4. First stack will start from 0" position and second stack will start from last position of array. The overflow will occur when
the top of the two stacks cross
\
(i) Push4, 8 on stack A, Push 3,6,9 on stack B (ii) Push 1,2,5,9 on stackA, Push 4,6,1,2,3,8,5 on stack B
5. First queue will start from 0" position and second queue will start from last position of array.
6. Enqueue - Push item on instack
Dequeue- If outstack is erp then move all the items from instack to outstack.
This way we can get the FIFO behavior using two stacks.
7. Push — Enqueue in Q]
Pop — Delete all items from Q! except the last and insert in Q2. The last item left is the item to be popped. Now
insert all items back into QI.
8. Push - Enqueue the item
Pop - Delete all the items one by one and insert then at the end one by one except the last one which is the
item popped
(i) Push 1,2,3,4,5
Queue - 1,2,3,4,5
(ii) Pop
Queue - 2,3,4,5,1
Queue - 3,4,5,1,2
Queue - 4,5,1,2,3
Queue - 5,1.2,3,4
Now 5 is deleted which is the item popped
Queue - |,2,3,4
10.
Solutions
nig}
OES (I Ha
SI iT; Sl S2
13. Delete the elements from Q1 one by one and insert them both in Q] and Q2.
Ql- 1 2,3,4,5 Ql- 2,3,4,5,1 QlI-3,4,5,1,2 Ql- 4,5,1,2,3 Ql-5,1,2,3,4 QI- 1,2,3,4,5
Q?2- Q2- | Q2- 1,2 Q2- 1,2,3 Q2- 1,2,3,4 Q2- 1,2,3,4,5
14. Take two stacks, one to store all the elements and other to store the minimum.
Push — Push the item on the main stack. Push on the minimum stack only if the item to be pushed is less than or equal to the
value at the top of minimum stack.
Pop — Pop from the main stack. Pop from the minimum stack only if the value popped from main stack is equal to that on
the top of minimum stack.
6|
4|
Push 4 Push 2 Push 3 Push 5 Pop Pop Pop Pop Push 6
15. Keep on dividing the number by 2 and pushing the remainders on a stack till the number is reduced to 0. Now pop all the
numbers from the stack and display them
101
101/2 q=50, r=1 stack=]
50/2, q=25,r=0 stack=1,0
25/2, q=12, r=] stack=1,0,1
12/2,q=6,r=0 stack=1,0,1,0
6/2,q=3,1=0 stack=1,0,1,0,0
3/2,q=1,r=1 stack=1,0,1,0,0,1
1/2, q=0,r=1 stack=1,0,1,0,0,1,1
Binary = 1100101
16. Find the prime factors iteratively and push them in a stack. Pop elements from the stack and display them.
17. Scan the infix from right to left. Whenever an operator comes, pop the operators which have priority greater than(not
2qual to) the priority of the symbol operator. Rest of the procedure is same as in postfix. Ss
18. Postfix (i) AB+CD+* (ii) ACD-%BE*+ (iii) ABC+D*E/- (iv) HJK+41*5%
Prefix (i) *+AB+CD (ii) +%A-CD*BE (iii) -A/*+BCDE (iv) %**H+JKIS
19. (i) -/+5342 Value: 0 (ii) +4+/63*3%221 Value: 15 (iii) +-*/+8223/213 Value: 16
20. Scan from left to right, when you get an operator; place it before the 2 operands that precede it. For example-
i) ABC*-DE-F*G/H/+
A|[*BC]-DE-F*G/H/+
[-A*BC]DE-F*G/H/+
[-A*BC][-DE]F*G/H/+
[-A*BC][*-DEF]G/H/+
[-A*BC][/*-DEFG]H/+
[-A*BC][//*-DEFGH]+
+-A*BC//*-DEFGH
514 : Data Structures Through C in Depth
Chapter 5
Recursion
1. 33 33. Both functions return the sum of all numbers from a to b.
2. 1033, Function returns the sum of all numbers from a to b added to 1000.
3. 40, func( ) returns the sum of all numbers from 6 to 10. In funcl(), we have infinite recursion as the terminating condition
is never true:
4. tunc(4,8) returns 30, but in func(3,8) we have the infinite recursion as the terminating condition is never true.
SUS Ma NG MS eb Ts aD IL kG)
pO eld 213214155 16 118
NNO WGA NE AMS. Ge Fe ts
We a AG SS eS NB) ee Se ANG:
7. 24 0 O , func( ) returns the product of a and b. 8. 5, count( ) returns the number of digits inn. —
9, 23 , func( ) returns the sum ofdigits in n. 10. 3, count( ) returns the number of times digit d occurs in number n.
11. 4, func( ) returns the total number of even elements in array arr.
12. 28 , func( ) returns the sum of all elements of array arr.
13. func( ) returns the number of times character c occurs in string str.
14. funcl( ) prints - func2( ) prints-
Se ee *
ok ok aH
KK 28KHK
% sk ok 2k
Chapter 6
Trees
ile
here are n nodes then the number of possible non similar binary trees is "C, * 1/(n+1)
F3-nodes [ 6! / (3! * 3!) ] * [1/4] =5
r 4 nodes [ 8!/ (4! * 4!) ]* [1/5] = 14
we have to find out the total number of different binary trees with n different keys, then we can multiply the above value~
n!.
r example suppose we have to find number of possible binary trees of 3 nodes having key values 1,2,3. We have 5
ferent structures of possible binary trees as shown in the figure, and in each structure the values 1,2,3 can be arranged in
ways. So the total number of different binary trees will be 30.
10.
LEA OGSS 2a). (ii) (9, 6, 5, 7,8) (9, 6, 7,8, 5) (9, 6, 7, 5,8)
(ili) (4, 6, 5, 8) (4, 6, 8, 5) (iv) (1, 2, 5, 7)
12. (a) (n-x) (b) y+]
13. For all these trees Preorder = ABC, Postorder = CBA
© ©
o 6)
© 6
14. 67 is root node, all values less than 67 will be in left Subtree and more than 67 will be in right subtree. Applying this
logic for the subtrees also we can create our BST.
@
Ca) @
Y @Q@ @®© @
a © @&
Alternatively, we can construct the tree as in section 6.9.3. The inorder traversal of a BST can be found by putting the data
in sorted order.
16. In a full binary tree, the left and right subtrees of a node have same number of nodes. F is the
root node, and from the
remaining 6 nodes three will be in left subtree and three in right subtree,
()
GO O W O
lutions : 517
Chapter 7
Graph |
BG 0)
©)
@) (9)
(i) In(O)=1, In(1)=3, In(2)=0, In(3)=2, In(4)=1, Out(0)=2 ,Out(1)=0 , Out(2)=3, Out(3)=2, Out(4)=0
(ii) The path matrix is :
Q°5E 2 374
Ch Pade) Te)
10 ORIOL 0l 0
Deities 4 leet
Sees 20ee 10
An 0. 020) (On0 i
(iii) Since all elements of the path matrix are not 1, the graph is not strongly connected.
(iv) Since all the diagonal elements of the path matrix are not 0, the graph is not acyclic.
2. (i) Sum of degrees
= 3+ 3+3+3= 12 (ii) Sum of degrees=1+0+1+2=4
(iii) Sum of degrees =3+2+3=8 (iv) Sum of degrees =1+2+1+4414+2+1=12
Sum of degrees of all the vertices is twice the number of edges. This is because each edge contributes 2 to
the sum of degrees. This result is called handshaking Lemma. If in a meeting, people are represented by
vertices, and a handshake between two people by an edge, then the total number of hands shaken is twice
the total number of handshakes.
3. Sum of degrees = Sum of degrees of even vertices + Sum of degrees of odd vertices.
The sum of degrees of all vertices is even by handshaking lemma, and the sum of degrees of even vertices will
definitely be even. Difference of two even numbers is even, so the sum of degrees of odd vertices will be even. Sum
of degrees of odd vertices is a sum of only odd terms, so the sum can be even only if the number of these odd terms
is even. So the number of odd vertices in a graph is even.
4. (i) Sum of indegrees = 2+ 1+2+0+1=6, Sum of outdegrees = 1+1+0+4+3+1=6
(ii) Sum of indegrees = 0+ 2+ | = 3, Sum of outdegrees = 2+0+1=3
(ili) Sum of indegrees = 1 + 2+ 1 =4, Sum of outdegrees =2+1+1=4
(iv) Sum of indegrees = 2 + 1+2+3=8 , Sum of outdegrees =2+34+1+2=8
Sum of indegrevs of all vertices = Sum of outdegrees of all ve.tices = Number of edges.
This is handshaking lemma for the directed graphs. Each edge contributes | to the sum of indegrees and
1 to the sum of outdegrees.
5. The sum of degrees of all verticcs of a graph is twice the number of edges. In a regular graph of n vertices having
degree d, the sum of degrees of all vertices is n*d.
2ec=n*dae=>"e=(ntd)y/Z
(i)5 (11) 6
(iii) (3*5)/2 is not a whole number so this is not possible. In|Ex3 we have proved that number of odd vertices should
be even but here it is not so.
Gass, 2 Gi) 152,35 -dib3eies
7. (i) All vertices are visited by performing a DFS from 0. Now reverse the graph. \
\
In the reverse graph, all vertices are not visited by performing a DFS from 0. So the graph is not strongly connected.
(ii) All vertices are visited by performing a DFS from 0. Now reverse the graph.
Solutions
5 19
Si
In the reverse ee all vertices are visited by performing a DFS from 0. So ce graph is strongly connected.
8. DFS taking 0 as the start vertex - 01237456
Discovery time and finishing times of vertices are —
Vertex 0- (1, 10), Vertex | - (2,9), Vertex 2 - (3,8), Vertex 3 - (4, 7),
Vertex4 - (11,16), Vertex 5 - (12, 15), Vertex 6 - (13,14), Vertex 7 - (5, 6)
Now reverse the graph.
IAIN
Start DFS from 4 as it has highest finishing time.
Strongly connected components are - (4, 5, 6) (0,1) (2,3,7)
meee tj= a, (0,3, 4,5,2)=22, (0,3)=5, —(,3,4) = 14, (0, 3,4, 5) =16
10. (i) Edges in MST: (0,1), (0,2), (0,3),.....(0,n-1)
Weight of MST = 1+2+3+......... (n-1) = n(n-1)/2
(ii) Edges in MST : (0,1), (1,2),(2,3)...... (n-2,n-1)
Weight of MST=1+1+41+............ +l]=n-l
(iii) Edges in MST : (0,1), (1,2),(2,3)...... (n-2,n-1)
Weight of MST=5+5+5 +............ +5=5(n-1)
Chapter 8
Sorting
me 12345) © 54327) (23154)
For all these sets of data the number of comparisons will be 10.
mero {2 21 23 32° 10 [34%¢45--67 89
Me
Pe 23> 47). 93, GG hse 230
ft, 42723, «12 Oe
fee 42 23, 12° FS. Fe
ee 1d 23, 9 a 3F
9 12 23, 42 beae23,
Pe 1 P35335~ 03, 24 136
fae oot? 23, 03,1342
~. On sorting the data given in E8 by bubble sort we get 6 9 12 23, 23, 23¢ 42
eee eels 19) 20 21 34> 89 38
(2. (i) Bubble (ii) Selection (iii) Insertion
4. 5 or 8 could be the pivot.
Peele 2123 12512936 19935 47-87 72756
6. The contents of array after cach pass of heap sort are-
83 76 82 54 45 21 12 O77,
82 76 21 54 45 12 83:97
76 54 21 12 45 82783) 97
54 45 21 12 16: 82,83 97
AS 12-2) 54 76 82 83 97
PBN AN 45 54 76 82 83 97
12 21 45 54 76 82 83 97
520 Data Structures Through C in Depth
Chapter 9
Searching and Hashing
1. Search 27 (0,8, 17) (0,3, 7) - 27 found at position 3
Search 32 (0,8, 17) (0, 3,7) (4,5, 7) (4,4, 4) (5, 4, 4) - 32 not found in array
Search 61 (0, 8, 17) (9, 13, 17) (9, 10, 12) (9,9, 9) - 61 found at position 9
Search 97 (0, 8, 17) (9, 13, 17) (14, 15, 17) (16, 16, 17) (17, 17, 17) - 97 found at position 17
Ne. Squaring the keys and taking 2™, 4" and 6" digits
116964 045369 186624 293764 017424 582169 088804
194 439 864 974 144 819 884
3. 3214+982+432=1735 21344324183 = 828
343 + 541 +652 =1536 542 +313 + 753 = 1608
Addresses are 735, 828, 536, 608
4. 321 + 289 + 432 = 1042 213 + 234+ 183= 630
343 + 145 +652 =1140 542 +313 +753 = 1608
Addresses:are 42, 630, 140, 608
7).H(9893)=37, H(2341)=37, H(4312)=24, H(7893)=21, H(4531)=51, H(8731)=27, H(3184)=48
H(5421)=45, H(4955)=27, H(1496)=24
H(9893)=44, H(2341)=63, H(4312)=24, H(7893)=54, H(4531)=42, H(8731)=21, H(3184)=35
H(5421)=61, H(4955)=64, H(1496)=22
6. Linear Probe 7. Quadratic Probe 8. Double hashing
[0]
(1)
[2]
[3]
[4]
[5]
(6]
(7)
[8]
19]
[10]
(41)
[12]
{13]
[14]
(15)
(16)
9. Length of the longest chain is 3 and the keys in it are 1457, 8255, 1061
/
Chapter 10
Storage Management
1. In first fit and worst fit, memory is allocated trom 120K free block and in best fit, memory is allocated from 72K free
block.
n 0
BBmwe) Header Node
Db
paTaaLe aT
Sy cata
olutions ica : 521
ashing, 476 N:
\
ash functions, 478
Division Method(Modulo-Divison), 479 Natural Merge Sort, 444
Folding Method, 479 Nested Structures, 36
Midsquare Method, 479 Non linear data structures, 3
Truncation (or Extraction), 478 Null graph, 329
map, 207 =
Building a heap, 284 O
Deletion, 282
Insertion in Heap, 279 O notation, 6
eap Sort, 455 One Dimensional Array, 10
eight of Binary tree, 200 Open Addressing (Closed Hashing), 480
uffman Codes, 289 Outdegree, 328
uffman Tree, 287 Output restricted deque, 129
P
| place Sort, 420 Path, 327
idence, 327 Path Matrix, 346
degree, 328 Pendant vertex, 328
direct Recursion, 169 Planar graph, 329
direct Sort, 419 Pointer to Pointer, 20
order Traversal, 190 Pointers, 18
put restricted deque, 129 Pointers and Functions, 23
sertion Sort, 427 | Pointers and One Dimensional Arrays, 21
olated vertex, 329 ‘Pointers to Structures, 38
Pointers within Structures, 39
Polish Notation, 137
\ Converting infix expression to postfix expression
ruskal’s Algorithm,405 using stack, 139
Evaluation of postfix expression using stack, 141
Polynomial arithmetic with linked list, 98
Addition of 2 polynomials, 101
Creation of polynomial linked list, 101
. : Modern Artificial
Cybersecurity intelligence
Practices Managers
2 DaReapeTNsORtre
Yestavent Kanetkar
‘Data
Structures
. : Threggh
fe
iol
INTERNET OF THINGS
My Privacy
My Choice
perverse
POH >
; ( e
Available on www.bpbonline.com and all leading book stores.
PUBLISHER OF
by
YEARS OF ¢ MILLION BOOKS
COMPUTER BOOKS EXCELLENCE SOLD WORLDWIDE
In Easy Steps Series
for smart learning
IW
HY
OCaVE
SNW
VilHSIN;
www.bpbonline.com
DATA STRUCTURES
THROUGH
IN DEPTH
Second Revised & Updated Edition
"Data structures through C in Depth" presents the concepts of data
structures in a very clear and understandable manner. It covers data
structures syllabus of different undergraduate and post graduate
courses and can be useful for both beginners and professional C in Depth
programmers. This book can be used by students for self study as the - 2nd Revised & Updated Editio
concepts are explained in step-by-step manner followed by clear and
easy to comprehend complete programs. The explanations are Pages : 552
illustrated by detailed examples, figures and tables throughout the book.
ISBN: 81-8333-048-7
Exercises with solutions are provided which help in having a better
understanding of the text. The CD-Rom contains all the programs given
The book explains each to
in the book. Some ‘'demo' programs are included in the CD, which in depth without compromis
demonstrate the stepwise working of the algorithms. over the lucidity of the text:
programs. This approe
@ Introduction @ Trees makes this book suitable
+J both novices and advant
@ Arrays, Pointers and Structures @ Graphs
programmers. The wel
@ Linked Lists @ Sorting structured programs are eat
@ Searching and Hashing understandable by t
@ Stacks and Queues beginners and useful for |
@ Recursion @ Storage Management experienced programme
The book contains about 4
programs, 210 exercises é
80 programming exercis
Suresh Kumar peciais has been working in software nae me last with solutions of exercises 4
12 years. He has done B level from DOEACC Society. He likes to work hints .to solve programrmr
on system side and do some creative work for development of software exercises. The chapter
tools. He has authored a book on C language titled "C in Depth". project development 2
library creation can
Deepali Srivastava has done M.Sc. in Mathematics and Advanced students in implementing thi
PGDCA from MJP Rohilkhand University. Her areas of interest are C, knowledge and become
C++ and Data Structures, and she has a passion for writing. She has perfect C programmer.
authored a book on C language titled "C in Depth’.
Ji
ISBN 81-7656-741-8
ISBN - 13 : 978-81-7656-741-1
ISBN - 10 : 81-7656-741-8
BPB PUBLICATIONS —
|
—_—————=
—$p—— 20, Ansari Road, Darya Ganj, New Delhi-110002