DS_Unit-3_Notes
DS_Unit-3_Notes
BCA SEM-II
Unit – III
❖ One of the drawbacks of an array is that ‘it is a static data structure’. Hence, the maximum
capacity of an array should be known and defined before the compilation process. Practically
speaking, accurate predictions about data structure sizes are very difficult. And defining a large
static size results in memory wastage
❖ Another drawback of an array is that ‘its elements are stored fixed distance apart, and the
insertion and deletion of elements in between require a lot of data movement’
❖ The linked list is the solution to overcome the above drawbacks. A linked list uses ‘dynamic
memory management’ – i.e. allocate and use memory when needed and de-allocate it (release
it or free it) when it is no more needed
❖ A linked list is a very effective and efficient dynamic data structure for linear lists. Items may be
added or deleted from it at any position more easily when compared to arrays. With linked lists,
there is no restriction on the maximum size of the linear list
❖ Linked List is a linear data structure which consists of a group of nodes in a sequence and these
nodes are not stored at contiguous locations (i.e. the nodes can be stored anywhere in the
memory in a scattered manner)
❖ Each node in a linked list consists of 2 parts: data and link. The ‘data’ part represents the actual
element value and the ‘link’ part consists of the address (i.e. pointer) of its successor node (i.e.
next node) and so on and forms a chain. It is called as a linked list as each node is linked with its
successor
❖ Linked List is implemented by using pointers
1) Linked List is a dynamic data structure which allocates and de-allocates memory as per need.
Hence, memory utilization is efficient
2) Insertion and Deletion operations can be easily implemented
3) Stacks and Queues can be easily implemented
4) Linked List reduces the access time for an element
Data Structures Unit-3 Notes (for BCA) – By Ganesh sir Page 3
❖ Applications of Linked Lists: Linked Lists are used to easily implement stacks, queues, trees,
graphs etc
❖ Linked List ADT: Following are the operations that can be performed on a Linked List ADT:
• Head: It is a variable (or handle) which acts as a pointer to the first node (i.e. head part)
of the linked list. A linked list must always have at least one pointer pointing to the first
node (head) of the list. This pointer is necessary because it is the only way to access the
nodes in the list. Sometimes, this pointer may also be termed as Front
• Tail: It is a variable (or handle) which acts as a pointer to the last node of a linked list.
Sometimes, this pointer may also be termed as Rear
• Header node: Sometimes, a linked list may contain a special node that is attached at the
beginning of the linked list to which the Head pointer is pointing. Such a node is called
the Header node. It may contain special information (metadata) about the linked list such
as total number of data nodes in the list, DOC (i.e. date of creation of the list), type
of data stored etc. Below is the diagrammatic representation of linked list with Header
node:
• Data node: It is the node that stores the actual data and link(s) to its successor and/or
predecessor
• Singly linked list contains the nodes which have a ‘data’ part as well as an ‘address’
part i.e. next, which points to the next node in the sequence of nodes
• The operations that we can perform on singly linked lists are insertion, deletion and
traversal
• Following is the diagrammatic representation of a singly linked list
• In a doubly linked list, each node contains two links: the ‘prev’ link points to the previous
node and the ‘next’ link points to the next node in the sequence
• The operations that we can perform on doubly linked lists are insertion, deletion and
traversal
• Following is the diagrammatic representation of a doubly linked list
• In the circular linked list, the last node of the list contains the address of the first node
and forms a circular chain
• The operations that we can perform on circular linked lists are insertion, deletion and
traversal
• Following is the diagrammatic representation of a circular linked list
Data Structures Unit-3 Notes (for BCA) – By Ganesh sir Page 5
3.1.1 Singly Linked List
❖ Singly linked list is a linked list that contains a sequence of nodes. Each node have a ‘data’
part as well as an ‘address’ part i.e. next, which points to the next node in the sequence
❖ In a singly linked list, the address of the first node is always stored in a variable known as
head or front. Below is an Example:
❖ Operations in Singly Linked List: The basic operations that can be performed on a singly
linked list are:
• Creating the linked list
• Insertion of a node in the linked list
• Deletion of a node from the linked list
• Traversing a linked list
Node *Newnode;
Newnode = new Node;
cin >> Newnode -> data;
Newnode -> link = NULL;
If (Head == NULL)
{
Head = Newnode;
Tail = Newnode;
}
Else
{
Tail -> link = Newnode;
Tail = Newnode;
}
• To insert a node at the first position, previous node does not exist
• The link manipulations needed to add a node at the first location is shown below
using dotted lines
• After the insertion of the NewNode, the linked list is changed to below:
• Assume that a node is to be inserted at some position other than the first and end
position. Let Prev refer to the node after which NewNode is to be inserted. The link
manipulations required to accomplish this is shown below:
• We need the following 2 steps to insert a NewNode after the Prev node:
• After insertion of a NewNode after the Prev node, the linked list is changed to below:
• As the node is to be inserted after the last node, assume that Prev is the last node. Let
the node to be inserted be NewNode as shown below:
• We need the following 2 steps to insert a NewNode after the Prev node:
• After the insertion of a NewNode after the Prev node (i.e. at the end of the list), the
linked list is changed to below:
• There may be nodes that are to be deleted from a list. We need the address of the node to
be deleted (i.e. Curr) as well as the address of its predecessor (i.e. Prev) to modify the
links such that the node is deleted. Let us assume that the node to be deleted contains
Data Structures Unit-3 Notes (for BCA) – By Ganesh sir Page 9
data x. Let x = 13 and let it be pointed to by the pointer ‘Curr’ and let its previous node
is pointed to by the pointer ‘Prev’
• To delete a node, the required link manipulations are shown in the below diagram with
the dotted lines:
• If the node at first position is to be deleted, then we need to modify the pointer
pointing to first node (i.e. Head). So, set some temporary pointer (say, Curr) point to
the first node before modifying Head (to point it to the first node after deletion). And
then set Head to point to the second node. This can be accomplished by the following
statements and the link manipulations needed are shown in the below diagram
Curr = Head;
Head = Head -> link;
delete Curr;
• Let Curr point to the node to be deleted, and Prev be the predecessor of Curr. Then,
the following statements will delete the node Curr. These two statements work well
for deleting the middle node (or) last node as well
• List Traversal is the basic operation where all the elements in the list are processed
sequentially, one by one
• Processing could involve the tasks like retrieving, searching (i.e. comparing),
updating, printing, sorting, computing the length and so on
• To traverse the linked list, we have to start from the first node. We can access the first
node through the Head pointer. Once we access the first node, through its link field, we
can access the second node; through the second node’s link field, we can access the third
node, and so on, as every node points to its successor till the last node
1) Get the address of the first node, call it Curr (i.e. Curr = Head;)
2) If Curr is null, it means that the list is either empty or the list is ended. Hence, go
to step 6
3) If Curr is not null, process the data field of the current node (i.e. node pointed by
Curr) as per the processing requirement of the application
5) Go to the step 2
6) Stop
• Example: Let us see the list traversal for L = {21, 22, 23} with the help of a
diagrammatic representation as given below
In a DLL (Doubly Linked List), each node has 1 data field and 2 link fields, called as next
and prev (for previous) that holds the address of next node and the previous node
respectively. Hence, each node has knowledge of its successor and also its predecessor
In DLL, from every node, the list can be traversed in both the directions
1) Doubly Linked List are more convenient than Singly Linked List since we maintain links for
bi-directional traversing
2) We can traverse in both the directions and display the contents in the whole list
3) The previous link of the first node and the next node of the last node points to NULL
• Creation of a DLL
• Insertion of a node in DLL
• Deletion of a node from DLL
• Traversal of a DLL
A. Creation of a DLL
• Creation of a DLL has the similar procedure as that of a SLL (Singly Linked List), the
only difference is that each node must be linked to both its predecessor and successor
• Assuming that a DLL needs to be created with the data element ‘value’, the following
statements are involved in creation process:
if (Head == NULL)
{
temp -> prev = NULL;
Head = temp;
}
Data Structures Unit-3 Notes (for BCA) – By Ganesh sir Page 13
B. Insertion of a node in DLL
• There are 3 situations for inserting a node in the list (i.e. insertion at the front of the list,
insertion at the end of the list and insertion somewhere at the middle of the list)
• To insert a new node, we have to modify 4 links as each node points to its predecessor as
well as its successor. Let us assume that the node Current is to be inserted in between
the 2 nodes say node1 and node2, we have to modify the following 4 links:
node1 -> Next, node2 -> Prev, Current -> Prev, Current -> Next
• To insert a node, say NewNode in between the 2 nodes say PrevNode and
CurrentNode, we have to modify the following 4 links:
PrevNode -> Next, CurrentNode -> Prev, NewNode -> Prev, NewNode -> Next
• To insert a node at the first position with the above naming conventions, we have
to modify 4 links with the help of following statements:
• To insert a node at the last position, we have to only modify the 3 links as shown
by the below statements:
• We need the address of the node to be deleted (i.e. CurrNode), the address of its
predecessor (i.e. PrevNode) and the address of Curr’s successor (i.e. NextNode)
to modify the links such that the node is deleted
• To delete a node from a DLL, we need to do the following:
1) Let both CurrNode, PrevNode and NextNode be set to Head
2) Traverse the list and search for the node (i.e. data element) to be deleted
3) Let CurrNode point to the node to be deleted, PrevNode to its previous node
and NextNode to CurrNode's next node
4) Modify the next link field of PrevNode and previous link field of NextNode so
that they skip pointing to CurrentNode and points to the corresponding nodes
as follows:
PrevNode -> next = CurrentNode -> next;
NextNode -> prev = CurrentNode -> prev;
5) Free the memory allocated for the CurrentNode using the below statement:
delete CurrentNode;
6) Stop
❖ Release the memory allocated for the first node that we wanted to delete (i.e.
CurrentNode)
delete CurrentNode;
delete CurrentNode;
D. Traversal of a DLL
❖ Tail is a special pointer in Doubly Linked List which points to the last node of the
list. It is used for backward traversal of the list
❖ A DLL can be traversed in both the directions – forward or backward (i.e. starting
from the Head and traversing towards right by using next link of the Current node
or starting from the Tail pointer and traversing towards left by using prev link of
the Current node)
The disadvantage with Linear Linked List (i.e. Singly Linked List) is that we cannot reach any
of the nodes that precede the node to which CurrentNode is pointing (i.e. we cannot traverse
back in the list OR we cannot start back from the first node in the list). To overcome this
disadvantage, doubly linked list and circular linked list are the solutions
In a circular linked list, the last node of the list contains the address of the first node and
forms a circular chain
Depending on the application requirement, Circular Linked List can be implemented as 2
variants: (1) Circular Singly Linked List and (2) Circular Doubly Linked List
In a circular singly linked list, the last node of the list contains the address of the first node
In a circular doubly linked list, the next link of last node points to the first node in the list
and the previous link of first node points to the last node in the list
The operations that we can perform on circular linked lists are creation, insertion, deletion and
traversal
Following is the diagrammatic representation of a simple circular linked list:
#include<iostream>
using namespace std;
#include<conio.h>
struct node
{
int info;
struct node *next;
};
if (front == NULL)
{
cout << "\nList is empty, hence cannot search for an element";
}
else
{
for(temp = front; temp != rear; temp = temp->next)
{
if(temp == rear)
{
if(temp->info == data)
{
cout << "\n" << temp->info << " found at location " << position+1;
return;
}
}
cout << "\n" << data << " is not found in the list";
}
}
while(i)
{
cout << "\nEnter the info for the node " << k <<": ";
clist.create();
i--;
k++;
}
cout << "\nThe list is: ";
clist.display();
clist.del();
cout << "\nThe list after deletion of first element is: ";
clist.display();
cout << "\nEnter the node that you want to search: ";
cin >> value;
clist.search(value);
getch();
return 0;
}
Output
❖ In all searching techniques like linear search, binary search and search trees, the time required to
search an element depends on the total number of elements in that data structure. In all these
search techniques, as the number of elements increases, the time required to search an element
also increase
❖ Hashing is an approach in which time required to search an element does not depend on the
number of elements. Using hashing data structure, an element is searched with constant time
complexity (i.e. any element in a data structure can be searched in the same time). Hashing is an
effective way to reduce the number of comparisons to search an element in a data structure
❖ Static Hashing (or simply hashing) is the process of indexing and retrieving an element (i.e.
data) in a data structure to provide faster way of finding the element using the hash key. Here,
hash key is a value which provides the index value (i.e. address) where the actual data is likely
to store in the data structure
❖ In this data structure, we use a concept called Hash Table to store the data. All the data values
are inserted into the hash table based on the hash key. Hash key is used to map the data with
index (i.e. address) in the hash table. And initially the hash key is generated for every data
using a hash function. That means every entry in the hash table is based on the key value
generated using a hash function
❖ A hash table is an array which stores 'pointer to the data (i.e. actual memory address of the
data)' mapped to a 'given hash key' (i.e. a key/important data value) such that insertion,
deletion and search operations can be performed very quickly with constant time complexity
❖ In Hashing, hash table uses hash function to compute index (i.e. address) of array where a
record will be inserted or searched. Basically, there are 2 main components of hash table:
Hash function and Array (i.e. array of actual data)
❖ A hash value in hash table (i.e. address) is null if there is no hash value equal to the index for the
entry (i.e. actual data)
• The phenomenon where the hash function generates same hash value (i.e. address) for two
keys is called as a collision. A good hash function should ensure that the collisions are
minimum
• Load Factor
▪ The load factor is simply a measure of how full(occupied) the hash table is and is simply
defined as:
❖ Hash function is a function which takes a piece of data (i.e. key) as input and gives an integer
value (i.e. hash value or index value) as the output which represents the actual data in the
memory
❖ Basic concept of hashing and hash table is shown in the following diagram:
• A hash function may give same value (i.e. index) for two or more keys. The situation where
a newly inserted key map to an already occupied slot in hash table is called collision or
overflow and it must be handled using some collision handling techniques. This handling is
also called as overflow hashing
▪ The idea of separate chaining is to make each cell of hash table point to a linked list of
records that have same hash function value
▪ Example: Let us consider a simple hash function as "key mod 7" and sequence of keys
as 50, 700, 76, 85, 92, 73 and 101. Below is the diagrammatic representation of data
stored by using separate chaining method
2) Open Addressing
▪ In open addressing, all elements are stored in the hash table itself. So, at any point, size of
table must be greater than or equal to total number of keys (NOTE: We can increase the
table size by copying the old data if needed)
▪ Insert(k): Keep probing (i.e. moving) until an empty slot is found. Once an empty slot is
found, insert k
▪ Search(k): Keep probing until slot's key becomes equal to k or an empty slot is reached
General Process: Let hash(x) is the slot index (i.e. slot address) computed using hash
function and S is the table size (i.e. total number of keys) then
o If "hash(x) mod S" is full (i.e. already have an element stored), then we try for
"(hash(x)+1) mod S"
o If "(hash(x)+1) mod S" is full, then we try for "(hash(x)+2) mod S"
o If "(hash(x)+2) mod S" is full, then we try for "(hash(x)+3) mod S"
......
Example: Let us consider a simple hash function as "key mod 7" and sequence of
keys as 50, 700, 76, 85, 92, 73 and 101. Then below is the diagrammatic
representation of the key insertion process
b) Quadratic Probing: If hash(x) is the slot index computed using hash function and S
is the table size (i.e. total number of keys), then below is the process followed in
quadratic probing
Data Structures Unit-3 Notes (for BCA) – By Ganesh sir Page 27
o If "hash(x) mod S" is full, then we try for "(hash(x) + 1*1) mod S"
o If "(hash(x) + 1*1) mod S" is full, then we try for "(hash(x) + 2*2) mod S"
o If "(hash(x) + 2*2) mod S" is full, then we try for "(hash(x) + 3*3) mod S"
......
c) Double Hashing: If hash(x) is the slot index computed using hash function and S is
the table size (i.e. total number of keys), then below is the process followed in Double
hashing:
o If "hash(x) mod S" is full, then we try for "(hash(x) + 1*hash2(x)) mod S"
o If "(hash(x) + 1*hash2(x)) mod S" is full, then we try for "(hash(x) +
2*hash2(x)) mod S"
o If "(hash(x) + 2*hash2(x)) mod S" is full, then we try for "(hash(x) +
3*hash2(x)) mod S"
......