Ads M Tech Mid 2
Ads M Tech Mid 2
A Hash table is defined as a data structure used to insert, look up, and remove key-value pairs
quickly. It operates on the hashing concept, where each key is translated by a hash function into
a distinct index in an array. The index functions as a storage location for the matching value. In
simple words, it maps the keys with the value.
A hash table’s load factor is determined by how many elements are kept there in relation to how
big the table is. The table may be cluttered and have longer search times and collisions if the load
factor is high. An ideal load factor can be maintained with the use of a good hash function and
proper table resizing.
A Function that translates keys to array indices is known as a hash function. The keys should be
evenly distributed across the array via a decent hash function to reduce collisions and ensure
quick lookup speeds.
Integer universe assumption: The keys are assumed to be integers within a certain
range according to the integer universe assumption. This enables the use of basic hashing
operations like division or multiplication hashing.
Hashing by division: This straightforward hashing technique uses the key’s remaining
value after dividing it by the array’s size as the index. When an array size is a prime
number and the keys are evenly spaced out, it performs well.
Hashing by multiplication: This straightforward hashing operation multiplies the key by
a constant between 0 and 1 before taking the fractional portion of the outcome. After that,
the index is determined by multiplying the fractional component by the array’s size. Also,
it functions effectively when the keys are scattered equally.
Selecting a decent hash function is based on the properties of the keys and the intended
functionality of the hash table. Using a function that evenly distributes the keys and reduces
collisions is crucial.
To ensure that the number of collisions is kept to a minimum, a good hash function
should distribute the keys throughout the hash table in a uniform manner. This implies
that for all pairings of keys, the likelihood of two keys hashing to the same position in the
table should be rather constant.
To enable speedy hashing and key retrieval, the hash function should be computationally
efficient.
It ought to be challenging to deduce the key from its hash value. As a result, attempts to
guess the key using the hash value are less likely to succeed.
A hash function should be flexible enough to adjust as the data being hashed changes. For
instance, the hash function needs to continue to perform properly if the keys being hashed
change in size or format.
Collisions happen when two or more keys point to the same array index. Chaining, open
addressing, and double hashing are a few techniques for resolving collisions.
Open addressing: collisions are handled by looking for the following empty space in the
table. If the first slot is already taken, the hash function is applied to the subsequent slots
until one is left empty. There are various ways to use this approach, including double
hashing, linear probing, and quadratic probing.
Separate Chaining: In separate chaining, a linked list of objects that hash to each slot in
the hash table is present. Two keys are included in the linked list if they hash to the same
slot. This method is rather simple to use and can manage several collisions.
Robin Hood hashing: To reduce the length of the chain, collisions in Robin Hood
hashing are addressed by switching off keys. The algorithm compares the distance
between the slot and the occupied slot of the two keys if a new key hashes to an already-
occupied slot. The existing key gets swapped out with the new one if it is closer to its
ideal slot. This brings the existing key closer to its ideal slot. This method has a tendency
to cut down on collisions and average chain length.
Dynamic resizing:
This feature enables the hash table to expand or contract in response to changes in the number of
elements contained in the table. This promotes a load factor that is ideal and quick lookup times.
Python, Java, C++, and Ruby are just a few of the programming languages that support hash
tables. They can be used as a customized data structure in addition to frequently being included
in the standard library.
Three techniques are commonly used to compute the probe sequence required for open
addressing:
1. Linear Probing.
2. Quadratic Probing.
3. Double Hashing.
1. Linear Probing:
Suppose a new record R with key k is to be added to the memory table T but that the memory
locations with the hash address H (k). H is already filled.
Our natural key to resolve the collision is to crossing R to the first available location following T
(h). We assume that the table T with m location is circular, so that T [i] comes after T [m].
Linear probing is simple to implement, but it suffers from an issue known as primary clustering.
Long runs of occupied slots build up, increasing the average search time. Clusters arise because
an empty slot proceeded by i full slots gets filled next with probability (i + 1)/m. Long runs of
occupied slots tend to get longer, and the average search time increases.
Given an ordinary hash function h': U {0, 1...m-1}, the method of linear probing uses the hash
function.
Given key k, the first slot is T [h' (k)]. We next slot T [h' (k) +1] and so on up to the slot T [m-1].
Then we wrap around to slots T [0], T [1]....until finally slot T [h' (k)-1]. Since the initial probe
position dispose of the entire probe sequence, only m distinct probe sequences are used with
linear probing.
Example: Consider inserting the keys 24, 36, 58,65,62,86 into a hash table of size m=11 using
linear probing, consider the primary hash function is h' (k) = k mod m.
Suppose a record R with key k has the hash address H (k) = h then instead of searching the
location with addresses h, h+1, and h+ 2...We linearly search the locations with addresses
Where (as in linear probing) h' is an auxiliary hash function c1 and c2 ≠0 are auxiliary constants
and i=0, 1...m-1. The initial position is T [h' (k)]; later position probed is offset by the amount
that depend in a quadratic manner on the probe number i.
Example: Consider inserting the keys 74, 28, 36,58,21,64 into a hash table of size m =11 using
quadratic probing with c1=1 and c2=3. Further consider that the primary hash function is h' (k) =
k mod m.
Insert 28.
Insert 36.
Insert 58.
Insert 21.
Insert 64.
3. Double Hashing:
Double Hashing is one of the best techniques available for open addressing because the
permutations produced have many of the characteristics of randomly chosen permutations.
h1 (k) = k mod m or h2 (k) = k mod m'. Here m' is slightly less than m (say m-1 or m-2).
Example: Consider inserting the keys 76, 26, 37,59,21,65 into a hash table of size m = 11 using
double hashing. Consider that the auxiliary hash functions are h1 (k)=k mod 11 and h2(k) = k
mod 9.
1. Insert 76.
h1(76) = 76 mod 11 = 10
h2(76) = 76 mod 9 = 4
h (76, 0) = (10 + 0 x 4) mod 11
= 10 mod 11 = 10
T [10] is free, so insert key 76 at this place.
2. Insert 26.
h1(26) = 26 mod 11 = 4
h2(26) = 26 mod 9 = 8
h (26, 0) = (4 + 0 x 8) mod 11
= 4 mod 11 = 4
T [4] is free, so insert key 26 at this place.
3. Insert 37.
h1(37) = 37 mod 11 = 4
h2(37) = 37 mod 9 = 1
h (37, 0) = (4 + 0 x 1) mod 11 = 4 mod 11 = 4
T [4] is not free, the next probe sequence is
h (37, 1) = (4 + 1 x 1) mod 11 = 5 mod 11 = 5
T [5] is free, so insert key 37 at this place.
4. Insert 59.
h1(59) = 59 mod 11 = 4
h2(59) = 59 mod 9 = 5
h (59, 0) = (4 + 0 x 5) mod 11 = 4 mod 11 = 4
Since, T [4] is not free, the next probe sequence is
h (59, 1) = (4 + 1 x 5) mod 11 = 9 mod 11 = 9
T [9] is free, so insert key 59 at this place.
5. Insert 21.
h1(21) = 21 mod 11 = 10
h2(21) = 21 mod 9 = 3
h (21, 0) = (10 + 0 x 3) mod 11 = 10 mod 11 = 10
T [10] is not free, the next probe sequence is
h (21, 1) = (10 + 1 x 3) mod 11 = 13 mod 11 = 2
T [2] is free, so insert key 21 at this place.
6. Insert 65.
h1(65) = 65 mod 11 = 10
h2(65) = 65 mod 9 = 2
h (65, 0) = (10 + 0 x 2) mod 11 = 10 mod 11 = 10
T [10] is not free, the next probe sequence is
h (65, 1) = (10 + 1 x 2) mod 11 = 12 mod 11 = 1
T [1] is free, so insert key 65 at this place.
Thus, after insertion of all keys the final hash table is
A Priority Queue is a data structure that allows you to insert elements with a priority, and
retrieve the element with the highest priority.
You can implement a priority queue using either an array or a heap. Both array and heap-based
implementations of priority queues have their own advantages and disadvantages. Arrays are
generally easier to implement, but they can be slower because inserting and deleting elements
requires shifting the elements in the array. Heaps are more efficient, but they can be more
complex to implement. You can also refer to the Difference between Heaps and Sorted Array for
a general comparison between the two.
It involves creating a binary heap data structure and maintaining the heap property as elements
are inserted and removed. In a binary heap, the element with the highest priority is always the
root of the heap. To insert an element, you would add it to the end of the heap and then perform
the necessary heap operations (such as swapping the element with its parent) to restore the heap
property. To retrieve the highest priority element, you would simply return the root of the heap.
To implement a priority queue using a heap, we can use the following steps:
Create a heap data structure (either a max heap or a min-heap)
To insert an element into the priority queue, add the element to the heap using the heap’s
insert function. The heap will automatically rearrange the elements to maintain the heap
property.
To remove the highest priority element (in a max heap) or the lowest priority element (in
a min-heap), use the heap’s remove function. This will remove the root of the tree and
rearrange the remaining elements to maintain the heap property.
To implement a priority queue using arrays, we can use the following steps:
Both arrays and heaps can be used to implement priority queues, but heaps are generally
more efficient because they offer faster insertion and retrieval times. The choice of data
structure will depend on the specific requirements of your application. It is important to
consider the trade-offs between the ease of implementation and the performance of the
data structure when deciding which one to use.
A Binary Search Tree (or BST) is a data structure used in computer science for organizing and
storing data in a sorted manner. Each node in a Binary Search Tree has at most two children, a
left child and a right child, with the left child containing values less than the parent node and the
right child containing values greater than the parent node. This hierarchical structure allows for
efficient searching, insertion, and deletion operations on the data stored in the tree.
Introduction to BST
Applications of BST
Insertion in BST
Searching in BST
Deletion in BST
BST Traversals
Minimum in BST
Maximum in BST
Floor in BST
Ceil in BST
3. return root
6. else
8. END if
9. Step 2 - END
In a binary search tree, we must delete a node from the tree by keeping in mind that the property
of BST is not violated. To delete a node from BST, there are three possible situations occur -
It is the simplest case to delete a node in BST. Here, we have to replace the leaf node with NULL
and simply free the allocated space.
We can see the process to delete a leaf node from BST in the below image. In below image,
suppose we have to delete node 90, as the node to be deleted is a leaf node, so it will be replaced
with NULL, and the allocated space will free.
In this case, we have to replace the target node with its child, and then delete the child node. It
means that after replacing the target node with its child node, the child node will now contain the
value to be deleted. So, we simply have to replace the child node with NULL and free up the
allocated space.
We can see the process of deleting a node with one child from BST in the below image. In the
below image, suppose we have to delete the node 79, as the node to be deleted has only one
child, so it will be replaced with its child 55.
So, the replaced node 79 will now be a leaf node that can be easily deleted.
The inorder successor is required when the right child of the node is not empty. We can obtain
the inorder successor by finding the minimum element in the right child of the node.
We can see the process of deleting a node with two children from BST in the below image. In the
below image, suppose we have to delete node 45 that is the root node, as the node to be deleted
has two children, so it will be replaced with its inorder successor. Now, node 45 will be at the
leaf of the tree so that it can be deleted easily.
A new key in BST is always inserted at the leaf. To insert an element in BST, we have to start
searching from the root node; if the node to be inserted is less than the root node, then search for
an empty location in the left subtree. Else, search for the empty location in the right subtree and
insert the data. Insert in BST is similar to searching, as we always have to maintain the rule that
the left subtree is smaller than the root, and right subtree is larger than the root.
Now, let's see the process of inserting a node into BST using an example.
The complexity of the Binary Search tree
Let's see the time and space complexity of the Binary search tree. We will see the time
complexity for insertion, deletion, and searching operations in best case, average case, and worst
case.
1. Time Complexity
2. Space Complexity
An AVL tree defined as a self-balancing Binary Search Tree (BST) where the difference
between heights of left and right subtrees for any node cannot be more than one.
The difference between the heights of the left subtree and the right subtree for any node is known
as the balance factor of the node.
The AVL tree is named after its inventors, Georgy Adelson-Velsky and Evgenii Landis, who
published it in their 1962 paper “An algorithm for the organization of information”.
AVL tree
The above tree is AVL because the differences between the heights of left and right subtrees for
every node are less than or equal to 1.
Insertion
Deletion
An AVL tree may rotate in one of the following four ways to keep itself balanced:
Left Rotation:
When a node is added into the right subtree of the right subtree, if the tree gets out of balance, we
do a single left rotation.
Right Rotation:
If a node is added to the left subtree of the left subtree, the AVL tree may get out of balance, we
do a single right rotation.
Left-Right Rotation:
A left-right rotation is a combination in which first left rotation takes place after that right
rotation executes.
Right-Left Rotation:
A right-left rotation is a combination in which first right rotation takes place after that left
rotation executes.
1. AVL trees can self-balance themselves and therefore provides time complexity as O(Log
n) for search, insert and delete.
3. Since the balancing rules are strict compared to Red Black Tree, AVL trees in general
have relatively less height and hence the search is faster.
4. AVL tree is relatively less complex to understand and implement compared to Red Black
Trees.
1. It is difficult to implement compared to normal BST and easier compared to Red Black
2. Less used compared to Red-Black trees. Due to its rather strict balance, AVL trees
provide complicated insertion and removal operations as more rotations are performed.
1. AVL Tree is used as a first example self balancing BST in teaching DSA as it is easier to
understand and implement compared to Red Black
2. Applications, where insertions and deletions are less common but frequent data lookups
along with other operations of BST like sorted traversal, floor, ceil, min and max.
3. Red Black tree is more commonly implemented in language libraries like map in C++, set
in C++, TreeMap in Java and TreeSet in Java.
4. AVL Trees can be used in a real time environment where predictable and consistent
performance is required.
Binary search trees are a fundamental data structure, but their performance can suffer if the tree
becomes unbalanced. Red Black Trees are a type of balanced binary search tree that use a set
of rules to maintain balance, ensuring logarithmic time complexity for operations like insertion,
deletion, and searching, regardless of the initial shape of the tree. Red Black Trees are self-
balancing, using a simple color-coding scheme to adjust the tree after each modification.
Red-Black Tree
6. Basic Operations on RED – BLACK Trees and also rotations ?
3. Red Property: Red nodes cannot have red children (no two consecutive red nodes on any path).
4. Black Property: Every path from a node to its descendant null nodes (leaves) has the same
number of black nodes.
These properties ensure that the longest path from the root to any leaf is no more than twice as
long as the shortest path, maintaining the tree’s balance and efficient performance.
1. Insertion
2. Search
3. Deletion
4. Rotation
1. Insertion
Inserting a new node in a Red-Black Tree involves a two-step process: performing a standard
binary search tree (BST) insertion, followed by fixing any violations of Red-Black properties.
Insertion Steps
2. Fix Violations:
o If the parent is red, the tree might violate the Red Property, requiring fixes.
Fixing Violations During Insertion
After inserting the new node as a red node, we might encounter several cases depending on the
colors of the node’s parent and uncle (the sibling of the parent):
Case 1: Uncle is Red: Recolor the parent and uncle to black, and the grandparent to red. Then
move up the tree to check for further violations.
o Sub-case 2.1: Node is a right child: Perform a left rotation on the parent.
o Sub-case 2.2: Node is a left child: Perform a right rotation on the grandparent and
recolor appropriately.
2. Searching
Searching for a node in a Red-Black Tree is similar to searching in a standard Binary Search
Tree (BST). The search operation follows a straightforward path from the root to a leaf,
comparing the target value with the current node’s value and moving left or right accordingly.
Search Steps
o If the target value is equal to the current node’s value, the node is found.
o If the target value is less than the current node’s value, move to the left child.
o If the target value is greater than the current node’s value, move to the right child.
3. Repeat: Continue this process until the target value is found or a NIL node is reached (indicating
the value is not present in the tree).
3. Deletion
Deleting a node from a Red-Black Tree also involves a two-step process: performing the BST
deletion, followed by fixing any violations that arise.
Deletion Steps
o If a black node is deleted, a “double black” condition might arise, which requires specific
fixes.
When a black node is deleted, we handle the double black issue based on the sibling’s color and
the colors of its children:
Case 1: Sibling is Red: Rotate the parent and recolor the sibling and parent.
o Sub-case 2.1: Sibling’s children are black: Recolor the sibling and propagate the double
black upwards.
If the sibling’s far child is red: Perform a rotation on the parent and sibling, and
recolor appropriately.
If the sibling’s near child is red: Rotate the sibling and its child, then handle as
above.
4. Rotation
Rotations are fundamental operations in maintaining the balanced structure of a Red-Black Tree
(RBT). They help to preserve the properties of the tree, ensuring that the longest path from the
root to any leaf is no more than twice the length of the shortest path. Rotations come in two
types: left rotations and right rotations.
1. Left Rotation
A left rotation at node x moves x down to the left and its right child y up to take x’s place.
Before Rotation:
x
\
y
/ \
a b
y
/ \
x b
\
a
2. Right Rotation
A right rotation at node x moves x down to the right and its left child y up to take x’s place.
1
x
4
/
5
y
6
/ \
7
a b
8
11
y
12
/ \
13
a x
14
15 b
Right Rotation Steps: