0% found this document useful (0 votes)
14 views9 pages

DSA Lab 11 Hashing

Uploaded by

amjadrimsha851
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views9 pages

DSA Lab 11 Hashing

Uploaded by

amjadrimsha851
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Data Structures Lab

Session 11
Course: Data Structures (CL2001) Semester: Fall 2024
Instructor: Alishba Subhani T.A:

• Maintain discipline during the lab.


• Just raise hand if you have any problem.
• Get your lab checked at the end of the session.

HASHING
Hashing refers to the process of generating a fixed-size output from an input of variable size
using the mathematical formulas known as hash functions. This technique determines an
index or location for the storage of an item in a data structure.

Components Of Hashing:
There are majorly three components of hashing:
1. Key: A Key can be anything string or integer which is fed as input in the hash function
the technique that determines an index or location for storage of an item in a data
structure.
2. Hash Function: The hash function receives the input key and returns the index of an
element in an array called a hash table. The index is known as the hash index .
3. Hash Table: Hash table is a data structure that maps keys to values using a special
function called a hash function. Hash stores the data in an associative manner in an
array where each data value has its own unique index.

1|Page
Inserting In A Hash Table:

1. Choose a Hash Function: The first step is selecting or designing a hash function suitable
for the data and the hash table size. The function should map input keys to indices
within the range of the hash table size, ensuring uniform distribution.
H(key) = key % sizeOfHashTable
The hash index must be within the range of hashtable size, so the key is usually taken
modulo the table size to produce a valid index.

2. Calculate Hash Code: For a given key, apply the hash function to generate an index
where that key will be inserted in the hashtable.

3. Insert Data:
i) Calculate the index using the hash function.
ii) Check if the computed index in the hash table is empty
o If it’s empty, place the key at that index.
o If it’s occupied (collision occurs), resolve the collision using a chosen
method:
▪ Separate Chaining: Add the key-value pair to a linked list at that index.
▪ Open Addressing: Probe for the next available slot based on the probing
technique (e.g., linear, quadratic, or double hashing).

2|Page
COLLISION RESOLUTION

1. SEPARATE CHAINING: This method involves making a linked list out of the slot
where the collision happened, then adding the new key to the list.
o Time complexity: Its worst-case complexity for searching and deletion is o(n).
o The hash table never fills full, so we can add more elements to the chain.
o It requires more space for element links.

3|Page
Code For Separate Chaining Using Vectors of Vectors:

4|Page
2. OPEN ADDRESSING: To
prevent collisions in the
hashing table, open
addressing is employed as
a collision-resolution
technique. No key is kept
anywhere else besides the
hash table. As a result, the
hash table’s size is never
equal to or less than the
number of keys. (Note
that we can increase table
size by copying old data
if needed). Additionally
known as closed hashing.

a) Linear probing: In linear probing, if a collision occurs at an index i, the


algorithm checks the next slot (i + 1) % table_size, then (i + 2) % table_size,
and so on until an empty slot is found. This technique often leads to clustering,
where contiguous blocks of filled slots are formed, which can slow down search
times

5|Page
b) Quadratic probing: When a collision occurs at index i, the next slots checked
are
(i + 1^2) % table_size,
(i + 2^2) % table_size,
(i + 3^2) % table_size, and so on.
This spreads out the potential positions, reducing clustering but requiring a
well-sized table to ensure all slots can be reached.

6|Page
c) Double hashing: Double hashing uses a second hash function to calculate the
step size for probing.When a collision occurs at index i, the next slot is
determined by (i + j * hash2(key)) % table_size, where hash2 is a
secondary hash function, and j increments with each probe. Double hashing
generally provides a good spread across the table and minimizes clustering.

REHASHING
Rehashing is the process of resizing a hash table and reassigning all the elements to new
positions within it. This is done to reduce the load factor, minimize collisions, and improve
the performance of hash operations. In essence, rehashing involves creating a new, larger
hash table and re-inserting each key-value pair from the old table using a new hash function
or the same one adjusted to the new table size.

When is Rehashing Needed?

Rehashing is typically triggered when the load factor of the hash table reaches or exceeds a
certain threshold, usually around 0.7 to 0.75. The load factor is defined as:

7|Page
A high load factor means there are more elements relative to the number of slots, leading to a
higher probability of collisions and therefore longer search times. Rehashing alleviates this by
expanding the table size and redistributing elements.

Steps for Rehashing

1. Calculate the New Table Size


o Generally, the new size is a prime number roughly double the current size.
Prime numbers help in reducing clustering when using hash functions.
o For instance, if the current table has 10 slots, the new table might have 23 or 29
slots.
2. Create the New Hash Table
o Allocate a new hash table with the updated size.
3. Rehash All Existing Elements
o For each element in the old hash table:
▪ Calculate a new index using the hash function and the updated table size.
▪ Insert the element at this new index in the new table.
o This step can be time-consuming, as every element must be rehashed and
inserted into the new table.
4. Replace the Old Table with the New Table
o Once all elements have been rehashed into the new table, replace the old table
with the new one.
o Update any references to the old table, effectively freeing its memory.

8|Page
EXERCISES:
1. Design a library catalog system where each book is assigned a unique ID. To store and retrieve
book information efficiently, the system uses a hash table. However, due to limited storage
slots, books of the same authors map to the same index create a mechanism to handle
overlapping book IDs effectively.
Each book ID is a 3-digit number with 1st two numbers representing the book author and the last
digit is the book ID specific to that author.
a. Create a hash table of size 10 and insert 9 records (3 for author A, 2 for author B, 4 for
author C).
b. Search for 2 of the author’s books inserted and 1 book that is not on the table.
c. Delete the 2 books from part b.
Display the hash table after each operation.

2. A fitness club stores its member IDs on a fixed-size table for quick access. Each unique member
ID is mapped to a position in the table using a hash function. Due to limited storage, the table
cannot have gaps left unused for long, so if a position is occupied, the system must look for the
next available slot for the new member ID.
a. Create a hash table of size 7 and insert member IDs 10 - 60.
b. Search for member IDs: 30, 50, 70.
c. Delete member IDs: 20 and 40. Insert additional member IDs: 70, 80 to show how the
deleted slots are reused.
Display the hash table after each operation.

3. A university uses an academic portal which has limited storage, so it adjusts its storage capacity
dynamically when the number of student IDs exceeds a certain threshold. However, these
unique IDs need to be strategically placed to minimize search times while ensuring that all slots
are accessible.
a. Create a hash table with an initial size of 7 and a load factor threshold of 0.75. Insert
student IDs: 12, 22, 32, 42, 52, 62.
b. Search for student IDs: 22, 42, 72.
c. Insert additional IDs: 72, 82 to exceed the load factor threshold. Use a new hash
function based on the resized table size.
Display the hash table after each operation.

4. A banking system is designed to store customer account numbers on a hash table. To ensure
data security and efficiency, the system uses an additional mathematical formula to decide the
next slot when collisions occur. This method ensures that even closely related account numbers
do not lead to clusters of occupied slots.
a. Create a hash table of size 11 to store customer account numbers. Use primary_hash =
ID % table_size for the initial position and secondary_hash = 7 - (ID % 7) for the step
size. Insert the following account numbers: 101, 111, 121, 131, 141, 151.
b. Search for account numbers: 111, 141, 161.
c. Delete account numbers: 111 and 131. Insert additional account numbers: 161 and 171
to demonstrate how the secondary formula resolves collisions while avoiding clustering.
Display the hash table after each operation.
9|Page

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy