0% found this document useful (0 votes)
22 views29 pages

GROUP 15.Pptx Presentation

The document provides an overview of hash tables, describing their structure, characteristics, and operations such as insertion, searching, and deletion. It discusses collision resolution techniques, including chaining and open addressing, and emphasizes the importance of a good hash function for performance. Additionally, it outlines the advantages and disadvantages of hash tables, along with code implementations for handling collisions.

Uploaded by

Christine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views29 pages

GROUP 15.Pptx Presentation

The document provides an overview of hash tables, describing their structure, characteristics, and operations such as insertion, searching, and deletion. It discusses collision resolution techniques, including chaining and open addressing, and emphasizes the importance of a good hash function for performance. Additionally, it outlines the advantages and disadvantages of hash tables, along with code implementations for handling collisions.

Uploaded by

Christine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

GROUP 15: HARSH TABLES

Members:
1. Alex N Moyo N02425688J
2. Simbarashe Bope N02423343F
3. Russell Mashinya N02424758M
4. Ashley Mufundisi N02420358W
5. Nothando Sithole N02428220W
6. Takunda Mushambi N02427662R
7. Needmore A Muzenda N02420804Q
8. Blessing Hoto N02423763T
9. Elbethlem Moyo N02421199Y
10. Yeukai Kubiku N02422680F
11. Tadiwanashe Maradze N02425592P
HASH TABLES
• Introduction to hash tables
• A hash table is a dynamic data structure that implements
an associative array , which is a structure that can map
keys to values
• It is designed to store data in a way that allows for
efficient retrieval
• Hash Table is a data structure which stores data in an
associative manner. In a hash table, data is stored in an
array format, where each data value has its own unique
index value. Access of data becomes very fast if we know
the index of the desired data.
• Thus, it becomes a data structure in which insertion and
search operations are very fast irrespective of the size of the
data. Hash Table uses an array as a storage medium and
uses hash technique to generate an index where an element
is to be inserted or is to be located from.

• Hashing is a technique to convert a range of key values into


a range of indexes of an array. We're going to use modulo
operator to get a range of key values. Consider an example
of hash table of size 20, and the following items are to be
stored. Item are in the (key,value) format.
-

- Hashing by division: This straightforward hashing technique uses the key’s remaining value after
dividing it by the array’s size as the index. When an array size is a prime number and the keys are
evenly spaced out, it performs well.

-Hashing by multiplication: This straightforward hashing operation multiplies the key by a constant
between 0 and 1 before taking the fractional portion of the outcome. After that, the index is
determined by multiplying the fractional component by the array’s size. Also, it functions effectively
when the keys are scattered equally.
Characteristics of hash tables
Key-Value Pair Storage:
• Hash tables store data as pairs, where each key is unique
and maps to a specific value.
Fast Access Time:
• Average-case time complexity for search, insert, and delete
operations is O(1), making hash tables very efficient.
Collision Handling:
• Mechanisms (like chaining or open addressing) are
implemented to manage situations where multiple keys hash
to the same index.
Dynamic Resizing:
• Hash tables can automatically resize when the load factor
exceeds a certain threshold, usually by doubling the array
size and rehashing existing entries.
Hash Function Dependency:
• The efficiency of a hash table heavily depends on the
quality of the hash function, which should distribute keys
uniformly across the array.
Load Factor:
• This is a measure of how full the hash table is. It
influences performance and helps in deciding when to
resize.
Memory Usage:
• Hash tables can have a higher memory overhead due to
unused slots, especially when the load factor is low.
Order of Elements:
• Unlike ordered data structures (like arrays or linked lists),
hash tables do not maintain the order of elements. The order
of retrieval may not match the order of insertion.
Data Types:
• Hash tables can store various data types as values, allowing
for flexible data management.
• Concurrency:
• Some implementations of hash tables support concurrent
Choosing a hash function:
Selecting a decent hash function is based on the properties of the keys and the
intended functionality of the hash table. Using a function that evenly distributes the
keys and reduces collisions is crucial.
Criteria based on which a hash function is chosen:
To ensure that the number of collisions is kept to a minimum, a good hash function
should:

-Distribute the keys throughout the hash table in a uniform manner. This implies that
for all pairings of keys, the likelihood of two keys hashing to the same position in the
table should be rather constant.

-To enable speedy hashing and key retrieval, the hash function should be
computationally efficient.

-It ought to be challenging to deduce the key from its hash value. As a result, attempts
to guess the key using the hash value are less likely to succeed.
HOW THEY WORK

• The hash tables consist of keys which


are the unique identifiers of data values
and indices which are the positions of
the data values.
• The formula to get the index value is as
follows: sum of key values modulus the
size of the array.
• However this results in two keys
hashing into the same index which is
(collision).
• ALGORITHM
• 1. Initialization

• Array Creation: A hash table starts with an array (or list) of a


fixed size, often called the "bucket array." Each position in the
array can hold one or more entries (key-value pairs).
• 2. Hash Function
• Computing the Index: When a key is added, the hash function
takes the key as input and computes an index, which
determines where the corresponding value will be stored in
the array.
• Good Hash Function: A well-designed hash function provides
a uniform distribution of keys across the array, minimizing
collisions.
• 3. Insertion Process
• Calculate Hash: The hash function generates an index for the
given key.
• Store Value:
• If the index is empty, the key-value pair is stored there.
• If a collision occurs (another key is already stored at that
index), the hash table will use a collision resolution strategy:
• Chaining: The index points to a linked list (or another
structure) that holds all entries with colliding keys.
• Open Addressing: The algorithm looks for the next available
index according to a probing sequence (e.g., linear or
quadratic probing).
• 4. Searching for a Value
• Calculate Hash: To retrieve a value, the hash function
computes the index for the key.
• Check Index:
• If the key is found at the index, the associated value is
returned.
• If not found, in the case of chaining, the linked list at that index
is searched. In open addressing, the probing sequence is
followed to locate the key.
• 5. Deleting a Key-Value Pair
• Calculate Hash: Just like searching, the hash function
computes the index for the key.
• Remove Entry:
• If the key is found, it is removed from the array or the linked list
(if using chaining).
• In open addressing, the slot can be marked as deleted or a
special marker can be used to indicate that the slot was once
occupied.
• 6. Dynamic Resizing
• Load Factor: The load factor (number of entries divided by the
array size) is monitored. When it exceeds a certain threshold
(commonly 0.7), the hash table is resized.
• Rehashing: A new, larger array is created, and the existing key-
value pairs are rehashed and redistributed into the new array to
maintain efficient access.
• What is Load factor?
• A hash table’s load factor is determined by how many
elements are kept there in relation to how big the table is.
The table may be cluttered and have longer search times
and collisions if the load factor is high. An ideal load factor
can be maintained with the use of a good hash function
and proper table resizing
• Integer universe assumption: The keys are assumed to
be integers within a certain range according to the integer
universe assumption. This enables the use of basic
hashing operations like division or multiplication hashing.
COLLISION RESOLUTION
• Collision can be resolved in two ways: Open addressing
and closed addressing.
Buckets
• One obvious option is to reserve a two-dimensional array
from the start. We can think of each column as a bucket in
which we throw all the elements which give a particular
result when the hash function is supplied.
• The disadvantage of this approach is that it has to reserve
quite a bit more space than will be eventually required,
since it must take into account the likely maximal number
of collisions.
• Even while the table is still quite empty overall, collisions
will become increasingly likely.

Closed addressing

Direct Chaining
• Rather than reserving entire sub-arrays (the columns
above) for keys that collide, one can instead create a
linked list for the set of entries corresponding to each key.
Open addressing
Linear probing
• Linear probing involves trying to place the key into the
next slot if the calculated index(address) is occupied until
an open slot is found.
• Linear probing reduces the index by one to 3, and finds
an empty location in that position.
• However this results in primary clustering.
• Primary clustering refers to the bunching of keys together
inside the array while large proportions of it remain
unoccupied.
+3 hash
• This involves looking at every third slot along until a free
index is found.

Double hashing

This applies a second hash function to the key when a


collision occurs. The result of the second hash function
gives the number of positions along from the point of the
original collision.
Features Chaining Open Addressing
Uses linked lists (or other structures) Looks for the next available slot in the
Collision Handling
at each index to store multiple entries. array when a collision occurs.
May use more memory due to the Generally more memory-efficient as it
Memory Usage overhead of storing pointers in linked uses a single array; however, resizing
lists. can lead to wasted space.
Performance can degrade when many Average-case O(1) but can degrade
Performance collisions occur; average-case O(1), to O(n) if the load factor is high and
worst-case O(n). many probes are needed.
Resizing is straightforward; simply Resizing is more complex; requires
Resizing create a new larger array and rehash rehashing all entries and may require
all entries. careful handling of deleted slots.
Requires probing to find the next
New entries can be added directly to
nsertion available slot, which can take longer
the linked list at the computed index.
with higher load factors.
Can quickly traverse the linked list at Requires probing through the array,
Searching an index but may need to check which may take longer as the number
multiple entries. of entries increases.
Can easily remove entries from the Deletion can be more complex, as it
Deletion linked list but may have to adjust involves marking slots as deleted and
pointers. ensuring proper probing.
Relatively straightforward to
More complex due to the need for
mplementation Complexity implement; managing linked lists adds
probing and handling of deleted slots.
some complexity.
Advantages and Disadvantages of hash tables
•Advantages
. Disadvantages

Fast Access Collision Handling


Average-case O(1) time complexity for search, insert, and delete Requires mechanisms to manage collisions (e.g., chaining, open
operations. addressing).
Dynamic Resizing Non-Ordered
Can grow in size to accommodate more data without significant Does not maintain the order of elements; retrieval is not in
performance loss. insertion order.
Flexible Data Storage Memory Overhead
Can store various data types and structures, making them
May use more memory, especially with chaining due to pointers.
versatile.
Constant Time Complexity Load Factor Management
Operations can be performed in constant time under ideal High load factors can lead to increased collisions and longer
conditions. search times.
Ease of Implementation Complexity of Hash Function

Many programming languages have built-in implementations, A poorly designed hash function can lead to many collisions,
simplifying development. degrading performance.

Efficient Memory Use Deletion Complexity


Deleting items can be complicated, especially in open addressing
Can efficiently use memory when well-managed.
schemes.
Code implementations
creating a hash table
code on how to handle collisions
in c ++

• // Defines the LinkedList.


• typedef struct LinkedList {
• Ht_item* item;
• struct LinkedList* next;
• } LinkedList;;

• LinkedList* allocate_list()
• {
• // Allocates memory for a LinkedList pointer.
• LinkedList* list = (LinkedList*) malloc(sizeof(LinkedList));
• return list;
• }

• LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item)


• {
• // Inserts the item onto the LinkedList.
• if (!list)
• {
• LinkedList* head = allocate_list();
• head->item = item;
• head->next = NULL;
• list = head;
• return list;
• }
• else if (list->next == NULL)
• {
• LinkedList* node = allocate_list();
• node->item = item;
• node->next = NULL;
• list->next = node;
• return list;

handling overflow using linked lists
References
• Tutorials point
• You tube
• Search Poe
• https://www.digitalocean.com/community/tutorials/hash-
table-in-c-plus-plus
• https://www.geeksforgeeks.org/implementation-of-hash-
table-in-c-using-separate-chaining/

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy