0% found this document useful (0 votes)
5 views118 pages

UNIT 1- Hashing

The document provides an overview of hashing, including concepts such as hash tables, hash functions, and collision resolution strategies. It discusses the advantages and disadvantages of hash tables, operations supported, and various types of hash functions. Additionally, it covers applications of hash tables and techniques for handling collisions, including open and closed hashing.

Uploaded by

sharma9103867592
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views118 pages

UNIT 1- Hashing

The document provides an overview of hashing, including concepts such as hash tables, hash functions, and collision resolution strategies. It discusses the advantages and disadvantages of hash tables, operations supported, and various types of hash functions. Additionally, it covers applications of hash tables and techniques for handling collisions, including open and closed hashing.

Uploaded by

sharma9103867592
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 118

“HASHING”

CLASS : SE COMPUTER - A
SUBJECT : DSA (SEM-II)

:I
.UNIT
SYLLABUS
SYLLABUS
Hash Table: Concepts-hash table, hash function, basic operations,
bucket, collision, probe, synonym, overflow, open hashing, closed
hashing, perfect hash function, load density, full table, load factor,
rehashing, issues in hashing
Hash Functions: properties of good hash function, division,
multiplication, extraction, mid-square, folding and universal
Collision Resolution Strategies: open addressing and chaining,
Hash table overflow- open addressing and chaining, extendible
hashing, closed addressing and separate chaining.
Skip List: representation, searching and operations- insertion,
removal
UNIT-I
HASHIN
HASHIN
• Hashing is one of the searching techniques that uses a constant time.
G
The time complexity in hashing is O(1). Till now, we read the two
techniques for searching, i.e., linear search and binary search
• The worst time complexity in linear search is O(n), and O(logn) in
binary search. In both the searching techniques, the searching depends
upon the number of elements but we want the technique that takes a
constant time. So, hashing technique came that provides a constant
time.
• In Hashing technique, the hash table and hash function are used.
Using the hash function, we can calculate the address at which the
value can be stored.
INTRODUCTIO
N
1. Hashing is finding an address where the data is
to be stored as well as located using a key with
the help of the algorithmic function.
2. Hashing is a method of directly computing the
address of the record with the help of a key by
using a suitable mathematical function called
the hash function
3. A hash table is an array-based structure used to
store <key, information> pairs
HASHING/HASH
FUNCTION
• The main idea behind the hashing is to create the (key/value)
pairs. If the key is given, then the algorithm computes the index
at which the value would be stored. It can be written as:
• Index = hash(key)
INTRODUCTIO
3. Hash Table:N
• A hash table is an array-based structure used to store
<key, value> pairs.
• A Hash table is a data structure that stores some
information, and the information has basically two main
components, i.e., key and value.
• A Hash table can be used for quick insertion, searching
and retrieval of data.
• A hash function is applied to the key of the record
being stored, returning an index within the range of the
hash table.
• The resulting address is used as the basis for storing and
retrieving records and this address is called as home
address of the record
HASH
• The item is thenTABLE
stored in the table of that index
position
• Hash table is one of the most important data structures that
uses a special function known as a hash function that
maps a given value with a key to access the elements
faster.
• The hash table can be implemented with the help of an
associative array.
• The efficiency of mapping depends upon the efficiency of
the hash function used for mapping.
HASH
TABLE

Fig 2:Hash Table


ADVANTAGES OF HASH
TABLE
Here, are pros/benefits of using hash tables:
1. Hash tables have high performance when looking up data,
inserting, and deleting existing values.
2. The time complexity for hash tables is constant O(1)
regardless of the number of items in the table.
3. They perform very well even when working with
large datasets.
DISAVANTAGES OF
HASH TABLE
Here, are cons of using hash tables:

1. You cannot use a null value as a key.


2. Collisions cannot be avoided when generating keys using.
hash functions. Collisions occur when a key that is already
in use is generated.
3. If the hashing function has many collisions, this can lead
to performance decrease.
OPERATIONS OF
HASH TABLE
Here, are the Operations supported by Hash tables:
1. Insertion : this Operation is used to add an element to the
hash table
2. Searching : this Operation is used to search for elements
in the hash table using the key
3. Deleting : this Operation is used to delete elements
from the hash table
APPLICATIONS OF
HASH
Real-world Applications
TABLE
In the real-world, hash tables are used to store data for

1. Databases(Indexing Technique)
2. Associative arrays
3. Sets
4. Memory cache
HASH
FUNCTION
• A function that maps a key into the range [0 to Max − 1], the
result of which is used as an index (or address) to hash table for
storing and retrieving record
• The address generated by hashing function is called as home
address
• All home addresses address to particular area of memory and that
area is called as prime area
PROPERTIES OF HASH
FUNCTION
1) Hash function should be simple to computer.
2) Number of collision should be less
3) The hash function uses all the input data.
4) The hash function "uniformly" distributes the data across
the entire set of possible hash values.
5) The hash function generates very different hash values
for similar strings.
BUCKE

T
Bucket is an index position in hash table that can store more than
one record
• A hash file stores data in bucket format.
• Bucket is considered a unit of storage.
• A bucket typically stores one complete disk block, which in turn
can store one or more records.
• When the same index is mapped with two keys, then both
the records are stored in the same bucket
BUCKET

Fig. 3: File System


BUCKE
T
COLLISIO
N
• The result of two keys hashing into the same address is
called
collision
PROB
• Each calculation
E
of an address and test for success is
known as
Probe.
SYNONYM
S
• Keys those hash to the same address are called synonyms
OVERFLO
W
• The result of more keys hashing to the same address and if there
is no room in the bucket, then it is said that overflow has
occurred
• Collision and overflow are synonymous when the bucket is
of size 1
Perfect Hash

Function
A perfect hash function h for a set S is a hash function that maps
distinct elements in S to a set of m integers, with no collisions.
• A perfect hash function with values in a limited range can be used
for efficient lookup operations, by placing keys from S (or other
associated values) in a table indexed by the output of the function.
Perfect Hash
Function
• Advantages :
1. A perfect hash function with values in a limited range
can be used for efficient lookup operations.
2. No need to apply collision resolution techniques.
LOAD
FACTOR
 Load factor is defined as (m/n) where n is the total size of the hash table
and m is the preferred number of entries which can be inserted before a
increment in size of the underlying data structure

 The load factor is simply a measure of how full (occupied) the hash table
is, and is simply defined as: α = number of occupied slots/total slots
 In simple words, consider we have a hash table of size 1000, and we have
500 slots filled, then the load factor would become α = 500/1000 = 0.5
 If Load factor (α) = constant, then time complexity of
Insert, Search, Delete = Θ(1)
Load Density

• Load Density : The identifier density of a hash table


is the ratio n/T,
• where n is the number of identifiers in the table.
The loading density or loading factor of a hash
table is a = n /(sb)
• T is total number of possible element.
Load Density
• Example :
• Consider the hash table with b = 26 buckets and s =
2. We have n = 10 distinct identifiers, each
representing a C library function.
• This table has a loading factor, a, of 10/52 = 0.19
REHASHING
• Load factor should be less than 1.
• If Load Factor is greater than 1 then we need to increase the
number of buckets.
• The technique in which table is resized is called as
“REHASHING”
• New size of table will be N’ = closest prime greater than equal
to 2N.
• How to REHASH?
• Increase N to N’
• Modify hash function to x%N’
• Apply modified hash function to existing elements.
TYPES OF HASH
FUNCTION
There are three ways of calculating the hash function:
1. Division method
2. Folding method
3. Mid square method
4. Multiplication Method
TYPES OF HASH
FUNCTION
1. Division Method:
This is the most simple and easiest method to generate a hash value. The
hash function divides the value k by M and then uses the remainder
obtained.
In the division method, the hash function can be defined as:
h(ki) = ki % m;
where m is the size of the hash table.
For example, if the key value is 6 and the size of the hash table is
10. When we apply the hash function to key 6 then the index would be:
h(6) = 6%10 = 6 The index is 6 at which the value is stored.
TYPES OF HASH
FUNCTION
1. Division Method:

Formula: h(K) = k mod M Here,


k is the key value, and
M is the size of the hash table.

Example:
k = 12345
M = 95
h(12345) = 12345 mod 95 = 90

k = 1276
M = 11
h(1276) = 1276 mod 11 =0
TYPES OF HASH
FUNCTION
1. Division Method:
TYPES OF HASH
2. FUNCTION
Mid Square Method
It involves two steps to compute the hash value-

1. Square the value of the key k i.e. k2

2. Extract the middle r digits as the hash value.


Example:
Suppose the hash table has 100 memory locations. So r = 2 because
two digits are required to map the key to the memory location.
k = 60
k x k = 60 x 60
= 3600
h(60) = 60
The hash value obtained is 60
TYPES OF HASH
2. FUNCTION
The mid square method
TYPES OF HASH
FUNCTION
3. Digit Folding Method : This method involves two steps:
Divide the key-value k into a number of parts i.e. k1, k2, k3,….,kn, where each
part has the same number of digits except for the last part that can have lesser
digits than the other parts.
Add the individual parts. The hash value is obtained by ignoring the last carry if
any.
Formula:

k = k1, k2, k3, k4, ….., kn


s = k1+ k2 + k3 + k4 +….+ kn
h(K)= s
Here,
s is obtained by adding the
TYPES OF HASH
FUNCTION
3. Digit Folding Method :

Example:

k = 12345
k1 = 12, k2 = 34, k3 = 5
s = k1 + k2 + k3
= 12 + 34 + 5
= 51
h(K) = 51
TYPES OF HASH
3.FUNCTION
Digit Folding Method :
TYPES OF HASH
FUNCTION
4. Multiplication method :
This method involves the following steps:
• Choose a constant value A such that 0 < A < 1.
• Multiply the key value with A.
• Extract the fractional part of kA.
• Multiply the result of the above step by the size of the hash table
i.e. M.
• The resulting hash value is obtained by taking the floor of
the result obtained in step 4.
TYPES OF HASH
FUNCTION
4. Multiplication method :
Formula: h(K) = floor (M* (kA mod 1))
M is the size of the hash
table. k is the key value.
A is a constant value.

⌊k A⌋.
Where "k A mod 1" means the fractional part of k A, that is, k A -
TYPES OF HASH
FUNCTION
4. Multiplication method :
Example:

k = 12345
A = 0.357840
M = 100

h(12345) = floor[ 100 (12345*0.357840 mod


1)]
= floor[ 100 (4417.5348 mod 1) ]
= floor[ 100 (0.5348) ]
= floor[ 53.48 ]
TYPES OF HASH
FUNCTION
COLLISIO
N
When the two different values have the same value, then the problem occurs
between the two values, known as a collision. In the above example, the value is
stored at index 6.
If the key value is 26, then the index would be:

h(26) = 26%10 = 6
Therefore, two values are stored at the same index, i.e., 6, and this leads to the
collision problem. To resolve these collisions, we have some techniques known
as collision techniques.

The following are the collision techniques:


Open Hashing/Separate Chaining: It is also known as closed addressing.
Closed Hashing: It is also known as open addressing.
TYPES OF
HASHING
Static Hashing
Dynamic Hashing
STATIC HASHING
 In the static hashing, the resultant data bucket address will always
remain the same.
 Therefore, in this static hashing method, the number of data buckets in
memory always remains constant.
 There are two main types of static hashing schemes as below:

1. Open Hashing/Closed Addressing


1. Chaining/ Separate Chaining/Overflow Chaining

2. Closed Hashing/Open Addressing


1. Linear Probing
2. Quadratic Probing
3. Double Hashing
Static Hashing
Techniques
In Hashing, collision resolution techniques are classified as
DYNAMIC HASHING
 Dynamic hashing offers a mechanism in which data buckets are added
and removed dynamically and on demand.
 In this hashing, the hash function helps you to create a large number of
values.
 An issue in static hashing is bucket overflow.
 Dynamic hashing helps to overcome this issue.
 It is also called Extendible Hashing method.
 In this method, the data buckets increase and decrease depending on the
number of records.
OPEN HASHING/
CHAINING
An Open Hashing, one of the methods used to resolve the collision
is known as a chaining method.
Maintains a linked list at every index for collided elements.
Lets take example of insertion sequence:
{ k1,k2,k3…kn)}
Here h(k)= k mod 10
Hash table T is vector linked list.
Key k is stored in list at T[h(k)]
So the problems is “ Insert first 10 perfect squares in hash table of
OPEN HASHING/
CHAINING
An Open Hashing, one of the methods used to resolve the collision
is known as a chaining or separate chaining method.
OPEN
HASHING
Let's first understand the chaining to resolve the collision.
Suppose we have a list of key values
A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3
In this case, we cannot directly use h(k) = ki/m as h(k) = 2k+3

The index of key value 3 is:


index = h(3) = (2(3)+3)%10 = 9
The value 3 would be stored at the index 9.
The index of key value 2 is:
index = h(2) = (2(2)+3)%10 = 7
The value 2 would be stored at the index 7.
OPEN
HASHING
The index of key value 9 is:
index = h(9) = (2(9)+3)%10 = 1
The value 9 would be stored at the index 1.

The index of key value 6 is:


index = h(6) = (2(6)+3)%10 = 5
The value 6 would be stored at the index 5.

The index of key value 11 is:


index = h(11) = (2(11)+3)%10 = 5
The value 11 would be stored at the index 5.
OPEN
HASHING
The index of key value 13 is:
index = h(13) = (2(13)+3)%10 = 9
The value 13 would be stored at index 9.

The index of key value 7 is:


index = h(7) = (2(7)+3)%10 = 7
The value 7 would be stored at index 7.

The index of key value 12 is:


index = h(12) = (2(12)+3)%10 = 7
The value 7 would be stored at index 7.
OPEN
HASHING
OPEN
HASHING
CLOSED
In Closed
HASHING
hashing, three techniques are used to resolve
the collision:
1. Linear probing
2. Quadratic probing
3. Double Hashing technique
CLOSED HASHING
Method Description
Just like the name suggests, this
method searches for empty slots
linearly starting from

Linear probing
the position where the collision
occurred and moving forward. If the
end of the list is reached and no
empty slot is found. The probing
This method uses quadratic polynomial
Quadratic starts at the beginning of the list.
probing expressions to find the next available
free slot.
Double This technique uses a secondary
Hashing hash function algorithm to find the
next free available slot.
Linear Probing

 Linear probing is one of the forms of open addressing.


 As we know that each cell in the hash table contains a key-value
pair, so when the collision occurs by mapping a new key to the
cell already occupied by another key, then linear probing
technique searches for the closest free locations and adds a
new key to that empty cell.
 In this case, searching is performed sequentially, starting from the
position where the collision occurs till the empty cell is not
found.
Linear Probing
 Probing Function
probing function is defined as f( key , j) = index
where j is attempt to find empty location
• We use the probing function to find the first available/free slot.
• Same is used in lookup/search.
 Linear Probing
f( key , j) = (hash(key)+ i) % m
where i= 0,1,2…m-1
We search linearly from the hashed index until the end of the
Linear Probing

 Insertion
The insertion algorithm is as follows:
1. Use hash function to find index for a record
2. If that spot is already in use, we use next available spot in a
"higher" index.(Find first empty slot of hash table in linear
fashion)
3. Treat the hash table as if it is round, if you hit the end of the
hash table, go back to the front
Linear Probing

 Searching
The searching algorithm is as follows:
1. Use hash function to find index of where an item should be
inserted.
2. If it isn't there search records that records after that hash
location (remember to treat table as cicular) until either it found, or
until an empty record is found. If there is an empty spot in the
table before record is found, it means that the the record is not
there
Linear Probing

 Delete/ Remove
The Delete/Remove algorithm is as follows:
1. Find record and remove it making the spot empty
2. If it isn't there search records that records after that hash
location (remember to treat table as circular) until either it found, or
until an empty record is found. If there is an empty spot in the
table before record is found, it means that the the record is not there
Linear Probing
Let's understand the linear probing through an example.
Consider the above example for the linear probing:
A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3
index = h(k)
%10
• The key values 3, 2, 9, 6 are stored at the indexes 9, 7, 1, 5
respectively.
• The calculated index value of 11 is 5 which is already occupied by
another key value, i.e., 6.
• When linear probing is applied, the nearest empty cell to the index 5 is
6; therefore, the value 11 will be added at the index 6.
Linear Probing
Let's understand the linear probing through an example.
Consider the above example for the linear probing:
A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3
index = h(k)
%10

• The next key value is 13. The index value associated with this key
value is 9 when hash function is applied.
• The cell is already filled at index 9.
• When linear probing is applied, the nearest empty cell to the index 9 is
0; therefore, the value 13 will be added at the index 0
Linear Probing
Let's understand the linear probing through an example.
Consider the above example for the linear probing:
A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, index key

and h(k) = 2k+3 0 13


index = h(k) %10 1 9
2 12
3 --
4 --
5 6
6 11
7 2
8 7
9 3
Linear Probing
Let us consider a simple hash function as “key mod 7”
and a sequence of keys as 50, 700, 76, 85, 92, 73, 101.
Linear Probing
Advantages

1. Simple
We linearly iterate to find the next slot.

2. Fast
With the use of localized access (Locality of reference)
It gives constant time performance in ideal situation.
Linear Probing
Disadvantages
There are a few drawbacks when using linear probing to maintain
a hash table. Let’s take a look together!

In worst case time complexity O(n) due to clustering.

Clustering
Linear probing is sensitive to a phenomenon called clustering.
Clustering is a phenomenon that occurs as elements are added to a
hash table. Elements may have a tendency to clump together,
forming clusters, which over time will significantly impact
performance for searching and adding elements because we’ll
approach a worst case O(n) time complexity.
Linear Probing with
Chaining
1. Chaining without replacement

• In collision handling method chaining is a concept which


introduces an additional field with data i.e. chain.
• A separate chain field is maintained for colliding data.
• When collision occurs we store the second colliding data by
linear probing method.
• The address of this colliding data can be stored with the first
colliding element in the chain table, without replacement.
Linear Probing with
Chaining
For example consider elements,

131, 3, 4, 21, 61, 6, 71, 8, 9


Linear Probing with
Chaining
• From the example, you can see that the chain is maintained the number
who demands for location 1.
• First number 131 comes we will place at index 1.
• Next comes 21 but collision Fig. Chaining without replacement occurs
so by linear probing we will place 21 at index 2, and chain is
maintained by writing 2 in chain table at index 1
• similarly next number comes 61 by linear probing we can place 61 at
index 5 and chain will be maintained at index 2.
• Thus any element which gives hash key as 1 will be stored by linear
probing at empty location but a chain is maintained so that traversing
the hash table will be efficient.
Linear Probing
Try out this example
Linear Probing
Try out this example
QUADRATIC
PROBING
Quadratic Probing
• In case of linear probing, searching is performed linearly. In contrast,
quadratic probing is an open addressing technique that uses quadratic
polynomial for searching until a empty slot is found.
• It can also be defined as that it allows the insertion ki at first free
location from (u+i2) % m where i=0 to m-1.
h´ = (𝑥) = 𝑥 𝑚𝑜𝑑 𝑚
ℎ(𝑥, 𝑖) = (ℎ´(𝑥) + 𝑖2)𝑚𝑜𝑑 𝑚

We can put some other quadratic equations also using some constants
The value of i = 0, 1, . . ., m-1. So we start from i = 0, and increase this
until we get one free space. So initially when i = 0, then the h(x, i) is
same as h´(x).
CLOSED
HASHING
DOUBLE HASHING
Double hashing : It is a collision resolving technique in Open
Addressed Hash tables. Double hashing uses the idea of applying a
second hash function to key when a collision occurs.

Advantages of Double hashing


1. The advantage of Double hashing is that it is one of the best for
of probing, producing a uniform distribution
throughout a hash table.
2. This technique does not yield any clusters.
3. It is one of effective method for resolving collisions
DOUBLE
HASHING
Double hashing :
Double hashing can be done using :
(hash1(key) + i * hash2(key)) % TABLE_SIZE
Here hash1() and hash2() are hash functions and TABLE_SIZE
is size of hash table.
(We repeat by increasing i when collision occurs)

First hash function is typically hash1(key) = key %


TABLE_SIZE
A popular second hash function is : hash2(key) = PRIME – (key %
PRIME) where PRIME is a prime smaller than the TABLE_SIZE.
DOUBLE
HASHING
Double hashing : A good second Hash function is:
1. It must never evaluate to zero
2. Must make sure that all cells can be probed
Open
Operations In open addressing,
Addressing
1. Insert Operation-

• Hash function is used to compute the hash value for a key to be inserted.
• Hash value is then used as an index to store the key in the hash table.

In case of collision,
• Probing is performed until an empty bucket is found.
• Once an empty bucket is found, the key is inserted.
• Probing is performed in accordance with the technique used for
open addressing.
Open
Operations In open addressing,
Addressing
Search Operation-

To search any particular key,

• Its hash value is obtained using the hash function used.


• Using the hash value, that bucket of the hash table is checked.
• If the required key is found, the key is searched.
• Otherwise, the subsequent buckets are checked until the required key or
an empty bucket is found.
• The empty bucket indicates that the key is not present in the hash table.
Open
Operations In open addressing,
Addressing
Search Operation-

• The key is first searched and then deleted.


• After deleting the key, that particular bucket is marked
as
“deleted”.
Rehashi
ng technique.
Rehashing is a collision resolution
Rehashing is a technique in which the table is resized, i.e., the size of
table is doubled by creating a new table. It is preferable is the total
size of table is a prime number. There are situations in which the
rehashing is required.

 When table is completely full

 With quadratic probing when the table is filled half.

 When insertions fail due to overflow.

In such situations, we have to transfer entries from old table to the new
table by re computing their positions using hash functions
Rehashi
ng
Rehashi
ngrehashing means hashing again.
 As the name suggests,
 Basically, when the load factor increases to more than its
pre- defined value (default value of load factor is
0.75), the complexity increases.
 So to overcome this, the size of the array is
increased (doubled) and all the values are hashed again
and stored in the new double sized array to maintain a
low load factor and low complexity.
Rehashi
But that comes with ang
price:
With the new size the Hash function can change, which
means all the 75 elements we had stored earlier, would now
with this new hash Function might yield different Index to
place them, so basically we rehash all those stored elements
with the new Hash Function and place them at new Indexes
of newly resized bigger HashTable.
Rehashi
ng
Why rehashing?
Rehashing is done because whenever key value pairs are
inserted into the map, the load factor increases, which implies
that the time complexity also increases as explained above.
This might not give the required time complexity of O(1).
Hence, rehash must be done, increasing the size of the
bucketArray so as to reduce the load factor and the time
complexity
Rehashi
How Rehashing isng
done?
Rehashing can be done as follows:
• For each addition of a new entry to the map, check the load
factor.
• If it’s greater than its pre-defined value (or default value of 0.75
if not given), then Rehash.
• For Rehash, make a new array of double the previous size and
make it the new bucketarray.
• Then traverse to each element in the old bucketArray and call
the insert() for each so as to insert it into the new larger bucket
array.
Rehashing & Double Hashing
How Rehashing is different than Double Hashing?
• In double hashing, two different hash functions are applied at
the same time and in rehashing same hash function is applied
again and again to generate a unique mapping value on
increased hash table size.
Dynamic Hashing
1. Static hashing does not expand or shrink the hash table
dynamically as the size of database grows or shrinks and
bucket overflow occurs.
2. The dynamic hashing method is used to overcome the
problems of static hashing like bucket overflow.
3. In this method, data buckets grow or shrink as the records
increases or decreases.
4. This method is also known as Extendable/Extensible
hashing method.
Extensible/Extendible
• Hashing
How to search a key
1. First, calculate the hash address of the key.
2. Check how many bits are used in the directory, and these
bits are called as i.
3. Take the least significant i bits of the hash address.
This gives an index of the directory.
4. Now using the index, go to the directory and find bucket
address where the record might be.
Extensible/Extendible
• Hashing
How to insert a new record
1. Firstly, you have to follow the same procedure
for retrieval, ending up in some bucket.
2. If there is still space in that bucket, then place the record
in it.
3. If the bucket is full, then we will split the bucket
and redistribute the records.
Extensible/Extendible
• Hashing
For example :
Consider the following grouping of keys into buckets
depending on the prefix of their hash address:
Extensible/Extendible
Hashing
The last two bits of 2 and 4 are 00. So it will go into bucket B0. The
last two bits of 5 and 6 are 01, so it will go into bucket B1. The last
two bits of 1 and 3 are 10, so it will go into bucket B2. The last two
bits of 7 are 11, so it will go into B3.
Extensible/Extendible
Hashing
Insert key 9 with hash address 10001 into the above structure:
1. Since key 9 has hash address 10001, it must go into the first bucket.
But bucket B1 is full, so it will get split.
2. The splitting will separate 5, 9 from 6 since last three bits of 5, 9 are
001, so it will go into bucket B1, and the last three bits of 6 are 101, so
it will go into bucket B5.
3. Keys 2 and 4 are still in B0. The record in B0 pointed by the 000 and
100 entry because last two bits of both the entry are 00.
4. Keys 1 and 3 are still in B2. The record in B2 pointed by the 010 and
110 entry because last two bits of both the entry are 10.
5. Key 7 are still in B3. The record in B3 pointed by the 111 and 011
entry because last two bits of both the entry are 11.
Extensible/Extendible
Hashing
Example
Solve Below hashing problem using extendible hashing.
Advantages of Extensible
1.Hashing
In this method, the performance does not decrease as the
data

grows in the system. It simply increases the size of memory

to accommodate the data.

2. In this method, memory is well utilized as it grows and shrinks

with the data. There will not be any unused memory lying.

3. This method is good for the dynamic database where data grows

and shrinks frequently


Disadvantages of Extensible
Hashing
1. In this method, if the data size increases then the bucket size is

also increased.

2. In this case, the bucket overflow situation will also occur. But it

might take little time to reach this situation than static hashing.
Linked List Benefits &
Drawbacks
• Benefits:
- Easy to insert & delete in O(1) time
- Don’t need to estimate total memory needed
• Drawbacks:
- Hard to search in less than O(n) time (binary search
doesn’t work, eg.)
- Hard to jump to the middle
• Skip Lists:
- fix these drawbacks
- good data structure for a dictionary ADT
Skip
1.
List
A skip list is a probabilistic data structure.
2. Invented around 1990 by Bill Pugh
3. Expected search time is O(log n)
4. The skip list is used to store a sorted list of elements or
data with a linked list.
5. It allows the process of the elements or data to view
efficiently.
6. In one single step, it skips several elements of the entire
list, which is why it is known as a skip list.
7. Randomized/Probabilistic data structure:
- use random coin flips to build the data structure
Skip
4.
List
The skip list is an extended version of the linked list.
5. It allows the user to search, remove, and insert the element
very quickly.
6. It consists of a base list that includes a set of elements
which maintains the link hierarchy of the subsequent
elements.
Skip
Skip list structure
List

It is built in two layers: The lowest layer and Top layer.

1. The lowest layer of the skip list is a common sorted

linked list

2. Top layers of the skip list are like an "express line"

where the elements are skipped.


Skip List
• Keys in sorted order.
• O(log n) levels
• Each higher level contains 1/2 the elements of the level
below it.
• Header & sentinel nodes are in every level
Searching in Skip List
Skip
List the working of the skip list. In this
• Let's take an example to understand
example, we have 14 nodes, such that these nodes are divided into two
layers, as shown in the diagram.
• The lower layer is a common line that links all nodes, and the top layer
is an express line that links only the main nodes, as you can see in the
diagram.
• Suppose you want to find 47 in this example. You will start the search from
the first node of the express line and continue running on the express line
until you find a node that is equal a 47 or more than 47.
• You can see in the example that 47 does not exist in the express line, so
you search for a node of less than 47, which is 40. Now, you go to the
normal line with the help of 40, and search the 47, as shown in the
diagram.
Skip
List
Skip
List
Skip List Basic Operations
There are the following types of operations in the skip list.

1. Insertion operation: It is used to add a new node to a particular


location in a specific situation.
2. Deletion operation: It is used to delete a node in a
specific situation.
3. Search Operation: The search operation is used to
search a particular node in a skip list.
Skip
List
Example 1: Create a skip list, we want to insert these following
keys in the empty skip list.
Step 1: Insert 6 with level 1
1. 6 with level 1.
2. 29 with level 1.
3. 22 with level 4.
4. 9 with level 3.
5. 17 with level 1.
6. 4 with level 2.
Skip
List
Example 1: Create a skip list, we want to insert these following
keys in the empty skip list.
Step 2: Insert 29 with level 1
1. 6 with level 1.
2. 29 with level 1.
3. 22 with level 4.
4. 9 with level 3.
5. 17 with level 1.
6. 4 with level 2.
Skip
List
Example 1: Create a skip list, we want to insert these following
keys in the empty skip list.
Step 3: Insert 22 with level 4
1. 6 with level 1.
2. 29 with level 1.
3. 22 with level 4.
4. 9 with level 3.
5. 17 with level 1.
6. 4 with level 2.
Skip
List
Example 1: Create a skip list, we want to insert these following
keys in the empty skip list.
Step 4: Insert 9 with level 3
1. 6 with level 1.
2. 29 with level 1.
3. 22 with level 4.
4. 9 with level 3.
5. 17 with level 1.
6. 4 with level 2.
Skip
List
Example 1: Create a skip list, we want to insert these following
keys in the empty skip list.
Step 5: Insert 17 with level 1
1. 6 with level 1.
2. 29 with level 1.
3. 22 with level 4.
4. 9 with level 3.
5. 17 with level 1.
6. 4 with level 2.
Skip
List
Example 1: Create a skip list, we want to insert these following
keys in the empty skip list.
Step 6: Insert 4 with level 2
1. 6 with level 1.
2. 29 with level 1.
3. 22 with level 4.
4. 9 with level 3.
5. 17 with level 1.
6. 4 with level 2.
Skip
List
Example 2: Consider this example where we want to search for
key 17.
Skip
List
Example 2: Consider this example where we want to search for
key 17.
Advantages of Skip
1. List
If you want to insert a new node in the skip list, then it will insert
the node very fast because there are no rotations in the skip list.
2. The skip list is simple to implement as compared to the hash
table and the binary search tree.
3. It is very simple to find a node in the list because it stores the
nodes in sorted form.
4. The skip list algorithm can be modified very easily in a more
specific structure, such as indexable skip lists, trees, or priority
queues.
5. The skip list is a robust and reliable list.
Disadvantages of
1. Skip List
It requires more memory than the balanced tree.
2. Reverse searching is not allowed.
3. The skip list searches the node much slower than the linked list.
Applications of Skip
1. List
Skiplist are used in distributed applications. In
distributed

systems, the nodes of skip list represents the computer systems

and pointers represent network connection.

2. Skip list are used for implementing highly scalable concurrent

priority queues with less lock contention (struggle for having a

lock on a data item)


Applications of Skip
3. It is List
alsoused with the QMap template
class. (
Value-based template class that provides a dictionary)
4. The indexing of the skip list is used in running
median problems.
median
5. skipdb ordered
is an open-source database format using
key/value pairs

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy