0% found this document useful (0 votes)
10 views8 pages

Ex 7 Daa

Uploaded by

wadhaniyash14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views8 pages

Ex 7 Daa

Uploaded by

wadhaniyash14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

YEAR: 2024-25

MEDI-CAPS UNIVERSITY, INDORE

SEM: EVEN

Experiment : 7
Aim / Objective: To implement Huffman coding & determine its time complexity.
Theory: Huffman coding is a lossless data compression algorithm. The idea is to assign variable-length
codes to input characters, lengths of the assigned codes are based on the frequencies of corresponding
characters.
The variable-length codes assigned to input characters are Prefix Codes, means the codes (bit sequences)
are assigned in such a way that the code assigned to one character is not the prefix of code assigned to any
other character. This is how Huffman Coding makes sure that there is no ambiguity when decoding the
generated bitstream.
Let us understand prefix codes with a counter example. Let there be four characters a, b, c and d, and their
corresponding variable length codes be 00, 01, 0 and 1. This coding leads to ambiguity because code
assigned to c is the prefix of codes assigned to a and b. If the compressed bit stream is 0001, the de-
compressed output may be “cccd” or “ccb” or “acd” or “ab”.
See this for applications of Huffman Coding.
There are mainly two major parts in Huffman Coding

1. Build a Huffman Tree from input characters.

2. Traverse the Huffman Tree and assign codes to characters.

Fig 7.1

NAME : YASH WADHWANI


ENROLLMENT NO : EN22CS3011112
YEAR: 2024-25
MEDI-CAPS UNIVERSITY, INDORE

SEM: EVEN

Algorithm:

Huffman coding involves two main steps:

1. Building the Huffman Tree

2. Encoding & Decoding Using the Tree

Step 1: Build Huffman Tree

1. Count Frequencies:

*Count the frequency of each character in the input string.

2. Create a Min-Heap:

*Insert each character as a node into a priority queue (min-heap), where the frequency is
the priority.

3. Build the Huffman Tree:

*While there are at least two nodes in the heap:

*Remove the two nodes with the smallest frequency.

*Merge them into a new node, where frequency = sum of the two nodes.

*Insert the new node back into the heap.

*The remaining node is the root of the Huffman tree.

Step 2: Generate Huffman Codes

1. Traverse the Tree:

*Assign '0' for left edges and '1' for right edges.

*Recursively build binary codes for each character.

2. Create a Dictionary:

NAME : YASH WADHWANI


ENROLLMENT NO : EN22CS3011112
YEAR: 2024-25
MEDI-CAPS UNIVERSITY, INDORE

SEM: EVEN

*Store characters and their corresponding binary codes.

Step 3: Encode the Input String

1. Replace Each Character:

*Convert the input string into a sequence of Huffman codes using the dictionary.

Step 4: Decode the Encoded String

1. Traverse the Huffman Tree:

*Read bits from the encoded string one by one.

*Move left for '0' and right for '1'.

*When a leaf node (character) is reached, append it to the output and reset to the root.

Code:

import heapq

class node:

def __init__(self, freq, symbol, left=None, right=None):

self.freq = freq

self.symbol = symbol

self.left = left

self.right = right

self.huff = ''

NAME : YASH WADHWANI


ENROLLMENT NO : EN22CS3011112
YEAR: 2024-25
MEDI-CAPS UNIVERSITY, INDORE

SEM: EVEN

def __lt__(self, nxt):

return self.freq < nxt.freq

def printNodes(node, val=''):

newVal = val + str(node.huff)

if(node.left):

printNodes(node.left, newVal)

if(node.right):

printNodes(node.right, newVal)

if(not node.left and not node.right):

print(f"{node.symbol} -> {newVal}")

chars = ['a', 'b', 'c', 'd', 'e', 'f']

freq = [5, 9, 12, 13, 16, 45]

nodes = []

for x in range(len(chars)):

heapq.heappush(nodes, node(freq[x], chars[x]))

while len(nodes) > 1:

NAME : YASH WADHWANI


ENROLLMENT NO : EN22CS3011112
YEAR: 2024-25
MEDI-CAPS UNIVERSITY, INDORE

SEM: EVEN

left = heapq.heappop(nodes)

right = heapq.heappop(nodes)

left.huff = 0

right.huff = 1

newNode = node(left.freq+right.freq, left.symbol+right.symbol, left, right)

heapq.heappush(nodes, newNode)

printNodes(nodes[0])

Output:

Fig 7.2

NAME : YASH WADHWANI


ENROLLMENT NO : EN22CS3011112
YEAR: 2024-25
MEDI-CAPS UNIVERSITY, INDORE

SEM: EVEN

Advantages:

1. Optimal Compression: Huffman coding provides an optimal prefix code for a given set of
character frequencies, leading to efficient compression.

2. Lossless Compression: Unlike lossy compression methods (e.g., JPEG, MP3), Huffman
coding ensures that no data is lost during encoding and decoding.

3. Widely Used: Huffman coding is used in file compression formats like ZIP, GZIP, and image
compression formats like PNG.

4. Prefix-Free Property: Huffman codes are prefix codes, meaning no code is a prefix of
another, ensuring unique decodability.

Disadvantages:

1. Requires Two Passes: One pass is needed to determine frequency counts, and another pass to
encode, which can be inefficient in some applications.

2. Not Suitable for Dynamic Data: If character frequencies change frequently, a new Huffman
tree must be built, making it inefficient for real-time streaming.

3. Not Always the Best: If all characters have nearly equal frequencies, Huffman coding does not
provide significant compression benefits.

Time Complexity:

1. Encoding Time Complexity:

*Building the Frequency Table: O(n) (scanning the text)

*Building the Huffman Tree: O(n log n) (using a min-heap for tree construction)

NAME : YASH WADHWANI


ENROLLMENT NO : EN22CS3011112
YEAR: 2024-25
MEDI-CAPS UNIVERSITY, INDORE

SEM: EVEN

*Generating Huffman Codes: O(n) (traversing the tree)

*Encoding the Input: O(n) (replacing characters with their Huffman codes)

*Total Time Complexity: O(n log n) (dominated by tree construction)

2. Decoding Time Complexity

*Traversing the Huffman Tree for Each Bit: O(n)

*Total Decoding Complexity: O(n)

Overall Time Complexity:

* Huffman Encoding: O(n log n)


* Huffman Decoding: O(n)

Space Complexity:

Frequency Table: O(c) (where c is the number of unique characters)

Priority Queue (Min-Heap): O(c) (stores all unique characters)

Huffman Tree: O(c) (stores tree nodes)

Encoding Dictionary: O(c) (maps characters to their codes)

Encoded Output: O(n) (depends on the input size and compression efficiency)

Total Space Complexity: O(c + n) (where c is typically much smaller than n)

NAME : YASH WADHWANI


ENROLLMENT NO : EN22CS3011112
YEAR: 2024-25
MEDI-CAPS UNIVERSITY, INDORE

SEM: EVEN

NAME : YASH WADHWANI


ENROLLMENT NO : EN22CS3011112

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy