0% found this document useful (0 votes)

15 views22 pages

Huffman

The document discusses data compression techniques, particularly focusing on Huffman coding, which utilizes variable-length encoding based on character frequency to reduce file size. It explains the construction of prefix codes using a binary tree, allowing for efficient and unique decoding of messages. The document also outlines the greedy algorithm approach for generating optimal Huffman codes, emphasizing its effectiveness in minimizing average code length.

Uploaded by

yadavanshikaraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views22 pages

Huffman

Uploaded by

yadavanshikaraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 22

Data Compression

• Suppose we have 1000000000 (1G) character data file that

we wish to include in an email.
• Suppose file only contains 26 letters {a,…,z}.
• Suppose each letter  in {a,…,z} occurs with frequency f.
• Suppose we encode each letter by a binary code
• If we use a fixed length code, we need 5 bits for each
character
• The resulting message length is 5( fa+ fb + … + fz)

• Can we do better?
Huffman Codes

• Most character code systems (ASCII, unicode) use

fixed length encoding
• If frequency data is available and there is a wide
variety of frequencies, variable length encoding can
save 20% to 90% space
• Which characters should we assign shorter codes;
which characters will have longer codes?
Data Compression: A Smaller Example
• Suppose the file only has 6 letters {a,b,c,d,e,f}
with frequencies
a b c d e f
.45 .13 .12 .16 .09 .05
000 001 010 011 100 101 Fixed length

0 101 100 111 1101 1100 Variable length

• Fixed length 3G=3000000000 bits

• Variable length
.45  1  .13  3  .12  3  .16  3  .09  4  .05  4 2.24G
How to decode?

• At first it is not obvious how decoding

will happen, but this is possible if we
use prefix codes
Prefix Codes
• No encoding of a character can be the prefix of the
longer encoding of another character, for example,
we could not encode t as 01 and x as 01101 since 01
is a prefix of 01101
• By using a binary tree representation we will
generate prefix codes provided all letters are leaves
Prefix codes
• A message can be decoded uniquely.

• Following the tree until it reaches to a leaf, and

then repeat!

• Draw a few more tree and produce the codes!!!

Some Properties
• Prefix codes allow easy decoding
– Given a: 0, b: 101, c: 100, d: 111, e: 1101, f: 1100
– Decode 001011101 going left to right, 0|01011101, a|
0|1011101, a|a|101|1101, a|a|b|1101, a|a|b|e
• An optimal code must be a full binary tree (a tree
where every internal node has two children)
• For C leaves there are C-1 internal nodes
• The number of bits to encode a file is

where f(c) is the freq of c, dT(c) is the tree depth of

c, which corresponds to the code length of c
Optimal Prefix Coding Problem

• Input: Given a set of n letters (c1,…, cn) with

frequencies (f1,…, fn).

• Construct a full binary tree T to define a prefix

code that minimizes the average code length

Average(T ) i 1 f i  lengthT ci 

n
Greedy Algorithms
• Many optimization problems can be solved more
quickly using a greedy approach
– The basic principle is that local optimal decisions may
may be used to build an optimal solution
– But the greedy approach may not always lead to an
optimal solution overall for all problems
– The key is knowing which problems will work with
this approach and which will not
• We will study
– The problem of generating Huffman codes
Greedy algorithms
• A greedy algorithm always makes the choice that
looks best at the moment
– The hope: a locally optimal choice will lead to a
globally optimal solution
– For some problems, it works
• Greedy algorithms tend to be easier to code
David Huffman’s idea

•Build the tree (code) bottom-up in a greedy

fashion
Building the Encoding Tree
Building the Encoding Tree
Building the Encoding Tree
Building the Encoding Tree
Building the Encoding Tree
The Algorithm

• An appropriate data structure is a binary min-heap

• Rebuilding the heap is lg n and n-1 extractions are
made, so the complexity is O( n lg n )
• The encoding is NOT unique, other encoding may
work just as well, but none will work better
Correctness of Huffman’s Algorithm
Lemma A:

Since each swap does not increase the cost, the

resulting tree T’’ is also an optimal tree
Proof of Lemma A
• Without loss of generality, assume f[a]f[b] and
f[x]f[y]
• The cost difference between T and T’ is
B (T )  B (T ' )   f (c)dT (c)   f (c)dT (c) '

cC cC
 f [ x]dT ( x)  f [a ]dT (a )  f [ x]dT ( x)  f [a ]dT (a )
' '

 f [ x]dT ( x)  f [a ]dT (a )  f [ x]dT (a )  f [a ]dT ( x)

( f [a ]  f [ x])( dT (a )  dT ( x))
0
B(T’’)  B(T), but T is optimal,
B(T)  B(T’’)  B(T’’) = B(T)
Therefore T’’ is an optimal tree in which x and y
appear as sibling leaves of maximum depth
Correctness of Huffman’s Algorithm
Lemma B:

•Observation: B(T) = B(T’) + f[x] + f[y]  B(T’) = B(T)-f[x]-f[y]

–For each c C – {x, y}  dT(c) = dT’(c) f[c]dT(c) = f[c]dT’(c)
–dT(x) = dT(y) = dT’(z) + 1
–f[x]dT(x) + f[y]dT(y) = (f[x] + f[y])(dT’(z) + 1) = f[z]dT’(z) + (f[x] + f[y])
B(T’) = B(T)-f[x]-f[y]

z:14

B(T’) = 45*1+12*3+13*3+(5+9)*3+16*3
= B(T) - 5 - 9
B(T) = 45*1+12*3+13*3+5*4+9*4+16*3
Proof of Lemma B
• Prove by contradiction.
• Suppose that T does not represent an optimal prefix code
for C. Then there exists a tree T’’ such that B(T’’) < B(T).
• Without loss of generality, by Lemma A, T’’ has x and y
as siblings. Let T’’’ be the tree T’’ with the common
parent x and y replaced by a leaf with frequency f[z] =
f[x] + f[y]. Then
• B(T’’’) = B(T’’) - f[x] – f[y] < B(T) – f[x] – f[y] = B(T’)
– T’’’ is better than T’  contradiction to the
assumption that T’ is an optimal prefix code for C’

Brain Bugs How The Brain's Flaws Shape Our Lives
100% (2)
Brain Bugs How The Brain's Flaws Shape Our Lives
275 pages
Coding Line Coding Covered
No ratings yet
Coding Line Coding Covered
68 pages
Huffman Coding
No ratings yet
Huffman Coding
40 pages
The Nature of Communication - Purposive Communication
0% (1)
The Nature of Communication - Purposive Communication
16 pages
Graph Theory - Important Application of Trees Huffman Coding
No ratings yet
Graph Theory - Important Application of Trees Huffman Coding
50 pages
Ut 1 PPT
No ratings yet
Ut 1 PPT
77 pages
Mini Project
No ratings yet
Mini Project
26 pages
Language and Communication
No ratings yet
Language and Communication
17 pages
EE252 Module I Part1 10th Jan
No ratings yet
EE252 Module I Part1 10th Jan
78 pages
Sorting
No ratings yet
Sorting
98 pages
Lecture 32 Nyquist - Plot
No ratings yet
Lecture 32 Nyquist - Plot
53 pages
Unit 3
No ratings yet
Unit 3
122 pages
Digital Speech Processing
No ratings yet
Digital Speech Processing
46 pages
FGHHHHHHHHHHHHHH
No ratings yet
FGHHHHHHHHHHHHHH
81 pages
Lecture 22 Compression
No ratings yet
Lecture 22 Compression
42 pages
Lecture 34 Lead - Compensator
No ratings yet
Lecture 34 Lead - Compensator
32 pages
Unite 4-Greedy Method - CSE
No ratings yet
Unite 4-Greedy Method - CSE
41 pages
Multimedia Data Compression
No ratings yet
Multimedia Data Compression
31 pages
Unit III - Daa
No ratings yet
Unit III - Daa
127 pages
SSP 999303 Audi A8l Electrical
100% (1)
SSP 999303 Audi A8l Electrical
91 pages
DAA Lec 9 Greedy Approach
No ratings yet
DAA Lec 9 Greedy Approach
51 pages
Huffman Coding
No ratings yet
Huffman Coding
32 pages
Unit Iii Greedy and Dynamic Programming
No ratings yet
Unit Iii Greedy and Dynamic Programming
120 pages
Binary Codes
0% (2)
Binary Codes
5 pages
Chapter 4 Multi
No ratings yet
Chapter 4 Multi
45 pages
Greedy I
No ratings yet
Greedy I
24 pages
Greedy Techniques
No ratings yet
Greedy Techniques
21 pages
Introduction To Human Communication
No ratings yet
Introduction To Human Communication
21 pages
0g Huffman
No ratings yet
0g Huffman
23 pages
Lec 02
No ratings yet
Lec 02
19 pages
Wa0023.
No ratings yet
Wa0023.
28 pages
r23 III-i Syllabus
No ratings yet
r23 III-i Syllabus
21 pages
Lec 08 Wed
No ratings yet
Lec 08 Wed
17 pages
Chapter Three
No ratings yet
Chapter Three
30 pages
HuffmanCoding 2
No ratings yet
HuffmanCoding 2
16 pages
Huffman Code
No ratings yet
Huffman Code
51 pages
XF Format Specifications 2.01 (1999)
No ratings yet
XF Format Specifications 2.01 (1999)
31 pages
Data Compression
No ratings yet
Data Compression
28 pages
DDA115 Lecture 3 Notes 2025
No ratings yet
DDA115 Lecture 3 Notes 2025
12 pages
Eedy Algorithms
No ratings yet
Eedy Algorithms
63 pages
University of Hertfordshire School of Computer Science
No ratings yet
University of Hertfordshire School of Computer Science
60 pages
A Concise Introduction To Data Compression Undergraduate Topics in Computer Science 1st Edition by David Salomon ISBN 1848000715 9781848000711 Download
100% (2)
A Concise Introduction To Data Compression Undergraduate Topics in Computer Science 1st Edition by David Salomon ISBN 1848000715 9781848000711 Download
47 pages
Interactive Session 8th March Women's Day Celebration Work Hours - 2 (1-262)
No ratings yet
Interactive Session 8th March Women's Day Celebration Work Hours - 2 (1-262)
7 pages
3-2 - Compression Algorithm
0% (1)
3-2 - Compression Algorithm
24 pages
Elements of Communication
No ratings yet
Elements of Communication
49 pages
Unit 4
No ratings yet
Unit 4
21 pages
LP-III Assignment No 2
No ratings yet
LP-III Assignment No 2
16 pages
Lecture 14
No ratings yet
Lecture 14
25 pages
Huffman Coding Algorithm: Data Compression and Data Retrieval
No ratings yet
Huffman Coding Algorithm: Data Compression and Data Retrieval
15 pages
Huffman Coding Technique
No ratings yet
Huffman Coding Technique
13 pages
Lecture# 08 Greedy Algorithms
No ratings yet
Lecture# 08 Greedy Algorithms
63 pages
Huffman Coding
No ratings yet
Huffman Coding
32 pages
13 Greedy-I
No ratings yet
13 Greedy-I
24 pages
Huffman Codes
No ratings yet
Huffman Codes
8 pages
Information Theory: Mike Brookes E4.40, ISE4.51, SO20
No ratings yet
Information Theory: Mike Brookes E4.40, ISE4.51, SO20
114 pages
Lect18 19
No ratings yet
Lect18 19
17 pages
16 Greedy Algorithms
No ratings yet
16 Greedy Algorithms
21 pages
16 Greedy Algorithms
No ratings yet
16 Greedy Algorithms
21 pages
UNIT-5 Part-2 Coding Theory PDF
No ratings yet
UNIT-5 Part-2 Coding Theory PDF
61 pages
04huffman 2x2
No ratings yet
04huffman 2x2
6 pages
Lecture 4 Index Compression
No ratings yet
Lecture 4 Index Compression
32 pages
Huffman Code
No ratings yet
Huffman Code
7 pages
EE250: Control Systems Tutorial 08: Name: Roll No
No ratings yet
EE250: Control Systems Tutorial 08: Name: Roll No
4 pages
EE250-Quiz01 - Supp
No ratings yet
EE250-Quiz01 - Supp
4 pages
Huffman Coding: Greedy Algorithm
No ratings yet
Huffman Coding: Greedy Algorithm
27 pages
Interleaving For Burst Error Correction
No ratings yet
Interleaving For Burst Error Correction
8 pages
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
No ratings yet
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
28 pages
EE250: Control Systems Tutorial 07: Name: Roll No
No ratings yet
EE250: Control Systems Tutorial 07: Name: Roll No
3 pages
Huffman Coding: Version of September 17, 2016
No ratings yet
Huffman Coding: Version of September 17, 2016
27 pages
File Organization For Performance: Amogh P K, SVIT
No ratings yet
File Organization For Performance: Amogh P K, SVIT
12 pages
Unit III
No ratings yet
Unit III
28 pages
Quiz 1 - EE252 - 2021 - Answers
No ratings yet
Quiz 1 - EE252 - 2021 - Answers
2 pages
Steganography ASCII
No ratings yet
Steganography ASCII
11 pages
Huffman Code
No ratings yet
Huffman Code
5 pages
Huffman Code
No ratings yet
Huffman Code
29 pages
M1 Greedy - Huffman Codes
No ratings yet
M1 Greedy - Huffman Codes
2 pages
Multimedia Systems Chapter 7
No ratings yet
Multimedia Systems Chapter 7
21 pages
608 16 PDF
No ratings yet
608 16 PDF
14 pages
TEMA 3 Oposición Profesor Inglés 2021
No ratings yet
TEMA 3 Oposición Profesor Inglés 2021
7 pages
Huffman Codes
No ratings yet
Huffman Codes
27 pages
16 Greedy Algorithms
No ratings yet
16 Greedy Algorithms
21 pages
Kaduna State University: Faculty of Sciences Department of Computer Science
No ratings yet
Kaduna State University: Faculty of Sciences Department of Computer Science
3 pages
It2302-Information Theory and Coding Unit - I
No ratings yet
It2302-Information Theory and Coding Unit - I
17 pages
Huffman Codes: Coding
No ratings yet
Huffman Codes: Coding
8 pages
Huffman Coding Ms 140400147 Sadia Yunas Butt
No ratings yet
Huffman Coding Ms 140400147 Sadia Yunas Butt
9 pages
HLON 2019 Questions&Solutions PDF
No ratings yet
HLON 2019 Questions&Solutions PDF
9 pages
Huffman Trees and Codes: Greedy Technique
No ratings yet
Huffman Trees and Codes: Greedy Technique
6 pages
Huffman Codes: Spring 2010
No ratings yet
Huffman Codes: Spring 2010
7 pages
Documentation in Daa
No ratings yet
Documentation in Daa
16 pages
Huffman Code1
100% (1)
Huffman Code1
13 pages
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
No ratings yet
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
24 pages
Compression For Sending and Storing Information: Text, Audio, Images, Videos
No ratings yet
Compression For Sending and Storing Information: Text, Audio, Images, Videos
28 pages
ECE - F344 Information Theory and Coding
No ratings yet
ECE - F344 Information Theory and Coding
2 pages
Compression: Another Example of Greedy Algorithm: Huffman Codes
No ratings yet
Compression: Another Example of Greedy Algorithm: Huffman Codes
4 pages
University of Management & Technology: Submitted By: Usama Dastagir 14030027011 Hassan Humayoun 14030027043
No ratings yet
University of Management & Technology: Submitted By: Usama Dastagir 14030027011 Hassan Humayoun 14030027043
7 pages
Lecture 15
No ratings yet
Lecture 15
3 pages
Greedy and DFS in Huffman Coding
No ratings yet
Greedy and DFS in Huffman Coding
5 pages
A7005-Coding Theory and Practice
No ratings yet
A7005-Coding Theory and Practice
1 page

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Huffman

Uploaded by

Huffman

Uploaded by

Data Compression

• Suppose we have 1000000000 (1G) character data file that

• Most character code systems (ASCII, unicode) use

0 101 100 111 1101 1100 Variable length

• Fixed length 3G=3000000000 bits

• At first it is not obvious how decoding

• Following the tree until it reaches to a leaf, and

• Draw a few more tree and produce the codes!!!

where f(c) is the freq of c, dT(c) is the tree depth of

• Input: Given a set of n letters (c1,…, cn) with

• Construct a full binary tree T to define a prefix

Average(T ) i 1 f i  lengthT ci 

•Build the tree (code) bottom-up in a greedy

• An appropriate data structure is a binary min-heap

Since each swap does not increase the cost, the

 f [ x]dT ( x)  f [a ]dT (a )  f [ x]dT (a )  f [a ]dT ( x)

•Observation: B(T) = B(T’) + f[x] + f[y]  B(T’) = B(T)-f[x]-f[y]

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.