0% found this document useful (0 votes)
6 views9 pages

Eee 427 - 4

Uploaded by

Dele Odez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views9 pages

Eee 427 - 4

Uploaded by

Dele Odez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

DATA COMPRESSION

EEE 427
Introduction
• Data compression refers to reducing the number of bits that need to
be transmitted for exchanging a given volume of information.
• Data compression techniques are also used for reducing the data
storage requirements.
• Data compression is the process of encoding information using fewer
bits than the original representation. It aims to reduce the size of data
files, enhancing storage efficiency and speeding up data transmission.
• Virtually all forms of data contain redundancy, i.e. information
content is less than what the data representation is capable of.
• Compression can be either lossless or lossy.
Lossy Compression
• Lossy compression reduces bits by removing unnecessary or
less important information, which can result in some loss of
data.
• It is commonly used for multimedia data such as images,
audio, and video.
• Techniques include:
✓ Discrete Cosine Transform (DCT): Used in JPEG for
images and MPEG for video
✓ Psychoacoustic Models: Used in audio compression
formats like MP3 and AAC to remove inaudible components
Lossless Compression
• Lossless compression algorithms exploit statistical redundancy to
represent data without losing any information, making the process
reversible.
• Common techniques include:
➢ Run-Length Encoding (RLE): Encodes sequences of the same data value
as a single data value and count.
➢ Lempel-Ziv (LZ) Methods: Use a table-based compression model where
table entries are substituted for repeated strings of data. Examples
include LZW, used in GIF images, and DEFLATE, used in PNG files1.
➢ Huffman Coding: Uses variable-length codes for encoding symbols
based on their frequencies.
Information theory
• Typically one thinks of information as having to do with knowledge.
Gaining information signifies acquiring knowledge that was not there
earlier.
• Gaining information is equivalent to reducing uncertainty.
• If outcome of an event is known with certainty (say with probability
equal to unity), it does not add to the knowledge.
• On the other hand, if the outcome is uncertain, its occurrence adds to
the knowledge.
• Therefore, amount of information I(x) is inverse to probability P(x) of
occurrence of event x.
• we define I(x) = 1/P(x),
Logarithmic Measure of Information
1
• 𝐼 𝑥𝑖 = log 𝑏 = − log 𝑏 𝑃(𝑥𝑖 )
𝑃(𝑥𝑖 )

• Where 𝑃(𝑥𝑖 ) is the probability of occurrence of symbol 𝑥𝑖


• 𝐼 𝑥𝑖 is the information content of symbol 𝑥𝑖
• The unit of 𝐼 𝑥𝑖 is bit if b = 2 (binary unit), Hartley or decit if b = 10
and nat (natural) if b = e.
• It is always standard to use b = 2
Entropy
• Average information per message H is termed as entropy. Entropy of a
source serves a very important purpose in binary encoding process.
Encoding for Compression
• Information, entropy, and average length of code
provide insights into the design and evaluation of data
compression methods.
• There are two data compression methods,
• Shannon-Fano coding and
• Huffman coding.
• It is believed that the Huffman algorithm is most
optimal. It provides a benchmark to which other data
compression methods can be compared.
Huffman coding
• The Huffman coding procedure is as follows;
• List the source symbols in order of decreasing probability
• Combine the probabilities of two symbols having the lowest probabilities,
and reorder the resultant probabilities this step is called reduction 1. the
same procedure is repeated until there are two ordered probabilities
remaining.
• Start the encoding with the last reduction, which consist of exactly two
ordered probabilities. Assign 0 as the first digit in the codeword for all the
source symbols associated with the first probability; assign 1 to the second
probability
• Now assign 0 and 1 to the second digit for the two probabilities that were
combined in the previous reduction step, retaining all assignments made in
step 3.
• Keep regressing this way until first column is reached.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy