0% found this document useful (0 votes)
20 views47 pages

Mmis 03

Uploaded by

Malicha Galma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views47 pages

Mmis 03

Uploaded by

Malicha Galma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

.

Multimedia
Information
Systems (MMIS)

Habtamu A.
@Habtamu Ararsie
InSy4122 habtamuararsie@mtu.edu.et
.

Chapter 3

Multimedia Data
compression

InSy4122
The Need for Compression

⚫ Take, for example, a video signal with resolution


320x240 pixels and 256 (8 bits) colors, 30 frames
per second
– Raw bit rate = 320x240x8x30
= 18,432,000 bits
= 2,304,000 bytes = 2.3 MB
A 90 minute movie would take 2.3x60x90= 12.44 GB
3 MMIS (Comp. by Habtamu A.)
Multimedia Data Compression

⚫ Data compression is about finding ways to reduce


the number of bits or bytes used to store or
transmit the content of multimedia data.
– It is the process of encoding information using fewer
bits
– For example, the ZIP file format, which provides
compression, also acts as an archive, storing many
source files in a single destination output file.

4 MMIS (Comp. by Habtamu A.)


General data compression scheme

⚫ Compression helps reduce the consumption of


expensive resources, such as hard disk space or
transmission bandwidth

5 MMIS (Comp. by Habtamu A.)


Trade offs in Data Compression

⚫The degree of compression


– To what extent the data should be compressed?
⚫The amount of distortion introduced
– To what extent quality loss is tolerated?
⚫ The computational resources required to
compress and decompress the data.
– Dowe have enough memory required for compressing
and decompressing the data?

6 MMIS (Comp. by Habtamu A.)


Types of Compression

⚫Lossless Compression
– Lossless compression can recover exact original data after
compression.
– It is used mainly for compressing database records,
spreadsheets, texts, executable programs, etc., where exact
replication of the original data is essential & changing even
a single bit cannot be tolerated.
– Examples: Run Length Encoding, Lempel Ziv (LZ), Huffman
Coding.

7 MMIS (Comp. by Habtamu A.)


Cond…

⚫ Lossless compression is possible because most real-


world data has statistical redundancy.
⚫ It packs data into a smaller file size by using a kind
of internal shorthand to signify redundant data.
⚫ This technique can reduce up to half of the original
size.
⚫ WinZip use lossless compression. For this reason
zip software is popular for compressing program &
8 data files. MMIS (Comp. by Habtamu A.)
Cond…

⚫ Lossless compression has advantages and


disadvantages.
– The advantage: compressed file are decompressed to an
exact duplicate of the original file, mirroring its quality.
– The disadvantage: the compression ratio is not all that
high, precisely because no data is lost.

9 MMIS (Comp. by Habtamu A.)


Cond…

⚫Lossy Compression
– Result in a certain loss of accuracy in exchange for a substantial
increase in compression.
– Forvisual & audio data, some loss of quality can be tolerated
without losing the essential nature of the data where losses
outside visual or aural perception can be tolerated.
⚫ By taking advantage of the limitations of the human sensory system, a
great deal of space can be saved while producing an output which is
nearly indistinguishable from the original.
⚫ In audio compression, for instance, non-audible (or less audible)
components of the signal are removed.

10
Cond…

⚫ Lossy compression is used for:


– Image compression in digital cameras, to increase storage
capacities with minimal degradation of picture quality
– Audio compression for Internet telephony & CD ripping,
which is decoded by audio players.
– Video compression in DVDs with MPEG format.
– To get a higher compression ratio (i.e. to reduce a file
significantly beyond 50%) you must use lossy
11 compression. MMIS (Comp. by Habtamu A.)
Cond…

⚫Lossy Compression
–A sound file in WAV format, converted to a MP3 file
will lose much data
– MP3 employs a lossy compression; resulting in a file
much smaller so that several dozen MP3 files can fit on a
single storage device, vs. a handful of WAV files.
– However, the sound quality of the MP3 file will be
slightly lower than the original WAV.
12 MMIS (Comp. by Habtamu A.)
Types of Compression
M M

Compress without Loss (Lossless) Compress with loss (Lossy)

m m
M M’

Uncompress Uncompress

M M’

M = Multimedia data Transmitted


13 MMIS (Comp. by Habtamu A.)
How do you change “MMIS” into bits ?

ASCII Code

14
Extended ASCII Code

15
👉 Long Story Short
⚫ Differences
– Lossless compression schemes are reversible
so that the original data can be reconstructed,
– Lossy schemes accept some loss of data in
order to achieve higher degree of compression.

16 MMIS (Comp. by Habtamu A.)


17 MMIS (Comp. by Habtamu A.)
Common compression methods

⚫ Repetitive sequence suppression:


– Simple Repetition Suppression
⚫ If in a sequence a series on n successive tokens appears we can
replace these with a token and a count number of occurrences.
⚫ We usually need to have a special flag to denote when the
repeated token appears
⚫ Example
89400000000000000000000000000000000
– we can replace with
894f32
– where f is the flag for zero.
18 – Suppression of zero's in a file (Zero Length Suppression)
Common compression methods

⚫ Repetitive sequence suppression:


– Run Length Encoding compression technique
⚫ In Run-length encoding, large runs of consecutive identical data
values are replaced by a simple code with the data value and length
of the run, i.e. (dataValue, LengthOfTheRun)
⚫ This encoding scheme tries to tally occurrence of data value (Xi)
along with its run length, i.e.(Xi , Length_of_Xi)
⚫ For example:
– Original Sequence:
111122233333311112222
– can be encoded as:
(1,4),(2,3),(3,6),(1,4),(2,4)
19 MMIS (Comp. by Habtamu A.)
Common compression methods

⚫ Repetitive sequence suppression:


– RLE compression technique
⚫ Thismethod is useful on data that contains many such runs.
Otherwise, It is not recommended for use with files that don't
have many runs as it could potentially double the file size.

20 MMIS (Comp. by Habtamu A.)


Common compression methods
⚫ Statisticalmethods:
– It requires prior information about the occurrence of
symbols
– Estimateprobabilitiesof symbols,
⚫ Code one symbolat a time,
⚫ Shortercodes for symbolswith high probabilities
– Requirestwo passes:
⚫ One pass to compute probabilities (or frequencies) and
determinethe mapping,
⚫ A second pass to encode.
– E.g. Huffmancoding and Shannon-Fano coding

21 MMIS (Comp. by Habtamu A.)


Common compression methods
⚫ Dictionary- based coding

– Do not requireprior informationto compressstrings.


– Don't require a first pass over the data to calculate a
probabilitymodel
– All of the adaptivemethodsare one-pass methods; only
one scan of the messageis required.
– Rather, replace symbols with a pointer to dictionary
entries
– Example: Lempel-Ziv (LZ) & Adaptive Huffman
Coding compression techniques
22 MMIS (Comp. by Habtamu A.)
Compression Model
⚫ Almost all data compression methods involve the
use of a model, a prediction of the composition of
the data.
– When the data matches the prediction made by the
model, the encoder can usually transmit the content of
the data at a lower information cost, by making reference
to the model.
– In most methods the model is separate, and because both
the encoder and the decoder need to use the model, it
must be transmitted with the data.

23 MMIS (Comp. by Habtamu A.)


Compression Model
⚫ In dictionary coding, the encoder and decoder are
instead equipped with identical rules about how
they will alter their models in response to the actual
content of the data
– Both start with a blank slate, meaning that no initial model
needs to be transmitted.
⚫ As the data is transmitted, both encoder and
decoder adapt their models, so that unless the
character of the data changes radically, the model
becomes better-adapted to the data it's handling and
compresses it more efficiently.
24 MMIS (Comp. by Habtamu A.)
Pattern substitution

⚫ This is a simple form of statistical encoding


– Substitute
a frequently repeating pattern(s) with a code.
– For example: replace all occurrences of `The' with the
code '&‘
⚫Typically tokens are assigned to according to
frequency of occurrence of patterns
– Count occurrence of tokens
– Sort in Descending order
– Assign some symbols to highest count tokens
25 MMIS (Comp. by Habtamu A.)
Huffman coding

⚫ Developed in 1950s by David Huffman, widely


used for text compression, multimedia codec and
message transmission
⚫ Given a set of n symbols and their weights (or
frequencies), construct a tree structure (a binary
tree for binary code) with the objective of reducing
memory space and decoding time per symbol.
⚫ For instance, Huffman coding is constructed based
on frequency of occurrence of letters in text
documents
26 MMIS (Comp. by Habtamu A.)
Huffman coding

⚫ Fixed Length HC
⚫ Variable length HC

0 1 Code of:
D4 D1 = 000
0 1 D2 = 001
1 D3 D3 = 01
0
D4 = 1
27 D1 D2 MMIS (Comp. by Habtamu A.)
Huffman coding

⚫ The Model could determine raw probabilities of


each symbol occurring anywhere in the input
stream.
pi = # of occurrences of Si
Total # of Symbols

28 MMIS (Comp. by Habtamu A.)


Variable length Huffman coding

⚫ The output of the Huffman encoder is determined


by the Model (probabilities).
– The higher the probability of occurrence of the symbol,
the shorter the code assigned to that symbol and vice
versa.
– This will enable to easily control the most frequently
occurring symbols in a data and also reduce the time
taken during decoding each symbols.

29 MMIS (Comp. by Habtamu A.)


How to construct Variable length Huffman coding

⚫ Step 1: Create forest of trees for each symbol, t1,


t2,… tn
⚫ Step 2: Sort forest of trees according to increasing
probabilities of symbol occurrence
⚫ Step 3: WHILE more than one tree exist DO
– Merge two trees t1 and t2 with least probabilities p1 and
p2
– Label their root with sum p1 + p2
– Associate binary code: 1 with the right branch and 0 with
the left branch
30 MMIS (Comp. by Habtamu A.)
How to construct Huffman coding

⚫Step 4: Create a unique code word for each symbol


by traversing the tree from the root to the leaf.
– Concatenate all encountered 0s and 1s together during
traversal
⚫The resulting tree has a prob. of 1 in its root and
symbols in its leaf node.

31 MMIS (Comp. by Habtamu A.)


Example

⚫ Consider
the following table to construct the
Huffman coding.
VWVWWXYZYZXYZYZYXZYW

Symbol Probability • The Huffman


V 2 encoding algorithm
W 4
picks each time two
X 3
Y 6
symbols (with the
Z 5 smallest frequency) to
32 combine
Word level Exercise

⚫ Given text:
– ABRACADABRA
– MISSISSIPPI
– Construct the Variable Length Huffman coding?

33 MMIS (Comp. by Habtamu A.)


The Shannon-Fano Encoding Algorithm
1. Calculate the frequency of each of the symbols in
the list.
2. Sort the list in (decreasing) order of frequencies.
3. Divide the list into two half’s, with the total
frequency counts of each half being as close as
possible to each other.
4. The right half is assigned a code of 1 and the left
half with a code of 0.

34 MMIS (Comp. by Habtamu A.)


The Shannon-Fano Encoding Algorithm

5. Recursively apply steps 3 and 4 to each of the


halves, until each symbol has become a
corresponding code leaf on the tree.
✓ That is, treat each split as a list and apply splitting and
code assigning till you are left with lists of single
elements.
6. Generate codeword for each symbol

35 MMIS (Comp. by Habtamu A.)


Examples

⚫ Example: Given five symbols A to E with their


frequencies being 15 , 7, 6, 6 & 5 ; encode them
using Shannon - Fano encoding
⚫ Solution:

Symbol A B C D E
Count 15 7 6 6 5
0 0 1 1 1
0 1 0 1 1
0 1
36 MMIS (Comp. by Habtamu A.)
Cond…

Symbol Count Code Number of


Bits

A 15 00 30
B 7 01 14
C 6 10 12
D 6 110 18
E 5 111 15

89

37 MMIS (Comp. by Habtamu A.)


Exercise

⚫ Given the following symbols and their


corresponding frequency of occurrence, find an
optimal binary code for compression:
Character: a b c d e t

Frequency: 16 5 12 17 10 25
A. Using Shannon-Fano coding scheme
B. Using the Huffman algorithm/coding
38 MMIS (Comp. by Habtamu A.)
Lempel-Ziv-Welch (LZW) compression

⚫ Not rely on previous knowledge about the data


⚫ Rather builds this knowledge in the course of data
transmission/data storage
⚫ Lempel-Ziv algorithm uses a table of code-words
created during data transmission;
– Each time it replaces strings of characters with a reference
to a previous occurrence of the string.
⚫ The multi-symbol patterns are of the form: C0C1 . .
. Cn-1 Cn.
⚫ The prefix of a pattern consists of all the pattern
symbols except the last: C0C1 . . . Cn-1
39 MMIS (Comp. by Habtamu A.)
LZW compression

⚫ Output: there are three options in assigning a code


to each symbol in the list
– If one-symbol pattern is not in dictionary, assign (0,
symbol)
– If multi-symbol pattern is not in dictionary, assign
(dictionaryPrefixIndex, lastPatternSymbol)
– If the last input symbol or the last pattern is in the
dictionary, assign (dictionaryPrefixIndex)

40 MMIS (Comp. by Habtamu A.)


LZW compression
⚫Encode (i.e., compress) the string
ABBCBCABABCAABCAAB using the LZ
algorithm.

⚫The compressed message is:


(0,A)(0,B)(2,C)(3,A)(2,A)(4,A)(6,B)
41 MMIS (Comp. by Habtamu A.)
Example: Compute Number of bits transmitted

⚫Consider the string ABBCBCABABCAABCAAB


given in the example (previous slide) and compute
the number of bits transmitted:
– Number of bits = Total No. of characters * 8 = 18 * 8 = 144
bits
⚫The compressed string consists of codewords and the
corresponding codeword index as shown below:
– Codeword: (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B)
– Codeword index: 1 2 3 4 5 6 7
– The actual compressed message is: 0A 0B 10C 11A 10A
100A 110B where each character is replaced by its binary
8-bit ASCII code. (See on the Next Slide)
42 MMIS (Comp. by Habtamu A.)
Example: Compute Number of bits transmitted

⚫ Each code word consists of a character and an


integer:
– The character is represented by 8 bits
– The number of bits n required to represent the integer part
of the codeword with index i is given by:

CW: (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B)


Index: 1 2 3 4 5 6 7
Bits: (1 + 8) + (1 + 8) + (2 + 8) + (2 + 8) + (2 + 8) + (3 + 8) + (3 + 8) =
70 bits

⚫ How could you further reduce the file size using your
own codeword??
43 MMIS (Comp. by Habtamu A.)
Example: Decompression

⚫Decode (i.e., decompress) the sequence (0, A) (0, B)


(2, C) (3, A) (2, A) (4, A) (6, B)

The decompressed message is:


ABBCBCABABCAABCAAB
44 MMIS (Comp. by Habtamu A.)
Exercise

⚫Encode(i.e., compress) the following strings using


the Lempel-Ziv algorithm.

– YYZYZZZYZYYZYZZZYZZYZZ
– SATATASACITASA.

45 MMIS (Comp. by Habtamu A.)


Announcement!!

⚫ Due date for your Individual assignment

⚫ Test 1 (Chap 1 & Chap 2, )


⚫ Group project
– Submission: April 4 , 2015 E.C
– Presentation: April 5 , 2015 E.C
– Send me your group arrangement via telegram

46
47

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy