1 PB
1 PB
Abstract
Audio file size is relatively larger when compared to files with text format. Large files
can cause various obstacles in the form of large space requirements for storage and a
long enough time in the shipping process. File compression is one solution that can be
done to overcome the problem of large file sizes. Arithmetic coding is one algorithm
that can be used to compress audio files. The arithmetic coding algorithm encodes the
audio file and changes one row of input symbols with a floating point number and
obtains the output of the encoding in the form of a number of values greater than 0
and smaller than 1. The process of compression and decompression of audio files in
this study is done against several wave files. Wave files are standard audio file
formats developed by Microsoft and IBM that are stored using PCM (Pulse Code
Modulation) coding. The wave file compression ratio obtained in this study was 16.12
percent with an average compression process time of 45.89 seconds, while the average
decompression time was 0.32 seconds.
1. INTRODUCTION
Audio (sound) is a physical phenomenon produced by the vibration of an object in the
form of an analog signal with an amplitude that changes continuously with time called
frequency [3]. Sounds in wave form are stored in digital audio data format in
computer system files. There are several formats for storing audio files on computer
systems including wave files.
Wave files are standard audio file formats developed by Microsoft and IBM [10].
Wave files are stored using PCM (Pulse Code Modulation) coding. The wave file is an
audio file that is not compressed so that all audio samples are stored all in the storage
media in digital form.
Audio files on computer systems tend to have a large size, according to the length of
recording time. In the process of storing and sending data, large files have constraints
that require large space to store and require a considerable amount of time on delivery
73
[9]. To overcome this, file compression can be done. Compression is the process of
encoding information using fewer bits than the initial information [5]. There are two
types of compression, namely Lossless Compression and Lossy Compression [8].
Data compression through the encoding process seeks to eliminate the repetition of
data by changing it in such a way as to produce smaller data sizes [7]. The
compression encoding process can be carried out for various types of data such as
text, image, video, audio and others.
The arithmatic coding algorithm is a compression method that replaces one row of
input symbols with a floating point number [2]. The basic idea of arithmatic coding is
to create an opportunity line from 0 to 1 and give an interval for each character from
the input text based on the chance of its appearance. The higher the chance a character
has, the greater the interval that will be obtained [6]. After all the characters have an
interval, coding is done to produce an output number. Based on this description, this
study discusses the process of compression and decompression of audio files using the
arithmatic coding method. The results obtained in this study are in the form of a
percentage of average file compression performed on several sample wave files.
2. METHODS
2.1. Compression and Decompression
Compression or data compression is a method used to compress data so that it only
requires a smaller storage room. The main purpose of the data compaction process is
to increase efficiency in storage or shorten the data exchange time. Compression is the
process of encoding information using fewer bits than the initial information [5]. The
general principle in the compression process is to reduce data duplication so that the
memory to represent becomes less when compared to the original digital data
representation [8].
There are two types of compression, namely Lossless Compression and Lossy
Compression [8]. In lossless compression, the data will initially be broken down into
smaller sizes and eventually the data is reunited. Whereas, in lossy compression, there
are bits of information that are eliminated after the compression process is done [5].
The decompression process is the process of returning a compressed file to the initial
text. Decompression results depend on the nature of the compression used, namely
Lossless Compression or Lossy Compression.
The output of arithmatic coding is a number smaller than number 1 and greater than or
equal to 0. This number can be uniquely decoded to produce a row of symbols used to
produce that number. To produce the output number, each symbol that will be
After the probability of each character is known, each symbol / character will be given
a certain range whose values range from 0 and 1 according to the probabilities that
exist. In this case there is no stipulation sequence for segments, what is important is
that both the encoder and decoder must do the same. The Probability Range table is
generated as in Table 2.
The next step, the encoding process is carried out based on the following steps:
1) Set low = 0.0 (initial condition)
2) Set high = 1.0 (initial condition)
3) While (input symbol still exists) do
4) Take the input symbol.
5) CR = high - low.
6) High = low + CR * high_range (symbol)
7) Low = low + CR * low_range (symbol)
8) End while
9) Print low
Based on the steps in the encoding process, the results of sample data encoding are
obtained as in Table 3.
Table 3. Audio Sample Encoding Results
No Character Low High CR
Initial 0,0 1,0 1,0
1 00 0,0 1.0 1,0
2 3e 0,0 0,4 0,4
3 1f 0,32 0,4 0,08
4 00 0,352 0,376 0,024
5 9a 0,352 0,3616 0,0096
6 00 0,35872 0,35968 0,00096
7 1f 0,35872 0,359104 0,000384
8 9a 0,3588736 0,3589888 0,0001152
9 00 0,35906944 0,35908096 0,00001152
10 3e 0,35906944 0,359074048 0,000004608
Based on the data in Table 3, the low value for the last data is 0.35907332104192.
This value is used to replace audio data 00 3e 1f 00 9a 00 1f 9a 00 3e 00 1f 00 3e 1f.
While the decoding process is carried out through the following stages:
1) Take an encoded-symbol (ES).
2) Do
3) Look for the range of symbols surrounding ES.
4) Print symbol
5) RC = high_range - low_range
6) ES = ES - low_range
7) ES = ES / CR
8) Until the symbol runs out
Wave allow various forms of audio to be recorded in various qualities, such as 8-bit or
16-bit samples with rates of 11025 Hz, 22050 Hz or 44100 Hz [M. Kaur and S. Kaur].
Digital audio data in wave files can have various qualities. The quality of the sound
produced is determined by the bitrate, samplerate, and number of channels [1].
Bitrate is a bit size for each side, namely 8-bits, 16-bits, 24-bits or 32-bits [4]. In 8-bits
WAV all the samples will only take 1 byte. Whereas 16-bits will take 2 bytes. The
The information obtained from the audio file data in Figure 1 is explained below:
1) The first four bytes always contain 52 48 46 46 (hexa) which if at the
convention means R = 52, I = 49, F = 46, F = 46 is the same as RIFF.
2) The next four bytes containing 24 08 00 00 state the audio file size, which
is 24 = 36, 08 = 8, 00 = 0, 00 = 0 which is equal to 36800, then the file size
is 36800 kb - 1 kb = 36799 kb.
3) The next four bytes 57 41 56 45 state the file type: 57 = W, 41 = A, 56 = V,
45 = E
4) The next four bytes are 66 6d 74 20 declares ID "fmt", 66 = f, 6d = m, 74 =
t and 20 = empty spaces.
5) The next four bytes are 10 00 00 00 which states the length of information,
10 = 16, 00 = 0, 00 = 0, 00 = 0 all of which are worth 16.
6) The next four bytes are 01 00 02 00 which is worth 1 and 2 channels
(stereo).
7) The next four bytes are 22 56 00 00 which states the sample rate with the
value 22 = ", 56 = V, 00 = 0, 00 = 0.
8) The next two bytes are BlockAlign which is worth 04 00 which states the
size of the data for one full sample in bytes. One full sample is a sample
that represents the value of the sample on all channels at a time.
From the data from the audio file, data is obtained from the sample to 1 with the sign
# 1 until the sample is 23 with the sign # 23 on the last block of audio sample data.
For example, from the value of sample 1 audio right channel above that will be
encoded are: 00 00 00 00 24 17 1e 3c 13 3c 14 16 f9 18 f9. From the data to be
encoded, a probability table can be created like Table 4.
From this process, the low value for the last data is the low value =
0.07459618223090192 which will be used to replace the encoding audio sample,
namely the right channel sample audio value 00 00 00 00 24 17 1e 3c 13 3c 14 16 f9
18 f9. The next sample 1 is changed to 0.074.
For the audio sample left channel calculate the low value as above. The table of
probability values and audio sample symbols will be stored in text format as a data
source in the decoding process for audio samples to be replaced. The results of testing
compression and decompression carried out on several wave files are presented in
Table 7 and Table 8.
4. CONCLUSION
Based on the research results it can be concluded that the average wave file
compression ratio is 16.12% and the average compression time is 45.89 seconds,
while the average wave file decompression ratio is 16.12% and the average time is
0.32 seconds. The average decompression process time in this study is smaller when
compared to the compression process because the probability value tables and audio
sample symbols generated in the compression process are stored in text format as data
sources in the decompression process for audio samples to be replaced. In this study
testing was carried out on wave files with no limitations on file size. It is
recommended that this study be developed with a variety of audio file formats such as
mp3, mp4 and others.
5. REFERENCES
[1] Hazem, Kathem Qattous. (2017). Hiding Encrypted Data Into Audio File.
IJCSNS International Journal of Computer Science and Network Security,
17(6), 162-170.
[2] Howard, Paul G & Jeffrey Scott Vitter. (1992). Analysis Of Arithmetic Coding
For Data Compression. Information Processing & Management, 28(6),749-
763.
[3] Iwan, Binanto. (2010). Multimedia Basic Digital Theory & Development.
Yogyakarta: Andi Offset.
[4] Jawahir, Ahmad., & Haviluddin. (2015). An Audio Encryption Using
Transposition Method. International Journal of Advances in Intelligent
Informatics, 1(2), 94-106.
[5] K. Sayood. (1996). Introduction To Data Compression. Virginia Polytechnic
University: Morgan Kaufmann Publishers Inc.
[6] Maan, Anmol Jyot. (2013). Analysis and Comparison of Algorithms For
Lossless Data Compression. International Journal of Information and
Computation Technology, 3(3), 139-146.
[7] Said, Amir. (2004). Comparative Analysis of Arithmetic Coding Computational
Complexity. Imaging Systems Laboratory. HP Laboratories Palo Alto.
[8] Salomon, D. A (2012). Guide to Data Compression Methods, Springer.
[9] Silitonga, Parasian D.P. et.al. (2018). Wave File Encryption using Huffman
Compression and Serpent Algorithm.. International Journal of Computer
Trends and Technology ( IJCTT ), 64(1), 2231-2803.
[10] Tamimi, A.A & A. M. Abdalla. (2104). An Audio Shuffle -Encryption
Algorithm. The World Congress on Engineering and Computer Science 2014
WCECS. San Francisco.