Abstract
Source coding, also known as data compression, is an area of information theory that deals with the design and performance evaluation of optimal codes for data compression. In 1952 Huffman constructed his optimal code that minimizes the average code length among all prefix codes for known sources. Actually, Huffman codes minimizes the average redundancy defined as the difference between the code length and the entropy of the source. Interestingly enough, no optimal code is known for other popular optimization criterion such as the maximal redundancy defined as the maximum of the pointwise redundancy over all source sequences. We first prove that a generalized Shannon code minimizes the maximal redundancy among all prefix codes, and present an efficient implementation of the optimal code. Then we compute precisely its redundancy for memoryless sources. Finally, we study universal codes for unknown source distributions. We adopt the minimax approach and search for the best code for the worst source. We establish that such redundancy is a sum of the likelihood estimator and the redundancy of the generalize code computed for the maximum likelihood distribution. This replaces Shtarkov’s bound by an exact formula. We also compute precisely the maximal minimax redundancy for a class of memoryless sources. The main findings of this paper are established by techniques that belong to the toolkit of the “analytic analysis of algorithms” such as theory of distribution of sequences modulo 1 and Fourier series. These methods have already found applications in other problems of information theory, and they constitute the so called analytic information theory.
This work was supported by NSF Grant CCR-9804760 and contract 1419991431A from sponsors of CERIAS at Purdue.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
J. Abrahams, Code and Parse Trees for Lossless Source Encoding, Proc. of Compression and Complexity of SEQUENCE’97, Positano, IEEE Press, 145–171, 1998.
A. Barron, J. Rissanen, and B. Yu, The Minimum Description Length Principle in Coding and Modeling, IEEE Trans. Information Theory, 44, 2743–2760, 1998.
T. Cover and J.A. Thomas, Elements of Information Theory, John Wiley & Sons, New York 1991.
L. Campbell, A Coding Theorem and Rényi’s Entropy, Information and Control, 8, 423–429, 1965.
M. Drmota and R. Tichy, Sequences, Discrepancies, and Applications, Springer Verlag, Berlin Heidelberg, 1997.
D. E. Knuth, Dynamic Huffman Coding, J. Algorithms, 6, 163–180, 1985.
E. Krätzel, Lattice Points, Kluwer, Dordrecht, 1988.
P. Nath, On a Coding Theorem Connected with Rényi’s Entropy, Information and Control, 29, 234–242, 1975.
J. van Leeuwen, On the Construction of the Huffman Trees, Proc. ICALP’76, 382–410, 1976.
J. Rissanen, Complexity of Strings in the Class of MarkovSo urces, IEEE Trans. Information Theory, 30, 526–532, 1984.
J. Rissanen, Fisher Information and Stochastic Complexity, IEEE Trans. Information Theory, 42, 40–47, 1996.
P. Shields, Universal Redundancy Rates Do Not Exist, IEEE Trans. Information Theory, 39, 520–524, 1993.
D. S. Parker, Conditions for Optimiality of the Hu.man Algorithm, SIAM J. Compt., 9, 470–489, 1980.
Y. Shtarkov, Universal Sequential Coding of Single Messages, Problems of Information Transmission, 23, 175–186, 1987.
W. Szpankowski, On Asymptotics of Certain Recurrences Arising in Universal Coding, Problems of Information Transmission, 34, No.2, 55–61, 1998.
W. Szpankowski, Asymptotic Redundancy of Hu.man (and Other) Block Codes, IEEE Trans. Information Theory, 46, 2434–2443, 2000.
W. Szpankowski, Average Case Analysis of Algorithms on Sequences, Wiley, New York, 2001.
Q. Xie, A. Barron, Minimax Redundancy for the Class of Memoryless Sources, IEEE Trans. Information Theory, 43, 647–657, 1997.
Q. Xie, A. Barron, Asymptotic Minimax Regret for Data Compression, Gambling, and Prediction, IEEE Trans. Information Theory, 46, 431–445, 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Drmota, M., Szpankowski, W. (2002). Generalized Shannon Code Minimizes the Maximal Redundancy. In: Rajsbaum, S. (eds) LATIN 2002: Theoretical Informatics. LATIN 2002. Lecture Notes in Computer Science, vol 2286. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45995-2_29
Download citation
DOI: https://doi.org/10.1007/3-540-45995-2_29
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43400-9
Online ISBN: 978-3-540-45995-8
eBook Packages: Springer Book Archive