0% found this document useful (0 votes)
16 views2 pages

Entr 5

Entropy5

Uploaded by

pirqayum1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views2 pages

Entr 5

Entropy5

Uploaded by

pirqayum1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

In information theory, the entropy of a random variable quantifies the average level of

uncertainty or information associated with the variable's potential states or possible


outcomes. This measures the expected amount of information needed to describe the
state of the variable, considering the distribution of probabilities across all potential
states. Given a discrete random variable , which takes values in the set and is
distributed according to , the entropy iswhere denotes the sum over the variable's
possible values.[Note 1] The choice of base for , the logarithm, varies for different
applications. Base 2 gives the unit of bits (or "shannons"), while base e gives "natural
units" nat, and base 10 gives units of "dits", "bans", or "hartleys". An equivalent
definition of entropy is the expected value of the self-information of a variable.[1]
Two bits of entropy: In the case of two fair coin tosses, the information entropy in bits is the base-2
logarithm of the number of possible outcomes— with two coins there are four possible outcomes, and
two bits of entropy. Generally, information entropy is the average amount of information conveyed by
an event, when considering all possible outcomes.

The concept of information entropy was introduced by Claude Shannon in his 1948
paper "A Mathematical Theory of Communication",[2][3] and is also referred to
as Shannon entropy. Shannon's theory defines a data communication system
composed of three elements: a source of data, a communication channel, and a
receiver. The "fundamental problem of communication" – as expressed by Shannon – is
for the receiver to be able to identify what data was generated by the source, based on
the signal it receives through the channel.[2][3] Shannon considered various ways to
encode, compress, and transmit messages from a data source, and proved in
his source coding theorem that the entropy represents an absolute mathematical limit
on how well data from the source can be losslessly compressed onto a perfectly
noiseless channel. Shannon strengthened this result considerably for noisy channels in
his noisy-channel coding theorem.

Entropy in information theory is directly analogous to the entropy in statistical


thermodynamics. The analogy results when the values of the random variable designate
energies of microstates, so Gibbs's formula for the entropy is formally identical to
Shannon's formula. Entropy has relevance to other areas of mathematics such
as combinatorics and machine learning. The definition can be derived from a set
of axioms establishing that entropy should be a measure of how informative the average
outcome of a variable is. For a continuous random variable, differential entropy is
analogous to entropy. The definition generalizes the above.

Introduction
[edit]

The core idea of information theory is that the "informational value" of a communicated
message depends on the degree to which the content of the message is surprising. If a
highly likely event occurs, the message carries very little information. On the other hand,
if a highly unlikely event occurs, the message is much more informative. For instance,
the knowledge that some particular number will not be the winning number of a lottery
provides very little information, because any particular chosen number will almost
certainly not win. However, knowledge that a particular number will win a lottery has
high informational value because it communicates the occurrence of a very low
probability event.

The information content, also called the surprisal or self-information, of an event is a


function which increases as the probability of an event decreases. When is close to 1,
the surprisal of the event is low, but if is close to 0, the surprisal of the event is high.
This relationship is described by the functionwhere is the logarithm, which gives 0
surprise when the probability of the event is 1.[4] In fact, log is the only function that
satisfies а specific set of conditions defined in section § Characterization.
Hence, we can define the information, or surprisal, of an event byor equivalently,
Entropy measures the expected (i.e., average) amount of information conveyed by
identifying the outcome of a random trial.[5]: 67 This implies that rolling a die has higher
entropy than tossing a coin because each outcome of a die toss has smaller probability
() than each outcome of a coin toss ().

Consider a coin with probability p of landing on heads and probability 1 − p of landing


on tails. The maximum surprise is when p = 1/2, for which one outcome is not expected
over the other. In this case a coin flip has an entropy of one bit. (Similarly, one trit with
equiprobable values contains (about 1.58496) bits of information because it can have
one of three values.) The minimum surprise is when p = 0 or p = 1, when the event
outcome is known ahead of time, and the entropy is zero bits. When the entropy is zero
bits, this is sometimes referred to as unity, where there is no uncertainty at all – no
freedom of choice – no information. Other values of p give entropies between zero and
one bits.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy