Honors Unit 1
Honors Unit 1
Claude Shannon's foundational work in Information Theory, primarily laid out in his seminal 1948 paper "A Mathematical
Theory of Communication," established a rigorous mathematical framework for understanding the transmission and
processing of information. Here are the key concepts and contributions of Shannon’s Information Theory:
1. Information Measure:
Entropy: Shannon introduced the concept of entropy as a measure of the uncertainty or randomness of a source of
information. Entropy 𝐻(𝑋)H(X) for a discrete random variable 𝑋X with possible values {𝑥1,𝑥2,…,𝑥𝑛}{x1,x2,…,xn} and
probability mass function 𝑃(𝑋)P(X) is defined as:
𝐻(𝑋)=−∑𝑖=1𝑛𝑃(𝑥𝑖)log𝑃(𝑥𝑖)H(X)=−i=1∑nP(xi)logP(xi)
Entropy quantifies the average amount of information produced by a stochastic source of data.
2. Data Compression:
Source Coding Theorem: Shannon proved that the entropy 𝐻(𝑋)H(X) of a source represents the theoretical limit on
the minimum average length of lossless encoding. This means you cannot compress the data to a size smaller than its
entropy without losing information.
Shannon-Fano Coding: A precursor to Huffman coding, it is a method of constructing a prefix code based on a set of
symbols and their probabilities.
3. Channel Capacity:
Channel Capacity 𝐶C: Shannon defined the maximum rate at which information can be reliably transmitted over a
communication channel. The channel capacity is given by:
𝐶=max𝑃(𝑋)𝐼(𝑋;𝑌)C=P(X)maxI(X;Y)
where 𝐼(𝑋;𝑌)I(X;Y) is the mutual information between the input 𝑋X and the output 𝑌Y of the channel.
Noisy-Channel Coding Theorem: This theorem states that reliable communication is possible at any rate less than the
channel capacity 𝐶C. This involves using error-correcting codes to mitigate the effect of noise.
4. Mutual Information:
Mutual Information 𝐼(𝑋;𝑌)I(X;Y): It measures the amount of information that one random variable 𝑋X contains about
another random variable 𝑌Y. It is defined as:
𝐼(𝑋;𝑌)=𝐻(𝑋)+𝐻(𝑌)−𝐻(𝑋,𝑌)I(X;Y)=H(X)+H(Y)−H(X,Y)
Mutual information is used to determine the efficiency and effectiveness of communication systems.
5. Redundancy and Efficiency:
Shannon showed how redundancy in a message can be exploited to detect and correct errors in transmission. This is
critical for designing robust communication systems.
6. Practical Implications:
Shannon's theories laid the groundwork for modern digital communication and data compression technologies.
Methods like Huffman coding, JPEG, MP3, and modern telecommunication systems all derive from these principles.
Significance and Impact:
Shannon's work transformed communication theory by introducing a quantitative approach to information transfer and
processing. It provided a foundation for various fields including telecommunications, computer science, cryptography, and
data science. Shannon is often referred to as the "father of information theory" due to these groundbreaking contributions.
Shannon's theories continue to be relevant, influencing developments in wireless communication, network theory, data
storage, and more, proving the timelessness of his foundational concepts in Information Theory.
2
UNIT 1: (HTCS-401) Information Theory ………………………..Er. Shubham Kumar Sir
ENGINEERING ADDAA (Online Tutorial Point)
Random variables play a critical role in information theory, which is fundamental to many aspects of cybersecurity. Here are
key concepts involving random variables in information theory and how they apply to cybersecurity:
1. Entropy and Uncertainty:
Entropy (𝐻(𝑋)H(X)): Measures the uncertainty or unpredictability of a random variable 𝑋X. In cybersecurity, entropy is
used to assess the strength of cryptographic keys and passwords. High entropy indicates a higher level of randomness,
making it harder for attackers to predict or brute-force the key or password.
𝐻(𝑋)=−∑𝑖=1𝑛𝑃(𝑥𝑖)log𝑃(𝑥𝑖)H(X)=−i=1∑nP(xi)logP(xi)
2. Mutual Information:
Mutual Information (𝐼(𝑋;𝑌)I(X;Y)): Measures the amount of information one random variable 𝑋X contains about
another 𝑌Y. In cybersecurity, mutual information can be used to detect dependencies and potential leaks in data
transmission. For instance, it helps in identifying how much information about the plaintext can be obtained from the
ciphertext.
𝐼(𝑋;𝑌)=𝐻(𝑋)+𝐻(𝑌)−𝐻(𝑋,𝑌)I(X;Y)=H(X)+H(Y)−H(X,Y)
3. Channel Capacity:
Channel Capacity (𝐶C): The maximum rate at which information can be reliably transmitted over a communication
channel. In cybersecurity, understanding channel capacity helps in designing systems that can securely transmit data
even in the presence of noise and potential eavesdroppers.
𝐶=max𝑃(𝑋)𝐼(𝑋;𝑌)C=P(X)maxI(X;Y)
Ensuring secure communication often involves using error-correcting codes and encryption to maximize reliable and
confidential data transfer.
4. Conditional Entropy:
Conditional Entropy (𝐻(𝑌∣𝑋)H(Y∣X)): Measures the amount of uncertainty remaining in 𝑌Y given that 𝑋X is known.
This concept is used in analyzing cryptographic systems to understand how much uncertainty about the key remains
after observing the ciphertext.
𝐻(𝑌∣𝑋)=−∑𝑥∈𝑋𝑃(𝑥)∑𝑦∈𝑌𝑃(𝑦∣𝑥)log𝑃(𝑦∣𝑥)H(Y∣X)=−x∈X∑P(x)y∈Y∑P(y∣x)logP(y∣x)
5. Relative Entropy (Kullback-Leibler Divergence):
Kullback-Leibler Divergence (𝐷𝐾𝐿(𝑃∣∣𝑄)DKL(P∣∣Q)): Measures how one probability distribution 𝑃P diverges from a
second, expected probability distribution 𝑄Q. In cybersecurity, it is used for anomaly detection by comparing the
observed behavior (distribution) of a system to a baseline (expected) behavior.
𝐷𝐾𝐿(𝑃∣∣𝑄)=∑𝑖𝑃(𝑥𝑖)log𝑃(𝑥𝑖)𝑄(𝑥𝑖)DKL(P∣∣Q)=i∑P(xi)logQ(xi)P(xi)
6. Key Distribution and Management:
Random Variables in Key Generation: Cryptographic keys are often generated using random variables to ensure their
unpredictability. The quality of the random number generator directly impacts the security of the cryptographic
system.
Secure Key Exchange: Information theory helps in designing protocols like Diffie-Hellman and Quantum Key
Distribution (QKD) that securely exchange cryptographic keys over potentially insecure channels.
7. Error Detection and Correction:
Error-Correcting Codes: Techniques like Hamming codes, Reed-Solomon codes, and Turbo codes are grounded in
information theory. They ensure data integrity and security by detecting and correcting errors in data transmission,
which is vital for secure communications.
3
UNIT 1: (HTCS-401) Information Theory ………………………..Er. Shubham Kumar Sir
ENGINEERING ADDAA (Online Tutorial Point)
Hash Functions: Used to ensure data integrity by producing a fixed-size string from input data. A good hash function
behaves like a random variable with high entropy, making it infeasible to reverse-engineer the original input from the
hash output.
8. Steganography and Watermarking:
Information Hiding: Random variables are used to embed information within other data (e.g., images, audio) in a way
that is statistically indistinguishable from random noise, thus hiding the existence of the hidden information.
Applications in Cybersecurity:
1. Password Security: Ensuring high entropy in passwords to prevent dictionary attacks.
2. Cryptographic Systems: Designing secure encryption algorithms and protocols.
3. Secure Communication: Using channel capacity and error-correcting codes to maintain confidentiality and integrity of
data.
4. Anomaly Detection: Utilizing relative entropy and mutual information for detecting unusual patterns that might
indicate a security breach.
5. Data Integrity: Implementing robust hash functions and error-correcting codes to maintain data integrity against
tampering and transmission errors.
In information theory and cybersecurity, understanding the probability distribution of random variables is crucial for various
applications, including cryptography, data compression, error detection and correction, and anomaly detection. Here are key
factors and concepts related to probability distributions in these contexts:
Key Factors of Probability Distributions
1. Type of Distribution:
Discrete Distributions: Used when the random variable takes on a finite or countably infinite number of values
(e.g., binomial, Poisson).
Continuous Distributions: Used when the random variable can take on any value within a range (e.g., normal,
exponential).
2. Probability Mass Function (PMF) / Probability Density Function (PDF):
PMF (for discrete variables): Function 𝑃(𝑋=𝑥𝑖)P(X=xi) that gives the probability that a discrete random
variable is exactly equal to some value 𝑥𝑖xi.
PDF (for continuous variables): Function 𝑓(𝑥)f(x) that describes the likelihood of a continuous random variable
to take on a particular value.
3. Cumulative Distribution Function (CDF):
Function 𝐹(𝑥)F(x) that gives the probability that a random variable 𝑋X will take a value less than or equal to 𝑥x:
𝐹(𝑥)=𝑃(𝑋≤𝑥)F(x)=P(X≤x)
4. Moments:
Mean (Expectation 𝐸[𝑋]E[X]): The average value of the random variable, giving a measure of central
tendency.
Variance (Var(𝑋)Var(X)): Measures the spread or dispersion of the random variable from the mean.
Higher-order moments: Skewness (asymmetry) and kurtosis (tailedness).
5. Entropy:
Entropy (𝐻(𝑋)H(X)): Measures the uncertainty or randomness of the random variable:
𝐻(𝑋)=−∑𝑖𝑃(𝑥𝑖)log𝑃(𝑥𝑖)(discrete)H(X)=−i∑P(xi)logP(xi)(discrete)
4
UNIT 1: (HTCS-401) Information Theory ………………………..Er. Shubham Kumar Sir
ENGINEERING ADDAA (Online Tutorial Point)
𝐻(𝑋)=−∫𝑓(𝑥)log𝑓(𝑥) 𝑑𝑥(continuous)H(X)=−∫f(x)logf(x)dx(continuous)
High entropy indicates high unpredictability, which is desirable in cryptographic keys.
Applications in Cybersecurity
1. Cryptographic Key Generation:
Uniform Distribution: Cryptographic keys are often generated using a uniform distribution to ensure each key
is equally likely, maximizing entropy and minimizing predictability.
2. Password Strength:
Evaluating the probability distribution of password components (e.g., letters, numbers, special characters) to
ensure high entropy and resistance to guessing attacks.
3. Anomaly Detection:
Normal Distribution: Many systems assume normal distribution for baseline behavior. Anomalies are detected
by identifying deviations from this distribution.
KL-Divergence: Measures how one probability distribution diverges from a reference distribution, useful for
spotting anomalies.
4. Data Compression:
Huffman Coding: Uses the probability distribution of characters to create an optimal prefix code, reducing the
average length of encoded messages.
Shannon-Fano Coding: Similar to Huffman but less efficient, both methods rely on knowing the probability
distribution of the source data.
5. Error Detection and Correction:
Parity Bits: Add redundancy based on the distribution of the data to detect errors.
Error-Correcting Codes: Use statistical properties of the data's probability distribution to correct errors in
transmission.
6. Information Leakage:
Side-Channel Attacks: Understanding the probability distribution of side-channel data (e.g., power
consumption, timing) to prevent information leakage.
Mutual Information: Assessing how much information about a secret (e.g., cryptographic key) can be inferred
from observed data.
7. Random Number Generators (RNGs):
Evaluating the quality of RNGs based on the uniformity and unpredictability of their output distribution,
crucial for cryptographic applications.
Summary
Probability distributions are fundamental in information theory and cybersecurity, providing a mathematical framework to
analyze, design, and optimize various security mechanisms. By understanding and utilizing the properties of probability
distributions, we can enhance the effectiveness of cryptographic systems, improve anomaly detection, ensure data integrity,
and maintain the confidentiality of information.
In information theory, measures of uncertainty and entropy are fundamental for understanding the amount of
unpredictability or information contained in a random variable or a set of data. These measures are crucial in various fields,
including cybersecurity, where they are used to evaluate the strength of cryptographic systems, design secure
communication protocols, and analyze data security. Here are the key concepts and measures of uncertainty and entropy:
5
UNIT 1: (HTCS-401) Information Theory ………………………..Er. Shubham Kumar Sir
ENGINEERING ADDAA (Online Tutorial Point)
1. Shannon Entropy
Shannon entropy, introduced by Claude Shannon, is a measure of the average uncertainty in a random variable. It quantifies
the amount of information needed to describe the random variable.
Formula for Discrete Random Variables:
𝐻(𝑋)=−∑𝑖=1𝑛𝑃(𝑥𝑖)log𝑃(𝑥𝑖)H(X)=−i=1∑nP(xi)logP(xi)
where 𝑋X is a discrete random variable with possible values {𝑥1,𝑥2,…,𝑥𝑛}{x1,x2,…,xn} and 𝑃(𝑥𝑖)P(xi) is the probability of 𝑥𝑖xi.
Formula for Continuous Random Variables:
(𝑋)=−∫−∞∞𝑓(𝑥)log𝑓(𝑥) 𝑑𝑥h(X)=−∫−∞∞f(x)logf(x)dx
where 𝑓(𝑥)f(x) is the probability density function of the continuous random variable 𝑋X.
2. Joint Entropy
Joint entropy measures the uncertainty associated with a pair of random variables.
Formula:
𝐻(𝑋,𝑌)=−∑𝑥∈𝑋∑𝑦∈𝑌𝑃(𝑥,𝑦)log𝑃(𝑥,𝑦)H(X,Y)=−x∈X∑y∈Y∑P(x,y)logP(x,y)
where 𝑃(𝑥,𝑦)P(x,y) is the joint probability of 𝑋X and 𝑌Y.
3. Conditional Entropy
Conditional entropy measures the amount of uncertainty remaining in one random variable given that the value of another
random variable is known.
Formula:
𝐻(𝑌∣𝑋)=−∑𝑥∈𝑋𝑃(𝑥)∑𝑦∈𝑌𝑃(𝑦∣𝑥)log𝑃(𝑦∣𝑥)H(Y∣X)=−x∈X∑P(x)y∈Y∑P(y∣x)logP(y∣x)
where 𝑃(𝑦∣𝑥)P(y∣x) is the conditional probability of 𝑌Y given 𝑋X.
4. Mutual Information
Mutual information quantifies the amount of information obtained about one random variable through another random
variable. It is a measure of the reduction in uncertainty of one variable due to the knowledge of another.
Formula:
𝐼(𝑋;𝑌)=𝐻(𝑋)+𝐻(𝑌)−𝐻(𝑋,𝑌)I(X;Y)=H(X)+H(Y)−H(X,Y)
or equivalently,
𝐼(𝑋;𝑌)=∑𝑥∈𝑋∑𝑦∈𝑌𝑃(𝑥,𝑦)log𝑃(𝑥,𝑦)𝑃(𝑥)𝑃(𝑦)I(X;Y)=x∈X∑y∈Y∑P(x,y)logP(x)P(y)P(x,y)
5. Kullback-Leibler Divergence (Relative Entropy)
KL-divergence measures the difference between two probability distributions. It is used to quantify how one probability
distribution diverges from a second, reference probability distribution.
Formula:
𝐷𝐾𝐿(𝑃∣∣𝑄)=∑𝑥∈𝑋𝑃(𝑥)log𝑃(𝑥)𝑄(𝑥)DKL(P∣∣Q)=x∈X∑P(x)logQ(x)P(x)
where 𝑃(𝑥)P(x) and 𝑄(𝑥)Q(x) are two probability distributions over the same variable 𝑋X.
6. Cross-Entropy
Cross-entropy measures the average number of bits needed to identify an event from one distribution if a coding scheme is
used based on another distribution.
Formula:
𝐻(𝑃,𝑄)=−∑𝑥∈𝑋𝑃(𝑥)log𝑄(𝑥)H(P,Q)=−x∈X∑P(x)logQ(x)
where 𝑃(𝑥)P(x) is the true probability distribution and 𝑄(𝑥)Q(x) is the approximating distribution.
Applications in Cybersecurity
6
UNIT 1: (HTCS-401) Information Theory ………………………..Er. Shubham Kumar Sir
ENGINEERING ADDAA (Online Tutorial Point)
DNS Leakage: Revealing browsing activities and visited domains through unsecured DNS queries.
4. Storage Leakage:
Unencrypted Storage: Storing sensitive data without encryption, making it accessible to unauthorized users if
the storage medium is compromised.
Residual Data: Data remnants left on storage devices after deletion, which can be recovered by attackers.
5. Application Leakage:
Logging Sensitive Information: Applications logging sensitive data such as passwords or personal information,
which can be accessed by unauthorized users.
Insecure API Usage: Poorly secured APIs leaking data through improper authentication, authorization, or data
handling practices.
Measures to Prevent and Mitigate Leakage
1. Data Loss Prevention (DLP):
Implementing DLP solutions to monitor, detect, and prevent unauthorized data transfers. These solutions can
be deployed at endpoints, networks, and storage systems.
2. Encryption:
Encrypting sensitive data both at rest and in transit to ensure that even if data is intercepted, it remains
unreadable without the decryption key.
3. Access Controls:
Enforcing strict access controls to limit who can view, modify, or transmit sensitive data. This includes using
role-based access control (RBAC) and multi-factor authentication (MFA).
4. Secure Coding Practices:
Following secure coding guidelines to prevent common vulnerabilities such as injection attacks, buffer
overflows, and insecure data storage.
5. Regular Audits and Monitoring:
Conducting regular security audits and continuous monitoring to detect and respond to potential leaks
promptly. This includes monitoring logs, network traffic, and system behavior.
6. Side-Channel Attack Countermeasures:
Implementing countermeasures against side-channel attacks, such as constant-time algorithms, power
analysis defenses, and shielding against electromagnetic emissions.
7. Data Sanitization:
Properly sanitizing storage devices before disposal or repurposing to ensure that no residual data can be
recovered.
8. Network Security:
Using secure communication protocols (e.g., HTTPS, VPNs) to protect data in transit. Employing network
segmentation to limit the spread of sensitive data.
9. Employee Training:
Educating employees about data protection policies, safe data handling practices, and the importance of
maintaining the confidentiality of sensitive information.
Examples of Leakage Scenarios
1. Heartbleed Vulnerability:
8
UNIT 1: (HTCS-401) Information Theory ………………………..Er. Shubham Kumar Sir
ENGINEERING ADDAA (Online Tutorial Point)
The Heartbleed bug in the OpenSSL cryptographic library allowed attackers to read memory contents of the
servers, potentially leaking sensitive data such as private keys and user passwords.
2. Equifax Data Breach:
A vulnerability in a web application framework led to the exposure of personal data of approximately 147
million individuals, highlighting the importance of patching and secure application development.
3. Side-Channel Attacks on RSA Keys:
Researchers have demonstrated that by analyzing power consumption or electromagnetic emissions, they can
extract RSA encryption keys, emphasizing the need for side-channel resistance in cryptographic
implementations.
Conclusion
Leakage in cybersecurity poses a significant threat to the confidentiality, integrity, and availability of sensitive information.
By understanding the various forms of leakage and implementing comprehensive security measures, organizations can
significantly reduce the risk of data breaches and ensure the protection of their critical assets. Regular security assessments,
robust encryption, strict access controls, and continuous monitoring are essential components of an effective strategy to
prevent and mitigate leakage.
Quantifying information leakage and partitioning data are crucial in evaluating and enhancing cybersecurity measures.
Here’s how these concepts can be understood and applied:
Quantifying Information Leakage
Quantifying information leakage involves measuring how much information is unintentionally exposed or can be inferred by
an attacker. Various metrics and methods are used to assess the extent of leakage.
1. Mutual Information:
Mutual Information (MI) quantifies the amount of information that one variable reveals about another. In the context of
leakage, it measures how much information an observer can infer about a secret variable by observing some public variable.
Formula:
𝐼(𝑋;𝑌)=𝐻(𝑋)+𝐻(𝑌)−𝐻(𝑋,𝑌)I(X;Y)=H(X)+H(Y)−H(X,Y)
where 𝐻(𝑋)H(X) is the entropy of the secret, 𝐻(𝑌)H(Y) is the entropy of the observable, and 𝐻(𝑋,𝑌)H(X,Y) is their joint
entropy.
2. Conditional Entropy:
Conditional entropy measures the remaining uncertainty of a secret given the observation of another variable.
Formula:
𝐻(𝑋∣𝑌)=𝐻(𝑋,𝑌)−𝐻(𝑌)H(X∣Y)=H(X,Y)−H(Y)
This helps in understanding how much uncertainty remains about the secret 𝑋X after observing 𝑌Y.
3. Kullback-Leibler Divergence (KL-Divergence):
KL-Divergence measures how one probability distribution diverges from a reference distribution. It’s used to assess how
much an observed distribution (potentially affected by leakage) differs from the expected distribution.
Formula:
𝐷𝐾𝐿(𝑃∣∣𝑄)=∑𝑥∈𝑋𝑃(𝑥)log𝑃(𝑥)𝑄(𝑥)DKL(P∣∣Q)=x∈X∑P(x)logQ(x)P(x)
4. Differential Privacy:
Differential privacy quantifies the privacy guarantees provided by an algorithm, specifically how much the presence or
absence of a single individual in a dataset affects the output of a computation.
9
UNIT 1: (HTCS-401) Information Theory ………………………..Er. Shubham Kumar Sir
ENGINEERING ADDAA (Online Tutorial Point)
Definition: A randomized algorithm 𝐴A is 𝜖ϵ-differentially private if for all datasets 𝐷D and 𝐷′D′ differing on at most
one element, and all possible outputs 𝑆S:
𝑃(𝐴(𝐷)∈𝑆)≤𝑒𝜖𝑃(𝐴(𝐷′)∈𝑆)P(A(D)∈S)≤eϵP(A(D′)∈S)
Partitioning Data
Partitioning data involves dividing data into distinct, manageable segments, often to isolate sensitive information and
control access more effectively.
1. Role-Based Access Control (RBAC):
RBAC partitions access based on user roles within an organization. Each role has predefined access permissions.
Roles: Defined based on job functions.
Permissions: Assigned to roles rather than individuals.
2. Data Segmentation:
Data segmentation involves dividing data into different categories based on sensitivity or functional requirements.
Example: Separating financial data from customer personal data.
3. Network Segmentation:
Network segmentation involves dividing a network into smaller segments or subnetworks to improve security and
manageability.
Subnets: Each subnetwork can have its security policies and access controls.
Firewalls: Used to control traffic between segments.
Combining Quantification and Partitioning
Combining quantification of leakage with data partitioning can enhance security by:
1. Assessing Partition Effectiveness:
Use information-theoretic measures to evaluate how well partitioning schemes (e.g., network segmentation)
reduce potential leakage.
Example: Calculate mutual information between segments to ensure minimal inter-segment leakage.
2. Optimizing Access Controls:
Quantify the potential leakage through access controls and adjust RBAC policies to minimize information
exposure.
Example: Use KL-Divergence to compare the probability distributions of accessed data by different roles.
3. Implementing Differential Privacy:
Ensure that partitioned datasets used in analytics provide differential privacy guarantees.
Example: Apply differentially private algorithms to segmented data to ensure privacy-preserving analytics.
Practical Applications
1. Cryptographic Systems:
Leakage-Resilient Cryptography: Design cryptographic algorithms that minimize information leakage through
side-channels. Quantify leakage using mutual information and conditional entropy.
2. Network Security:
Intrusion Detection Systems (IDS): Use information-theoretic measures to detect anomalies and potential
leaks. Partition network traffic data to isolate suspicious segments.
3. Data Privacy:
10
UNIT 1: (HTCS-401) Information Theory ………………………..Er. Shubham Kumar Sir
ENGINEERING ADDAA (Online Tutorial Point)
Privacy-Preserving Data Mining: Apply differential privacy techniques to partitioned datasets to protect
individual privacy while enabling data analysis.
Summary
Quantifying information leakage and partitioning data are essential strategies in cybersecurity to protect sensitive
information. By using measures such as mutual information, conditional entropy, KL-Divergence, and differential privacy,
security professionals can assess and mitigate leakage risks. Combining these quantification methods with effective data
partitioning techniques like RBAC and network segmentation enhances overall data security and ensures robust protection
against unauthorized access and information leakage.
Determining the lower bounds on key size for cryptographic systems is crucial for ensuring secrecy and preventing
unauthorized access. The key size directly impacts the security level of cryptographic algorithms, making it vital to choose
appropriately large keys to resist various types of attacks. Here’s an overview of the considerations and lower bounds for
key sizes in different cryptographic contexts.
Symmetric Key Cryptography
In symmetric key cryptography, the same key is used for both encryption and decryption. The security of symmetric
algorithms is primarily determined by the key length. The key must be large enough to prevent brute-force attacks, where
an attacker tries all possible keys until the correct one is found.
Common Symmetric Algorithms:
AES (Advanced Encryption Standard)
DES (Data Encryption Standard)
Triple DES (3DES)
Lower Bound Considerations:
1. Brute-Force Attack Resistance:
The key size should be large enough to make brute-force attacks infeasible with current and foreseeable
computing power.
2. Recommendations:
DES: Originally used a 56-bit key, now considered insecure due to the feasibility of brute-force attacks.
3DES: Uses three 56-bit keys (168-bit effective key size), offering better security but still considered less
efficient compared to modern algorithms.
AES:
128-bit: Generally considered secure and offers a good balance between security and performance.
192-bit and 256-bit: Provide higher security margins, suitable for long-term security and higher
sensitivity applications.
Current Recommendation: A minimum key size of 128 bits for AES is recommended for general use, with 256 bits for highly
sensitive or long-term secure applications.
Asymmetric Key Cryptography
In asymmetric key cryptography, a pair of keys (public and private) is used. The security of asymmetric algorithms depends
on the difficulty of mathematical problems such as integer factorization or discrete logarithms.
Common Asymmetric Algorithms:
RSA (Rivest-Shamir-Adleman)
ECC (Elliptic Curve Cryptography)
11
UNIT 1: (HTCS-401) Information Theory ………………………..Er. Shubham Kumar Sir
ENGINEERING ADDAA (Online Tutorial Point)
1. Security Models:
Define the capabilities of attackers and the security goals of the system (e.g., confidentiality, integrity,
authenticity).
Common Models:
IND-CPA (Indistinguishability under Chosen Plaintext Attack): Ensures that an attacker cannot
distinguish between encryptions of different plaintexts.
IND-CCA (Indistinguishability under Chosen Ciphertext Attack): Extends IND-CPA by considering
attackers who can access a decryption oracle.
2. Reductionist Security Proofs:
Show that breaking a cryptographic scheme would require solving a well-known hard problem (e.g., factoring
large integers, discrete logarithm problem).
Example: Proving that breaking RSA encryption is as hard as factoring the product of two large prime
numbers.
3. Security Proof Techniques:
Game-Based Proofs: Define a game between a challenger and an adversary, where the adversary’s success in
breaking the scheme implies solving a hard problem.
Simulation-Based Proofs: Show that any attack on the cryptographic scheme can be simulated within a certain
model, proving the scheme’s security.
Summary
1. Authentication:
Ensures the identity of users and devices using methods like passwords, 2FA/MFA, PKI, and biometrics.
Protocols like Kerberos, OAuth, and TLS are used to implement secure authentication systems.
2. Secret Sharing:
Distributes a secret among multiple participants, requiring a threshold number of shares to reconstruct the
secret.
Shamir’s and Blakley’s schemes are popular methods for secret sharing, with extensions like Verifiable Secret
Sharing for added security.
3. Provable Security:
Provides mathematical guarantees about the security of cryptographic protocols.
Uses security models and reductionist proofs to demonstrate that breaking a scheme implies solving a known
hard problem.
By integrating robust authentication mechanisms, secret sharing techniques, and provable security methods, cryptographic
systems can ensure high levels of security, protecting sensitive information from unauthorized access and attacks.
Computational security, also known as computationally secure cryptography, refers to cryptographic systems designed to
be secure against adversaries with bounded computational resources. The concept hinges on the assumption that certain
mathematical problems are computationally infeasible to solve within a reasonable amount of time, given current and
foreseeable computational capabilities.
Key Concepts of Computational Security
1. Intractability Assumptions:
14
UNIT 1: (HTCS-401) Information Theory ………………………..Er. Shubham Kumar Sir
ENGINEERING ADDAA (Online Tutorial Point)
The security of computationally secure cryptographic systems relies on the difficulty of solving specific
mathematical problems. Examples include:
Integer Factorization Problem (for RSA): Given a large integer that is the product of two primes,
finding the prime factors is considered infeasible.
Discrete Logarithm Problem (for Diffie-Hellman and DSA): Given 𝑔𝑎mod 𝑝gamodp, finding 𝑎a is
infeasible for large 𝑝p.
Elliptic Curve Discrete Logarithm Problem (for ECC): Given a point 𝑃P and another point 𝑄Q on an
elliptic curve, finding an integer 𝑘k such that 𝑄=𝑘𝑃Q=kP is infeasible.
2. Probabilistic Polynomial Time (PPT):
A cryptographic algorithm is considered computationally secure if no probabilistic polynomial-time adversary
(an algorithm that runs in polynomial time and may use randomness) can break the system with non-
negligible probability.
3. Negligible Function:
A function 𝑓(𝑛)f(n) is negligible if, for every polynomial 𝑝(𝑛)p(n), there exists an 𝑛0n0 such that for all
𝑛>𝑛0n>n0, 𝑓(𝑛)<1𝑝(𝑛)f(n)<p(n)1. In other words, 𝑓(𝑛)f(n) decreases faster than the inverse of any polynomial.
In computational security, the probability of an adversary successfully breaking the system should be
negligible.
Examples of Computational Security
1. Symmetric Key Cryptography:
AES (Advanced Encryption Standard): AES with a key size of 128 bits is considered computationally secure
because the best known attack, a brute-force search, requires 21282128 operations, which is computationally
infeasible.
2. Asymmetric Key Cryptography:
RSA: The security of RSA relies on the difficulty of factoring large composite numbers. With current
technology, a key size of at least 2048 bits is recommended to ensure computational security.
ECC (Elliptic Curve Cryptography): ECC offers equivalent security with smaller key sizes compared to RSA. For
example, a 256-bit key in ECC is considered to provide security comparable to a 3072-bit key in RSA.
Computational Security in Practice
1. Key Management:
Ensuring that key sizes are large enough to be secure against current and future computational capabilities.
Regularly updating and rotating keys to mitigate the risk of compromised keys.
2. Algorithm Selection:
Choosing cryptographic algorithms based on their proven security properties and resistance to known attacks.
Using standardized and well-reviewed algorithms to ensure robustness.
3. Security Proofs:
Providing formal proofs that relate the security of a cryptographic system to the hardness of underlying
mathematical problems.
Using reductionist proofs to show that breaking the cryptographic system implies solving a hard problem.
Quantum Computing Considerations
15
UNIT 1: (HTCS-401) Information Theory ………………………..Er. Shubham Kumar Sir
ENGINEERING ADDAA (Online Tutorial Point)
The advent of quantum computing poses a threat to computational security because quantum algorithms can solve certain
mathematical problems more efficiently than classical algorithms. For example:
Shor's Algorithm: Efficiently factors large integers and computes discrete logarithms, potentially breaking RSA and
ECC.
Grover's Algorithm: Provides a quadratic speedup for brute-force search, impacting symmetric key algorithms like
AES.
Post-Quantum Cryptography
To address these threats, researchers are developing quantum-resistant algorithms that remain secure even in the presence
of quantum computers. Examples include:
Lattice-Based Cryptography: Based on the hardness of lattice problems, which are believed to be resistant to
quantum attacks.
Hash-Based Cryptography: Using cryptographic hash functions to construct secure systems.
Code-Based Cryptography: Based on the difficulty of decoding random linear codes.
Multivariate Quadratic Equations: Involves solving systems of multivariate quadratic equations, considered hard for
both classical and quantum computers.
Summary
Computational security ensures that cryptographic systems are secure against adversaries with limited computational
resources. This security is based on the intractability of specific mathematical problems and the assumption that these
problems cannot be solved efficiently. As advancements in quantum computing pose new challenges, the field of post-
quantum cryptography is evolving to develop algorithms that remain secure in a quantum computing world. Regularly
updating cryptographic practices and transitioning to quantum-resistant algorithms will be crucial for maintaining
computational security in the future.
A symmetric cipher is a type of cryptographic algorithm where the same key is used for both encryption and decryption.
These ciphers are widely used due to their efficiency and effectiveness in securing data. Symmetric ciphers can be broadly
classified into two types: block ciphers and stream ciphers.
Types of Symmetric Ciphers
1. Block Ciphers:
Block ciphers encrypt data in fixed-size blocks, typically 64 or 128 bits. Each block is processed independently,
using the same key for each block.
Examples of block ciphers include:
AES (Advanced Encryption Standard):
Encrypts data in 128-bit blocks.
Supports key sizes of 128, 192, and 256 bits.
Widely used for securing sensitive data and communications.
DES (Data Encryption Standard):
Encrypts data in 64-bit blocks.
Uses a 56-bit key.
Now considered insecure due to its short key length.
3DES (Triple DES):
Encrypts data using three successive DES operations.
16
UNIT 1: (HTCS-401) Information Theory ………………………..Er. Shubham Kumar Sir
ENGINEERING ADDAA (Online Tutorial Point)
Provides a longer effective key length (168 bits) but is less efficient than AES.
Blowfish:
Encrypts data in 64-bit blocks.
Supports variable key sizes from 32 to 448 bits.
Known for its simplicity and speed.
2. Stream Ciphers:
Stream ciphers encrypt data one bit or byte at a time, producing a continuous stream of ciphertext.
Examples of stream ciphers include:
RC4:
A widely used stream cipher that generates a pseudorandom keystream.
Known for its simplicity and speed but has vulnerabilities that make it unsuitable for new
applications.
ChaCha20:
A modern stream cipher designed for high performance and security.
Uses a 256-bit key and a 64-bit nonce.
Modes of Operation
Block ciphers can be used in various modes of operation to enhance their security and usability. Common modes include:
1. ECB (Electronic Codebook):
Each block of plaintext is encrypted independently.
Simple but vulnerable to pattern attacks since identical plaintext blocks produce identical ciphertext blocks.
2. CBC (Cipher Block Chaining):
Each plaintext block is XORed with the previous ciphertext block before encryption.
Requires an initialization vector (IV) for the first block.
Provides better security than ECB by introducing dependency between blocks.
3. CFB (Cipher Feedback):
Converts a block cipher into a stream cipher.
Encrypts the previous ciphertext block and XORs the result with the current plaintext block.
Requires an IV.
4. OFB (Output Feedback):
Converts a block cipher into a stream cipher.
Generates keystream blocks, which are XORed with the plaintext blocks.
Requires an IV and is resistant to transmission errors.
5. CTR (Counter):
Converts a block cipher into a stream cipher.
Encrypts a counter value, which is then XORed with the plaintext block.
Allows random access to encrypted data blocks and is highly parallelizable.
Security Considerations
1. Key Management:
The security of symmetric ciphers relies on keeping the key secret.
Keys must be securely generated, distributed, and stored.
17
UNIT 1: (HTCS-401) Information Theory ………………………..Er. Shubham Kumar Sir
ENGINEERING ADDAA (Online Tutorial Point)
Conclusion
Symmetric ciphers are a fundamental tool in cryptography, offering efficient and secure methods for data encryption.
Understanding the different types of symmetric ciphers, their modes of operation, and security considerations is crucial for
implementing robust cryptographic systems. While symmetric ciphers are highly effective, proper key management and
adherence to best practices are essential to maintaining their security.