0% found this document useful (0 votes)
8 views42 pages

L3 Comp1806 2024

Uploaded by

fahahm502
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views42 pages

L3 Comp1806 2024

Uploaded by

fahahm502
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

COMP1806: Information Security

Lecture 3 – Security and Privacy for Machine Learning


Building Trust in AI

Dr. Sakshyam Panda


Centre for Sustainable Cyber Security
s.panda@greenwich.ac.uk

11th October 2024

COMP1806: Information Security


fundamentals

COMP1806: Information Security 2


Introduction to Machine Learning (ML)
◉ Machine Learning (ML) allows computers to learn and make decisions
without being explicitly programmed
◉ Traditional vs. ML: Traditional programming is rule-based, while ML
involves training models using data

rules Output/result data ML model


Traditional predictions
Programming /return value labels training
data
etc

new data

◉ Key Components: Algorithms, Data, and Models


◉ Applications: From everyday apps (like voice assistants and email filters)
to specialised use-cases (like medical diagnoses and stock market
predictions)

COMP1806: Information Security 3


AI vs Machine Learning
◉ AI is a broad field of computer science that's focused on creating
systems capable of performing tasks that require human intelligence
➢ AI systems can be rule-based and might not learn from data
➢ e.g., a classic expert system in AI could be based on a fixed set of rules

◉ ML is a subset of AI centered on the idea that machines can be given


access to data and can learn to give accurate predictions or decisions
➢ Systems improve their performance on a task over time or with more data
❖ supervised learning, unsupervised learning, reinforcement learning, …

◉ Examples:
➢ AI: Rule-based expert systems, robotics, natural language processing tools,
and even some machine learning systems fall under AI
➢ ML: Algorithms such as neural networks, decision trees, k-means clustering,
and linear regression are examples of ML

COMP1806: Information Security 4


The central role of data
◉ Heart of ML: ML models are as good as the data they're trained on
◉ Economic Value: Data has become a valuable resource, often termed the
"new oil"
◉ Granularity: ML often requires detailed, granular data for accurate
predictions, increasing potential privacy concerns
◉ Diversity of Data: Ranges from personal preferences in shopping,
medical histories, financial records to social interactions

data

COMP1806: Information Security 5


AI assets

“With great data comes great responsibility.”

COMP1806: Information Security 6


risks and requirements

COMP1806: Information Security 7


When things go wrong: Risks of neglecting security
◉ Data Breaches: Unauthorised access can expose sensitive user data
◉ Model Manipulation: Attackers can alter ML predictions, leading to incorrect
decisions
◉ Loss of Trust: Once trust is lost, it's challenging to regain; users might abandon a
product or service
◉ Financial & reputational risks: Breaches can result in financial losses, legal
consequences, and damage to a company's reputation
◉ Long-Term Impacts: Compromised ML can have cascading effects, especially if
integrated into critical systems like healthcare or finance

Benefits Potential risks


• Convenience • Data breaches
• Personalisation • Misuse
• Automation • Deception
• Hallucination
• ….

COMP1806: Information Security 8


Vulnerabilities in ML: an overview
◉ Weaknesses or gaps in a security system that can be exploited by threats to gain
unauthorised access or cause harm
◉ Why ML Models are Vulnerable:
➢ High Complexity: Deep learning models, especially, can have millions of parameters
➢ Data Dependence: Reliance on data makes them susceptible to data-based attacks
➢ Transparency: Open-source models and codes make it easier for attackers to understand
and exploit
◉ Data Leakage: When training data, which may include sensitive details, inadvertently
exposes private information due to model's overfitting or improper data handling
◉ Transferability: Exploiting the fact that adversarial examples can often be transferred
between different models
➢ Makes defending against adversarial attacks more challenging
➢ Suggests that attackers don't necessarily need direct access to a model to exploit it

COMP1806: Information Security 9


Data Privacy: The Essence of Responsible Machine Learning

◉ Data privacy: the protection of sensitive data from unauthorised access and misuse, ensuring
that individuals' personal information is used in a fair, secure, and legitimate manner
◉ Why it is important:
➢ Regulatory Compliance: Laws (GDPR) mandate stringent data protection measures
➢ User Trust: Ensuring privacy bolsters user confidence in adopting ML-powered solutions
➢ Ethical Responsibility: An obligation to protect individuals' privacy and prevent misuse of data
◉ Why Data Privacy is Central to ML:
➢ Training Requirement: ML models require vast amounts of data for training,
increasing the risk of privacy breaches
➢ Predictive Power: Advanced ML models can infer sensitive attributes, even if they
weren't part of the training data
➢ Real-world Implications: ML applications often impact real people – from healthcare
diagnostics to financial predictions
◉ Challenges in Ensuring Data Privacy:
➢ Data collection: Accumulating data without informed consent
➢ Storage: Ensuring that data at rest is protected from breaches
➢ Transmission: Safeguarding data in transit from eavesdropping or MITM attacks
➢ Inference attacks: Techniques where attackers can deduce sensitive information
from model outputs

COMP1806: Information Security 10


Attacks

COMP1806: Information Security 11


Navigating the threat landscape in ML
1. Data poisoning (during training): insert, modify, or delete examples from the training
dataset such that the model learns an incorrect or biased behaviour (incorrect outputs)
2. Adversarial attacks (during inference): adding carefully crafted perturbations to the input
data that are often imperceptible to humans but can drastically change the model's
prediction (incorrect outputs) – also called Evasion attack
3. Model inversion: deducing input data from model outputs compromising privacy
4. Membership inference attacks: Revealing if a particular individual's data was used in
training, which might be sensitive information
5. Model theft: replicating proprietary models without authorisation and without access to
the original training data
Data
poisoning
attacks

Model
theft

Membership
inference
attacks
Adversarial
Attacks

Model
inversion

COMP1806: Information Security 12


Poisoning attacks (1/2)
◉ Target the training phase of a machine learning pipeline, aiming to corrupt,
modify, or insert malicious data such that the model learns an incorrect or
biased behaviour
◉ How it works:
➢ Malicious Injection: Attackers insert carefully crafted data points into the
training dataset
➢ Model Corruption: Over time, as the model learns from the poisoned data, it
adopts undesired behaviours or biases
➢ Subtle Effects: The changes to the dataset might be minor, making detection
challenging
❖ Yet, their effects on the trained model can be significant

COMP1806: Information Security 13


Poisoning attacks (2/2)
◉ Types:
➢ Targeted: Attackers aim to achieve a specific misbehaviour in the model, like a backdoor trigger
➢ Random: Attackers aim to degrade the overall performance of the model
◉ Risks involved:
➢ Model degradation: The primary objective can be to decrease the model's overall accuracy or
effectiveness
➢ Backdoor attacks: Introducing a specific pattern or "trigger" in the model. When the model sees this
trigger in the input, it produces a predetermined (often incorrect) output.
➢ Strategic misclassification: Ensuring the model misclassifies specific inputs, beneficial to the attacker
◉ Real-world examples:
➢ Imagine a facial recognition system where an attacker, by poisoning the training data, could make the
system consistently misidentify or fail to recognise certain individuals
➢ Real case: evading detection of malware (Cylance)

COMP1806: Information Security 14


Adversarial Attacks: Sketch

COMP1806: Information Security 15


Evasion Attacks (1/2)
◉ Intentional perturbations to input data aiming to deceive machine learning
models, causing them to misclassify or produce incorrect outputs
◉ Adversarial attacks typically occur during the inference stage of a trained
machine learning model
◉ Targeted vs. Non-targeted: Attacks can be aimed to make the model produce a
specific erroneous output (targeted) or just any incorrect output (non-targeted)
◉ Why ML Models are Susceptible: Models tend to generalise patterns, but they
might overfit or "memorise" adversarial examples if exposed to
➢ Overfitting: model learns the training data too closely, including its noise
and outliers, resulting in poor generalisation to new or unseen data
◉ Example: Given a trained image classifier, an adversarial attack might add a small
amount of noise to an image of a cat such that the model now misclassifies it as
a dog, even though the perturbed image still looks like a cat to a human

COMP1806: Information Security 16


Evasion Attacks (2/2)
◉ Types of adversarial attacks:
➢ White-box: Attackers have complete knowledge of the model including its architecture,
trained parameters, and even the training data
❖ Easier to craft successful adversarial examples due to comprehensive model access
➢ Black-box: Attackers have no or limited knowledge of the model's internal workings
❖ They can only access the model's input and output, without knowing its internal workings or
parameters - reduced risk since attacker is operating with limited information
➢ The nature of the attack (white-box or black-box) greatly influences the strategies used by
both attackers and defenders
❖ Awareness of both types is essential for developing robust ML models
◉ Real world implications:
➢ Autonomous Vehicles: Misinterpreting traffic signs
➢ Face Recognition: Bypassing security measures
➢ Medical Diagnostics: Incorrectly diagnosing medical images

COMP1806: Information Security 17


Evasion Attacks: Examples
◉ Distorting an image so a model misclassifies it
◉ Tweaking audio to mislead voice recognition systems
Google showed in 2015 that popular object
detection algorithm classified the left image
as “panda” and the right one as “gibbon.”
And oddly enough, it had more confidence
in the gibbon image.
❖ The right image has undergone subtle manipulations that go unnoticed to the human
eye while making it a totally different sight to the digital eye of a machine learning
algorithm

researchers demonstrated that audio embedded in a video


soundtrack, music stream, or radio broadcast could be modified to
trigger voice commands in automatic speech-recognition systems
without being detected by a human listener. Most listeners couldn't
identify issues with the altered songs.

COMP1806: Information Security 18


Membership Inference Attacks (1/2)
◉ They determine whether a particular data point was part of the training dataset
of a machine learning model
◉ How it Works:
➢ Model access: Attackers typically have black-box access, meaning they can query the
model and observe outputs but don't have direct access to model parameters or
architecture
➢ Confidence scores: Attackers leverage the confidence scores (e.g., softmax
probabilities) produced by the model
❖ Models often produce higher confidence scores for data they've seen (training data) than for
unfamiliar data
➢ Comparison: By comparing the confidence scores of known members and non-
members of the training set, attackers can infer if a data point was likely in the
training set.

COMP1806: Information Security 19


Membership Inference Attacks (2/2)
◉ Risks involved:
➢ Privacy breach: Revealing whether specific data was used for training can
violate the privacy of individuals, e.g., sensitive applications like medical or
financial data
➢ Model leakage: Potentially exposes insights about proprietary or unique
datasets
◉ Real-world Example:
➢ In medical research, knowing that a particular patient's data was used might
reveal that they have a certain condition or disease

COMP1806: Information Security 20


Model Inversion Attacks (1/2)
◉ Aim to reconstruct input data or sensitive attributes given access to a trained
machine learning model and its outputs
◉ How it Works:
➢ Model access: The attacker has access to the model's predictions (either black-box or
white-box access)
➢ Reverse engineering: Based on the model's output, the attacker reverse engineers or
infers sensitive details about the original input data
❖ The attacker trains a separate machine learning model, known as an inversion model, on the
output of the target model (your model)

COMP1806: Information Security 21


Model Inversion Attacks (2/2)
◉ Risks involved:
➢ Privacy Violation: Exposing private and sensitive data, even if the original input
wasn't directly provided to the attacker
➢ Loss of Intellectual Property: In cases where the trained data contains proprietary
information
◉ Real-world Example:
➢ An attacker without direct access to a facial dataset might use a facial recognition
model's predictions to recreate approximate images of the faces in the training set

COMP1806: Information Security 22


Model Extraction Attacks (1/2)
◉ The act of replicating a machine learning model without direct access to its
parameters or training data, typically by only observing its outputs
◉ How it Works:
➢ Query-Based Access: Attackers have black-box access to the model, meaning they
can provide inputs and observe the model's predictions
➢ Mimicry: Attackers train a local model using the queried outputs from the victim
model as ground truth, effectively mimicking its behaviour
➢ Model Extraction: By continuously querying and refining, attackers can approximate
and "steal" the functionality of the target model

COMP1806: Information Security 23


Model Extraction Attacks (2/2)
◉ Risks involved
➢ Intellectual Property Loss: Organisations invest significantly in training custom
models and model theft can bypass this investment
➢ Economic Impact: Businesses that monetise ML models can face financial losses if
their models are replicated and used without authorisation.
➢ Misuse: Stolen models can be deployed in unauthorised environments or used in
adversarial settings
◉ Real-world Example:
➢ A company offers a premium image recognition API. An attacker could use
model theft to create a similar service without investing in data collection or
model training.

COMP1806: Information Security 24


examples

COMP1806: Information Security 25


Beyond the tech: The real-world fallout
◉ Financial Systems: Manipulated stock predictions, fraudulent
transactions
◉ Healthcare: Incorrect diagnoses, unauthorised access to patient data
◉ Autonomous Vehicles: Misreading traffic signs, potential crashes
◉ E-commerce: Fraudulent purchases and recommendation manipulation
◉ Social Media: Spread of misinformation and unauthorised data access

COMP1806: Information Security 26


When bots go rogue: Microsoft's Tay
◉ Tay, a Twitter chatbot developed by Microsoft, was designed to
interact and learn from its conversations with users
◉ What Happened? Within 24 hours of launch, Tay began tweeting
inappropriate and offensive content
◉ Cause: Malicious users exploited the model's learning mechanism,
feeding Tay harmful data
◉ Lessons Learned: Importance of filtering and validating input data,
especially for models that learn in real-time

COMP1806: Information Security 27


The Deepfake dilemma
◉ Deepfakes use deep learning to superimpose images or videos onto source
media, creating realistic-looking fake content
◉ Notable Incidents: Fake videos of politicians or celebrities saying or doing things
they never did
➢ https://www.youtube.com/watch?v=gLoI9hAX9dw
◉ Implications: Misinformation, defamation and security threats
◉ Defensive Measures: Detection algorithms, digital watermarking, legislation
➢ laws, regulations, and other formal enactments by governmental bodies to
address the creation, dissemination, and consequences of deepfakes

COMP1806: Information Security 28


Road hazards: fooling Autonomous Vehicles
◉ Scenario: Research showed that subtle manipulations, like adding stickers to
stop signs, could confuse self-driving cars
◉ Danger: A misinterpreted stop sign could lead to life-threatening situations
◉ Cause: ML models in cars being overly sensitive to slight visual changes
◉ Countermeasures: Multi-modal sensing, redundancy, adversarial training
➢ Redundancy: having backup systems in place to take over if the primary system
fails or is compromised

https://arxiv.org/pdf/2307.08278.pdf

COMP1806: Information Security 29


defences

COMP1806: Information Security 30


Guarding the gold: protecting data in ML

◉ Differential privacy: Adding noise to data or queries to ensure individual


data points aren't identifiable
➢ this can be applied during data collection and model training
◉ Federated learning: Training models across devices without centrally storing
the data -- data privacy as raw data never leaves its origin
➢ the model is trained locally on each device
➢ model updates are shared centrally
◉ Homomorphic encryption: Processing data while it's still encrypted
➢ ML models can be trained on, or make predictions with, encrypted data
without ever needing to decrypt it, ensuring data remains confidential
◉ Data anonymisation: Removing personally identifiable information from
datasets so that individuals cannot readily be identified
➢ safer for use in ML training without compromising privacy

COMP1806: Information Security 31


Defending - Poisoning Attacks
Reminder: Target the training phase of a machine learning pipeline, aiming to corrupt, modify,
or insert malicious data such that the model learns an incorrect or biased behaviour
◉ Data validation: Regularly audit and validate training data sources for anomalies
or unexpected patterns
◉ Model monitoring: Continuously monitor model predictions and retrain the
model if a drop in performance or unexpected behaviour is observed
◉ Robust learning algorithms: Employ algorithms designed to be resilient against
adversarial data or noise
◉ Anomaly detection: Use techniques to identify and remove outlier data points
that might be indicative of poisoning attempts

COMP1806: Information Security 32


Defending - Evasion Attacks
Reminder: adding carefully crafted perturbations to the input data that are often imperceptible to
humans but can drastically change the model’s prediction
◉ Pre-processing Defences:
➢ Modify the input data to neutralise adversarial perturbations
➢ Input validation: Checking for suspicious input alterations
➢ e.g., Image denoising, feature squeezing, spatial smoothing
◉ Post-processing Defences:
➢ Alter the model's outputs after inference to counteract adversarial effects
➢ e.g., Rejecting low-confidence predictions, result calibration
◉ Adversarial Training:
➢ Train the model on adversarial examples to improve its robustness
➢ Makes the model inherently resistant to adversarial perturbations
◉ Model Architecture Defences:
➢ Designing model architectures that are intrinsically robust
❖ Model regularisation: Penalising complex model behaviours that might be overfit to adversarial
samples
◉ Randomisation:
➢ Introduce randomness in the model's layers or inputs to disrupt the adversarial intent

COMP1806: Information Security 33


Defending - Model Inversion Attacks
Reminder: Aim to reconstruct input data or sensitive attributes given access to a
trained machine learning model and its outputs

◉ Differential privacy: Add noise to the model's output, making it harder to


reverse engineer the original data
◉ Model regularisation: Prevent the model from memorising training data too
closely
◉ Access control: Restrict access to the model and its outputs
◉ Output masking: Limit the precision of output values to reduce the amount
of information available to an attacker

COMP1806: Information Security 34


Defending - Membership Inference Attacks
Reminder: They determine whether a particular data
point was part of the training dataset of a machine
learning model
◉ Differential Privacy: Applying noise to model
outputs to ensure that outputs do not reveal
specific information about any single training
example
◉ Limiting output granularity: Instead of giving
detailed confidence scores, give binary or coarser
outputs
◉ Generalisation: Train models to generalise well,
so that they do not exhibit markedly different
behaviours for training and non-training data

COMP1806: Information Security 35


Defending - Model Extraction Attacks
Reminder: The act of replicating a machine learning model without direct access to its
parameters or training data, typically by only observing its outputs
◉ Rate limiting: Restrict the frequency at which users can query the model to prevent
exhaustive extraction
◉ Adding noise: Introduce minor, random noise to predictions, making it more
challenging to replicate the model accurately
◉ Model hardening: Use techniques like knowledge distillation to create models that
are inherently harder to mimic
◉ Monitoring & detection: Monitor API or model access patterns to detect potential
theft behaviours, like repetitive or systematic querying

COMP1806: Information Security 36


MITRE ATLAS
◉ globally accessible, living knowledge base of adversary tactics and
techniques against Al-enabled systems

◉ based on real-world attack observations and realistic demonstrations


from Al red teams and security groups.

https://atlas.mitre.org/

COMP1806: Information Security 37


MITRE ATLAS Matrix

COMP1806: Information Security


Building resilient models
◉ Accuracy: model's general performance
➢ the percentage of correct predictions a model makes
◉ Robustness: ability to perform reliably under adversarial conditions
➢ make correct predictions even when the input data is intentionally and
maliciously perturbed
◉ Need for models to maintain performance under adversarial conditions
➢ Many state-of-the-art models, while achieving high accuracy on
standard datasets, have been shown to be vulnerable to adversarial
attacks
◉ Generalisation: Ability of models to perform well on unseen data
◉ Noise tolerance: Models shouldn’t be overly sensitive to small input
changes
Accuracy Robustness

COMP1806: Information Security 39


tools

COMP1806: Information Security 40


References
◉ “Machine Learning Security Principles” by John Paul Mueller, Rod Stephens,
https://www.oreilly.com/library/view/machine-learning-security/9781804618851/
◉ Google images
◉ Github tools
➢ https://github.com/Trusted-AI/adversarial-robustness-toolbox
➢ https://github.com/topics/adversarial-machine-learning
◉ https://en.wikipedia.org/wiki/Adversarial_machine_learning

COMP1806: Information Security 44


End of week 3!

COMP1806: Information Security 45

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy