SANS - Draft - Critical AI Security Controls V1.1
SANS - Draft - Critical AI Security Controls V1.1
AI Security
Guidelines, v1.1
Access Controls
Data Protection
Deployment Strategies
Inference Security
Monitoring
Our initial goal was to identify and establish technical security considerations for
implementing and utilizing AI within an environment. However, as this space grows, our
recommendations have evolved to include governance and compliance frameworks. Topics
such as context window management, AI bill-of-materials (AIBOM), and adherence to
evolving regulatory standards like the EU AI Act are explored to guide organizations and
readers in navigating a complex AI landscape.
From applying least privilege principles to managing vector database access, this paper
highlights risks associated with unauthorized interactions and data tampering, and potential
data leakage, providing actionable recommendations to mitigate AI-centric concerns and
vulnerabilities. Specific emphasis also is placed on guarding against adversarial threats,
including model poisoning, prompt injection, and exploitation of public models.
With the rapid advancements in AI, the security landscape is in constant flux. During our
editing period, Hangzhou-based DeepSeek released its controversial chatbot application
and the U.S.-based Stargate AI infrastructure project emerged. As these events—among
many others we will witness—illustrate, securing AI is an ongoing challenge requiring
adaptive controls. Thus, this paper captures point-in-time security recommendations.
Thus, this paper captures point-in-time safety and security recommendations. At SANS,
we’ve discovered that many organizations are implementing AI slowly, using risk-based
approaches to determine where integrations will be most advantageous. It is in these
processes we offer our advice. By considering and/or implementing these comprehensive
measures, enterprises can securely leverage AI capabilities while minimizing risks, ensuring
operational efficiency and trustworthiness in an era of increasing reliance on generative and
machine learning (ML) models.
A
ccess Controls
Protect Your Model Parameters We’ve seen AI access controls range from hard-coded
It is critical to ensure traditional security controls, such as principle privileges to zero trust principles, especially as
access extends beyond individual users to connected
of least privilege and strong access controls with accountability,
devices, applications, APIs, and other systems. As you
have been implemented. Should an unauthorized individual be deploy AI capabilities throughout your organization,
able to replace or modify a deployed model, untold damage can we recommend utilizing least privilege principles
result. Although this applies to any kind of AI model developed and and integrating with your current authentication
mechanisms. However, if you plan to move toward
deployed by an enterprise—including training models, which are
stronger access controls, such as zero trust, consider
highly susceptible to poisoning and attacks—consider the example of AI in this transition.
a generative agentic system leveraging a large language model (LLM).
Imagine a bad actor has found an avenue that allows for tampering with the deployed
model or the code that drives it. What if the attacker were to tamper with the model or
prompt for a model that has the role of an auditor agent? If this auditor agent is relied
upon by the rest of the ensemble solution to decide whether responses are appropriate,
it may suddenly become possible to cause inappropriate responses to be generated, since
the auditor, which acts as a gatekeeper, has been subverted.
Extending this to other applications, if an AI model is being used to decide whether some
activity is malicious or not, allowed or disallowed, and the model can be subverted or
replaced, the model is no longer a reliable control.
1
https://owaspai.org/goto/runtimemodeltheft/
Protecting augmentation data requires more than just applying access controls. Data
stored in these databases should be treated as sensitive, especially if it influences LLM
responses. If tampered with, this data can cause models to generate mislreading or
dangerous outputs.
In addition to enforcing least-privilege access models for both read and write operations,
organizaitons should implement secure upload pipelines, logging and auditing of changes,
and validation mechanisms to detect unauthorized modifications. Encryption at rest and
in transit, along with digital signing of documents or chunks, can enhance trust in the
augmentation layer.
Data Protection
Protecting training data is critical to ensuring AI models maintain integrity and reliability.
Without proper safeguards, adversaries can manipulate training data and introduce
vulnerabilities. Figure 1 outlines the techniques for securing sensitive data, preventing
unauthorized modifications, and enforcing strict governance over data usage.
Defend Training Data Avoid Data Commingling Limit Sensitive Prompt Content
• M
odels are only as good as their • L everaging enterprise data allows • A ttackers with unauthorized access to an
training data. Adversarial access can for better grounded applications. organization’s prompts can infer sensitive
negatively impact training data, information such as internal business processes,
hiding malicious activities. • S ensitive data should be sanitized proprietary data, decision logic, or even personally
or anonymized prior to LLM identifiable information (PII).
• N
ot just in the context of LLMs; be corporation.3
concerned with data used to train a • L imit the inclusion of sensitive content in prompts
non-LLM model that will make wherever possible and avoid using prompts to pass
security or operational decisions. confidential information to LLMs and data exposure.
2
“Mitigating Security Risks in Retrieval Augmented Generation (RAG) LLM Applications,” November 2023,
https://cloudsecurityalliance.org/blog/2023/11/22/mitigating-security-risks-in-retrieval-augmented-generation-rag-llm-applications
3
“ Census Bureau Director Defends Use of Differential Privacy,” December 2022,
https://epic.org/census-bureau-director-defends-use-of-differential-privacy/
When weighing where and how to host AI solutions, be sure to think carefully about
and codify legal requirements in any contracts. For example, will your data ever be
used or retained by the provider for training or refining a model? If the provider claims
that they will not store or use your data, what steps have been taken to prevent your
data from being logged when sent to and processed by the API endpoint? How are
these logs controlled? How long are they stored? Who has access to them?
Many organizations are, rightly, trying to leverage LLMs for knowledge retrieval and
customer interaction. The approach some are taking is to fine-tune the model to
handle specific types of questions and to introduce additional information into
the model. For deployment, a great deal of effort goes into prompt engineering to
prevent the model from disclosing information that a particular user might not have
the right to access.
For example, the LLM answers questions for employees and customers. Employees
have the right to access more information than customers. Attempting to implement
these types of controls within the model is error prone and can often be easily
subverted. Instead, consider a RAG-style approach with ACLs applied in the vector
retrieval system from which responses are generated. This eliminates the need
to attempt to implement these guardrails in the LLM. This approach also has the
not-so-subtle benefit of limiting the likelihood of so-called hallucinations in the
responses from the LLM.
In addition, organizations should pay close attention to the use of function calling,
especially in agentic AI systems. If not properly scoped, function calls may allow
models to invoke external tools or actions beyond their intended purpose. Limit
access to critical functions and monitor usage.
Models also may be created by bad actors with architectural backdoors in them.
The idea of a backdoor like this would be to invoke a specific behavior in response
to a specific input. Once a backdoor is created inside of a model, it can be difficult,
if not impossible, to remove it via fine turning. This becomes an issue if a user
inadvertently triggers the backdoor or, if exposed outside the organization, a bad
actor looks for their backdoor across a swath of models.5
4
“Exploiting ML models with pickle file attacks: Part 1,” June 2024,
https://blog.trailofbits.com/2024/06/11/exploiting-ml-models-with-pickle-file-attacks-part-1/
5
“ Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training,” January 2024,
www.anthropic.com/research/sleeper-agents-training-deceptive-llms-that-persist-through-safety-training
• How much effort would be required to properly sandbox this generated code?
• If you are already building agent-based solutions, have your dev teams
thought about this issue? What safeguards have been introduced?
Remember that developers are often hyper-focused on use cases and are
notoriously bad at threat modeling and abuse cases.6 Beyond malicious actors
creating models with code or backdoors, there is always the risk that models have
not been fully tested. Understand that by using public models, you are placing trust
in an unknown data scientist to create a model with no flaws and no incorrect or
unadvertised behaviors.
Luckily, some testing can be done in house. Manually red team all imported models.
Host vetted models in an internal model garden for developers to easily obtain. Red
team all solutions that are built, when these models or in-house models are used,
specifically attempting to find abuse cases.
For more reference, see OWASP’s AI Supply Chain guidance, part of their larger AI
Exchange.7
6
“Backdoors in Computational Graphs,” October 2024, https://hiddenlayer.com/innovation-hub/shadowlogic/
7
https://owaspai.org/goto/supplychainmanage/
Additionally, guardrails can be integrated with other LLMs. In this case, the user’s request and
the LLM’s response are passed to another LLM to detect any “trickery” attempts.8
Even with these guardrails, recognize that you cannot rely upon them to be infallible. Bad actors
are notoriously good at coming up with creative ways to convince an AI model to do things its
guardrails specifically prohibit. Ultimately, if there is information or actions that your AI should
never disclose or take, the wiser course is to ensure your model does not have access to that
information and is not given access that can lead to those actions.
8
“ The landscape of LLM guardrails: intervention levels and techniques,” June 2024,
www.ml6.eu/blogpost/the-landscape-of-llm-guardrails-intervention-levels-and-techniques
9
https://owasp.org/www-project-top-10-for-large-language-model-applications/
Modality
Although amazing, multimodal implementations increase the attack surface. Common
sense and research suggest safety and alignment can prove inconsistent across different
modalities. As an example, a text-only prompt that might have been considered unsafe
could be allowed if the text were instead submitted as an image.
It has been shown that multilingual and multicharacter models can introduce new
vulnerabilities and expand the attack surface. Multilingual jailbreak challenges have
been observed when utilizing prompts in a language other than the primary training
data.10 Research has shown this can result in jailbreaking or providing instructions to
deliberately attack vulnerable LLMs. The same is true for character sets, which have
been shown to increase hallucinations and comprehension errors.11 Additional research
highlights that when instructions involve Unicode characters outside the standard Latin
or variants of other languages, a reduction in guardrail efficiency is observed.
Encoding/Decoding
Many foundation models, even relatively small ones, often can handle input and output
using different encoding schemes. Encoded prompt/response data might be able to bypass
security, safety, and alignment measures. Testing by this paper’s authors has shown models
often could handle Base64, Hex, or Morse encoded data input without even being explicitly
told the formatting or asking for decoding. This even includes smaller and/or open models
like GPT-4o mini, Gemini Flash 1.5, Claude 3.5 Haiku, Llama 3.1, DeepSeek v2.5.
Compression/Decompression
Another means of input/output obfuscation available to adversaries could include
methods of compression and decompression. Support for handling various compression
and decompression schemes varies substantially across model implementations.
10
“Multilingual Jailbreak Challenges in Large Language Models,” March 2024, https://arxiv.org/abs/2310.06474
11
“Impact of Non-Standard Unicode Characters on Security and Comprehension in Large Language Models,” May 2024, https://arxiv.org/abs/2405.14490
Observing API usage for misuse is critical. Anomalous spikes in API usage can serve as
an effective detection method for abuse, while rate limiting should be considered to
restrict the number and cadence of interactions allowed. In addition to rate limiting,
organizations also should consider other forms of behavior/anomaly detection.
Although not limited
to interactions through APIs, adversaries can easily automate inputs to exposed API
endpoints, making them more susceptible to volumetric attacks. To mitigate this risk,
internal training or inference API endpoints, should not be public facing.
With the evolving regulatory landscape surrounding AI, organizations must establish
governance frameworks that align with industry standards and legal requirements. This
section discusses the importance of AI risk management, model registries, AI bill of
materials (AIBOMs), and regulatory adherence to ensure ethical and compliant AI usage.
Maintain an AI Bill-Of-Materials
LLM applications depend upon a complex underlying ecosystem for their functionality. Modeled
after software bill of materials (SBOM), creation and maintenance of an AIBOM can provide better
visibility into relevant aspects of the AI supply chain, including considerations of dataset and
model provenance. AIBOMs contain technical details that are useful to adversaries in attacking LLM
applications. Care should be taken to limit the disclosure of AIBOMs.
Model Registries
Model registries are centralized repositories that track and manage ML models through their life cycle,
from development to deployment. These can be a valuable addition to your AI deployment workflows,
providing security and governance benefits. Registries track model versions, dependencies, and
training data, ensuring full traceability and enabling rollback, if needed.
Benefits also include:
• A
ccess controls to prevent unauthorized modifications or deployments
• M
onitoring and drift detection to track performance over time, detecting adversary manipulation
• R
eproducibility and consistency, ensuring that models are deployed with correct configurations and
dependencies, preventing unauthorized changes that could introduce vulnerabilities
• Secure storage of model artifacts and associated metadata, preventing unauthorized storage
• C
I/CD integration, enabling automated checks and validation during the model deployment process
Though not mandated, tracking to and adherence with other AI/LLM security frameworks or
guidance like SANS AI Security Controls, NIST AI Risk Management Framework, MITRE ATLAS,™ or
OWASP Top 10 for LLM also can prove beneficial.
Artificial Intelligence Act European Union August 2024 Establishes a risk-based classification system for AI
applications12
ELVIS Act United States March 2024 Addresses unauthorized use of AI in replicating
voices and likenesses13
Executive Order 14110: Safe, Secure, and Trustworthy United States October 2023 Defines national policy goals for AI governance and
Development and Use of Artificial Intelligence mandates agency actions14
Framework Convention on Artificial Intelligence Council of Europe September 2024 Emphasizes human rights and democratic values in
AI development15
Interim Measures for the Management of Generative China August 2023 Ensures generative AI aligns with socialist values and
AI Services (生成式人工智能服务管理暂行办法) prevents misuse16
Israel’s Policy on Artificial Intelligence Regulation Israel December 2023 Advocates for a sector-based, risk-oriented approach
and Ethics to AI regulation17
Safe and Secure Innovation for Frontier Artificial United States September 2024 Mandates safety tests for powerful AI models to
Intelligence Models Act mitigate catastrophic risks18
Utah’s Artificial Intelligence Policy Act Utah, USA March 2024 Establishes liability and oversight for generative
AI usage19
12
https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:52021PC0206
13
www.capitol.tn.gov/Bills/113/Bill/SB2096.pdf
14
www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence
15
www.coe.int/en/web/artificial-intelligence/the-framework-convention-on-artificial-intelligence
16
www.cac.gov.cn/2023-07/13/c_1690898327029107.htm
17
www.gov.il/BlobFolder/policy/ai_2023/en/Israels%20AI%20Policy%202023.pdf
18
https://leginfo.legislature.ca.gov/faces/billNavClient.xhtml?bill_id=202320240SB1047
19
https://le.utah.gov/~2024/bills/static/SB0149.html
Governance, risk management, and compliance frameworks play a critical role in responsible
AI implementations. With continuous testing, monitoring, and adherence to evolving
regulatory requirements, organizations can maintain AI reliability and mitigate potential
security risks. Furthermore, a multilayered approach to inference security, including strict
input validation and output filtering, is necessary to prevent model exploitation.