0% found this document useful (0 votes)
32 views21 pages

Policies - Kubernetes

Kubernetes policies allow cluster administrators to manage security, resource allocation, and best practices. There are several types of policies including LimitRanges to constrain resource allocation, ResourceQuotas to limit resource consumption, and admission controllers to validate or modify API requests according to certain policies. Policies can also be implemented using ValidatingAdmissionPolicy, dynamic admission control via webhooks, and Kubelet configurations.

Uploaded by

Fantahun Fkadie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views21 pages

Policies - Kubernetes

Kubernetes policies allow cluster administrators to manage security, resource allocation, and best practices. There are several types of policies including LimitRanges to constrain resource allocation, ResourceQuotas to limit resource consumption, and admission controllers to validate or modify API requests according to certain policies. Policies can also be implemented using ValidatingAdmissionPolicy, dynamic admission control via webhooks, and Kubelet configurations.

Uploaded by

Fantahun Fkadie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

6/6/23, 3:59 PM Policies | Kubernetes

Policies
Manage security and best-practices with policies.

1: Limit Ranges
2: Resource Quotas
3: Process ID Limits And Reservations
4: Node Resource Managers

Kubernetes policies are configurations that manage other configurations or runtime


behaviors. Kubernetes offers various forms of policies, described below:

Apply policies using API objects


Some API objects act as policies. Here are some examples:

NetworkPolicies can be used to restrict ingress and egress traffic for a workload.
LimitRanges manage resource allocation constraints across different object kinds.
ResourceQuotas limit resource consumption for a namespace.

Apply policies using admission controllers


An admission controller runs in the API server and can validate or mutate API requests. Some
admission controllers act to apply policies. For example, the AlwaysPullImages admission
controller modifies a new Pod to set the image pull policy to Always .

Kubernetes has several built-in admission controllers that are configurable via the API server
--enable-admission-plugins flag.

Details on admission controllers, with the complete list of available admission controllers, are
documented in a dedicated section:

Admission Controllers

Apply policies using ValidatingAdmissionPolicy


Validating admission policies allow configurable validation checks to be executed in the API
server using the Common Expression Language (CEL). For example, a
ValidatingAdmissionPolicy can be used to disallow use of the latest image tag.

A ValidatingAdmissionPolicy operates on an API request and can be used to block, audit,


and warn users about non-compliant configurations.

Details on the ValidatingAdmissionPolicy API, with examples, are documented in a


dedicated section:

Validating Admission Policy

Apply policies using dynamic admission


control
Dynamic admission controllers (or admission webhooks) run outside the API server as
separate applications that register to receive webhooks requests to perform validation or
mutation of API requests.

Dynamic admission controllers can be used to apply policies on API requests and trigger other
policy-based workflows. A dynamic admission controller can perform complex checks
including those that require retrieval of other cluster resources and external data. For
example, an image verification check can lookup data from OCI registries to validate the
container image signatures and attestations.
https://kubernetes.io/docs/concepts/policy/_print/ 1/21
6/6/23, 3:59 PM Policies | Kubernetes

Details on dynamic admission control are documented in a dedicated section:

Dynamic Admission Control

Implementations

Note: This section links to third party projects that provide functionality required by
Kubernetes. The Kubernetes project authors aren't responsible for these projects, which
are listed alphabetically. To add a project to this list, read the content guide before
submitting a change. More information.

Dynamic Admission Controllers that act as flexible policy engines are being developed in the
Kubernetes ecosystem, such as:

Kubewarden
Kyverno
OPA Gatekeeper
Polaris

Apply policies using Kubelet configurations


Kubernetes allows configuring the Kubelet on each worker node. Some Kubelet configurations
act as policies:

Process ID limts and reservations are used to limit and reserve allocatable PIDs.
Node Resource Managers can manage compute, memory, and device resources for
latency-critical and high-throughput workloads.

https://kubernetes.io/docs/concepts/policy/_print/ 2/21
6/6/23, 3:59 PM Policies | Kubernetes

1 - Limit Ranges
By default, containers run with unbounded compute resources on a Kubernetes cluster. Using
Kubernetes resource quotas, administrators (also termed cluster operators) can restrict
consumption and creation of cluster resources (such as CPU time, memory, and persistent
storage) within a specified namespace. Within a namespace, a Pod can consume as much CPU
and memory as is allowed by the ResourceQuotas that apply to that namespace. As a cluster
operator, or as a namespace-level administrator, you might also be concerned about making
sure that a single object cannot monopolize all available resources within a namespace.

A LimitRange is a policy to constrain the resource allocations (limits and requests) that you
can specify for each applicable object kind (such as Pod or PersistentVolumeClaim) in a
namespace.

A LimitRange provides constraints that can:

Enforce minimum and maximum compute resources usage per Pod or Container in a
namespace.
Enforce minimum and maximum storage request per PersistentVolumeClaim in a
namespace.
Enforce a ratio between request and limit for a resource in a namespace.
Set default request/limit for compute resources in a namespace and automatically inject
them to Containers at runtime.

A LimitRange is enforced in a particular namespace when there is a LimitRange object in that


namespace.

The name of a LimitRange object must be a valid DNS subdomain name.

Constraints on resource limits and requests


The administrator creates a LimitRange in a namespace.
Users create (or try to create) objects in that namespace, such as Pods or
PersistentVolumeClaims.
First, the LimitRange admission controller applies default request and limit values for
all Pods (and their containers) that do not set compute resource requirements.
Second, the LimitRange tracks usage to ensure it does not exceed resource minimum,
maximum and ratio defined in any LimitRange present in the namespace.
If you attempt to create or update an object (Pod or PersistentVolumeClaim) that
violates a LimitRange constraint, your request to the API server will fail with an HTTP
status code 403 Forbidden and a message explaining the constraint that has been
violated.
If you add a LimitRange in a namespace that applies to compute-related resources
such as cpu and memory , you must specify requests or limits for those values.
Otherwise, the system may reject Pod creation.
LimitRange validations occur only at Pod admission stage, not on running Pods. If you
add or modify a LimitRange, the Pods that already exist in that namespace continue
unchanged.
If two or more LimitRange objects exist in the namespace, it is not deterministic which
default value will be applied.

LimitRange and admission checks for Pods


A LimitRange does not check the consistency of the default values it applies. This means that
a default value for the limit that is set by LimitRange may be less than the request value
specified for the container in the spec that a client submits to the API server. If that happens,
the final Pod will not be schedulable.

For example, you define a LimitRange with this manifest:

https://kubernetes.io/docs/concepts/policy/_print/ 3/21
6/6/23, 3:59 PM Policies | Kubernetes

concepts/policy/limit-range/problematic-limit-range.yaml

apiVersion: v1
kind: LimitRange
metadata:
name: cpu-resource-constraint
spec:
limits:
- default: # this section defines default limits
cpu: 500m
defaultRequest: # this section defines default requests
cpu: 500m
max: # max and min define the limit range
cpu: "1"
min:
cpu: 100m
type: Container

along with a Pod that declares a CPU resource request of 700m , but not a limit:

concepts/policy/limit-range/example-conflict-with-limitrange-cpu.yaml

apiVersion: v1
kind: Pod
metadata:
name: example-conflict-with-limitrange-cpu
spec:
containers:
- name: demo
image: registry.k8s.io/pause:2.0
resources:
requests:
cpu: 700m

then that Pod will not be scheduled, failing with an error similar to:

Pod "example-conflict-with-limitrange-cpu" is invalid: spec.containers[0].resourc

If you set both requestand limit , then that new Pod will be scheduled successfully even
with the same LimitRange in place:

concepts/policy/limit-range/example-no-conflict-with-limitrange-cpu.yaml

https://kubernetes.io/docs/concepts/policy/_print/ 4/21
6/6/23, 3:59 PM Policies | Kubernetes

apiVersion: v1
kind: Pod
metadata:
name: example-no-conflict-with-limitrange-cpu
spec:
containers:
- name: demo
image: registry.k8s.io/pause:2.0
resources:
requests:
cpu: 700m
limits:
cpu: 700m

Example resource constraints


Examples of policies that could be created using LimitRange are:

In a 2 node cluster with a capacity of 8 GiB RAM and 16 cores, constrain Pods in a
namespace to request 100m of CPU with a max limit of 500m for CPU and request
200Mi for Memory with a max limit of 600Mi for Memory.
Define default CPU limit and request to 150m and memory default request to 300Mi for
Containers started with no cpu and memory requests in their specs.

In the case where the total limits of the namespace is less than the sum of the limits of the
Pods/Containers, there may be contention for resources. In this case, the Containers or Pods
will not be created.

Neither contention nor changes to a LimitRange will affect already created resources.

What's next
For examples on using limits, see:

how to configure minimum and maximum CPU constraints per namespace.


how to configure minimum and maximum Memory constraints per namespace.
how to configure default CPU Requests and Limits per namespace.
how to configure default Memory Requests and Limits per namespace.
how to configure minimum and maximum Storage consumption per namespace.
a detailed example on configuring quota per namespace.

Refer to the LimitRanger design document for context and historical information.

https://kubernetes.io/docs/concepts/policy/_print/ 5/21
6/6/23, 3:59 PM Policies | Kubernetes

2 - Resource Quotas
When several users or teams share a cluster with a fixed number of nodes, there is a concern
that one team could use more than its fair share of resources.

Resource quotas are a tool for administrators to address this concern.

A resource quota, defined by a ResourceQuota object, provides constraints that limit


aggregate resource consumption per namespace. It can limit the quantity of objects that can
be created in a namespace by type, as well as the total amount of compute resources that
may be consumed by resources in that namespace.

Resource quotas work like this:

Different teams work in different namespaces. This can be enforced with RBAC.

The administrator creates one ResourceQuota for each namespace.

Users create resources (pods, services, etc.) in the namespace, and the quota system
tracks usage to ensure it does not exceed hard resource limits defined in a
ResourceQuota.

If creating or updating a resource violates a quota constraint, the request will fail with
HTTP status code 403 FORBIDDEN with a message explaining the constraint that would
have been violated.

If quota is enabled in a namespace for compute resources like cpu and memory , users
must specify requests or limits for those values; otherwise, the quota system may reject
pod creation. Hint: Use the LimitRanger admission controller to force defaults for pods
that make no compute resource requirements.

See the walkthrough for an example of how to avoid this problem.

Note:
For cpu and memory resources, ResourceQuotas enforce that every (new) pod in
that namespace sets a limit for that resource. If you enforce a resource quota in a
namespace for either cpu or memory , you, and other clients, must specify either
requests or limits for that resource, for every new Pod you submit. If you don't,
the control plane may reject admission for that Pod.
For other resources: ResourceQuota works and will ignore pods in the namespace
without setting a limit or request for that resource. It means that you can create a
new pod without limit/request ephemeral storage if the resource quota limits the
ephemeral storage of this namespace. You can use a LimitRange to automatically
set a default request for these resources.

The name of a ResourceQuota object must be a valid DNS subdomain name.

Examples of policies that could be created using namespaces and quotas are:

In a cluster with a capacity of 32 GiB RAM, and 16 cores, let team A use 20 GiB and 10
cores, let B use 10GiB and 4 cores, and hold 2GiB and 2 cores in reserve for future
allocation.
Limit the "testing" namespace to using 1 core and 1GiB RAM. Let the "production"
namespace use any amount.

In the case where the total capacity of the cluster is less than the sum of the quotas of the
namespaces, there may be contention for resources. This is handled on a first-come-first-
served basis.

Neither contention nor changes to quota will affect already created resources.

https://kubernetes.io/docs/concepts/policy/_print/ 6/21
6/6/23, 3:59 PM Policies | Kubernetes

Enabling Resource Quota


Resource Quota support is enabled by default for many Kubernetes distributions. It is enabled
when the API server --enable-admission-plugins= flag has ResourceQuota as one of its
arguments.

A resource quota is enforced in a particular namespace when there is a ResourceQuota in


that namespace.

Compute Resource Quota


You can limit the total sum of compute resources that can be requested in a given
namespace.

The following resource types are supported:

Resource
Name Description

limits.cpu Across all pods in a non-terminal state, the sum of CPU limits cannot
exceed this value.

limits.memor Across all pods in a non-terminal state, the sum of memory limits
y cannot exceed this value.

requests.cp Across all pods in a non-terminal state, the sum of CPU requests
u cannot exceed this value.

requests.mem Across all pods in a non-terminal state, the sum of memory requests
ory cannot exceed this value.

hugepages- Across all pods in a non-terminal state, the number of huge page
<size> requests of the specified size cannot exceed this value.

cpu Same as requests.cpu

memory Same as requests.memory

Resource Quota For Extended Resources


In addition to the resources mentioned above, in release 1.10, quota support for extended
resources is added.

As overcommit is not allowed for extended resources, it makes no sense to specify both
requests and limits for the same extended resource in a quota. So for extended
resources, only quota items with prefix requests. is allowed for now.

Take the GPU resource as an example, if the resource name is nvidia.com/gpu , and you want
to limit the total number of GPUs requested in a namespace to 4, you can define a quota as
follows:

requests.nvidia.com/gpu: 4

See Viewing and Setting Quotas for more detail information.

Storage Resource Quota


You can limit the total sum of storage resources that can be requested in a given namespace.

https://kubernetes.io/docs/concepts/policy/_print/ 7/21
6/6/23, 3:59 PM Policies | Kubernetes

In addition, you can limit consumption of storage resources based on associated storage-
class.

Resource Name Description

requests.storage Across all persistent volume claims, the sum of


storage requests cannot exceed this value.

persistentvolumeclaims The total number of PersistentVolumeClaims that can


exist in the namespace.

<storage-class- Across all persistent volume claims associated with


name>.storageclass.storag the <storage-class-name> , the sum of storage
e.k8s.io/requests.storage requests cannot exceed this value.

<storage-class- Across all persistent volume claims associated with


name>.storageclass.storag the <storage-class-name> , the total number of
e.k8s.io/persistentvolume persistent volume claims that can exist in the
claims namespace.

For example, if an operator wants to quota storage with gold storage class separate from
bronze storage class, the operator can define a quota as follows:

gold.storageclass.storage.k8s.io/requests.storage: 500Gi

bronze.storageclass.storage.k8s.io/requests.storage: 100Gi

In release 1.8, quota support for local ephemeral storage is added as an alpha feature:

Resource Name Description

requests.ephemera Across all pods in the namespace, the sum of local ephemeral
l-storage storage requests cannot exceed this value.

limits.ephemeral- Across all pods in the namespace, the sum of local ephemeral
storage storage limits cannot exceed this value.

ephemeral-storage Same as requests.ephemeral-storage .

Note: When using a CRI container runtime, container logs will count against the
ephemeral storage quota. This can result in the unexpected eviction of pods that have
exhausted their storage quotas. Refer to Logging Architecture for details.

Object Count Quota


You can set quota for the total number of certain resources of all standard, namespaced
resource types using the following syntax:

count/<resource>.<group> for resources from non-core groups


count/<resource> for resources from the core group

Here is an example set of resources users may want to put under object count quota:

count/persistentvolumeclaims

count/services

count/secrets

count/configmaps

count/replicationcontrollers

count/deployments.apps

count/replicasets.apps
https://kubernetes.io/docs/concepts/policy/_print/ 8/21
6/6/23, 3:59 PM Policies | Kubernetes

count/statefulsets.apps

count/jobs.batch

count/cronjobs.batch

The same syntax can be used for custom resources. For example, to create a quota on a
widgets custom resource in the example.com API group, use count/widgets.example.com .

When using count/* resource quota, an object is charged against the quota if it exists in
server storage. These types of quotas are useful to protect against exhaustion of storage
resources. For example, you may want to limit the number of Secrets in a server given their
large size. Too many Secrets in a cluster can actually prevent servers and controllers from
starting. You can set a quota for Jobs to protect against a poorly configured CronJob. CronJobs
that create too many Jobs in a namespace can lead to a denial of service.

It is also possible to do generic object count quota on a limited set of resources. The following
types are supported:

Resource Name Description

configmaps The total number of ConfigMaps that can exist in the namespace.

persistentvo The total number of PersistentVolumeClaims that can exist in the


lumeclaims namespace.

pods The total number of Pods in a non-terminal state that can exist in the
namespace. A pod is in a terminal state if .status.phase in
(Failed, Succeeded) is true.

replicationc The total number of ReplicationControllers that can exist in the


ontrollers namespace.

resourcequot The total number of ResourceQuotas that can exist in the namespace.
as

services The total number of Services that can exist in the namespace.

services.loa The total number of Services of type LoadBalancer that can exist
dbalancers in the namespace.

services.nod The total number of Services of type NodePort that can exist in the
eports namespace.

secrets The total number of Secrets that can exist in the namespace.

For example, pods quota counts and enforces a maximum on the number of pods created
in a single namespace that are not terminal. You might want to set a pods quota on a
namespace to avoid the case where a user creates many small pods and exhausts the
cluster's supply of Pod IPs.

Quota Scopes
Each quota can have an associated set of scopes . A quota will only measure usage for a
resource if it matches the intersection of enumerated scopes.

When a scope is added to the quota, it limits the number of resources it supports to those
that pertain to the scope. Resources specified on the quota outside of the allowed set results
in a validation error.

https://kubernetes.io/docs/concepts/policy/_print/ 9/21
6/6/23, 3:59 PM Policies | Kubernetes

Scope Description

Terminating Match pods where .spec.activeDeadlineSeconds


>= 0

NotTerminating Match pods where .spec.activeDeadlineSeconds


is nil

BestEffort Match pods that have best effort quality of service.

NotBestEffort Match pods that do not have best effort quality of


service.

PriorityClass Match pods that references the specified priority class.

CrossNamespacePodAffin Match pods that have cross-namespace pod (anti)affinity


ity terms.

The BestEffort scope restricts a quota to tracking the following resource:

pods

The Terminating , NotTerminating , NotBestEffort and PriorityClass scopes restrict a


quota to tracking the following resources:

pods

cpu

memory

requests.cpu

requests.memory

limits.cpu

limits.memory

Note that you cannot specify both the Terminating and the NotTerminating scopes in the
same quota, and you cannot specify both the BestEffort and NotBestEffort scopes in the
same quota either.

The scopeSelector supports the following values in the operator field:

In

NotIn

Exists

DoesNotExist

When using one of the following values as the scopeName when defining the scopeSelector ,
the operator must be Exists .

Terminating

NotTerminating

BestEffort

NotBestEffort

If the operator is In or NotIn , the values field must have at least one value. For example:

scopeSelector:
matchExpressions:
- scopeName: PriorityClass
operator: In
values:
- middle

https://kubernetes.io/docs/concepts/policy/_print/ 10/21
6/6/23, 3:59 PM Policies | Kubernetes

If the operator is Exists or DoesNotExist , the values field must NOT be specified.

Resource Quota Per PriorityClass


FEATURE STATE: Kubernetes v1.17 [stable]

Pods can be created at a specific priority. You can control a pod's consumption of system
resources based on a pod's priority, by using the scopeSelector field in the quota spec.

A quota is matched and consumed only if scopeSelector in the quota spec selects the pod.

When quota is scoped for priority class using scopeSelector field, quota object is restricted
to track only following resources:

pods

cpu

memory

ephemeral-storage

limits.cpu

limits.memory

limits.ephemeral-storage

requests.cpu

requests.memory

requests.ephemeral-storage

This example creates a quota object and matches it with pods at specific priorities. The
example works as follows:

Pods in the cluster have one of the three priority classes, "low", "medium", "high".
One quota object is created for each priority.

Save the following YAML to a file quota.yml .

https://kubernetes.io/docs/concepts/policy/_print/ 11/21
6/6/23, 3:59 PM Policies | Kubernetes

apiVersion: v1
kind: List
items:
- apiVersion: v1
kind: ResourceQuota
metadata:
name: pods-high
spec:
hard:
cpu: "1000"
memory: 200Gi
pods: "10"
scopeSelector:
matchExpressions:
- operator : In
scopeName: PriorityClass
values: ["high"]
- apiVersion: v1
kind: ResourceQuota
metadata:
name: pods-medium
spec:
hard:
cpu: "10"
memory: 20Gi
pods: "10"
scopeSelector:
matchExpressions:
- operator : In
scopeName: PriorityClass
values: ["medium"]
- apiVersion: v1
kind: ResourceQuota
metadata:
name: pods-low
spec:
hard:
cpu: "5"
memory: 10Gi
pods: "10"
scopeSelector:
matchExpressions:
- operator : In
scopeName: PriorityClass
values: ["low"]

Apply the YAML using kubectl create .

kubectl create -f ./quota.yml

resourcequota/pods-high created
resourcequota/pods-medium created
resourcequota/pods-low created

Verify that Used quota is 0 using kubectl describe quota .

kubectl describe quota

https://kubernetes.io/docs/concepts/policy/_print/ 12/21
6/6/23, 3:59 PM Policies | Kubernetes

Name: pods-high
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 1k
memory 0 200Gi
pods 0 10

Name: pods-low
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 5
memory 0 10Gi
pods 0 10

Name: pods-medium
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 10
memory 0 20Gi
pods 0 10

Create a pod with priority "high". Save the following YAML to a file high-priority-pod.yml .

apiVersion: v1
kind: Pod
metadata:
name: high-priority
spec:
containers:
- name: high-priority
image: ubuntu
command: ["/bin/sh"]
args: ["-c", "while true; do echo hello; sleep 10;done"]
resources:
requests:
memory: "10Gi"
cpu: "500m"
limits:
memory: "10Gi"
cpu: "500m"
priorityClassName: high

Apply it with kubectl create .

kubectl create -f ./high-priority-pod.yml

Verify that "Used" stats for "high" priority quota, pods-high , has changed and that the other
two quotas are unchanged.

kubectl describe quota

https://kubernetes.io/docs/concepts/policy/_print/ 13/21
6/6/23, 3:59 PM Policies | Kubernetes

Name: pods-high
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 500m 1k
memory 10Gi 200Gi
pods 1 10

Name: pods-low
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 5
memory 0 10Gi
pods 0 10

Name: pods-medium
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 10
memory 0 20Gi
pods 0 10

Cross-namespace Pod Affinity Quota


FEATURE STATE: Kubernetes v1.24 [stable]

Operators can use CrossNamespacePodAffinity quota scope to limit which namespaces are
allowed to have pods with affinity terms that cross namespaces. Specifically, it controls which
pods are allowed to set namespaces or namespaceSelector fields in pod affinity terms.

Preventing users from using cross-namespace affinity terms might be desired since a pod
with anti-affinity constraints can block pods from all other namespaces from getting
scheduled in a failure domain.

Using this scope operators can prevent certain namespaces ( foo-ns in the example below)
from having pods that use cross-namespace pod affinity by creating a resource quota object
in that namespace with CrossNamespaceAffinity scope and hard limit of 0:

apiVersion: v1
kind: ResourceQuota
metadata:
name: disable-cross-namespace-affinity
namespace: foo-ns
spec:
hard:
pods: "0"
scopeSelector:
matchExpressions:
- scopeName: CrossNamespaceAffinity

If operators want to disallow using namespaces and namespaceSelector by default, and only
allow it for specific namespaces, they could configure CrossNamespaceAffinity as a limited
resource by setting the kube-apiserver flag --admission-control-config-file to the path of the
following configuration file:

https://kubernetes.io/docs/concepts/policy/_print/ 14/21
6/6/23, 3:59 PM Policies | Kubernetes

apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: "ResourceQuota"
configuration:
apiVersion: apiserver.config.k8s.io/v1
kind: ResourceQuotaConfiguration
limitedResources:
- resource: pods
matchScopes:
- scopeName: CrossNamespaceAffinity

With the above configuration, pods can use namespaces and namespaceSelector in pod
affinity only if the namespace where they are created have a resource quota object with
CrossNamespaceAffinity scope and a hard limit greater than or equal to the number of pods
using those fields.

Requests compared to Limits


When allocating compute resources, each container may specify a request and a limit value
for either CPU or memory. The quota can be configured to quota either value.

If the quota has a value specified for requests.cpu or requests.memory , then it requires that
every incoming container makes an explicit request for those resources. If the quota has a
value specified for limits.cpu or limits.memory , then it requires that every incoming
container specifies an explicit limit for those resources.

Viewing and Setting Quotas


Kubectl supports creating, updating, and viewing quotas:

kubectl create namespace myspace

cat <<EOF > compute-resources.yaml


apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-resources
spec:
hard:
requests.cpu: "1"
requests.memory: 1Gi
limits.cpu: "2"
limits.memory: 2Gi
requests.nvidia.com/gpu: 4
EOF

kubectl create -f ./compute-resources.yaml --namespace=myspace

https://kubernetes.io/docs/concepts/policy/_print/ 15/21
6/6/23, 3:59 PM Policies | Kubernetes

cat <<EOF > object-counts.yaml


apiVersion: v1
kind: ResourceQuota
metadata:
name: object-counts
spec:
hard:
configmaps: "10"
persistentvolumeclaims: "4"
pods: "4"
replicationcontrollers: "20"
secrets: "10"
services: "10"
services.loadbalancers: "2"
EOF

kubectl create -f ./object-counts.yaml --namespace=myspace

kubectl get quota --namespace=myspace

NAME AGE
compute-resources 30s
object-counts 32s

kubectl describe quota compute-resources --namespace=myspace

Name: compute-resources
Namespace: myspace
Resource Used Hard
-------- ---- ----
limits.cpu 0 2
limits.memory 0 2Gi
requests.cpu 0 1
requests.memory 0 1Gi
requests.nvidia.com/gpu 0 4

kubectl describe quota object-counts --namespace=myspace

Name: object-counts
Namespace: myspace
Resource Used Hard
-------- ---- ----
configmaps 0 10
persistentvolumeclaims 0 4
pods 0 4
replicationcontrollers 0 20
secrets 1 10
services 0 10
services.loadbalancers 0 2

Kubectl also supports object count quota for all standard namespaced resources using the
syntax count/<resource>.<group> :

https://kubernetes.io/docs/concepts/policy/_print/ 16/21
6/6/23, 3:59 PM Policies | Kubernetes

kubectl create namespace myspace

kubectl create quota test --hard=count/deployments.apps=2,count/replicasets.apps=

kubectl create deployment nginx --image=nginx --namespace=myspace --replicas=2

kubectl describe quota --namespace=myspace

Name: test
Namespace: myspace
Resource Used Hard
-------- ---- ----
count/deployments.apps 1 2
count/pods 2 3
count/replicasets.apps 1 4
count/secrets 1 4

Quota and Cluster Capacity


ResourceQuotas are independent of the cluster capacity. They are expressed in absolute
units. So, if you add nodes to your cluster, this does not automatically give each namespace
the ability to consume more resources.

Sometimes more complex policies may be desired, such as:

Proportionally divide total cluster resources among several teams.


Allow each tenant to grow resource usage as needed, but have a generous limit to
prevent accidental resource exhaustion.
Detect demand from one namespace, add nodes, and increase quota.

Such policies could be implemented using ResourceQuotas as building blocks, by writing a


"controller" that watches the quota usage and adjusts the quota hard limits of each
namespace according to other signals.

Note that resource quota divides up aggregate cluster resources, but it creates no restrictions
around nodes: pods from several namespaces may run on the same node.

Limit Priority Class consumption by default


It may be desired that pods at a particular priority, eg. "cluster-services", should be allowed in
a namespace, if and only if, a matching quota object exists.

With this mechanism, operators are able to restrict usage of certain high priority classes to a
limited number of namespaces and not every namespace will be able to consume these
priority classes by default.

To enforce this, kube-apiserver flag --admission-control-config-file should be used to


pass path to the following configuration file:

https://kubernetes.io/docs/concepts/policy/_print/ 17/21
6/6/23, 3:59 PM Policies | Kubernetes

apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: "ResourceQuota"
configuration:
apiVersion: apiserver.config.k8s.io/v1
kind: ResourceQuotaConfiguration
limitedResources:
- resource: pods
matchScopes:
- scopeName: PriorityClass
operator: In
values: ["cluster-services"]

Then, create a resource quota object in the kube-system namespace:

policy/priority-class-resourcequota.yaml

apiVersion: v1
kind: ResourceQuota
metadata:
name: pods-cluster-services
spec:
scopeSelector:
matchExpressions:
- operator : In
scopeName: PriorityClass
values: ["cluster-services"]

kubectl apply -f https://k8s.io/examples/policy/priority-class-resourcequota.yaml

resourcequota/pods-cluster-services created

In this case, a pod creation will be allowed if:

1. the Pod's is not specified.


priorityClassName

2. the Pod's priorityClassName is specified to a value other than cluster-services .

3. the Pod's priorityClassName is set to cluster-services , it is to be created in the


kube-system namespace, and it has passed the resource quota check.

A Pod creation request is rejected if its priorityClassName is set to cluster-services and it


is to be created in a namespace other than kube-system .

What's next
See ResourceQuota design doc for more information.
See a detailed example for how to use resource quota.
Read Quota support for priority class design doc.
See LimitedResources

https://kubernetes.io/docs/concepts/policy/_print/ 18/21
6/6/23, 3:59 PM Policies | Kubernetes

3 - Process ID Limits And Reservations


FEATURE STATE: Kubernetes v1.20 [stable]

Kubernetes allow you to limit the number of process IDs (PIDs) that a Pod can use. You can
also reserve a number of allocatable PIDs for each node for use by the operating system and
daemons (rather than by Pods).

Process IDs (PIDs) are a fundamental resource on nodes. It is trivial to hit the task limit
without hitting any other resource limits, which can then cause instability to a host machine.

Cluster administrators require mechanisms to ensure that Pods running in the cluster cannot
induce PID exhaustion that prevents host daemons (such as the kubelet or kube-proxy, and
potentially also the container runtime) from running. In addition, it is important to ensure that
PIDs are limited among Pods in order to ensure they have limited impact on other workloads
on the same node.

Note: On certain Linux installations, the operating system sets the PIDs limit to a low
default, such as 32768. Consider raising the value of /proc/sys/kernel/pid_max.

You can configure a kubelet to limit the number of PIDs a given Pod can consume. For
example, if your node's host OS is set to use a maximum of 262144 PIDs and expect to host
less than 250 Pods, one can give each Pod a budget of 1000 PIDs to prevent using up that
node's overall number of available PIDs. If the admin wants to overcommit PIDs similar to
CPU or memory, they may do so as well with some additional risks. Either way, a single Pod
will not be able to bring the whole machine down. This kind of resource limiting helps to
prevent simple fork bombs from affecting operation of an entire cluster.

Per-Pod PID limiting allows administrators to protect one Pod from another, but does not
ensure that all Pods scheduled onto that host are unable to impact the node overall. Per-Pod
limiting also does not protect the node agents themselves from PID exhaustion.

You can also reserve an amount of PIDs for node overhead, separate from the allocation to
Pods. This is similar to how you can reserve CPU, memory, or other resources for use by the
operating system and other facilities outside of Pods and their containers.

PID limiting is a an important sibling to compute resource requests and limits. However, you
specify it in a different way: rather than defining a Pod's resource limit in the .spec for a Pod,
you configure the limit as a setting on the kubelet. Pod-defined PID limits are not currently
supported.

Caution: This means that the limit that applies to a Pod may be different depending on
where the Pod is scheduled. To make things simple, it's easiest if all Nodes use the same
PID resource limits and reservations.

Node PID limits


Kubernetes allows you to reserve a number of process IDs for the system use. To configure
the reservation, use the parameter pid=<number> in the --system-reserved and --kube-
reserved command line options to the kubelet. The value you specified declares that the
specified number of process IDs will be reserved for the system as a whole and for
Kubernetes system daemons respectively.

Pod PID limits


Kubernetes allows you to limit the number of processes running in a Pod. You specify this
limit at the node level, rather than configuring it as a resource limit for a particular Pod. Each
Node can have a different PID limit.
https://kubernetes.io/docs/concepts/policy/_print/ 19/21
6/6/23, 3:59 PM Policies | Kubernetes

To configure the limit, you can specify the command line parameter --pod-max-pids to the
kubelet, or set PodPidsLimit in the kubelet configuration file.

PID based eviction


You can configure kubelet to start terminating a Pod when it is misbehaving and consuming
abnormal amount of resources. This feature is called eviction. You can Configure Out of
Resource Handling for various eviction signals. Use pid.available eviction signal to
configure the threshold for number of PIDs used by Pod. You can set soft and hard eviction
policies. However, even with the hard eviction policy, if the number of PIDs growing very fast,
node can still get into unstable state by hitting the node PIDs limit. Eviction signal value is
calculated periodically and does NOT enforce the limit.

PID limiting - per Pod and per Node sets the hard limit. Once the limit is hit, workload will start
experiencing failures when trying to get a new PID. It may or may not lead to rescheduling of a
Pod, depending on how workload reacts on these failures and how liveleness and readiness
probes are configured for the Pod. However, if limits were set correctly, you can guarantee
that other Pods workload and system processes will not run out of PIDs when one Pod is
misbehaving.

What's next
Refer to the PID Limiting enhancement document for more information.
For historical context, read Process ID Limiting for Stability Improvements in Kubernetes
1.14.
Read Managing Resources for Containers.
Learn how to Configure Out of Resource Handling.

https://kubernetes.io/docs/concepts/policy/_print/ 20/21
6/6/23, 3:59 PM Policies | Kubernetes

4 - Node Resource Managers


In order to support latency-critical and high-throughput workloads, Kubernetes offers a suite
of Resource Managers. The managers aim to co-ordinate and optimise node's resources
alignment for pods configured with a specific requirement for CPUs, devices, and memory
(hugepages) resources.

The main manager, the Topology Manager, is a Kubelet component that co-ordinates the
overall resource management process through its policy.

The configuration of individual managers is elaborated in dedicated documents:

CPU Manager Policies


Device Manager
Memory Manager Policies

https://kubernetes.io/docs/concepts/policy/_print/ 21/21

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy