0% found this document useful (0 votes)
41 views3 pages

Scribd 1

This document provides a cheatsheet for monitoring Kubernetes clusters with metrics from Kube-state-metrics and Prometheus. It lists the various cluster state, node resource, job, service, container, disk/network metrics that can be monitored along with the corresponding metric names in Kube-state-metrics, Prometheus, and Datadog. Examples of commands to view the metrics in Kubernetes are also included.

Uploaded by

Nope
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views3 pages

Scribd 1

This document provides a cheatsheet for monitoring Kubernetes clusters with metrics from Kube-state-metrics and Prometheus. It lists the various cluster state, node resource, job, service, container, disk/network metrics that can be monitored along with the corresponding metric names in Kube-state-metrics, Prometheus, and Datadog. Examples of commands to view the metrics in Kubernetes are also included.

Uploaded by

Nope
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Cheatsheet: Kubernetes Monitoring

Cluster state metrics


Container metrics
MORE INFO >
DESCRIPTION NAME IN KUBE-STATE-METRICS COMMAND DESCRIPTION NAME IN
KUBE-STATE-METRICS COMMAND
Running pods kube_pod_status_phase kubectl get pods Containers running on a pod
kube_pod_container_info kubectl describe pod <POD_NAME>
Number of pods desired for a
Deployment kube_deployment_spec_replicas kubectl get deployment <DEPLOYMENT>
Containers restarted on a pod kube_pod_container_status_restarts_total kubectl describe pod
<POD_NAME>
Number of pods desired for a
DaemonSet Containers terminated on a pod kube_pod_container_status_terminated kubectl
describe pod <POD_NAME>
kube_daemonset_status_desired_number_scheduled
kubectl get daemonset <DAEMONSET>
Number of pods currently running
kube_deployment_status_replicas
in a Deployment kubectl get deployment <DEPLOYMENT>
Number of pods currently running
kube_daemonset_status_current_number_scheduled
in a DaemonSet kubectl get daemonset <DAEMONSET>
Number of pods currently
available in a Deployment kube_deployment_status_replicas_available kubectl get deployment
<DEPLOYMENT>
Number of pods currently
available in a DaemonSet kube_daemonset_status_number_available kubectl get daemonset
<DAEMONSET>
Number of pods currently not
available in a Deployment kube_deployment_status_replicas_unavailable kubectl get deployment
<DEPLOYMENT>
Number of pods currently not
available in a DaemonSet kube_daemonset_status_number_unavailable kubectl get daemonset
<DAEMONSET>
Node resource and status metrics
DESCRIPTION
MORE INFO >
NAME IN KUBE-STATE-METRICS
COMMAND
Current health status of a node
(kubelet) kube_node_status_condition kubectl describe node <NODE_NAME>

Total memory requests (bytes)


per node kube_pod_container_resource_requests_memory_bytes kubectl describe node
<NODE_NAME>
Total memory in use on a node N/A kubectl describe node <NODE_NAME>
Total CPU requests (cores) per
node kube_pod_container_resource_requests_cpu_cores kubectl describe node <NODE_NAME>
Total CPU in use on a node N/A kubectl describe node <NODE_NAME>
Job metrics
MORE INFO >
DESCRIPTION NAME IN KUBE-STATE-METRICS COMMAND
Number of successful jobs kube_job_status_succeeded kubectl get jobs --all-namespaces |
grep “succeeded”
Number of failed jobs kube_job_status_failed kubectl get jobs --all-namespaces |
grep “failed”
Number of active jobs kube_job_status_active kubectl get jobs --all-namespaces
Number of CronJobs kube_cronjob_info kubectl get cronjobs --all-namespaces
Service metrics
MORE INFO >
DESCRIPTION NAME IN KUBE-STATE-METRICS COMMAND
Service types per cluster kube_service_info kubectl get services --all-namespaces
Number of pods running by
service kubectl get pods --selector=<SERVICE_SELECTOR>
-o=name kubectl get jobs --all-namespaces
Disk I/O & Network metrics
DESCRIPTION PROMETHEUS METRIC NAME COMMAND
Network in per node container_network_receive_bytes_total kubectl get --raw
/api/v1/nodes/<NODE_
NAME>/proxy/metrics/cadvisor
Network out per node container_network_transmit_bytes_total kubectl get --raw
/api/v1/nodes/<NODE_
NAME>/proxy/metrics/cadvisor
Disk writes per node container_fs_writes_bytes_total kubectl get --raw /api/v1/nodes/<NODE_
NAME>/proxy/metrics/cadvisor
Disk reads per node container_fs_reads_bytes_total kubectl get --raw /api/v1/nodes/<NODE_
NAME>/proxy/metrics/cadvisor
Network errors per node container_network_receive_errors_total,
container_network_transmit_errors_total kubectl get --raw /api/v1/nodes/<NODE_
NAME>/proxy/metrics/cadvisor
Kubernetes events
MORE INFO >
DESCRIPTION COMMAND
List events kubectl get eventsCheatsheet: Kubernetes Monitoring with Datadog
1. Cluster state metrics
METRIC DESCRIPTION DATADOG STATUS CHECK/METRIC NAME
Running pods kubernetes.pods.running
Number of pods desired for a Deployment kubernetes_state.deployment.replicas_desired
Number of pods desired for a DaemonSet kubernetes_state.daemonset.desired
Number of pods currently running in a Deployment kubernetes_state.deployment.replicas
Number of pods currently running in a DaemonSet kubernetes_state.daemonset.scheduled
Number of pods currently available in a Deployment
kubernetes_state.deployment.replicas_available
Number of pods currently available in a DaemonSet kubernetes_state.daemonset.ready
Number of pods currently not available in a Deployment
kubernetes_state.deployment.replicas_unavailable
Number of pods currently not available in a DaemonSet kubernetes_state.daemonset.desired -
kubernetes_state.daemonset.ready
2. Node resource and status metrics
METRIC DESCRIPTION DATADOG METRIC NAME
Current health status of a node (kubelet) kubernetes.kubelet.check
Total memory requests (bytes) per node kubernetes.memory.requests
Total memory in use on a node kubernetes.memory.usage
Total CPU requests (cores) per node kubernetes.cpu.requests
Total CPU in use on a node kubernetes.cpu.usage.total
3. Job metrics
METRIC DESCRIPTION DATADOG METRIC NAME
Number of successful jobs kubernetes_state.job.succeeded
Number of failed jobs kubernetes_state.job.failed
Number of active jobs kubernetes_state.job.count
Number of CronJobs kubernetes_state.job.count (filtered by the owner_kind:cronjob tag)
4. Service metrics
METRIC DESCRIPTION DATADOG METRIC NAME
Service types per cluster kubernetes_state.service.count
Number of pods running by service kubernetes.pods.running
5. Container metrics
METRIC DESCRIPTION DATADOG METRIC NAME
Containers running on a pod kubernetes_state.container.running
Containers restarted on a pod kubernetes_state.container.restarts
Containers terminated on a pod kubernetes_state.container.terminated
6. Disk I/O & Network metrics
METRIC DESCRIPTION DATADOG METRIC NAME
Network in per node kubernetes.network.rx_bytes
Network out per node kubernetes.network.tx_bytes
Disk writes per node kubernetes.io.write_bytes
Disk reads per node kubernetes.io.read_bytes
Network errors per node kubernetes.network.rx_errors, kubernetes.network.tx_errors
7. Events
Kubernetes events will appear in the Datadog Events Explorer and in event widgets on dashboards

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy