This page shows you how to analyze the CPU performance of your Google Kubernetes Engine (GKE) cluster nodes using Performance Monitoring Unit (PMU) events.
This page is intended for cluster admins who have performance-sensitive workloads and want to examine the CPU execution of their workloads on their GKE nodes during development, debugging, benchmarking, and continuous monitoring.
Before you begin
Before you start, make sure you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running
gcloud components update
.
Requirements and limitations
When enabling PMU events, be aware of the following requirements and limitations:
- Your cluster must be Standard mode.
- If your cluster has node auto-provisioning enabled, any node pools created through auto-provisioning cannot enable PMU events. If you enable node auto provisioning after enabling PMU events, existing node pools are not impacted.
- Cluster node pools must be running the C4 or C4A machine type.
Create a GKE cluster
Create a cluster with PMU events enabled for the default node pool:
gcloud container clusters create CLUSTER_NAME \
--location=COMPUTE_LOCATION \
--performance-monitoring-unit=PMU_LEVEL \
--machine-type=MACHINE_TYPE
Replace the following:
CLUSTER_NAME
: the name of the new cluster.COMPUTE_LOCATION
: the Compute Engine location for the new cluster.PMU_LEVEL
: the type of PMU events to collect. For more information, see How the PMU works in the Compute Engine documentation. Supported values are as follows:architectural
: enables architectural PMU events related to non-last-level cache (LLC) events.standard
: includes architectural events and enables core PMU events, including L2 cache events.enhanced
: includes standard events and enables any local events outside the CPU core and LLC PMU events. This option is only available with VMs that have a specific number of vCPUs. For more information, see Limitations in the Compute Engine documentation.
MACHINE_TYPE
: the Compute Engine machine type for your nodes. For a list of supported machine types, see limitations in the Compute Engine documentation.
You can also create a new node pool for an existing cluster using the
gcloud container node-pools create
command.
Connect to the cluster
Configure kubectl
to communicate with the cluster:
gcloud container clusters get-credentials CLUSTER_NAME \
--location=COMPUTE_LOCATION
Verify the PMU is enabled
Verify your cluster nodes have PMU enabled by examining the kernel messages.
Get a list of nodes in the cluster:
kubectl get nodes
The output is similar to the following:
NAME STATUS ROLES AGE VERSION gke-c1-default-pool-44be3e13-prr1 Ready <none> 5d23h v1.27.13-gke.1070000 gke-c1-default-pool-7abc4a17-9dlg Ready <none> 2d21h v1.27.13-gke.1070000 gke-c1-default-pool-ed969ef6-4gzp Ready <none> 5d v1.27.13-gke.1070000
Record the name of one of the nodes.
Get the Compute Engine location of the node:
gcloud compute instances list --filter=NODE_NAME
Replace
NODE_NAME
with the name of a node from the previous step.The output is similar to the following:
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS gke-c1-default-pool-44be3e13-prr1 us-central1-c c4-standard-4 true 10.128.0.67 34.170.44.164 RUNNING
Record the name of the Compute Engine
ZONE
. In this example, it'sus-central1-c
.Use SSH to connect to the cluster node:
gcloud compute ssh NODE_NAME \ --zone=COMPUTE_ZONE
Replace
COMPUTE_ZONE
with the name of the Compute Engine zone from the previous step.Examine the kernel messages:
sudo dmesg |grep -A10 -i "Performance"
The output is similar to the following:
[ 0.307634] Performance Events: generic architected perfmon, full- width counters, Intel PMU driver. # Several lines omitted
This output indicates the PMU driver is initialized.
What's next
- Learn how to Choose a minimum CPU platform