Transcript For Presentation: Slide 1: Introduction
Transcript For Presentation: Slide 1: Introduction
Slide 1: Introduction
Hello all, today I will be presenting about GKE, the Kubernetes Engine service managed by Google.
Not very long ago, the default way of deploying an application was on its physical computer. To set
one up, it needed physical space, power, cooling, and network connectivity, with its own Operating
system and its software dependencies, and finally the application. It needs more processing power,
redundancy, security or scalability. It was very common for each computer to have a single purpose:
like database, web server, or content delivery.
This wasted resources and took a lot of time to deploy, maintain and scale. It wasn’t very portable:
and applications were not cross platform and most of the time were constrained for specific
environment.
Containers which form a high level abstraction for virtualization can be used to tackle this issue.
Slide 2: Containers
Containers are a form of operating system virtualization. A single container might be used to run
anything from a small microservice or a software process to a larger application. Inside the container
are all necessary provisions, executables, binaries, libraries and config files.
Slide 3: Containers
Containers have a lot in common with virtual machines. Both of their architectures have similarities
as both are built on top of the OS kernel. However, there are some key differences between them.
VMs run in a hypervisor environment where each VM has its own guest OS inside it, along with its
related binaries, libraries and the application files. This consumes a large amount of system
resources and overhead, especially when multiple VMs a rerunning on the same physical server,
each with its own guest OS.
In contrast, each container shares the same host OS or kernel, yielding the size to be much lower.
This renders it quick to boot compared to VMs.
Compared to server or virtualization process, containers do not have operating system images. It
makes them lightweight and portable, with significantly less overhead.
Apart from its isolation from the Hypervisor, container’s lightweight gives it some unique advantages
when compared to its predecessing technologies as represented in this diagram (courtesy of
Kubernetes.io). Features like Isolation, portability, universal platform independence due to its
immutable infrastructure, versioning capabilities, ease of sharing makes containers an extremely
feasible solutions to one’s deployment needs.
Due to simplicity, and their inherent isolation from each other, containers provides essentially a
simple way to deploy and use cloud-based services. In practical scenarios, there may be many
containers and managing them might get too computationally taxing. Deploying, managing,
connecting, and updating those containers would need separate teams, resulting in process
inefficiencies. In such larger application deployments, multiple containers may be deployed as one
or more container clusters. Such clusters might be managed by a container orchestrator service such
as Kubernetes.
Slide 4: Kubernetes
Kubernetes is a portable, extensible, open-source platform for managing containers. Developed by
Google, it takes the whole group of computers and makes them work as a single unit. Using them
ensures that multiple users can use containerized applications, and able to collaborate together,
across different platforms.
Kubernetes determines the location of clusters, monitors health of containers, and manage the
lifecycle of the clusters. All of these collections of tasks are known as container orchestration.
It orchestrates the operation of multiple containers in harmony together. It manages the areas like
the use of underlying infrastructure resources for containerized applications such as the amount of
compute, network and storage resources needed. It makes automation and scaling the workloads
easier for live production environments.
Scheduling: It places the containers automatically based on their resource requirements and other
constraints, while not sacrificing availability.
Life-cycle and health: It restarts containers that fails, replaces and reschedules containers when
nodes die, kills containers that don’t respond to user-defined health, check until they are ready to
serve.
Scaling: Applications can be scaled up and down, either when requested or automatically based on
usage conditions.
Discovery and Load Balancing: Allocation of IPv4 and IPv6 addresses to pods and services and a
single DNS name for a set of pods, and can load balance across them.
Storage Volumes: Automatically mount the storage system of choice, either from any public
provider, or a network system, like NFS
Logging and Monitoring: The logs of health checks, system utilization can be exported for further
usage analysis.
Identity and Authorization: Restricts the usage access by following the principle of least access set by
the user of the containers.
Apart from these, batch executions of the workloads, automated rollouts and rollbacks and
extensive designing are also part of the functions of Kubernetes.
All of these functions and granular approach to configurations can be intimidating to some users
who are properly well-versed in the environment setting and controlling every aspect of the
container management system.
This is where GKE comes into play.
If one wants to provision their own native Kubernetes cluster tasks like: Choosing a cloud provider,
provisioning machines, picking an OS and container runtime, configuring networking, setting up
security, starting services like DNS, logging and monitoring. Everything seeming as mountain of task
is made much harder when a rolling update needs to be provisioned.
They also provide Best-in class developer tooling with consistent support for native and third party
tools. Offers container-native networking with a unique BeyondProd security approach.
The scalability is also a great point of consideration as it provides unique advantage of being most
scalable Kubernetes service. Only GKE can run 15,000 node clusters, outscaling competition upto
15X.
In GKE and Kubernetes, the containers having applications packaged into them or the microservices
batched into them are collectively called workloads. Before deploying a workload on a GKE cluster,
users must first package the workload into a container.
Standard mode: This is the original offering as a mode of operation that came out with GKE. The user
gets node configuration flexibility and full control over managing the clusters and node
infrastructure. It is best suited for those looking to have granular control over GKE experience.
Autopilot mode: In this mode, the entire managing of node and cluster infrastructure is done from
Google’s side, providing a more hands-off approach. However, it comes with some restrictions that
one needs to keep in mind as well like, the choices can be sometimes restricted and certain features
can only be accessed via the CLI.
Availability:
With GKE, you can create a cluster tailored to the availability requirements of your workload and
your budget. The types of available clusters include: zonal (single-zone or multi-zonal) and regional.
Zonal clusters have a single control plane in a single zone. Depending on availability requirements,
choice of distribution of nodes for the zonal cluster can be made in a single zone or in multiple
zones.
A regional cluster has multiple replicas of the control plane, running in multiple zones within a given
region. Nodes in a regional cluster can run in multiple zones or a single zone depending on the
configured node locations. By default, GKE replicates each node pool across three zones of the
control plane's region.
Use regional clusters to run your production workloads, as they offer higher availability than zonal
clusters.
Network Routing:
When creating a GKE cluster, you can specify the network routing mode. In Google Kubernetes
Engine, clusters can be distinguished according to the way they route traffic from one Pod to
another Pod. A cluster that uses Alias IPs is called a VPC-native cluster. A cluster that uses Google
Cloud routes is called a routes-based cluster. For clusters created in the Standard mode, the default
network mode depends on the GKE version and the method you use to create the cluster.
Network Isolation:
By default, you can configure access from public networks to your cluster's workloads. Routes are
not created automatically. Private clusters assign internal addresses to Pods and nodes, and
workloads are completely isolated from public networks.
Kubernetes Feature:
New features in Kubernetes are listed as Alpha, Beta, or Stable, depending upon their status in
development. In most cases, Kubernetes features that are listed as Beta or Stable are included with
GKE clusters. Kubernetes Alpha features are available in special GKE alpha clusters.
An alpha cluster has all Kubernetes alpha APIs (sometimes called feature gates) enabled. You can use
alpha clusters for early testing and validation of Kubernetes features. Alpha clusters are not
supported for production workloads, cannot be upgraded, and expire within 30 days.
Control Plane: The control plane runs the control plane processes, including the Kubernetes API
server, scheduler, and core resource controllers. The lifecycle of the control plane is managed by
GKE when you create or delete a cluster. This includes upgrades to the Kubernetes version running
on the control plane, which GKE performs automatically, or manually at your request if you prefer to
upgrade earlier than the automatic schedule.
The API server is a component of the Kubernetes control plane that exposes the Kubernetes API. The
API server is the front end for the Kubernetes control plane.
Scheduler:
Control plane component that watches for newly created Pods with no assigned node, and selects a
node for them to run on.
Factors taken into account for scheduling decisions include: individual and collective resource
requirements, hardware/software/policy constraints, data locality, deadlines to name a few.
Storage:
Consistent and highly-available key value store used as Kubernetes' backing store for all cluster data.
It also runs backups periodically of the stored data.
Resource Controllers:
Logically, each controller is a separate process, but to reduce complexity, they are all compiled into a
single binary and run in a single process.
Node controller which is Responsible for noticing and responding when nodes go down.
Job controller that watches for Job objects that represent one-off tasks, then creates Pods to run
those tasks to completion.
Endpoint controller populates the Endpoints object (that is, joins Services & Pods).
Service Account & Token controllers create default accounts and API access tokens for new
namespaces.
Each node is managed from the control plane, which receives updates on each node's self-reported
status. You can exercise some manual control over node lifecycle, or you can have GKE perform
automatic repairs and automatic upgrades on your cluster's nodes.
A node runs the services necessary to support the containers that make up your cluster's workloads.
These include the runtime and the Kubernetes node agent (which are called kubelet), which
communicates with the control plane and is responsible for starting and running containers
scheduled on the node. Apart from that proxy agent is responsible for maintaining network rules on
each node. These network rules allow network communication to your Pods from network sessions
inside or outside of your cluster.
In GKE, there are also a number of special containers that run as per-node agents to provide
functionality such as log collection and intra-cluster network connectivity.
Pods: Pods are single instances of a running process in a cluster. They contain at least one container.
And are usually run a single container, but can run multiple containers. Multiple containers are
Pods also use shared networking and storage across containers. Each pod gets a unique IP address
and a set of ports. Containers connect to a port. Multiple containers in a pod connect to different
ports and can talk to each other on localhost. This structure is designed to support running one
instance of an application within the cluster as a pod.
A pod allows its containers to behave as if they are running on an isolated VM, sharing common
storage, one IP address, and a set of ports. By doing this, you can deploy multiple instance of the
same application, or different instances of different applications on the same node or different
nodes, without having to change their configuration.
Pods treat the multiple containers as a single entity for management purposes. Pods are generally
created in groups. Replicas are copies of pods and constitute a group of pods that are managed as a
unit. Pods support autoscaling as well. Pods are considered ephemeral; that is, they are expected to
terminate. If a pod is unhealthy—for example, if it is stuck in a waiting mode or crashing
repeatedly—it is terminated. The mechanism that manages scaling and health monitoring is known
as a controller.
Deployments: Deployments are sets of identical pods. The members of the set may change as some
pods are terminated and others are started, but they are all running the same application. The pods
all run the same application because they are created using the same pod template.
A pod template is a definition of how to run a pod. The description of how to define the pod is called
a pod specification. Kubernetes uses this definition to keep a pod in the state specified in the
template.
There are times, however, when it is advantageous to have a single pod respond to all calls for a
client during a single session. StatefulSets are like deployments, but they assign unique identifiers to
pods. This enables Kubernetes to track which pod is used by which client and keep them together.
StatefulSets are used when an application needs a unique network identifier or stable persistent
storage.
Job: A job is an abstraction about a workload. Jobs create pods and run them until the application
completes a workload. Job specifications are specified in a configuration file and
include specifications about the container to use and what command to run.
Slide 14 -16: Cluster Working
With Kubernetes a control plane is introduced to make decisions on where the applications are to be
ran. And it continually monitors the state of each machine, and make adjustments to the state to
ensure that actions according to specifications are being performed. Kubernetes runs with a control
plane, and on a number of node. A piece of software called the kubelet is installed on each node,
which reports the state back to the control plane.
The master node manages a cluster of Docker containers. It also runs a Kubernetes API server to
interact with the cluster and perform tasks, such as servicing API requests and scheduling containers.
Beyond this, a cluster can also include one or more nodes, each running a Docker runtime and a
kubelet agent that are needed to manage the Docker containers.
GKE users organize one or more containers into pods that represent logical groups of related
containers. If a pod of related containers become unavailable, access to those containers may be
disrupted. Most applications in containers require redundancy to ensure that pods are always
available. GKE includes a replication controller that allows users to run the desired number of pod
duplicates at any given time.
Groups of pods can be organized into services, allowing applications to access other containers
without needing additional configurations.
It essentially runs like an autopilot. Declare a state, and Kubernetes ensures it sustenance.
A sysadmin is not needed to continue the monitoring. It is Cloud actions enabled. Monitoring and
logging are taken care of.
The deployments and build pipelines upon being enabled, the cluster is automatically deployed.
Filestore is a managed file storage service for applications that require a file-system interface and a
shared file system for data. It provides managed storage for dynamically provisioned persistent
volumes across a Google Kubernetes Engine cluster.
NFS-Client Provisioner defines a Kubernetes storage class that dynamically provisions volumes from
an NFS server. Filestore supplies the NFS mount and the storage backend used to create and host
persistent volumes.
Kubernetes pods request dynamically provisioned storage by specifying a dynamically allocated
storage class in their persistent volume claims. In this case, they specify the storage class defined by
the NFS-Client Provisioner.
To expose applications outside of a GKE cluster, GKE provides a built-in GKE Ingress controller and
GKE Service controller which deploy Google Cloud Load Balancers (GCLBs) on behalf of GKE users.
This is the same VM load balancing infrastructure, except its lifecycle is fully automated and
controlled by GKE. The GKE network controllers provide container-native Pod IP load balancing via
opinionated, higher-level interfaces that conform to the Ingress and Service API standards.
The following diagram illustrates how the GKE network controllers automate the creation of load
balancers: An infrastructure or app admin deploys a declarative manifest against their GKE cluster.
Ingress & Service controllers watch for GKE networking resources (such as Ingress or
MultiClusterIngress objects) and deploy Cloud load balancers (plus IP addressing, firewall rules etc)
based on the manifest. The controller continues managing the LB and backends based on
environmental and traffic changes. Thus, GKE load balancing becomes a dynamic and self-sustaining
load balancer with a simple and developer-oriented interface.
The horizontal pod autoscaler changes the shape of the workload by automatically increasing or
decreasing the number of pods in response to the workload’s CPU memory consumption, or in
response to a custom metric reported from within Kubernetes or external metrics from sources
outside the cluster.
It cannot be used for unscalable workloads like DaemonSets. DaemonSet manages groups of
replicated Pods. However, DaemonSets attempt to adhere to a one-Pod-per-node model, either
across the entire cluster or a subset of nodes. As you add nodes to a node pool, DaemonSets
automatically add Pods to the new nodes as needed.
Vertical Pod autoscaling is an autoscaling tool that help size pods for the optimal CPU and memory
resources required by the Pods. Instead of having to set up to date CPU requests and limits and
memory requests and limits for the container in the pods, Vertical pod autoscaling can be configured
to provide recommended values for CPU and memory requests and limits, or to automatically
update the values. Setting the right resources request and limits for the workloads is important for
stability and cost efficiency. If the pod resource size is smaller than workload requirements, the
application will be either throttled or fail due to out-of -memory errors. And large sizes will result in
wastage and large bills.
When demand is high, the cluster autoscaler adds nodes to the node pool. When the demand is low,
the cluster autoscaler scales back down to a minimum size that is designated. This can increase the
availability of the workloads when needed, while controlling costs. Cluster autoscaler can be
configured on a cluster.
There is no need to manually add or remove nodes or over-provision the node pools. Instead, just
specify a minimum and maximum size for the node pool, and the rest is automatic. If resources are
deleted or moved when autoscaling the cluster, the workloads might experience transient
disruption.
With Autopilot clusters, there is no need to worry about provisioning nodes or managing node pools
because node pools are automatically provisioned through node auto-provisioning and are
automatically scaled to meet the requirements of the workloads. It automatically manages a set of
node pools on the user’s behalf. Without node auto-provisioning, GKE starts new nodes only from
user-created node pools. With node auto-provisioning, new node pools are created and deleted
automatically.
GKE is secure by default with automatic data encryption at rest and at transit.
The clusters can be accessed without any public IP address on the internet. And the access can be
controlled using Identity and access management and role based access models.
Additionally, with GKE, trusted networking also comes into play. Using Global VPC, connect to and
isolate the clusters. Using Global Load balancing, deploy public services behind a global Anycast IP.
Use Cloud Armor to get protection against Layer 7 and DDoS attacks. And use the networking policy
to control the communication between the cluster pods.
GKE also comes with tools to verify, and enforce, and improve the infrastructure security. Binary
authorization ensures only properly signed containers are deployed to production. Vulnerability
scanning of the container images find security vulnerabilities early on CI/CD pipeline. And since base
images are managed, they are automatically patched and updated for security vulnerabilities.
Thank You.