0% found this document useful (0 votes)
2 views

Infrastructure and Platform Management

The document provides practical guidance on infrastructure and platform management within the ITIL 4 framework, detailing its purpose, processes, and the roles of organizations and technology involved. It emphasizes the importance of a high-quality IT infrastructure for effective service delivery and outlines the lifecycle of infrastructure solutions, including design, delivery, and support. Additionally, it discusses the integration of agile methods, automation, and the need for reliability and maintainability in managing infrastructure and platform solutions.

Uploaded by

gmc7m9fxns
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Infrastructure and Platform Management

The document provides practical guidance on infrastructure and platform management within the ITIL 4 framework, detailing its purpose, processes, and the roles of organizations and technology involved. It emphasizes the importance of a high-quality IT infrastructure for effective service delivery and outlines the lifecycle of infrastructure solutions, including design, delivery, and support. Additionally, it discusses the integration of agile methods, automation, and the need for reliability and maintainability in managing infrastructure and platform solutions.

Uploaded by

gmc7m9fxns
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Home Resources CPD Badges Events Help Dinesh Peter

February 28, 2020 46 min read

ITIL ITIL4 Practice Guides

Infrastructure and platform management: ITIL 4 Practice Guide

34 Likes

This document provides practical guidance for the infrastructure and platform management practice.

Table of Contents

1. About this document 4. Organizations and people 7. Important reminder

2. General information 5. Information and technology 8. Acknowledgements

3. Value Streams and processes 6. Partners and suppliers

1. About this document

It is split into five main sections, covering:

general information about the practice

the practice’s processes and activities and their roles in the service value chain

the organizations and people involved in the practice

the information and technology supporting the practice

considerations for partners and suppliers for the practice.

1.1 ITIL® 4 qualification scheme

Selected content from this document is examinable as a part of the following syllabus:

ITIL Specialist High-velocity IT

Please refer to the respective syllabus document for details.

2. General information

2.1 Purpose and description

Key message

The purpose of the infrastructure and platform management practice is to oversee the infrastructure and platforms used by
an organization. When carried out properly, this practice enables the monitoring of technology solutions available to the
organization, including the technology and external service providers.

The infrastructure and platform management practice ensures that the organization has a high-quality IT infrastructure that efficiently meets
its current and anticipated needs. ‘IT infrastructure’ as a concept includes all of the hardware, software, networks, and facilities that are required
to develop, test, deliver, monitor, manage, and support IT services.

Depending on the architecture of the organization’s IT infrastructure, this practice may focus on the management of the physical environment,
physical equipment, or digital infrastructure solutions, which may be the organization’s own resources or services provided by suppliers and
partners. Often, IT infrastructure solutions are managed as services; in these cases, the infrastructure and platform management practice may
include dedicated teams acting as service providers for the application and/or product teams within the organization. If this approach is taken,
it is important to ensure that the infrastructure and platform teams are closely involved in the overall service delivery activities of the
organization and follow the ITIL principles focus on value, think and work holistically, and collaborate and promote visibility. Members of these
teams should understand the wider context of the organization and its service value system (SVS).

This practice covers all stages of the infrastructure solutions lifecycle, from ideation and gathering requirements to delivery and support. At
every stage, it is used in conjunction with other practices (including the business analysis, architecture management, service design, availability
management, capacity and performance management, service continuity management, information security management, risk management
practices, and others). The importance of high-quality infrastructure and platforms for service delivery cannot be overstated; this practice is vital
for the success of the organization’s digital services and digitized business processes.

2.2 Terms and concepts

The infrastructure and platform management practice provides the structure to deliver and support stable and well-performing technology
services. Infrastructure and platform management is provided directly to the business, or supports the applications used by the business. With
a robust infrastructure and platform management practice, an organization can enable value creation with the confidence that the underlying
technology will meet organization’s and service consumers’ needs.

Definition: IT infrastructure

All of the hardware, software, networks, and facilities that are required to develop, test, deliver, monitor, manage, and
support IT services.

A wide range of activities are used to run and manage IT infrastructure effectively. These activities range from understanding organization’s
requirements and developing and planning infrastructure and platforms, to performing routine maintenance and overseeing infrastructure
performance.

Definition: Operation

The routine of running and managing an activity, product, service, or other configuration item.

A large portion of the operational activities can be automated. Automation tools can monitor the environment, identify changes, distribute
patches and other updates, provide asset inventory, and schedule and automate jobs.

2.2.1 Business alignment for infrastructure and platform solutions

Infrastructure and platform solutions are designed to meet specific quality criteria defined to support the organization’s needs. The
infrastructure and platform management practice is closely connected with the architecture management practice, ensuring that all
infrastructure and platform solutions comply with the chosen architectural approach, model and standards, as well as sharing knowledge on
the innovation available and feeding possible infrastructure and platform solutions into architecture management. The infrastructure and
platform management practice must support application architecture, data architecture, and business architecture as well as align to the
organization’s overall vision and principles.

To ensure alignment to the overall architectural model, standardized infrastructure and platform solutions are defined to meet the
organization’s needs in a repeatable manner, to simplify delivery and ongoing management for these services. Standardized services allow for
efficient provisioning through repeatability and automation. Many infrastructure services are designed to enable speed and agility. Self-service
capabilities leverage automation capabilities to allow for users or other IT staff to request and receive items without manual steps behind the
scenes. This should account for the majority of the services that are in utilized in the environment. Examples of standardized solutions may
include storage systems, application servers, database platforms, authentication systems, single-sign-on, and others.

In integration with the architecture management the practice, the infrastructure and platform management practice should ensure
development or outsourcing and cost-efficient operation of flexible and compatible core infrastructure and platform solutions, that should be
easily deployable and easily configured or merged to support the organization’s services or products, serving as building blocks for the
complex solutions, products, and services. One of the examples of implementing such approach is usage of microservices, that are “small in
size, messaging-enabled, bounded by contexts, autonomously developed, independently deployable, decentralized and built and released with
automated processes”1.

When the standard solution does not align with the business, a tailored or customized solution must be developed. The selection of a non-
standard service delays the delivery of the solution and increases the ongoing effort and cost to the business for support for the solution. These
non-standard solutions should be deployed and managed as an exception due to the additional overhead it requires.

In cases where the technology is not currently in place, the solution must be designed together with the architecture management and
service design practices for conceptual and detailed design. During design, the infrastructure and platform management practice, business,
and technical requirements are aligned and the recommended infrastructure and platform solutions are determined. As the solution is not
currently available within the environment, additional steps are taken to address the procurement, build, sourcing, and support of the solution.
The solution should be evaluated by infrastructure and enterprise architecture to determine if this should be offered to additional consumers or
to remain as an exception to the existing documented standards.

When an organization needs an infrastructure and platform solution, infrastructure and platform management practice ensures that a solution
is designed and delivered to meet the organization’s requirements. There are several ways to provide a solution. For requests that can be
fulfilled using documented standard packages, the solution is provided through defined provisioning methods.

2.2.2 Infrastructure and platform solution technologies – physical and virtual

The technology used for infrastructure and platform solutions is either physical or virtual. Physical resources run directly from the hardware,
such as an operating system that is installed directly on the hardware. This operating system can either host the application or services directly
or virtual systems can run on top of it.

Virtualization allows for additional systems to be built on the physical system. Virtualization software runs on the hardware and allows for
additional operating systems that are isolated and separated to be installed, creating multiple servers residing on the physical server. All virtual
systems may run on the same or different hardware, but the virtual capabilities allow for dynamic workload placement and other capabilities; it
also allows for better utilization of the hardware. The logical structure that connects the virtual servers and the physical servers should be
accounted for in the configuration management database (CMDB). Additional capabilities that allow for dynamic moving of workloads should
also be represented in the data model.

Infrastructure technologies, such as software-defined networking, virtual servers, and object storage, simplify the provisioning of infrastructure
services. This allows the organization to deliver services quickly through automation.

Virtualization has greatly improved provisioning, performance, capacity, and availability for solutions. Further development in the virtualization
direction is the usage of infrastructure-as-code (IaC) solutions. IaC is a way of managing and provisioning IT infrastructure and platforms by
using machine-readable definition files rather than physically configuring hardware components. IaC solutions significantly speed up design
(including hypothesis testing), development, building, provisioning and changing the infrastructure and platform solutions. Such solutions also
usually make the infrastructure more reliable and fault resistant.

2.2.3 Infrastructure and platform solution delivery models

Advancements in technical capabilities have changed how services are delivered. Service providers have embraced the ability to scale services.
As organizations move to services offerings that allow for flexibility in terms of how the service is provided, the organization can choose the
model that best aligns to their strategic goals. Many times, the preferred model is a combination of both internal and external provided
services. This complexity drives the need for a comprehensive management approach that ensures end-to-end delivery meets customer
expectations.

There are many models for providing infrastructure and platform solutions, ranging from in-house dedicated data centres to fully out-sourced
cloud environments. Many organizations continue to provide and support infrastructure residing in their internal data centres. They can also
use solutions external to their organization. Cloud solutions provide offerings that allow systems and applications to run in internal and external
data centres. Most enterprises use public cloud providers for at least part of their infrastructure. Cloud providers offer many solutions based on
the expected needs of the business. An application may be accessed through the cloud, leaving infrastructure management activities beyond
connecting to the cloud to be done externally by the application provider. Cloud offerings can include platforms for application development
and infrastructure specific services like storage or backup as a service.

There is usually a mix of public and private cloud services in any organization. Both cloud services and outsourcing can provide infrastructure
and platform services. Cloud services provide technical capabilities whereas outsourcing performs IT functions in a similar manner to internal
teams. The contract defines the outsourcing scope and service levels. Instead of managing technology directly, internal IT teams focus on
managing the contractual obligations and interactions with internal teams in an outsourced environment.

2.2.4 Agile methods in infrastructure and platform management

Recent technology innovations have enabled changes to how infrastructure is delivered and supported. Development practices have been
adopted by teams providing infrastructure and platform solutions. Engineering and support functions rely heavily on coding and other
development capabilities for automation.

Along with a focus on development from a system perspective, many organizations have also moved into models that blend development and
infrastructure capabilities on one team to provide coverage throughout the lifecycle. DevOps and site reliability engineering (SRE) are examples
of these models.

Specifically, DevOps brings a robust landscape of tools to automate the tracking, building, and deploying of small, agile-based releases. Agile is
a development framework, but DevOps includes the infrastructure components and operational activities. DevOps focuses on the
opportunities across all technology components and drives automation to enable rapid system updates. Infrastructure can now fully benefit
from structured development practices.

By accounting for the end-to-end development and management of the solution, this approach allows for operational improvements to be
included in the development releases. Machine learning and AIOps leverages data collected on solutions to automate, address issues, or
manage requests without development. Through operational visibility and development capabilities, the overall system is managed in a more
comprehensive and consistent manner through automation.

When using DevOps for infrastructure and platform management, special attention must be paid to obsolete systems and monolithic
solutions that require manual operation and, therefore, slow down all management processes and changes. There should be a clear roadmap
of decommissioning and replacing such solutions or replacing the manual activities with automation. One of the ways to do this is have an SRE
team to run operations.

SRE is a discipline that incorporates aspects of software engineering and applies them to infrastructure and operations problems with the goal
of creating ultra-scalable and highly reliable solutions. SRE is an approach that tries to bridge the gap between development and operations
and find a consensus of their opposite objectives, which is to develop and release solutions fast and have a stable solution to support. SRE
teams usually have software developers who must support the solutions they develop, and this stimulates them to automate most of the
manual support and management tasks (in the course of reducing toil: manual, repetitive, automatable, non-creative work). With this,
infrastructure and platform solutions become more manageable, require less manual work, and gain agility in changes, delivery, and support.
Probably one of the most important gains of SRE operations is that infrastructure scale-out doesn’t lead to according linear growth of the team
size, as it often happens in classical operations.

Key message

The practice is involved throughout the lifecycle of product and services. Figure 2.1 from “The Site Reliability Workbook” by
Google, illustrates how SRE teams are involved during the lifecycle. With minor variations, this illustration is applicable to
other approaches to infrastructure and platform management.

Figure 2.1 Infrastructure and platform management during product and service lifecycle

2.2.5 Reliability and maintainability

Once the solution is in production, the primary focus of the team supporting and operating the infrastructure is to ensure high-quality delivery
through managing the ongoing performance and functionality of the infrastructure and platform solutions. This team may be a dedicated
infrastructure team or a dedicated product team. The products and services rely on the solution’s availability and performance to support
them. In production, the organization has high expectations for uptime and very little tolerance for any impact of any type on service or
product. To meet these demands, the solutions must be reliable and maintainable. Beyond infrastructure and platform configurations to
support reliability and maintainability, the infrastructure and platform management practice must ensure the solution is supportable.
Supportability addresses the organization’s requirements to ensure that the solution is functional and ready to support products and services.

Reliability is designed with the system. Reliability requirements are aligned to the uptime and performance requirements, defined by the
capacity and performance management practice. These requirements ensure the solutions are built in to support the organization’s
requirements. For example, this may include high availability or redundant network connectivity.

Definition: Reliability

The ability of a product, service, or other configuration item to perform its intended function for a specified period of time or
number of cycles.

Maintainability of a system should be addressed during the design of a new system and tested before being transitioned to production. There
could be rules agreed for an infrastructure and platform solution, ensuring maintainability based on the organization’s requirements and
industry practices. One example is the existence of a monitoring tool to identify issues, or general monitorability of the solution planned at the
design phase. Other examples could be the existence of tools used to configure, deploy, and provision the solutions. These rules could also be
used to manage partners and suppliers responsible for infrastructure and platform service components.

Definition: Maintainability

The ease with which a service or other entity can be repaired or modified.

If maintainability is not addressed during the initial design and as part of daily operations, higher support costs, extended outages, and
negative impacts to performance will affect the production environment. Maintainability is improved through appropriate monitoring
configurations, automation, and utilization of standards.

Another aspect of maintainability involves ensuring the solution is recoverable to meet availability targets. This aspect is tightly aligned with
the service continuity management. Maintainability ensures that infrastructure and platform solutions can be recovered to meet availability
targets. This may mean, for example, ensuring that the hardware contract supports on-site replacements within a set timeframe. It may also
cover having on-site resources performing the repair. When committing to availability targets, the parts and resources needed to restore
service need to be factored in and be in place throughout the solution lifecycle. The infrastructure and platform management practice requires
that the right pieces are in place to diagnose, repair, and recover in order to restore services on time.

Automation is also used to improve a system’s maintainability. Repeatable actions are excellent candidates for automation. Software
development and management tools and techniques, such as Agile and DevOps, can be applied to infrastructure and platform management
to drive frequent updates to systems and configurations. By addressing opportunities as they are identified and implementing solutions in
small releases, benefits are realized quickly.

2.3. Scope

The scope of the infrastructure and platform practice includes:

activities used to plan, design, develop, deliver, maintain, and support infrastructure and platform technology

infrastructure and platform technology including:

hardware (servers, desktops, routers, switches, storage, cabling, and data centre)

software (operating systems, desktop applications, and middleware)

management tools (monitoring, management tools, deployment, inventory)

web hosting

cloud infrastructure and platform

identification systems and single sign-on (SSO).

infrastructure and platform management skills, including:

technical architecture and engineering

technical administration and operations

execution and enforcement of policies and procedures connected to infrastructure and platform management (planning, decision
making, oversight).

integration with other practices

skills required for infrastructure and platform management, including infrastructure architecture, engineering, and administration.

There are many activities and areas of responsibility that are not included in the infrastructure and platform management practice, although
they are still closely related to infrastructure and platform management. These are listed in Table 2.1, along with references to the practices in
which they can be found. It is important to remember that ITIL practices combine value chain activities through value streams to deliver value.

Table 2.1 Activities related to the infrastructure and platform management practice described in
other practice guides

Activity Practice guide

Restoration of infrastructure and platform technology and services including major Incident management
incidents

Defining permanent resolution or workarounds for infrastructure and platform known Problem management
errors

Management of changes to the infrastructure and platforms Change enablement

Tracking and management of infrastructure and platform assets IT asset management

Tracking of infrastructure and platform configurations in relationship to other Service configuration


configuration items (CIs) management

Monitoring, event management, and log management for infrastructure and platform Monitoring and event
technologies management

Infrastructure and platform design Service design

Defining requirements for infrastructure and platform solutions Business analysis

Definition of standards and road map for infrastructure and platforms Architecture management

2.4 Practice success factors

Definition: Practice success factor

A complex functional component of a practice that is required for the practice to fulfil its purpose.

A practice success factor (PSF) is more than a task or activity; it includes components from all four dimensions of service management. The
nature of the activities and resources of PSFs within a practice may differ, but together they ensure that the practice is effective.

The infrastructure and platform practice includes the following PSFs:

establishing an infrastructure and platform management approach to meet evolving organizational needs

ensuring that the infrastructure and platform solutions meet the organization’s current and anticipated needs.

2.4.1 Establishing an infrastructure and platform management approach to meet evolving


organizational needs

The needs of organizations and their customers are continually changing which leads to the technology industry continually transforming. The
changes may result from industry trends, changes within organizations, business process innovation, or changes to business volumes. The
infrastructure and platform management practice ensures that infrastructure and platform solutions are flexible and scalable so that they are
aligned with demand. Organizational infrastructure and platforms meet this demand through optimized solutions that are designed for and
used by all parts of the organization.

To properly design these solutions, teams delivering infrastructure and platform change must be aware of new technologies and techniques.
The evolution of technology can be seen in examples like email, virtual server farms, storage arrays, single sign-on, and cloud platforms. When
solutions are identified based on requirements, requests are promptly fulfilled. With virtual server technology that is used both internally and
for cloud offerings, the turnaround time for requests can be reduced to minutes. Technological progress, such as virtualization, containers,
continuous integration/continuous delivery (CI/CD), and IaC, significantly impacts the rate of change and innovation.

Organizations that deliver and support infrastructure and platform solutions have evolved through models, such as DevOps and SRE; they
eliminate the use of traditional waterfall techniques in favour of end-to-end development and management within one team. Crucially, the
organization’s structure and technology components must align with its overall strategic direction in order to ensure the consistent delivery
and support of infrastructure and platform solutions. Components must align with the overall strategic direction to ensure consistent delivery
and support of infrastructure and platform solutions.

It is important to plan how infrastructure and platform teams will identify, design, and introduce innovation into the environment at the
solution and strategic levels. Depending on the current needs, infrastructure and platform management might need initial research and
testing so that, when the need is presented, there is a clear plan of action. If the need is pressing, the technology may be selected, purchased,
designed, and configured before any official requests are received.

The infrastructure and platform management practice should ensure that the infrastructure and platforms are built to promote
experimentation, quick technology adoption, the ability to test theories and hypotheses, change the infrastructure and platform iteratively with
feedback, fail fast, and learn from experience and errors in a safe environment. Each organization should define its innovation and risk appetite
and consider their financial constraints for innovation in the infrastructure and platforms areas.

2.4.2 Ensuring that the infrastructure and platform solutions meet the organization’s current
and anticipated needs

The main focus of the infrastructure and platform management practice should be ensuring that stakeholders receive value throughout the
infrastructure and platform solution lifecycle. Stakeholders must be engaged from the initiation of a request or project until the solution’s
retirement. Understanding stakeholder expectations, from design to the ongoing management and support of the solutions, is an essential
aspect of delivering infrastructure and platform solutions. This ongoing relationship will drive improvement opportunities and ensure value
continues to be co-created as the solution evolves.

When the organization needs a technical solution, requirements are defined in order to ensure that the solution meets the organization’s
needs. The solution design should include technical and business requirements. The infrastructure and platform management practice is
involved in analysing requirements to create a high-level design (in conjunction with the architecture management, business analysis, and
service design practices, and others).

The requirements for infrastructure and platform solutions may come from different sources, including:

architectural standards and guidelines

compliance requirements, if the organization is subject to legislation

direct requirements from customers, if a solution is a service or service component that will be directly released to customers.

Where possible, the infrastructure and platform management practice ensures that standards can be defined and utilized in order to simplify
the management of infrastructure and platform solutions. The enforcement of these standards ensures the reliability and maintainability of
solutions. Standards enable efficient and effective operations and may include the hardware and software versions, configuration settings,
management and monitoring tools, and support structures. Through standards, solutions are easier to operate, monitor, and upgrade.

Designs should be assessed against current and planned standards and validated against the current and anticipated levels of availability,
performance, capacity, information security, and so on. Management practices supporting these should have active involvement.

Standard infrastructure solution packages should be utilized wherever possible. Any portion of the solution that is not standard increases cost,
delays delivery, and requires customized support throughout the life of the solution. Exceptions to standards may result in extended downtime
or other impacts to the customer. They may also delay teams responsible for performing other activities for other infrastructure and platform
solutions.

If there are multiple exceptions to a standard, a review should be conducted to ensure that the standard still meets the organization’s needs. If
it does not, a new standard should be designed and its implementation should be planned. Retiring the standard may include planning the
removal of current systems that were installed as part the retired offering in order to reduce technical debt and the potential risk to the
environment. The development and maintenance of the standards and standard packages are also within the scope of the infrastructure and
platform management practice.

Part of the practice’s focus is to manage risk to the organization throughout the infrastructure and platform. As part of this effort, input from
practices such as information security, service continuity, and risk management are taken to ensure that risks are managed throughout the
lifecycle of the solution. This ongoing management includes, for example, ensuring that network devices are configured based on defined
security policies, controls are tested periodically, and risks are identified and effectively managed. Requirements are handled on an ongoing
basis to prevent adverse impacts, such as extended service downtime or a security breach of confidential information.

The overall management of infrastructure and platform solutions often includes internal and third-party solutions and components.
Understanding the overall structure of these solutions and ensuring that the overall level of service provided meets customer expectations is
critical.

Management need visibility to validate that solutions are performing at acceptable levels and to highlight opportunities. These may include
addressing any issues and identifying areas that could be improved. The infrastructure and platform management practice should provide
visibility to stakeholders in performance and improvement plans. This practice interacts with other practices to ensure that any issues or
requests on solutions are resolved promptly. For this reason, the practice participates in agreeing targets for incident response, restoration, and
request fulfilment times to align with customer expectations. This practice may include managing and reporting on the ability of solutions to
meet targets. This visibility also provides an opportunity to improve performance in this area through automation or process refinement.

This practice also contributes to ensure that the agreed-upon levels of service is met. The scope of this effort includes any internal or external
components used in the solution. Third-party services must align with customer expectations, or the expectations must be reset. External
providers must meet the service levels in their contracts. By managing performance levels across internal and external services, the practice is
able to report performance and other outcomes to the business.

The infrastructure and platform management practice ensures that solutions within its scope effectively contribute to overall financial targets.
Infrastructure and platform solutions should be benchmarked against cloud offerings and external provider solutions. From a technology
perspective, automation, consolidation, and standardization simplify the infrastructure and platforms and release resources, which can then be
used to drive value. The current and potential partnerships with external providers can also be evaluated and existing agreements optimized.

2.5 Key metrics

The effectiveness and performance of the ITIL practices should be assessed within the context of the value streams to which each practice
contributes. As with the performance of any tool, the practice’s performance can only be assessed within the context of its application.
However, tools can differ greatly in design and quality, and these differences define a tool’s potential or capability to be effective when used
according to its purpose. Further guidance on metrics, key performance indicators (KPIs), and other techniques that can help with this can be
found in the measurement and reporting practice guide.

Key metrics for infrastructure and platform management are mapped to its PSFs. They can be used as KPIs in the context of value streams to
assess the contribution of the practice to the effectiveness and efficiency of those value streams. Some examples of key metrics are given in
Table 2.3.

Table 2.3 Examples of key metrics for the practice success factors

Practice success factors Key metrics

Establishing an infrastructure and platform


management approach to meet evolving organizational
needs Stakeholder satisfaction with the approach to
management of infrastructure and platforms
Alignment of the infrastructure and platform
management approach with the organization’s strategy
and architecture

Number and impact of deviations from the organization’s


strategy and architecture road map

Level of benefits, costs, and risks associated with the


approach to management of infrastructure and platforms

Ensuring that the infrastructure and platform solutions


meet the organization’s current and anticipated needs
Stakeholder satisfaction with infrastructure and platform
solutions

Number and impact of infrastructure incidents

Number and impact of constraints imposed by


infrastructure and platform solutions

Number and impact of deviations from the agreed


approach

The correct aggregation of metrics into complex indicators will make them easier to use for the ongoing management of value streams and for
the periodic assessment and continual improvement of the infrastructure and platform management practice. There is no single best solution.
Metrics will be based on the overall service strategy and priorities of an organization, as well as on the goals of the value streams to which the
practice contributes.

[1] Nadareishvili, I., Mitra, R., McLarty, M., Amundsen, M., Microservice Architecture: Aligning Principles, Practices, and Culture, O’Reilly 2016

3. Value Streams and processes

3.1 Value stream contribution

Like any other ITIL management practice, the infrastructure and platform management practice contributes to multiple value streams.
Remember, no value stream is made up of a single practice. The infrastructure and platform management practice combines with other
practices to provide high-quality services to consumers. The main value chain activities to which this practice contributes are:

deliver and support

design and transition

obtain/build

plan.

The contribution of the infrastructure and platform management practice to the service value chain is shown in Figure 3.1.

Figure 3.1 Heat map of the contribution of the infrastructure and platform management practice to value chain activities

3.2 Processes

Each practice may include one or more processes and activities that may be necessary to fulfil the purpose of that practice.

Definition: Process

A set of interrelated or interacting activities that transform inputs into outputs. A process takes one or more defined inputs
and turns them into defined outputs. Processes define the sequence of actions and their dependencies.

There are numerous models to structure activities of the infrastructure and platform management practice. These span several decades and
range from waterfall and manual, to iterative and incremental.

This practice is one of the two ITIL practices (the other is the software development and management practice) where activities do not always
form processes that could be described as sequences at the level of detail appropriate to this guide. This is because the infrastructure and
platform management activities are always performed in a context of one or another value stream, and always in conjunction with other
practices. However, activities of this practice can be categorized in three groups:

technology planning

product development

technology operations.

3.2.1 Technology planning activities

Technology planning activities ensure that the organization has a technology management approach and a roadmap for infrastructure
development and improvement. These activities ensure the organization’s financial, architectural, and resource plans are aligned. With
formalized and repeatable planning and effective integration with other practices, infrastructure and platform solutions will continually
support alignment with the strategic goals of the organization. Table 3.1 shows how the activities transforms the inputs into outputs.

Table 3.1 Inputs, activities, and outputs of technology planning

Key inputs Activities Key outputs

Organization’s Analyse the organization’s strategy and Infrastructure and platform


principles, policies, and architecture management approach and
vision roadmap

Develop and agree the infrastructure and


Organizational platform management approach Improvement initiatives and
strategy requests for changes

Review the infrastructure and platform


Organizational management approach
structure

Product and service


portfolio

Customer portfolio

Business analysis
records and review
reports

Audit reports

Figure 3.2 shows a workflow diagram of the process.

Figure 3.2 Workflow of technology planning

Table 3.2 provides an example of the technology planning activities.

Table 3.2 Example activities of technology planning

Activity Example

Analyse the IT Leaders of the organization analyse the organization’s strategy, architecture road map, and
organization’s portfolios and define requirements to the infrastructure and platform management approach.
strategy and
architecture

Develop and agree Business analysts, architects, product owners, and infrastructure experts agree and communicate
the infrastructure an infrastructure and platform approach, including scope, sourcing strategy, methods and
and platform techniques, procedures, and responsibilities.
management
approach

Review the Based on infrastructure review reports, periodic reviews, and audit reports, product owners and
infrastructure and infrastructure experts review the effectiveness of the infrastructure and platform management
platform approach and provide input to the analyse the organization and requirements activity, and/or
management initiate required changes.
approach

3.2.2 Product development activities

In many organizations these activities are performed within product development value streams in conjunction with other practices. The
infrastructure and platform management practice serves as a source of technical expertise and other resources to support product ideation,
design, development, and deployment. In other organizations, infrastructure and platform solutions are developed in a separate value stream
and provided to as services to product teams and their products. The activities of the infrastructure and platform practices are similar in these
scenarios. In many cases, infrastructure solutions are sourced from external developers; the activities of the practice are focused on ensuring
that the solutions meet the organization’s requirements and constraints.

This group includes the activities outlined in Table 3.3 and transforms the inputs into outputs.

Table 3.3 Inputs, activities, and outputs of product development

Key inputs Activities Key outputs

Infrastructure and platform Create a basic solution design Basic and detailed design
management approach

Create a detailed solution Agreed service level objectives


Solution requirements design

Components and solutions


Budget and other resources Source/develop/configure the
and constraints components
Solution documentation

Sourcing and supplier Source/build/configure the


Setup in management tools including
management policies solution
monitoring, ITSM tools

Sourcing and build policies and Support validation and testing


Operational run books
guidance

Support deployment and


Reports and scheduled reviews
Operational standards release

Success criteria Review solution development


and implementation

Project structure (schedule,


assignment, methods)

The focus of technology delivery and engineering is on designing, building, and transitioning infrastructure and platform services. These
activities may vary, depending on how the services will be delivered and how the organization applies these steps, as is outlined in Table 3.4.

Table 3.4 Technology delivery and engineering activities

Activity Internal build Sourced

Create a basic solution Based on the requirements identified by business analysts and product owner, infrastructure
design specialists agree service level objectives for the infrastructure solution and create a basic
solution design. The basic design is approved by the product owner.

Create a detailed solution Infrastructure specialists and/or site reliability engineers creates a detailed solution design,
design ensuring its reliability, efficiency, scalability, and other quality characteristics required by the
agreed SLOs and the organization’s infrastructure management approach are met.
The resulting design includes a recommended sourcing and delivery model for the
components and the solution.

Source/develop/configure Agreed components are developed and Agreed components are procured and configured
the components configured by infrastructure specialists by a supplier according to the design; their work is
according to the design monitored and accepted by infrastructure
specialists

Source/build/configure Agreed solution/system is Agreed solution/system is built/configured by a


the solution built/configured by infrastructure supplier according to the design; their work is
specialists according to the design; monitored and accepted by infrastructure
their work is accepted by the product specialists and the product owner
owner

Support validation and Infrastructure specialists participate in Infrastructure specialists participate in the
testing the validation and testing of the validation and testing of the components and the
components and the solution at all solution at all stages of the solution development,
stages of the solution development, ensuring effective integration with the service
ensuring effective integration with the validation and testing practice and the supplier
service validation and testing practice management practice

Support deployment and Infrastructure specialists participate in Infrastructure specialists participate in the
release the deployment and release of the deployment and release of the solution, ensuring
solution, ensuring effective integration effective integration with the supplier
with the respective practices management practice

Review solution Infrastructure specialists, product Infrastructure specialists, product owners,


development and owners, and application developers application developers, and supplier
implementation review the infrastructure solution representatives review the infrastructure solution
development activities and outcomes. development activities and outcomes. The
The resulting report is used as an input resulting report is used as an input to the
to the technology planning activities technology planning activities, supplier
and other improvement initiatives management improvements, and other
improvement initiatives

Product development activities ensure the delivery of a supportable solution that meets the organization’s needs and agreed SLOs. Even if an
external provider provides a solution, steps are taken to ensure it fits into the overall delivery and support model.

3.2.3 Technology operation activities

The technology operations activities are performed after the solution goes into the live environment. These activities include planned
maintenance and unplanned support activities. Maintenance focuses on the normal operations of the solution, such as administration and
monitoring. Support focuses on addressing events, incidents, alerts, and other areas that are not performing as planned. In an organization
that is not functioning well, the unplanned activities typically take most, if not all resource time. A more mature organization will focus on
planned activities that will result in less unplanned work.

This group includes the following activities, and transforms the following inputs into outputs:

Table 3.5 Inputs, activities, and outputs of the technology operation

Key inputs Activities Key outputs

Solutions and support documentation, such Manage queues of Reports


as operational run books queries and events

Closed tickets and events


Policies and guidelines Perform scheduled
tasks
Scheduled job completion
Monitoring data
Patch and update the
Backup completion
system
Queries (incidents, problems, and so on)

Updated solution and support


SLAs documentation

Automation

Improvements

Table 3.6 provides example descriptions of the technology operation activities

Table 3.6 Technology operation activities

Activity Example

Manage Infrastructure management teams and tools process incoming queries and events, ensuring timely and
queues of successful resolution of detected incidents, alerts, and other events requiring a response. Logs and reports
queries reflecting this activity are created as agreed in the infrastructure and platform management approach and
and solution documentation.
events Examples of this work include:

rolling back a bad software push

blocking or rate-limiting unwanted traffic

bringing up additional serving capacity

using the monitoring systems (for alerting and dashboards)

solving incidents

analysing problems

conducting post-mortems.

Perform Several actions are performed by infrastructure management teams or tools on a scheduled basis, such as
scheduled daily backups or a data transfer between systems. Logs and reports reflecting this activity are created as
tasks agreed in the infrastructure and platform management approach and solution documentation.
Examples of this work include:

administering production jobs

describing the architecture, various components, and dependencies of the services

testing back-up restoration

training users

reviewing supplier performance

reviewing solution performance.

Patch and Patches and system updates are released to the environment in a structured manner. Typically, patches
update deployed to the lower environments for testing and then deployed to production. Despite this structure, there
the are exceptions where systems are not patched as part of this scheduled release due to an application
system incompatibility, business usage of the solution, or issues identified through testing. It is important to track the
solutions that are not at current levels. Completing these updates should be rolled out promptly to maintain
overall supportability. Up-to-date solutions reduce the risk of downtime or security breaches.
There are also situations where system updates or patches are installed to resolve an incident and then need
to be rolled out to the rest of the organization. The result of applying patches and updates reactively creates a
non-standard environment.
The infrastructure specialist manages these exceptions and identifies a plan to address these exceptions.
Understanding and addressing these deviations is a vital part of technology management.

The technology operation activities ensure that solutions are available and functioning as designed from acceptance into the live environment
through retirements. Technical experts and technical coordinators perform the activities in this process.

4. Organizations and people

4.1 Roles, competencies, and responsibilities

The practice guides do not describe the roles of practice owners or managers that should exist for all practices. They focus instead on specialist
roles specific to each practice. The structure and naming of each role may differ from organization to organization, so any roles defined in ITIL
should not be treated as mandatory, or even recommended. Remember, roles are not job titles. One person can take on multiple roles and one
role can be assigned to multiple people.

Roles are described in the context of processes and activities. Each role is characterized with a competency profile based on the model shown
in Table 4.1.

Table 4.1 Competency codes and profiles

Competency Competency profile (activities and skills)


code

L Leader Decision-making, delegating, overseeing other activities, providing incentives and motivation, and
evaluating outcomes

A Administrator Assigning and prioritizing tasks, record-keeping, ongoing reporting, and initiating basic
improvements

C Coordinator/communicator Coordinating multiple parties, maintaining communication between


stakeholders, and running awareness campaigns

M Methods and techniques expert Designing and implementing work techniques, documenting
procedures, consulting on processes, work analysis, and continual improvement

T Technical expert Providing technical (IT) expertise and conducting expertise-based assignments

Table 4.2 Examples of the roles involved in infrastructure and platform management activities

Activity Responsible roles Competency Specific skills


profile

Technology planning

Analyse the organization’s Architects, business analysts, TC


strategy and architecture product owners, infrastructure
specialists Good knowledge of the
organization and its
environment, portfolios,
products, resources, and
customers

Understanding of the current


infrastructure architecture and
architecture roadmap

Analytical skills

Good knowledge of current and


available technology

Develop and agree the Architects, business analysts, TLMC


infrastructure and platform product owners, infrastructure
management approach specialists, consultants Good knowledge of the
organization and its
environment, portfolios,
products, resources, and
customers

Excellent knowledge of current


and available infrastructure
and platform solutions

Good knowledge of
infrastructure and technology
services suppliers and market

Review the infrastructure Architects, business analysts, TCA


and platform management product owners, infrastructure
approach specialists, consultants Good knowledge of the
organization and its
environment, portfolios,
products, resources, and
customers

Understanding of the current


infrastructure architecture and
architecture roadmap

Analytical skills

Good knowledge of current


and available technology

Product development

Create a basic solution Solution architects, infrastructure TA


design specialists, site reliability
engineers, product owners Understanding of the
requirements

Good knowledge of the


infrastructure and platform
management approach
Expertise in the available
technology

Create a detailed solution Solution architects, infrastructure TA


design specialists, site reliability
engineers, product owners Understanding of the
requirements

Good knowledge of the


infrastructure and platform
management approach

Expertise in the available


technology and services

Source/develop/configure Infrastructure specialists, site TC


the components reliability engineers, product
owners, suppliers Technical expertise

Communication and
collaboration skills

Source/build/configure the Infrastructure specialists, site TC


solution reliability engineers, product
owners, suppliers Technical expertise

Communication and
collaboration skills

Support validation and Infrastructure specialists, site TC


testing reliability engineers, product
owners, suppliers Technical expertise

Communication and
collaboration skills

Support deployment and Infrastructure specialists, site TC


release reliability engineers, product
owners, suppliers Technical expertise

Communication and
collaboration skills

Review solution Solution architects, infrastructure TCA


development and specialists, site reliability
implementation engineers, product owners Good knowledge of the
infrastructure and platform
management approach

Technical expertise

Good knowledge of the


organization and its
environment, portfolios,
products, resources, and
customers

Technology operations

Manage queues of queries Infrastructure specialists, site TA


and events reliability engineers
Technical knowledge

Understanding of business and


customer context

Communication and
coordination skills

Perform scheduled tasks Infrastructure specialists, site TA


reliability engineers
Technical administration
knowledge

Patch and update the Infrastructure specialists, site TA


system reliability engineers
Knowledge of security policies,
standards, and requirements

Technical knowledge

4.1.1 Infrastructure specialist

The key role for this practice is infrastructure specialist. This is a generic term to describe roles that can be specified either by the technology,
like network, SRE, and so on (for example, network specialist, site reliability engineer, or virtualization specialist) or by the phase in product
lifecycle, like design, testing, or operations (for example,. infrastructure designer/development specialist, testing specialist, or operations
administrator).

Those distinctions are defined by the organization’s size and structure, but the general set of competencies are similar, and usually includes:

technology subject matter expertise

good understanding of the organization’s architecture

knowledge of the frameworks and techniques adopted by the organization

knowledge of organization’s products and services

service mindset

good knowledge of organization’s operating model and value streams.

Examples of other roles which can be involved in infrastructure and platform management activities are listed in Table 4.2, together with the
associated competency profiles and specific skills.

4.2 Organizational structures and teams

Infrastructure and platform management specialists often form a dedicated team (or teams). However, in some organizations they are
included in product teams and focused on infrastructure solutions supporting respective products. Regardless of the organizational solution, it
is important to maintain shared view and responsibility across infrastructure and product teams.

Key message

Rigid boundaries between “application development” and “production” (sometimes called programmers and operators) are
counterproductive. This is especially true if the segregation of responsibilities and classification of ops as a cost centre leads
to power imbalances or discrepancies in esteem or pay
(…) Ideally, both product development and SRE teams should have a holistic view of the stack—the frontend, backend,
libraries, storage, kernels, and physical machine—and no team should jealously own single components. It turns out that you
can get a lot more done if you “blur the lines”11 and have SREs instrument JavaScript, or product developers qualify kernels:
knowledge of how to make changes and the authority to do so are much more widespread, and incentives to jealously
guard any particular function are removed.”
This quote from “The Site Reliability Workbook” by Google refers specifically to SRE teams. However, it is valid for any other
approach to infrastructure and platform management.

The infrastructure and platform management practice needs to allow for organization variations while ensuring some level of consistency
across infrastructure teams. The teams may be split by geography, type of technology, or business service. Having an overall structure to
manage practice changes and communication is important to keep the overall service functioning in an optimal manner. This may be done
with an overall governance group or through representation in an infrastructure committee.

5. Information and technology

5.1 Information exchange

The effectiveness of the infrastructure and platform management practice is based on the quality of the information used. This information
includes, but is not limited to:

business services and processes

customers and users

partner and suppliers including contracts and service levels

SLAs

architecture and design documentation

portfolio and project management plans

policies, requirements, and controls

change records

incident records

request records

problem records

release records

financial information

application development and testing information

system information (versions, baselines, configurations)

monitoring and event information

IT assets and inventory information.

5.2 Automation and tooling

In most cases, the infrastructure and platform management practice can significantly benefit from automation. Where this is possible and
effective, it may involve the solutions outlined in Table 5.1.

Table 5.1. Automation solutions for infrastructure platform management activities

Process activity Means of automation Key functionality Impact on


the
effectiveness
of the
practice

Technology planning

Analyse the Collection, processing, and High


organization’s strategy presentation of data from diverse
and architecture Communication and sources
collaboration tools

Analytical systems

Knowledge management tools

Develop and agree the Communication and Collaboration and information Medium
infrastructure and collaboration tools sharing
platform management
approach

Review the infrastructure High


and platform
management approach Communication and Collection, processing, and
collaboration tools presentation of data from
diverse sources

Analytical systems
Reporting engines

Knowledge management
tools Dashboard systems

Product development

Create a basic solution Workflow tools including task Ability to assign design tasks and High
design assignment, routing, approvals, approval for planning activities,
tracking, and notifications including status tracking,
notifications, and reporting to
ensure actions are on task and the
design is approved

Create a detailed solution Workflow tools including task Ability to assign tasks and approval High
design assignment, routing, approvals, for planning activities, including
tracking and notifications, status tracking, notifications, and
contract management with reporting to ensure actions are on
templates, approvals, and review task
schedules

Source/develop/configure Automated provisioning, Ability to receive approved request High


the components building, and configuring tools and to build a solution with no or
limited manual intervention
ensuring consistent and timely
delivery

Source/build/configure Automated provisioning, Ability to receive approved request High


the solution building, and configuring tools and to build a solution with no or
limited manual intervention
ensuring consistent and timely
delivery

Support validation and Automated testing and defect Automated testing, reporting, and High
testing tracking logging into the defect
management system

Support deployment and Deployment tools Automated deployment from High


release testing to implementation,
including submission of change
request

Review solution Dashboards and reports, trend Medium to


development and analysis high
implementation Workflow tools including
task assignment, routing,
approvals, tracking, and
notifications

System health monitoring


and reporting tools

Technology operation

Manage queues of High


queries and events
Automated request Ability to close repeat tickets
provisioning, automated automatically and assign the
resolution, ChatOps, AIOps, tickets automatically to the
correct group without manual
triage steps
Workflow tools

Task assignment, routing,


approvals, tracking and
notifications

Perform scheduled tasks High

Job scheduling tools and Automation of scheduled


scripts for backup, batch, tasks including notification on
and other automated tasks failures reducing the potential
for missed procedure
execution
Vulnerability tools and
report and testing
automation for compliance, Ability to automatically verify
automated solution and test solutions for security
recovery, and testing hardening, recoverability, and
controls

TSM report and dashboard


automation

Automated report
consolidation and
generation, customer
feedback surveys

Workflow tools including


task assignment, routing,
approvals, tracking, and
notifications

Patch and update the System and security patch Ability to automatically deploy and High
system deployment and inventory tools, report on installation status for
software distribution and patches and system updates
inventory tools

6. Partners and suppliers

Very few services are delivered using only an organization’s own resources. Most, if not all, depend on other services, often provided by third
parties outside the organization (see section 2.4 of ITIL Foundation: ITIL 4 Edition for a model of a service relationship).

The infrastructure and platform management practice allows for many outsourcing options both from an activity perspective as well as from a
technology perspective. Table 6.1 provides examples of areas that are candidates for outsourcing.

Table 6.1 Infrastructure and platform management sourcing considerations

Activity Opportunity Applicability

Provisioning Delivery of desktops, servers, computer, network, Outsourcing is most effective when standards are
and storage services or other technology services in place. Outsourcing may be selectively used for
remote locations.

Support Restoration and prevention of incidents for in- Support for the entire capability may be
scope technologies outsourced or focused on specific roles. Providers
should adhere to standard service desk processes
for a consistent customer experience. This works
well for remote sites, especially for desktop
support.

Administration Performing routine tasks based on operational Administrative tasks need to be well documented
procedures and requests and sufficient access must be provided.

Operations Outsourcing the operations centre function This reduces internal staffing requirements. This
centre reduces the need to ensure adequate coverage function must be well documented, have
with internal staff, especially if it is provided at all adequate access and frequent touchpoints are
times. This function can provide monitoring, recommended to understand any open issues or
systems management, job scheduling, or other improvement opportunities.
activities

Backup/restore Provider configures and manages backup jobs Providers may leverage internal backup tools or
and repositories, addresses backup failures, and may include backup solutions and storage as part
restores files as needed of the agreement.

Systems Manage systems to keep up to date for versions, Standards and configurations must be well
management, configurations, and patches documented, and access provided. Access to
patching, or management tools is required.
other updates

Technology Technology can be leased through subscription With cloud offerings, this is a prominent trend in
ownership services, reducing the capital required to the industry. This allows for service levels and
implement and maintain technology capabilities to be delivered without the overhead
of building and supporting technology internally.

With a large amount of opportunity within this space, understanding and managing outsourcing risks is an important activity to ensure that
services meet customer expectations. This should be done in a close conjunction with other practices, such as the risk management and
supplier management practices.

Some examples of these risks are:

loss of flexibility due to constraints of agreement

additional unplanned costs if the scope needs to be modified or if consumption exceeds the contractual terms

contractual service levels may not align with customer expectations

security and policy adherence of providers

loss of internal talent as role moves from performing activities to oversight of those activities

lack of visibility.

Although all functions can be outsourced, it is recommended to retain oversight and architecture functions. Oversight ensures providers are
delivering to their committed levels and allows insight into potential improvements to the existing agreement. To effectively support and
continue to deliver services, the knowledge of how solutions connect across providers must be well understood by the internal team. As the
specific knowledge in specific technologies moves to the provider, there should be an architectural role internally that understands the design
and operations of the infrastructure environment.

7. Important reminder

Most of the content of the practice guides should be taken as a suggestion of areas that an organization might consider when establishing and
nurturing their own practices. The practice guides are catalogues of topics that organizations might think about, not a list of answers. When
using the content of the ITIL practice guides, organizations should always follow the ITIL guiding principles:

focus on value

start where you are

progress iteratively with feedback

collaborate and promote visibility

think and work holistically

keep it simple and practical

optimize and automate.

More information on the guiding principles and their application can be found in section 4.3 of ITIL Foundation: ITIL 4 Edition.

8. Acknowledgements

AXELOS Ltd is grateful to everyone who has contributed to the development of this guidance. These practice guides incorporate an
unprecedented level of enthusiasm and feedback from across the ITIL community. In particular, AXELOS would like to thank the following
people.

8.1 Authors

Angie Pederson.

8.2 Reviewers

Dinara Adyrbayeva, Akshay Anand, Peter Farenden, Roman Jouravlev, Vernon Lloyd.

References

1. Nadareishvili, I., Mitra, R., McLarty, M., Amundsen, M., Microservice Architecture: Aligning Principles, Practices, and Culture, O’Reilly
2016

Home Resources CPD Badges Events Help Legal

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy