Skillsoft Course Transcript
Skillsoft Course Transcript
An event can also indicate that a normal activity or a routine operation is performed, such as installing
regular software updates.
An event is detected when an IT service, a configuration item, abbreviated to CI, or a monitoring tool
generates an alert or notification due to any change in the service or when a service failure occurs,
which may need further action and regular tracking. For example, when a daily automatic hardware audit
on a computer network indicates that a particular hardware is not functioning. This leads to detecting the
event – failure of the hardware to function as it is supposed to.
Note
A CI is a component that needs to be managed to deliver an IT Service. CIs typically include IT services,
hardware, software, buildings, people, and formal documentation, such as process documentation and
Service Level Agreements or SLAs.
These events are managed by the Event Management process. This process tracks all events
generated through the IT infrastructure and ensures normal functioning of the service.
Event Management also detects exception conditions that interfere with routine operations in the IT
infrastructure and escalates them to the appropriate teams or personnel for further action.
For example, when a program to scan all computers in a network fails on a few computers, an event is
generated and sent to the network administrator for further action.
Question
What is an event?
Options:
Answer
Option 1: This option is correct. An event may indicate normal or routine operations, such as
installing regular software updates, in the IT infrastructure.
Option 3: This option is correct. An event indicates a deviation from normal or expected
operations of IT services and reflects unusual conditions.
Option 4: This option is incorrect. To enable users to access an IT service, you can implement
the Request Fulfillment process. An event doesn't enable users to access an IT service.
Correct answer(s):
1. A change in the state of an activity which reflects normal operations in the IT infrastructure
3. A deviation from normal or expected operations of IT services
Event Management is the process of keeping track of all events that occur throughout an organization's
IT infrastructure. This process helps you quickly detect and respond to an event.
The tracking of events is facilitated by monitoring and control systems that are based on tools such as
Question
Answer
Option 1: This option is correct. Event Management tracks all events generated through the IT
infrastructure to help ensure normal operations.
Option 2: This option is incorrect. Event Management does not identify the cause of exception
conditions in the IT infrastructure; it detects and escalates these conditions.
Option 3: This option is incorrect. Event Management does not take action to correct
exception conditions; it escalates these conditions to the appropriate team or personnel for
further action.
Correct answer(s):
In addition to tracking events, detecting issues, and escalating exception conditions, the Event
Management process fulfills other purposes:
Suppose an organization has installed an antivirus software on its server. The software will generate an
event if it detects a virus on the server. And an active monitoring tool is set up to monitor these events.
This tool will automatically detect the event generated by the antivirus software and send alerts to the
respective team as configured in the monitoring tool, who will then take appropriate actions to deal with
the virus attack.
You can use Event Management tools to detect events and send not only alerts and warnings, but also
notifications indicating the status of the CIs. And based on these notifications, you can automate regular
operational activities.
For example, you can use an Event Management tool to send a notification to the system administrator
when the server backup completes every day. And also configure the tool to send an alert automatically
if the backup fails. This enables automation of a normally manual process of overseeing server backup.
You can also use Event Management tools to efficiently share services that are in demand, among
multiple devices.
Say a retail chain company has deployed an e-commerce application for customers to purchase goods
online. This application resides on two servers, and customers are connected to a server based on their
IP location. Based on the details from the Event Management tool, such as the number of customers
already connected to the servers and the available resources on the servers, the customer is allowed to
connect to any one of the servers dynamically. So this enables effective usage of resource, thus
improving performance of the online application.
These events or notifications detected through Event Management also serve as entry points for
executing activities, such as addressing and closing incidents logged with the IT Service Desk.
For example, if a user enters an incorrect password three times consecutively on an online banking
application, an alert is generated and based on this alert, the application is triggered to block the user
account. Also a notification is sent to the administrator of the application. Thus the alert or the event
generated is the entry point for executing the activity – blocking the user from accessing the application.
Event Management also helps compare actual performance against defined standards and SLAs.
According to the SLA of a telecommunications company, the network availability should be 99.9%. To
calculate the actual network availability, you can use an Event Management tool to automatically track
the network downtime or failures. This, in turn, can be used to compare the actual performance with the
expected performance.
Question
Options:
Answer
Option 1: This option is correct. Event Management helps detect events, such as an abnormal
occurrence on the network.
Option 2: This option is correct. Event Management automates the regular operational
activities of an organization. This leads to better resource and time utilization.
Option 3: This option is correct. Event Management helps compare defined standards and
SLAs against actual performance, such as the CPU capacity available on a client computer
with the CPU capacity required for an application. This facilitates effective operations in the
organization.
Option 4: This option is incorrect. Event Management does not correct exception conditions.
However, it escalates these conditions to the appropriate team to determine appropriate
control action.
Correct answer(s):
Many Service Operation processes, including Event Management, provide effective IT services. And
Event Management supports and spans across many aspects of Service Management that can be
controlled and automated.
monitoring CIs
Event Management helps monitor CIs that need to stay in a constant state and CIs whose
status needs to change frequently.
For example, your organization may have a software license for forty users. Event
Management can determine if all forty licenses are installed, are in use, and the frequency
of their use. This information can be used to ensure that no unlicensed software is installed
and to monitor usage statistics. Based on the frequency of its use, you can determine
which user groups need the software most, which in turn leads to optimal use of the
software.
detecting security breaches
Event Management can be used to monitor and detect security breaches.
For example, a monitoring tool tracks all installed applications on employees' computers. It
detects a restricted application on an employee's computer. The tool then generates an
event for this and notifies the system administrator, who can then take appropriate action.
monitoring environmental conditions, and
Event Management helps you monitor environmental conditions such as temperature.
Say the server room temperature increases due to failure of the air conditioner, which
increases the server temperature. An Event Management tool can be configured to detect
this event and send an alert to the concerned departments such as the IT and Facility
teams to take appropriate action.
tracking routine activities
Event Management can be used to monitor routine activities such as the server
performance.
For example, you can track the server performance using an Event Management tool. This
tool can be configured to generate events when the server performance falls below
acceptable levels.
In addition to its importance in Event Management, monitoring is also used in Service Management to
ensure normal operation of the IT infrastructure. However, the Event Management and monitoring differ
from each other in some aspects.
Event Management works with operations that need to be monitored and generates events if there is a
deviation from the normal process. This process focuses only on operations that are monitored to
reduce adverse impact on IT infrastructure services.
For example, you can use Event Management to track the inventory of all hardware in an organization
and generate an event whenever an upgrade or new hardware is required. In this situation, if the
hardware is not upgraded or acquired, some routine operations of an organization might be impacted.
Monitoring, on the other hand, tracks operations that generate events or alerts, as well as occurrences in
the IT infrastructure that do not generate alerts.
For example, monitoring tools track the memory utilization of a file server to ensure that memory used is
within acceptable limits, even if the regular utilization status does not generate alerts. An alert would
occur only if memory utilization exceeds acceptable limits. This alert would then be managed by Event
Management.
Question
An IT services organization has defined 90% memory utilization as the maximum acceptable
performance level for its main server. Due to extra workload in the IT infrastructure, the
server's memory utilization is close to 88%.
Options:
1. Event Management
2. Monitoring
Answer
Option 1: This option is incorrect. The Event Management process handles alerts that are
generated if the server's memory utilization exceeds acceptable limit, which in this case is
90%.
Option 2: This option is correct. Close monitoring of the server's memory utilization helps
ensure that it does not go beyond the acceptable limit – 90% from the current 88%.
Correct answer(s):
2. Monitoring
Question
Options:
Answer
Option 1: This option is incorrect. Ensuring the memory utilization of a computer is within
acceptable limits is a function of monitoring and not Event Management. The Event
Management process tracks events that are generated when memory utilization exceeds
acceptable limits.
Option 2: This option is correct. Event Management monitors the state of the network switch
to ensure that it is always operational so that the network service is available at all times.
Option 3: This option is correct. Event Management helps detect if logon details provided on
the main file server do not match with the logon details in the system and send an alert.
Option 4: This option is incorrect. Ensuring that network traffic is within defined limits is a
function of monitoring and not Event Management. The Event Management process tracks
events generated when the network traffic exceeds defined limits.
Correct answer(s):
Suppose an Event Management tool generates an event that indicates that the time it takes to access
files on a server is longer than the expected performance levels. This event is logged as an incident and
requires further monitoring. The details of the incident are then passed on to the concerned personnel to
take appropriate actions. So early detection of the incident allows the IT Service Desk determine the
cause of the slow server speed and take actions to return the service to normal operation.
Event Management notifies of status changes or exceptions in the IT infrastructure to other processes.
These notifications allow the appropriate process teams or individuals to respond to these in a timely
manner.
This improves process performance and provides more useful and capable Service Management for an
organization. This further ensures that all operational activities of an organization are carried out without
any hindrance, and thus improves the overall business profitability.
Say a hosted web page application generates an event that indicates that a payment authorization site
is unavailable. Event Management allows for the triggered event to reach the appropriate person or
group responsible for resolving the issue. So the relevant person or group is able to resolve the issue
quickly and restore the site so that financial approvals of business transactions are not impacted, which
could lead to business losses.
Question
Options:
Answer
Option 1: This option is correct. Event Management enhances Service Management through
early detection of events. This ensures that all routine operations are performed without
interruption.
Option 2: This option is incorrect. Event Management provides notifications of status changes
and exceptions to other processes across the IT infrastructure.
Option 3: This option is incorrect. Event Management only helps you determine the corrective
action plan for an event and implement the plan, it does not correct the problems.
Correct answer(s):
Event Management provides a mechanism for monitoring automated activities by exception. You monitor
automated activities only when exceptional conditions occur in these activities. This reduces the need for
regular monitoring and saves monitoring costs.
For example, if a server's CPU utilization exceeds beyond the acceptable utilization limit, an alert is
generated and sent to the system administrator for further action. In this situation, the system
administrator will need to monitor the CPU utilization, which is an automated activity, by exception.
Event Management also enables organizations to automate operations. Automated operations allow
organizations to use highly skilled, expensive human resources for more innovative work. Instead of
performing simple operations that can be automated, employees can design new services, improve
existing services based on customer feedback, or define new ways in which the organization can use
technology.
Consider that as an IT policy of an organization, all employees should change their network password
every 21 days. Using Event Management, this regular activity of changing the password can be
automated. The Event Management tool can be configured to check each user's password for expiry
and send an alert to the concerned employee five days before it expires. And in case any employee
does not change the password before the expiry, the employee is automatically taken to the password
change screen and required to change the password.
This automation ensures that an IT personnel does not have to spend time on manually helping the
employee change the network password every time it expires nor remind the employee about the
impending expiry of the password. Instead, the IT personnel’s time and effort can be used for more
critical tasks. This improves productivity and profitability.
Question
Options:
Answer
Option 1: This option is correct. Event Management helps in early detection of events so an
organization can ensure that its routine operations are not interrupted.
Option 2: This option is correct. Event Management tracks the status of server installation and
notifies status changes and any exceptions in the installation process. This ensures that the IT
team provides an early response to such notifications.
Option 3: This option is correct. Event Management helps automate the activity of installing
software updates on the server. This ensures that the IT personnel's effort is spent on more
innovative work.
Option 4: This option is correct. Event Management provides a mechanism for monitoring
automated activities by exception, such as monitoring the CPU utilization of the server when it
exceeds beyond the defined utilization limit.
Option 5: This option is incorrect. The tasks to the IT personnel are assigned by the IT
manager without using Event Management.
Option 6: This option is incorrect. Problems in server installation are resolved using the
Incident Management process.
Correct answer(s):
4. Summary
Event Management is a Service Operations process that includes tracking all events generated in the IT
infrastructure and determining appropriate preventive or corrective actions for these events. It also helps
automate regular operational activities, provides entry points for various service operations, and helps
you compare actual performance with defined standards or SLAs.
Additionally, Event Management helps manage various aspects of Service Management such as
monitoring CIs, monitoring software licenses, detecting security breaches, monitoring environmental
conditions, and routine activities in the IT infrastructure.
A well-monitored and controlled Event Management process adds value to the business. It provides a
mechanism for early detection of events, provides notifications to identify changes, allows for
exceptional monitoring of automated activities, and provides a basis for automated operations.
The Policies of Event Management
Learning Objectives
After completing this topic, you should be able to
distinguish between types of events
identify what determines event types
1. Types of events
Managing and recognizing different types of events is critical for an organization to ensure an effective
Event Management process. By managing events, you can detect any deviation from normal or
expected operations. And by categorizing events into different types, you can quickly detect them and
determine appropriate control actions for each type of event.
Regular events can be used as data for the future analysis of services. This can help in improving the
performance of various IT services provided to users and customers.
For instance, consider an antivirus program that is scheduled to run every day on all user computers in
an organization. In this case, the IT Department can deploy a monitoring tool that tracks if the program
runs as scheduled. After the program is run, the monitoring tool sends a notification to the network
administrator indicating that the operation was performed successfully at the scheduled time. This is
categorized as a regular event.
While this notification informs the Service Desk about every transaction, it can also be used to analyze
the average time taken in completing a transaction. This data can be used to improve the transaction
process in the future.
For example, consider a bank that provides online banking services to its customers. It allows customers
to transfer funds from one account to another through its web site. For security purposes, the IT
Department has stipulated a specific turnaround time in which every fund transfer should be completed.
In this case, if the transaction is taking 20% more time than specified, it would be reported as an unusual
event as it is exceeding the defined threshold.
Unusual events do not require immediate response from the IT Service Desk but they must be
monitored closely for any extreme deviation from the identified thresholds. In the case of funds transfer,
the bank's IT Service Desk can monitor if this event is occurring repeatedly.
For instance, in the case of online funds transfer, the transfer process may have slowed down because
of large number of users accessing the web site at the same time. So once the number of users
declines, the normal operation is restored and the funds are transferred without any delay.
However, if the funds transfer process is taking a longer time for every transaction, the IT Department
needs to intervene and identify its cause and resolve the problem.
Consider that the IT Department of an organization has deployed a monitoring tool to perform audits on
every computer. In one such routine audit, an unauthorized software is detected on one of the
computers. The monitoring tool then automatically sends alerts and notifies the Service Desk about this
breach in security. The detection of unauthorized software is an example of an exceptional event that
can have adverse impact on the data security of the organization.
For example, when a customer performs an online transaction for purchasing products, and the payment
authorization web page fails to load. This can result in disruption of customer services and also means a
loss of business to the online shopping company. If the web site is being monitored through Event
Management, this exceptional event can be quickly detected and resolved.
Question
A retail chain uses an IT-based inventory system that keeps a record of all products shipped
out of their central warehouse, including the rate at which inventory is deployed to each store.
This enables the company to forecast their ordering requirements from the manufacturers in
advance, and see the overall average of product consumption at each retail outlet. One store
is ordering three times the quantity of a particular product than all the other stores.
Options:
1. Regular event
2. Exceptional event
3. Unusual event
Answer
Option 1: This option is incorrect. A regular event is typically generated when an activity is
completed. These activities are routine changes such as the completion of the product sales
process. Regular events don't require immediate response from the Service Desk.
Option 2: This option is incorrect. An exceptional event requires further investigation and
action to resolve the event. Because the notification in this case requires the network
administrator to simply monitor the quantity of the product, this is not an exceptional event.
Option 3: This option is correct. Events are considered unusual when they don't meet the
threshold specified or overshoot the limit. In this case, the store has a much higher quantity of
a particular product than all the other stores. This event alerts appropriate personnel who can
more closely monitor the situation to determine why the store is exceeding expected levels in
ordering that particular inventory item.
Correct answer(s):
3. Unusual event
Question
A bank provides clients with an online secure platform for Internet banking. You are the IT
manager in this bank responsible for this particular service. Match each example of an event
with the appropriate event type.
Options:
Targets:
1. Regular event
2. Exceptional event
3. Unusual event
Answer
A client logging onto the bank's web site is a regular event because it simply confirms the
successful completion of an activity. This event doesn't require any immediate response from
the IT Service Desk.
The web page repeatedly failing to perform as it should is an example of exceptional event
because it represents the failure of a service. This event can adversely impact the bank's
customer relationships and so requires immediate resolution from the Service Desk.
The unusual increase in the time taken to complete a transaction for bill payment can be
considered an unusual event. This event requires monitoring by the Service Desk but no
immediate action. The Service Desk can monitor this event, and if it persists for a longer
duration, an action may be required.
Correct answer(s):
Target 1 = Option A
Target 2 = Option B
Target 3 = Option C
Determining whether an event is regular, unusual, or exceptional varies from one organization to
another.
There are no definitive rules for categorizing events. Every organization has its own categorization of
events that is based on the significance of the event and the organization's policies and benchmarks.
The organizational goals and targets specified in the Service Level Agreements, customer requirements,
and industry standards may also determine what constitutes an event's category.
A regular event for one organization might be unusual for another. And an unusual event for one
organization might be exceptional for another.
Consider that the manufacturer of an application specifies a benchmark of 80% memory utilization of the
main server as optimum for the application. For an organization with a small IT infrastructure, the
application function begins to degrade at 65% memory utilization. However, in a big IT infrastructure,
80% memory utilization indicates normal function for the application. So a condition considered regular
for one organization might be unusual for another.
Similarly, a particular software program might be restricted by one organization and allowed in another.
Therefore, installation of the program might be an exceptional event in one organization and a regular
event in another.
Customer requirements can also help a company determine if an event is regular, unusual, or
exceptional. For example, consider an investment banking company whose clients demand real-time
stock quotes. Because this affects the company's ability to do business with clients, they have more
stringent requirements for real-time information. Delays in accessing information could be far more
detrimental for their business. So this company may consider such delayed operations unusual or
exceptional, in contrast with other industries that would consider the same delayed operations regular or
a mere warning.
All events in an IT infrastructure, regardless of their type, are detected only when an alert message is
generated. These messages are referred to as event notifications and are generated to signal a status
change or an exceptional condition of a configuration item or an IT service.
To respond to event notifications, they should first be identified and captured. This can be done only if
these event notifications are properly defined. The Service Desk can identify the occurrence of an event
based on the definition and respond to the generated event notifications appropriately.
Question
You are the senior system administrator of a telecommunications company that provides high-
speed Internet services to customers. You are responsible for implementing the Event
Management process in the company's IT Service Life Cycle. To do this, you need to identify
the various event types that should be recorded. What factors will help you determine the
various event types?
Options:
Answer
Option 1: This option is correct. Every organization has its own goals and targets that help
determine the event types. These goals and targets are typically specified in the Service Level
Agreements during the Service Design stage of the IT Service Management process.
Option 2: This option is correct. Customer requirements are an important factor that determine
if an event should be categorized as regular or exceptional. These requirements help you
identify the significance of an event, which makes it easier to determine if the event is normal
or requires immediate resolution.
Option 3: This option is incorrect. There are no definitive rules that govern or influence the
categorization of events in the IT Service Life Cycle of an organization. Event Management
simply lays down the process of managing events; it doesn't specify any rules that determine
event types.
Option 4: This option is incorrect. An event is not categorized based on the type of
department from which it originates. The categorization is based on the significance of the
event.
Correct answer(s):
Events signifying regular operations are occurrences in the IT infrastructure that do not require any
action and are used to report successful completion of an activity. They simply identify a change in the
status of a device or a service. Exceptional events indicate abnormal operations that require further
investigation to prevent any adverse impact on the business operations. And unusual events indicate
situations that require monitoring so as to prevent it from turning into an exception.
Determining the types of events is crucial for effectively managing events. There are no definitive rules
for determining event types. However, an organization's policies and benchmarks help determine which
types of events are regular, exceptional, or unusual.
Approaching Event Management
Learning Objectives
After completing this topic, you should be able to
recognize the purpose, scope, and business value of Event Management
recognize examples of different types of event
1. Exercise overview
In this exercise, you're required to identify the purpose, scope, and business value of Event
Management. Additionally, you're required to differentiate between the types of events.
Question
You're the IT manager of a bank that provides clients with an online secure platform for
Internet banking. You want to implement the Event Management process to manage all events
related to the use of the online banking application.
Options:
Answer
Option 1: This option is correct. Event Management helps you quickly detect and analyze
events that indicate a change in the state of a component or an IT service.
Option 2: This option is correct. Event Management helps you determine appropriate control
actions for incidents that require immediate response from the IT support team.
Option 3: This option is correct. Event Management enables you to automate routine
operational activities of an organization. This leads to better resource and time utilization.
Option 4: This option is correct. Event Management provides a way of comparing actual
performance and behavior against standards and service level agreements.
Option 5: This option is incorrect. Monitoring helps you detect status of components when no
events are occurring. Event Management detect events that indicate a change in the state of a
component or an IT service.
Option 6: This option is incorrect. Event Management does not resolve exception events.
However, it assigns these events to the appropriate team to determine appropriate control
action.
Correct answer(s):
1. Detect events, such as successful completion of online funds transfers and user
authentication
2. Determine appropriate control actions for incidents, such as unauthorized access or failure
of the transaction process
3. Automate routine activities such as password updates or new password requests
4. Compare actual performance against expected performance, such as the time taken to
complete a transaction
Question
You're the senior system administrator in a retail chain that uses an IT-based inventory
application for managing its inventory. You want to implement the Event Management process
to efficiently manage all events generated in the IT infrastructure of your company.
What are the aspects of IT Service Management in your company that can be controlled by the
Event Management process?
Options:
Answer
Option 1: This option is correct. Event Management controls security issues, such as
unauthorized access to an application. This process helps you deploy monitoring tools that
trigger alerts when an unauthorized user tries to access an application.
Option 2: This option is correct. Event Management helps you monitor and detect all events
that indicate normal activity or a routine change, such as the performance of an application.
Option 3: This option is correct. Event Management ensures that all events related to upgrade
of software license are captured. It also helps you optimize the allocation of software to
authorized users.
Option 4: This option is correct. Event Management monitors configuration items whose
status changes frequently. For example, it can trigger alerts whenever the inventory data is
updated on the main server.
Option 5: This option is incorrect. The scope of Event Management does not include
managing the resources or identifying resource availability for resolving incidents. This process
is concerned with monitoring and detecting events that indicate a change in the state of a
device or activity.
Option 6: This option is incorrect. Monitoring checks the status of components even when no
events are generated. Event Management generates and detects notifications only when there
is a change in the status of a component or service.
Correct answer(s):
Question
How can the Event Management process help you provide more value to the business?
Options:
Answer
Option 1: This option is correct. Event Management helps you automate routine activities
such as providing reminders for password update. This helps you use human resources for
more critical tasks that require expert knowledge, such as a network failure or a server
breakdown.
Option 2: This option is correct. Event Management enables you to monitor exceptional
events, such as an error that causes a program to change from its usual routine. In this case,
you can deploy a monitoring tool to trigger alerts when automated reminders fail to reach the
employees.
Option 3: This option is correct. Event Management helps you quickly detect incidents that
can adversely impact business operations. In this case, Event Management can help you
trigger an alert when employees do not update their passwords, which might make their
computer vulnerable to security threats.
Option 5: This option is incorrect. Information security guidelines are determined by the
organization and are based on industry standards. Event Management is not responsible for
designing security guidelines.
Correct answer(s):
1. Automates the task of sending reminders by deploying an application that prompts the user
to change the password after every 90 days
2. Deploys a monitoring tool to detect exception events such as when a user doesn't receive
automated reminders
3. Provides a mechanism of early detection of incidents, such as not updating passwords
after the specified period or using wrong passwords
Question
Employees in your organization use a timesheet management system to record their daily
activities and the time spent in performing those activities. You're the IT Service Desk agent
responsible for managing all events related to this system. To do so, you want to categorize all
related events.
Match each example of an event with the appropriate event type. You may use each event
type more than once.
Options:
Targets:
1. Regular event
2. Exceptional event
3. Unusual event
Answer
An employee logging on to the timesheet management system and filling the timesheet are
regular events, which signify successful completion of activities. These are events that do not
require any immediate action from the IT Service Desk, but can provide useful data for the
improvement of services in the future.
A timesheet segment not responding to requests and a user ID getting blocked are considered
as exceptional events because they signify the failure of a service. These events can impact
the work of employees and require immediate action to resolve them.
The system taking a longer time than normal to process the timesheet is an unusual event that
requires monitoring from the IT Service Desk. In case of this event, the Service Desk should
monitor the event and check if it resolves by itself after sometime or prolongs for a longer
period requiring intervention.
Correct answer(s):
Target 3 = Option C
Question
A stockbrokerage firm uses an application that provides real-time stock quotes. Availability of
real-time information is a critical factor in their ability to serve their clients satisfactorily. As the
system administrator, you detect an event notification that indicates the memory utilization of
the main server is currently at 75% and increasing. As a result, information is taking a longer
time to process. The threshold is 80% after which there may be a server failure.
What type of event does the current situation signify?
Options:
1. An unusual event
2. An exceptional event
3. A regular event
Answer
Option 1: This option is correct. The event notification indicating 75% memory utilization of the
server signifies an unusual event because it has not yet reached the threshold. This event is
intended to notify the Service Desk that the server's memory utilization should be monitored.
The Service Desk should check if the normal operation is restored or if it requires intervention
in case the situation continues for a longer duration.
Option 2: This option is incorrect. The event of 75% memory utilization of the server is not an
exception because it has not breached the threshold and has not yet resulted in the service
failure. The current situation doesn't require any immediate action except monitoring the event.
Option 3: This option is incorrect. A regular event signifies a routine change in the status of an
activity or the successful completion of an operation. The current situation indicates that the
server's memory utilization is increasing and slowly approaching the threshold. So this requires
closer monitoring from the Service Desk.
Correct answer(s):
1. An unusual event
Question
You are the IT manager of a company that provides online DVD rental services. You have
implemented the Event Management process to manage all events efficiently. For this, you are
using a monitoring tool that sends notifications for all specified events.
Options:
Answer
Option 1: This option is correct. Successful completion of a transaction is a routine activity, so
it is considered as a regular event. This event doesn't require any action from the IT Service
Desk. Data related to this event can be used for improving the service in the future.
Option 2: This option is correct. The event notification that the automated mail is sent to new
users indicates a completed activity, and so is a regular event. This event doesn't require the
IT Service Desk to monitor or take any action.
Option 3: This option is incorrect. The transaction process taking 10% longer time than normal
to complete signifies an unusual event because it indicates that an activity is approaching its
threshold. This event requires closer monitoring to prevent it from becoming an exceptional
event.
Option 4: This option is incorrect. The payment authorization page not getting displayed can
result in loss of business to the company and therefore is considered an exceptional event.
This event requires immediate resolution from the IT Service Desk.
Correct answer(s):
Question
Options:
1. Unusual event
2. Regular event
3. Exceptional event
Answer
Option 1: This option is incorrect. The event of unauthorized access to a network is not an
unusual event because it requires immediate response from the Service Desk. An unusual
event that signifies an unusual operation, such as a service reaching a threshold, and may
require monitoring.
Option 2: This option is incorrect. Unauthorized access to a network is not a regular event
because it can make the company information vulnerable to misuse. An event is considered
regular when it reflects a routine change in the status of an activity or the successful
completion of an operation.
Correct answer(s):
3. Exceptional event
The Event Management Process
Learning Objective
After completing this topic, you should be able to
recognize how the Event Management process works
Ensuring that IT services are delivered effectively and efficiently is the prime goal of ITIL Service
Operation. A key component of Service Operation is the Event Management process.
Note
Service Operation is a stage in the IT Service Life Cycle. It contains a number of processes and functions.
The Event Management process monitors the status of IT infrastructure to ensure there are no
disruptions to the normal operations of the service, detects events that may cause disruptions to
services, and escalates those events to the appropriate process. This process monitors the performance
of a service and ensures its availability.
During the Event Management process, event notifications are generated by configuration items or CIs,
monitoring tools, or IT services in response to events.
Events are changes in the state of a CI or IT service that can affect normal functioning of the service or
the IT infrastructure on which it is deployed.
Note
A CI is any component in the IT infrastructure monitored for effective delivery of an IT service. CIs can be
hardware, software, buildings, personnel, or even formal documents.
Events can indicate normal activities, abnormal activities, or requirements for intervention during normal
activity.
Continuous monitoring ensures smooth functioning of normal operations and enables detection and
escalation of exception conditions. The Event Management process consists of ten steps.
Supplement
Selecting the link title opens the resource in a new browser window.
Job Aid
Access the job aid The Event Management process to view a diagrammatic representation of
the complete process.
The first five steps in the Event Management process are related to the generation of event notifications:
In the last five steps of the Event Management process, an appropriate response action is selected and
the event is closed:
The first step in the process of Event Management is the occurrence of an event.
Events occur all the time. However, you need to configure the system to detect and record those events
that are essential for managing the IT service or the IT infrastructure on which the service runs. At the
design stage, you consult with the design, development, management, and IT support teams to
determine which events need to be detected. The process used to decide which events are detected
and how they are handled is called instrumentation.
Say you are monitoring the performance of a server. You can configure the system to detect events such
as an increase in temperature of the server beyond a threshold, sudden increase in CPU utilization, and
power failure. These events can affect regular operations of the service and will need timely intervention.
In the second step of the Event Management process, the CIs or the IT service generate event
notifications to indicate that events have occurred.
Designers, based on previous experience, program the CI to generate a standard set of event
notifications. If required, the CIs can be customized by configuring additional event-generation
processes. In some cases, you might have to install an additional agent software that monitors the CI
and generates event notifications.
polling and
In polling, a management tool constantly queries the CI for specific data. Here the CI plays
a passive role in communication.
For example, you can configure a management tool to check regularly for any security
intrusion in the IT infrastructure.
notification
You can also design CIs to actively generate notifications when certain criteria are met.
Here the CIs play an active role in communication.
On a server, by default many events are tracked and tagged with specific event IDs. You
can configure the CI to generate specific event notifications for selected IDs.
Event notifications are of two types – proprietary and open standard. Proprietary event notifications can
be detected only by the management tools provided by the manufacturer of the CI. Most of the event
notifications related to the operating system are proprietary.
On the other hand, open-standard event notifications are those that do not require a proprietary
management tool for detection. Open-standard event notifications are more common when you use
customized applications on your server.
Note
Event notifications, whether proprietary or open standard, are useful only when they are clearly defined
and the response action to be taken is clear. When developing event notifications, designers must
ensure that the event notification data is clear and the roles and responsibilities are well defined.
A wide alert – an event notification without clear indication of who should resolve – leads to ambiguity
and effort duplication.
An event notification is detected during the event detection step – the third step in the Event
Management process.
If a CI has agent software installed, the agent detects the event notification. In the absence of agents,
event notifications are passed to the management tools, which can analyze them.
For example, you can use a proprietary network management tool to monitor the network bandwidth
usage. The tool can generate automatic event notifications when usage threshold identification is
crossed.
When necessary, the next step in the Event Management process is filtering event notifications.
Although you can configure the CI to generate event notifications only for required events, additional
filtering is often required
Sometimes, it might be sufficient to send only the first event in a series of related events to the
management tool. For example, say a network cluster fails and is being repaired. However, the system
keeps generating events because the network is down. In this case, the network will start functioning
when the cluster failure is fixed. So the network failure events can be ignored; only the first event needs
to be recorded.
In addition, a first-level correlation is also performed during this step. This involves identifying the
significance of the event and classifying it based on the significance.
The fifth step in the Event Management process involves classifying events based on their significance.
Different organizations follow different event categorizations. However, the broad event categories
commonly used include
informational
You use informational events to check the status of a device or service, to confirm if an
activity is completed successfully, or to collect data for statistics and investigation.
For example, a server might run a number of services. If any of the services stop, start, or
restart, event notifications are generated.
warnings, and
If a device or a service is approaching its threshold, warning events are generated
indicating that action needs to be taken and providing sufficient time to respond. Say, the
humidity in the server room is increasing. When it reaches the threshold of 48%, a warning
event is raised. If this trend continues and surpasses the acceptable limit – 51% – normal
functioning of the equipment might be impaired. This in turn impacts the business because
the OLA of the administration department is breached.
exceptions
Any abnormal activity in the device or service generates an exception event. Exception
events are generated due to any failure, functionality impairment, or poor performance and
significantly impact the business. An exception event is generated if a server suddenly
shuts down, for example.
Question
You are monitoring the performance of the server and suddenly a backup server fails. Based
on the Event Management process, sequence the first five steps in this case.
Options:
A. The first notification is passed to the management tool and the rest are
ignored
B. Network components are unable to reach the server
C. Notifications for inability to reach the server are generated repeatedly
D. An agent detects the notifications
E. Informational events, warnings, and exceptions are identified
Answer
Correct answer(s):
Question
You are monitoring the performance of a server in the IT infrastructure. There is a known issue
on the server that generates events deemed unnecessary but the event notification cannot be
turned off.
Options:
Answer
Option 1: This option is incorrect. This event is a result of a known issue and should not be
passed to the management tool for further resolution. If passed, a new issue is logged leading
to duplication of effort.
Option 2: This option is incorrect. This event is a result of a known issue and should not be
passed to the management tool for further resolution.
Option 3: This option is correct. The step – filtering event notifications – filters the event
notifications so that already-known issues are not passed to the management tool for
resolution.
Correct answer(s):
Question
A segment of the network in your organization stops responding to routine requests. This
behavior results in reduced network functionality.
Options:
1. Informational
2. Exception
3. Warning
Answer
Option 1: This option is incorrect. The event is not informational because it requires an
immediate response. The event cannot be treated just as data for investigations and statistics.
Option 2: This option is correct. This event is an exception. An abnormal change in the IT
infrastructure impacts functionality of the network.
Option 3: This option is incorrect. This event is not a warning because it requires immediate
response and intervention.
Correct answer(s):
2. Exception
The sixth step in the Event Management process involves correlating events.
Assume that two independent events occurred in a server – a cluster fails and the network is down. The
cluster fails in part because the network is down, and if the two events occur in a sequence, the events
may be correlated. This correlation enables you to determine a response for the events, the significance
you attach to the events, and the order of priority in which the events should be resolved.
Event correlation, usually implemented by management tools such as correlation engines, checks the
event against the business rules of the organization.
Business rules are a set of technical conditions and rules that are checked in a specific order. The
business rules help determine the exact meaning of the event and how it can be resolved. This in turn
can help you understand the level and type of impact the event can have on the business.
Correlation engines are configured based on the organization's business rules, performance standards
stated during Service Design, and operating environment-specific inputs.
Depending on the business rules in an organization, a correlation engine can be configured to consider
factors such as
The seventh step in the Event Management process involves triggering a response for the events
identified. Responses are initiated by triggers after the correlation engine identifies an event.
For example, scripts can perform a specific set of actions, such as submitting batch jobs. In this case,
the trigger is the script.
Triggers are organization specific. For example, some organizations can use pagers to communicate the
occurrence of an event to an individual or a team. In this case, the pager is the trigger.
Based on the task triggers initiate, different types of triggers are available.
incident triggers
When an incident – such as an application stopping on a server – occurs, incident triggers
generate an incident record and initiate the Incident Management process.
change triggers, and
Say the system generates the disk-full event and the remedy requires an increase in
server capacity. Change triggers generate a Request for Change, abbreviated to RFC. The
RFC, in turn, initiates the Change Management process.
database triggers
Database triggers are commonly used to implement user-access restriction to databases.
For example, if an unauthorized user tries accessing a server, a database trigger can be
used to deny access.
The next step in the Event Management process involves selecting the response actions for the event.
Often, a particular event might have multiple response actions.
If a high CPU utilization event occurs on a server, response actions may include human intervention,
event logging for future reference, or generation of an RFC.
Organizations formulate specific rules on how to select the appropriate responses and how to execute
them.
Some events might not always have a direct impact on the IT service. However, it is still essential that
they be addressed for the smooth functioning of the IT service or IT infrastructure.
By no means can you anticipate all such special events and formulate a response action for them. For
example, if a user gains access to a secured area in the network, the response might depend on
individual circumstances.
To handle such special events, such as a user gaining access into a restricted area, you need to chart
out some broad guidelines:
log the incident using an appropriate incident model – in this case, the Security Incident Model
ensure that the event is certified by the Incident Model as an operational issue instead of a
service related issue, and
escalate the incident to the appropriate group for proper resolution
Follow-up procedures are essential to determine if an event was handled appropriately and resolved
successfully. This occurs in the ninth step of the process – reviewing event handling. Although it can be
difficult, you need to ensure that all significant events are reviewed. While reviewing, in addition to
checking if the event was handled appropriately, you also track the event trends and collect statistics
about event types.
You can automate the review process by polling a server and running a relevant script to check if the
impaired function is restored. However, take care to avoid duplication of efforts – the Incident, Problem,
or Change Management process, if initiated, might have already taken care of the reviews. For example,
if an event is generated when a service stops, you can poll the server and find if it is restored and
functioning.
The output from the review is often used to evaluate, audit, and improve the Event Management
process.
In the final step of the Event Management process, all events that are successfully resolved are closed.
However, events that are linked to other open events can be closed only when the linked events are
resolved successfully.
Some events, such as informational events, do not have any status attached. Other events, such as
events configured with an automatic response, are closed when it raises a second event.
If an event generates an incident, problem, or change, the event should be closed only after the
corresponding process initiated is formally completed and closed.
Question
A network card on a router is not performing as programmed. The system has detected and
filtered this event.
Based on the Event Management process, sequence the last five steps in this case.
Options:
Answer
Correct answer(s):
Question
The status of one of the routers in your network is showing as down and a notification is
immediately sent to the router support team.
Options:
Answer
Option 1: This option is incorrect. In this step, events are correlated with response actions.
The alert is raised in this step.
Option 2: This option is correct. After an event is recognized, a response needs to be initiated.
Specific triggers are used in this step to initiate the response action.
Option 3: This option is incorrect. In this step, all possible response actions are evaluated and
a response is implemented. The alert is sent to the IT support staff in this step.
Option 4: This option is incorrect. This step involves reviewing the action taken to resolve the
issues and does not generate the alerts.
Correct answer(s):
3. Summary
Effective deployment of an IT service relies on the presence of an efficient Event Management process.
The process involves the creation and handling of events while monitoring the status of all the
components in the IT infrastructure on which the IT service is deployed.
The Event Management Process consists of ten steps. In the first step, the event occurs. Next the CI or
the IT service generates notifications for the occurred events, and communicates them through polling
and event notifications. Event notifications are then detected and filtered if necessary.
In the fifth step of the process, events are classified based on their significance and then correlated.
Next event response is initiated using triggers. In the next step, the appropriate response action or
combination is selected. Event handling is reviewed to ensure that all events are resolved successfully,
and finally the events are closed.
The Components of Event Management
Learning Objectives
After completing this topic, you should be able to
identify examples of triggers
identify processes with which Event Management interfaces
The trigger type – exceptions to predefined level of CI performance – relies on events from CIs in the IT
infrastructure.
Assume that the OLA for a service states that the CPU usage of the server should be below a specific
threshold. If, for any reason, the CPU usage increases beyond the threshold, the Event Management
process is initiated. Increase in CPU usage beyond the threshold can result in the server hanging or
restarting, which in turn can lead to loss of data and network connection.
For this reason, CPU usage is monitored continuously, and if it increases beyond the threshold,
corresponding events are generated and Event Management is triggered. This ensures timely
intervention and initiation of remedial action if required.
Question
The network suddenly goes down in your IT infrastructure. This generates an event and in
response to the event, a service ticket is automatically raised in the IT Helpdesk tool.
What type of trigger initiates the Event Management process in this situation?
Options:
Answer
Option 1: This option is correct. In this situation, the network going down is an exception to
the predefined level of network performance. So the Event Management process is initiated by
this type of trigger.
Option 2: This option is incorrect. In this situation, the event doesn't involve an exception to an
automated procedure or process. So this is not the trigger type that initiates Event
Management in this situation.
Option 3: This option is incorrect. In this situation, the event doesn't involve completion of an
automated task. So this is not the trigger type that initiates Event Management in this situation.
Option 4: This option is incorrect. In this situation, the event doesn't involve any device,
database, or application reaching predefined threshold identification. So this is not the trigger
type that initiates Event Management in this situation.
Correct answer(s):
Exceptions to automated procedures are also common in IT infrastructure. Say OS patches run from a
centralized server aren't installed on a few servers in the network. And without the OS patches some
applications on the servers cannot function. Then events need to be generated to trigger Event
Management.
Similarly, an exception can occur during the automated process for checking the completion of a
hardware update. If the update is incomplete, events can be generated to trigger Event Management.
The process can then ensure that appropriate action is taken to fix the issue.
Sometimes, triggering the Event Management process is helpful in the case of interconnected
automated procedures. Say an exception occurs in one of the procedures and the other procedures
cannot be run. You can ensure that the other procedures are signaled to stop by the Event Management
process.
Say a service is not functioning as per the specifications, for example, a customer is unable to generate
reports from an application. This is a typical example of an exception in a business process. Because
this can lead to customer dissatisfaction, events are generated and the Event Management process is
triggered to resolve and prevent the future occurrence of the condition.
Similarly, assume the business process in an organization requires that employees use a management
information system tool to enter time utilization details. If a specific user cannot log in and enter details,
an event is generated and Event Management is triggered. Incomplete data can lead to incorrect data
analysis and is an exception to the business process.
Completion of automated tasks or jobs also triggers Event Management by generating events.
In some cases, the status of certain devices might need to change regularly. If the status doesn't
change, the device isn't functioning properly. For example, a proxy server needs to show data flowing in
the network. During data flow, the status changes frequently.
In both cases, whenever there is a status change, events are generated and Event Management is
triggered.
Tracking access of an application or database in an IT infrastructure can often reveal security threats. In
such cases, events are generated when special applications or restricted databases are accessed.
For example, only a few employees in an organization are allowed to access the HR database. If an
unauthorized user or an automated procedure tries to access it, events are generated and Event
Management is triggered.
Similarly, if unauthorized users or procedures access the proxy server or the management information
system server, Event Management is triggered.
Event Management is also triggered when an application, device, or database reaches the predefined
performance threshold.
Question
The hard disk space on a critical server on your network is reaching the limit of acceptable
performance.
Options:
Answer
Option 2: This option is incorrect. This is not the type of trigger that initiates Event
Management in this situation because the situation does not involve change in status of the
critical server or the hard disk space on it.
Option 3: This option is incorrect. This is not the type of trigger that initiates Event
Management in this situation because no automated procedures or processes are involved.
Option 4: This option is incorrect. This is not the type of trigger that initiates Event
Management in this situation because the situation does not deal with business processes or
exceptions in them.
Correct answer(s):
1. A device, a database, or an application reaching predefined threshold identification
Question
Your organization uses a proprietary tool to audit all the computers on the network. The tool
records the success or failure of the audit.
Options:
Answer
Option 1: This option is incorrect. This is not the type of trigger that initiates Event
Management in this situation because the CI performance is not monitored. On completion of
the automated audit, an event is generated and the Event Management process records the
status in the appropriate log file.
Option 2: This option is incorrect. This is not the type of trigger that initiates Event
Management in this situation because exceptions to automated procedure are not monitored.
On completion of the automated audit, an event is generated and the Event Management
process records the status in the appropriate log file.
Option 3: This option is correct. This is the type of trigger that initiates Event Management in
this situation. On completion of the automated audit, an event is generated and the Event
Management process records the status in the appropriate log file.
Option 4: This option is incorrect. This is not the type of trigger that initiates Event
Management in this situation because the tool monitors the success of the audit and not the
threshold identification of the device, database, or application. On completion of the automated
audit, an event is generated and the Event Management process records the status in the
appropriate log file.
Correct answer(s):
Event Management can be used to monitor and control such applications and manage situations when
intervention might be required. For example, an unauthorized login to a customer account may indicate
a security breach that requires intervention. In such cases, interfacing with Event Management triggers
the process and enables you to monitor and control the issue.
Problem, Change, and Incident Management processes can use events generated by Event
Management to manage, monitor, and control problems, changes, and incidents.
Capacity Management ensures optimal utilization of the resources available in the IT infrastructure for
delivering the service and meeting the business requirements of the organization. Availability
Management ensures that IT services are available at all times.
The interface between Capacity and Availability Management and Event Management is mutually
beneficial. Capacity and Availability Management help identify important events, set event threshold
identification, and define the response each event should elicit.
In return, Event Management helps improve the performance and availability of services. This is ensured
by responding to events when they occur and reporting and analyzing events and their patterns with the
targets defined in SLAs. The analysis helps identify areas of improvement in IT infrastructure design or
operation.
Question
The server in your network suddenly generates the disk full alert. This is the first time the
server is generating this alert. You plan to resolve the event later.
What is the process type that Event Management interfaces with in this situation?
Options:
1. Incident Management
2. Problem Management
3. Asset Management
4. Process Management
Answer
Option 1: This option is correct. Event Management interfaces with Incident Management to
enable control and manage intervention or reduction in quality of an IT service.
Option 2: This option is incorrect. In this scenario, the event is an incident occurring for the
first time now and so Event Management interfaces does not interface with Problem
Management.
Option 3: This option is incorrect. Event Management does not interface with Asset
Management. This is because, the incident does not involve managing assets.
Option 4: This option is incorrect. Event Management does not interface with Process
Management. Some of the process types Event Management interfaces with are Configuration
Management, Asset Management, and Knowledge Management.
Correct answer(s):
1. Incident Management
Configuration Management is used to maintain and manage information about the CIs used in the IT
infrastructure. Configuration Management uses events to monitor the status of CIs. These events can
then be compared with the baselines set in the Configuration Management System, abbreviated to CMS,
to identify unauthorized changes in the status of CIs.
Assume that in the mail server, the baseline size limit for individual mailboxes is 85 MB. According to the
CMS, the size limit can increase up to 95 MB. However, if the mailbox size increases any further, the
Event Management process generates events to identify how and when the threshold was exceeded
and what needs to be done.
For example, an organization with many teams can move assets across these teams. Asset
Management stores various information, such as the upgrades performed on a computer after it moved
to a particular team, and the configuration of the system when it was used by the previous team.
Knowledge Management helps organizations collect, store, organize, and use domain-specific
information. Event Management can be used in many aspects of Knowledge Management, such as to
analyze the performance patterns that affect business activity. The information collected using events
can then be used in future designing and strategic decisions.
For example, an IT service provider can use Knowledge Management to collect information on the
performance of specific IT services, analyze the data, check how issues were resolved, and suggest
fixes to avoid future occurrences. The information captured can help in planning and designing future
services.
Yet another management process that Event Management interfaces with is Service Level Management,
which monitors service levels. Event Management helps identify any nonconformance to Service Level
Agreements or SLAs early in the process. This helps resolve problems and deflect failures that can
severely impact the service targets in the SLAs.
For example, a service provider that has promised server uptime of 98% can use events to identify
potential threats to server functionality early on. When events are generated, Event Management
addresses the problems with the appropriate response and deflects the failure. Similarly, events can be
used to generate server availability reports for the customer on a regular basis.
Question
You are part of the IT support staff. You need to find computers for the new hires scheduled to
join in a few days. A specific team just released a few systems, and you want to acquire these
systems and configure them for the new employees.
Options:
Answer
Option 1: This option is incorrect. Capacity and Availability Management help identify
important events, set event threshold identification, and define the response an event should
elicit.
Option 2: This option is correct. Event Management interfaces with Asset Management to
monitor the status of the various assets in the IT infrastructure. Event Management also helps
track the life cycle status changes of assets.
Option 3: This option is incorrect. Knowledge Management helps collect, store, organize, and
use domain-specific information.
Option 4: This option is incorrect. Service Level Management help resolve problems and
avoid failures that can severely impact the service targets set in the SLAs.
Correct answer(s):
2. Asset Management
Question
Your organization uses a specific auditing tool to monitor the software and hardware used in
each system in the network. Any unauthorized software installed on any system, is a security
issue so Event Management is used to monitor and manage auditing.
Which type of processes does Event Management interface with in this situation?
Options:
Answer
Option 1: This option is correct. Event Management interfaces with business applications or
processes to manage an auditing tool in the IT infrastructure. Because an instance of an
unauthorized software on a system is an incident, Event Management interfaces with the
Incident, Problem, and Change Management processes.
Option 2: This option is incorrect. Interfacing with Service Level Management help resolve
problems and avoid failures that can severely impact the service targets set in the SLAs.
Option 3: This option is correct. Incident, Problem, and Change Management processes can
use events generated by Event Management to manage, monitor, and control problems,
changes, and incidents. Identification of unauthorized software is an incident. In addition, to
manage an auditing tool installed on a system, Event Management interfaces with Business
Applications or Processes.
Option 4: This option is incorrect. Interfacing with Capacity and Availability Management helps
identify important events, set event threshold identification, and define the response each
event should elicit.
Correct answer(s):
1. Business Application or Process
3. Incident, Problem, and Change Management
3. Summary
Event Management can be triggered by events generated from any kind of system, device, application,
network, or process.
Triggers of Event Management can be classified into various types. Some of these are exceptions to
predefined levels of CI performance, exceptions to an automated procedure or process, exceptions in a
business process, completion of an automated task, change in status of a device or database record,
access of an application or database, and a device, a database, and a threshold being reached by a
device, database, or application.
Based on these triggers, Event Management interfaces mostly with processes that don't require real-
time monitoring, but do require some intervention. These processes are Business Applications or
Processes; Incident, Problem, and Change Management; Capacity and Availability Management;
Configuration Management; Asset Management; Knowledge Management; and Service Level
Management.
Event Management and Information Management
Learning Objective
After completing this topic, you should be able to
identify key information required in Event Management
With the information available, you can ensure that components in the IT infrastructure are monitored
and managed properly. And areas of improvement and actions are identified early on.
a managed device
an SNMP agent, and
an SNMP manager, which is the network management system
In the Event Management process, SNMP messages, also called SNMP traps, are sent by services and
SNMP agents running on the monitored devices. The monitoring tools capture the messages and start
the appropriate remedial action.
Say a service on the server stops suddenly. The server – the device being managed – notifies the agent
running on it. The agent generates an SNMP message. This message is sent to the network
management system and the required remedial action is taken. In this case, the response action can be
an automatic restart of the service and creation of a problem record in the Problem Management
System.
Another important source of information in Event Management is the MIBs of the devices in the IT
infrastructure. Each device contains an MIB, which stores device-related information such as the OS
installed on the device, the version of BIOS used, and the configuration of the system parameters.
Before events are generated in the Event Management process, MIBs are queried and the details of
devices are compared against the industry or business standards. And if actual device details do not
match prescribed standards, appropriate events are generated.
Note
MIBs of devices need to be well defined for effective communication between SNMP devices and SNMP
agents. MIBs are often used to decipher SNMP messages.
Typically, IT services offered by vendors are run on individual components in the IT infrastructure. Some
vendors provide monitoring tools for these services.
You can install the monitoring tool on the server and use the tool to centrally monitor the functioning of
the service on all the components. The agent software of these monitoring tools often provides relevant
information about the functioning of the services.
Vendor monitoring tools are highly specialized and can be customized to monitor specific details and
record them with detailed descriptions. This helps you generate very specific events and event
notifications and well-defined response actions.
Correlation engines are also vital sources of information in Event Management. They are the software
present in the network management systems. They enable all network, system, and service events to be
analyzed together to check for any correlation between the events and determine a better response
action.
All required conditions and criteria, often called Business Rules, for correlating the events are fed into
the correlation engine. These conditions are set in accordance with the performance standards stated
during Service Design.
For example, a service failed because a server was temporarily down. Such events examined in
isolation can lead to an incorrect resolution. When you correlate events using correlation engines, you
find that a power supply failure to the server caused the service to fail.
Question
Options:
1. Device details stored in the MIB can be used for response selection
2. Device details stored in the MIB can be compared with industry standards for
event generation
3. MIBs are used to understand SNMP messages
4. MIBs store details of events generated
Answer
Option 1: This option is incorrect. Details stored in MIB are not used for response selection
but for event generation. Details are compared against the industry standards and appropriate
events are generated.
Option 2: This option is correct. Details stored in MIB are primarily used for event generation.
Details are compared against the industry standards and appropriate events are generated.
Option 3: This option is incorrect. Although MIBs enable deciphering of SNMP messages, in
Event Management they help evaluate device performance against industry standards. This
serves as the basis for event generation.
Option 4: This option is incorrect. MIBs do not store details of the events generated. MIBs
store details about devices. Details stored in a MIB are primarily used for event generation.
Correct answer(s):
2. Device details stored in the MIB can be compared with industry standards for event
generation
After an event is detected by Event Management, it is recorded as an event record in the Event
Management tool.
The contents of the event record and its format vary depending on the Event Management tool and
component.
To be used effectively for analysis, however, an event record should contain at least the following fields:
device
Details about the device that generated the event must be stored. These details help
prioritize events and determine the appropriate response. For example, if the event is
generated by a server, the response might have a different priority than if the event is
generated by a printer.
component
An event can be generated by individual components on the device. Component details
help you analyze the event effectively and determine the appropriate response. For
example, if a problem on a server is caused by the hard-disk drive, it helps to have details
about that particular component.
type of failure
The complete description of the event is stored in the type of failure field. Various types of
events can be generated for each device and component. Without a clear event
description, it is difficult to select the appropriate response action. Established systems use
unique identification numbers and standard descriptions for events. This enhances clarity
and enables automation of the response.
date and time
Storing the date and time at which the event occurred helps you understand the
significance of the events. You can also use the details to correlate events and check for
patterns. Suppose a cluster failure event occurs in a server. From the date and time
details, you realize that a network down event occurred just before that, causing the cluster
failure. This correlation helps you choose the correct response.
parameters in an exception, and
In response to some events, you might need to reset or configure certain parameters on a
device or software. In such cases, it is essential that parameter details are stored in the
event record. For example, if a critical server reaches a performance threshold, a
parameter may need to be reset. Here it is important for the event record to store
parameter details.
value
For events in which parameters need to be reset or configured, details of the value that
need to be configured are also stored in the event record. This is common for well-defined
events. Storing the value helps avoid ambiguity, enables automation of event response,
and enables eliciting a timely response.
Question
What is the important information that event records of all types need to store?
Options:
Answer
Option 1: This option is correct. The details of the device and the components that generated
the event are essential for prioritizing the events and deciding on the appropriate response
action.
Option 2: This option is correct. A clear and detailed description of the event is essential to
ensure that events are resolved appropriately.
Option 3: This option is correct. If events involve resetting or tuning parameters, event records
should contain details of the parameters that are reported in the exception.
Option 4: This option is correct. For well-defined events that involve resetting or tuning
parameters, event records should contain the values that need to be used.
Option 5: This option is correct. Event records should always contain the date and time when
the event was received. This helps in prioritizing the events and event correlation.
Option 6: This option is incorrect. Event records do not contain details of the response action
because every event must be analyzed before the response is selected.
Option 7: This option is incorrect. Event records do not contain details of the correlated
events. However, date and time events can be correlated.
Correct answer(s):
Question
On a server that provides a variety of services, one of the services has stopped running. The
monitoring tool on the server starts generating events. To select the appropriate response
action for the event, you need to identify the exact service that stopped.
Which of the following key information in Event Management will you need?
Options:
Answer
Option 1: This option is correct. Event records store details about the device and the
component that generated the event. These details can be used to identify the service that
stopped and enable you to select an appropriate response action.
Option 2: This option is incorrect. SNMP messages are used for communicating technical
information about the status of various components in the IT infrastructure. To find the exact
details of the device and component that generated the event, you need the information
available in event records.
Option 3: This option is incorrect. MIBs store device details such as the OS installed on the
device, the version of BIOS used, and the configuration of the system parameters. To find the
exact details of the device and component that generated the event, you need the information
available in event records.
Option 4: This option is incorrect. Correlation engines store the conditions and criteria to be
used for correlating events and do not enable you to identify the service that stopped on the
server. To find the exact details of the device and the component that generated the event, you
need the information available in event records.
Correct answer(s):
2. Summary
Event Management collects relevant information from various sources to ensure effective delivery of IT
services. The sources of information that Event Management requires are SNMP messages, MIBs,
vendor's monitoring tool agent software, correlation engines, and event records.
Event records store information about the events and are used in analysis. The format and content of
the event records vary based on what is being monitored and the tool used. However, to be used for
analysis, event records should contain the device,component, type of failure, date and time, parameters
in exception, and value fields.
Analyzing the Event Management Process
Learning Objectives
After completing this topic, you should be able to
distinguish between the activities in the Event Management process
recognize components and key data required in Event Management
1. Exercise overview
In this topic, you're required to recognize how the Event Management process works and identify the
components and key information required in Event Management.
Question
The monitoring tool on the IT infrastructure regularly queries each computer to collect
information on the functioning of the new services installed. Each computer is also configured
to communicate a basic set of events about the functioning of the services.
In which step in the Event Management process does this communication happen?
Options:
Answer
Option 1: This option is incorrect. In the occurring of the event step, various events occur.
Option 2: This option is correct. Event notifications are generated in this step. In this case,
event notifications are generated when a monitoring tool queries the individual computers or
automatically by the computers.
Option 3: This option is incorrect. In the detecting event notifications step, the event
notifications are detected. Generally, an agent running on each computer detects the events
and sends it to the management tool for further interpretation.
Option 4: This option is incorrect. In the filtering event notifications step, the detected event
notifications are filtered. A first level of correlation is performed by an agent software or the
server to which the computers are connected.
Option 5: This option is incorrect. In the categorizing events step, events are organized into
the broad categories – information, warning, or exception – or any organization-specific
subcategories.
Correct answer(s):
Question
After event notifications are generated by the computers in the IT framework, they need to be
analyzed and identified as information, warning, or exception.
Options:
Answer
Option 1: This option is incorrect. In the occurring of the event step, various events occur.
Option 2: This option is incorrect. In this step, event notifications are generated.
Option 3: This option is incorrect. In this step, the event notifications generated are detected.
Option 4: This option is incorrect. In this step, the detected event notifications are filtered.
Option 5: This option is correct. In this step, events are categorized into the broad categories
– information, warning, or exception – or any organization-specific subcategories.
Correct answer(s):
5. Categorizing events
Question
What happens in the next step – Correlating events with response actions – in the Event
Management process?
Options:
Answer
Option 1: This option is correct. In the correlating events with response actions step of the
Event Management process, event notifications are analyzed to determine the meaning of the
event.
Option 2: This option is correct. In the correlating events with response actions step of the
Event Management process, the events are analyzed and compared with the business rules
by the correlation engine to identify correlating events.
Option 3: This option is incorrect. Filtering of events happens in the filtering event notifications
step of the Event Management process. While filtering, based on the event, you either send
the notification to a management tool or ignore the event.
Option 4: This option is incorrect. Response actions are selected in the selecting response
actions for the event step. There can be multiple responses to a particular event notification
and the combination of responses is selected in this step.
Option 5: This option is incorrect. Event notifications are detected in the detecting event
notifications step. Generally, an agent running on each computer detects the events and sends
them to the management tool for interpretation.
Correct answer(s):
You've configured the devices in the IT infrastructure so that whenever a service stops, it is
restarted automatically.
Determine what happens after the automated response for this event is generated.
Options:
Answer
Option 1: This option is correct. In the selecting response actions for the event step in the
Event Management process, appropriate actions are initiated. Successful completion of the
automatic restart is also evaluated.
Option 2: This option is incorrect. Based on their significance, events are categorized into
broad categories – information, warning, or exception – or any organization-specific
subcategories.
Option 3: This option is incorrect. Filtering of events occurs in the fourth step of the Event
Management process. While filtering, based on the event, you either send the notification to a
management tool or ignore the event.
Option 4: This option is incorrect. Event notifications are detected in the third step. Generally,
an agent running on each computer detects the events and sends it to the management tool
for interpretation.
Correct answer(s):
Question
Each new service on each computer is configured so that whenever an unauthorized user or
application attempts to access the service, an event notification is generated.
Options:
1. A device, a database, or an application reaching predefined threshold
identification
2. Access of an application or database
3. Change in status of a device or database record
4. Exceptions to an automated procedure or process
Answer
Option 1: This option is incorrect. The scenario does not involve any device, database, or
application reaching predefined threshold identification.
Option 2: This option is correct. The trigger that initiates Event Management in this scenario is
access of an application or database. Whenever an unauthorized user or application accesses
a service on a computer, an event notification is generated.
Option 3: This option is incorrect. In the given scenario, the status of the service or the
computer does not change and so this cannot initiate the Event Management process.
Option 4: This option is incorrect. In the given scenario there are no exceptions to automated
procedures occurring and so this cannot initiate the Event Management process.
Correct answer(s):
Question
The project team requires some test machines to be installed temporarily for acceptance
testing of their IT service. You need to identify the computers that can be spared for this
temporary period and configure them appropriately.
What is the process with which Event Management interfaces in this scenario?
Options:
1. Configuration Management
2. Asset Management
3. Knowledge Management
4. Capacity and Availability Management
Answer
Option 2: This option is correct. Asset Management can use events to monitor the status
change of the various assets in the IT infrastructure. To identify the computers that can be
configured temporarily for the acceptance testing, Event Management interfaces with the
Asset Management process in this scenario.
Option 3: This option is incorrect. Event Management can be used in many aspects of
Knowledge Management, such as to analyze the patterns of performance that affect business
activity.
Option 4: This option is incorrect. Capacity and Availability Management help identify events
that are important, set threshold identification for the events, and define the response each
event should elicit.
Correct answer(s):
2. Asset Management
Question
One of the services suddenly stops on a computer in the IT infrastructure. The agent software
generates a message that is sent to the network management system. What is the key
information source in this scenario?
Options:
1. SNMP messages
2. Management Information Bases or MIBs
3. Correlation engines
4. Event records
Answer
Option 1: This option is correct. In the Event Management process, SNMP messages are sent
by services and SNMP agents running on the devices. The monitoring tools capture these
messages.
Option 2: This option is incorrect. In the Event Management process, MIBs are queried for
information on the devices. These can be compared against industry standards for generating
appropriate events.
Option 3: This option is incorrect. In the Event Management process, correlation engines
enable event correlation based on business rules and criteria.
Option 4: This option is incorrect. Event records store details about the events and are used in
analysis.
Correct answer(s):
1. SNMP messages
Question
You are monitoring the services on each computer from a central server. You find a pattern in
the order of a few events and want to analyze it further. Determine the ideal source of
information you use in this case.
Options:
1. Event records
2. Correlation engines
3. SNMP messages
4. Management Information Bases (MIBs)
Answer
Option 1: This option is incorrect. Event records can be used to analyze individual events.
However, to analyze a series of events and understand the pattern behind, the event record
details might not be sufficient.
Option 2: This option is correct. In the Event Management process, correlation engines
enable event correlation based on business rules and criteria. They identify correlation
between events by analyzing various aspects of the events.
Option 3: This option is incorrect. In the Event Management process, SNMP messages are
sent by services and SNMP agents running on the devices. The monitoring tools capture these
messages.
Option 4: This option is incorrect. In the Event Management process, MIBs are queried for
information on the devices. These can be compared against industry standards for generating
appropriate events.
Correct answer(s):
2. Correlation engines
Event Management Critical Success Factors
Learning Objectives
After completing this topic, you should be able to
explain the approach to service measurement in Event Management
outline how to build a Service Measurement Framework for Event Management
explain what service measures should be defined in Event Management
specify the metrics used to measure the Event Management process
Events are status messages generated from applications, network, and systems management platforms.
They're commonly generated in two instances: when a measurement tool senses a threshold being met
and when an error condition occurs.
Both types of event are important to track the performance of your systems and services. Every time a
daily productivity requirement of a process cycle is met, a threshold is met and an event is generated.
An event generated by an error condition is typically caused by a service or network outage.
Managing events can be challenging because a large volume of events can be generated in your system
at any given time. This can be due to the stream impact of normal process transactions, as well as
actual problem events. Locating problems that need to be resolved can be difficult because of the huge
influx of information. To fix this issue, you need to generate well-correlated Event Management metrics.
Event Management metrics enable you to improve and develop the overall IT structure by helping you
analyze events and identify their causes by comparing the metrics against critical success factors. You
can do this by using specialized Event Management Correlation software. The functions of this software
include event correlation, impact analysis, and root cause analysis. These functions help filter false-
positive messages from real events and help you focus on the events that require action.
Event Management Correlation software helps you detect and analyze events based on tried-and-tested
rules, models, and policies. First, this technology helps you analyze the event log to identify the root
cause of the event. Then it enables you to assess these causes to measure the impact of the event.
Consider events that are repeatedly logged due to the installation of a new application. In this case,
using the correlation software you would first analyze the event log to identify the root cause –
installation of new application. Then you would analyze the impact of this cause on your project cycle –
unusable application due to installation error. Finally, you would suggest ways to resolve this issue, such
as fixing the installation errors or reinstalling the application.
The Event Management technology also supports the effective functioning of cross-domain IT
infrastructure. By doing this, it supports IT and areas of business including airlines, banks, and
educational institutions. In addition, this technology supports the Continual Service Improvement, or CSI,
model by generating details on
impact of availability and
exceeded performance thresholds, in relation to capacity or utilization
Question
Options:
Answer
Option 1: This option is correct. The Event Management Correlation software performs root
cause analysis to filter false-positive messages. In addition to this, it also detects and analyzes
events based on tried-and-tested rules, models, and policies.
Option 2: This option is correct. By providing the details on availability impact and exceeded
performance thresholds, the Event Management technology supports the Continual Service
Improvement or CSI model.
Option 3: This option is incorrect. Event Management technology helps differentiate false-
positive messages from actual problem events. This is done using functions such as event
correlation, impact analysis, and root cause analysis. This enables you to focus on events that
require action.
Option 4: This option is incorrect. The Event Management software analyzes the event log to
identify the root cause of the event. Then it assesses this cause to report the impact of the
event on the overall process.
Correct answer(s):
To strengthen your IT infrastructure and support other business services, you need to measure your IT
services using metrics. To identify which metrics to use, you should begin monitoring your IT services
using basic measurements.
availability of service
reliability of service, and
performance of service
You can measure your services against any of the three basic service measurements or against a
combination of them, according to your requirements. Also, you can measure your services using other
variables such as viability, stability, and support for quick upgrades.
Often, services are assessed against a few of their components that are most prone to errors. This
doesn't indicate the actual level of service the customer experiences. So when you measure IT services,
you need to assess and report services completely – end-to-end – and not as separate components,
such as servers or applications. This helps you monitor all aspects of a service and identify most issues
reported against it.
Consider that you're measuring a web service. You're assessing all aspects of your service such as
server accessibility, application functionality, and payment gateway security. You realize that one of
these aspects has failed and the customers are unable to use the service. However, the service
providers aren't aware of this because they've monitored only selected aspects of the service, which are
functioning properly. You can fix this by assessing every service aspect.
It's possible to measure IT services against different levels of systems and components. And with
services being measured beyond their component level, organizations now view IT services beyond their
physical infrastructure. Services are assessed at all levels – physical, service, and system.
You can do this by taking all your individual service measurements and combining them to simulate a
real customer experience. Consider this example, in which the availability of a search engine is broken
down into services, systems, sub systems, and component levels with examples of each.
Graphic
In this example, a Search Engine is listed at the Service level, while Search Engine 1 and
Search Engine 2 are listed at the System level. At the Sub-system level, PageRank, Doc
Index, and Anchor text are listed and these are connected with the system – Search Engine 1.
At the component level, Links, Indexer, and URL Resolver are listed. Links is connected to
PageRank, Indexer to Doc Index, and URL Resolver to Anchor text and they are connected to
Server 1, Server 2, and Server 3, respectively.
Note
Question
You're the service provider of a web service. Over the past few days, customers have reported
regular performance hiccups while using the service. However, your web servers have been
up and running properly during the reported period. To investigate the cause, you plan a
service measurement. How would you approach measuring this service?
Options:
Answer
Option 1: This option is incorrect. When you assess selected components of your service, you
don't experience the true level of service. To understand what customers require from the
service, it should be assessed completely.
Option 2: This option is correct. Services should be assessed and reported on completely,
end-to-end. This means using service measurements on every aspect of the service.
Option 3: This option is correct. Services can be measured against different levels of systems
and components to create reports. This can be executed by taking their individual
measurements and combining them to simulate a real customer experience.
Option 4: This option is incorrect. If you want to assess a service as the customer would view
it, it should be assessed completely.
Correct answer(s):
To build a Service Measurement Framework for your organization, you need to follow certain good and
best practices. During the initial stages of developing the Framework, you need to follow these practices:
evaluate your business processes and identify those critical to service delivery
understand that IT goals and objectives must complement your business goals and objectives
establish a strong bond between the strategic, tactical, and operational goals and objectives
of your organization, and
realize that the framework is used to assess past events, prepare the organization for the
future, and improve the overall process
After identifying the best practices, you need to determine which aspects of your IT services need to be
monitored and assessed and identify the basic parameters to be measured. These basic parameters
include
services
components
Service Management processes
process activities, and
output
While building the Service Measurement Framework, you must remember that it should be balanced,
unbiased, and able to withstand change. To achieve this, you should select a combination of measures
that will provide an accurate and balanced perspective.
To build your Service Measurement Framework, you need to follow a few key steps.
Understanding origins
First, you need to understand the origins. This includes defining success criteria and
identifying ways to check if you've achieved it. This initial step defines your target and
enables you to define action points for your service measurement process.
Choosing what to measure
You need to choose the measurements that will be of maximum use to help you make
strategic, tactical, and operational decisions.
To set these targets, you can employ Service Level Agreements or SLAs, service level
targets, or objectives.
Defining procedures and policies
In this step, you define procedures and policies for service measurement and identify tools
to execute it. Defining procedures includes determining the roles and responsibilities of the
staff and defining policies.
This step also enables you to evaluate service measurement activities and helps you
identify the criteria of initiatives for CSI. By implementing this step, you can also determine
when to raise your current targets for better performance.
When building the Service Measurement Framework, you need to define the individual roles and
responsibilities of your IT staff who are involved in this process. For instance, you need to identify who
defines the measurements and targets for the process before process kick-off
monitors the overall process and performs the actual measuring tasks
gathers the required data
processes the data and arranges it, and
prepares and presents reports to management
Question
You want to improve the quality and accountability of your IT services by setting up a
customized Service Measurement Framework. You've already initiated the process and
identified the measures you want to use. What other key steps would you execute to complete
building the framework?
Options:
1. Understand the origins of the Service Measurement Framework
2. Integrate IT goals and objectives to business goals and objectives
3. Determine the roles and responsibilities of IT staff involved in the process
4. Define the procedures and policies for measuring services
Answer
Option 1: This option is incorrect. Understanding the origins of the Service Measurement
Framework is part of the initial stages of the process. This step is performed before identifying
service measures.
Option 2: This option is incorrect. You would've assessed and integrated IT goals with
business goals during the initial stages.
Option 3: This option is correct. Determining the roles and responsibilities of IT staff is a key
step in building the framework. In this step, you determine who defines and executes the
measures and targets.
Option 4: This option is correct. Defining procedures and policies is a critical step of Service
Measurement Framework. In this step, you define the procedures for service measurement
and identify the required tools.
Correct answer(s):
For example, your organization may want to monitor the trial version of a service before rolling out its
final version. If you want to monitor and assess events for this setup, you may only need to use basic
service measures such as performance and availability. However, if you employ additional service
measures, you may needlessly complicate the entire process.
By defining and identifying appropriate measures for service assessment, you ensure these measures
aid features of Event Management. These features include
There are seven main categories that help organizations assess their business performance.
Productivity
High productivity indicates high performance. So, during service measurement, it's
important to assess the current productivity levels of your organization.
Customer satisfaction
Customer satisfaction helps assess business performance by providing a perceived value
of the IT services provided.
Value chain
Value chain is a business strategy model that enables you to analyze the activities your
organization performs to increase its business value. This model helps indicate the impact
of IT on the functional goals of your business. By measuring this impact, your organization
can assess their overall business performance.
Comparative performance
The comparative performance category deals with comparing business measures or
infrastructure components against internal and external results. This category enables
organizations to verify whether or not their performance meet their business objectives.
Business alignment
By considering the business alignment category, organizations can check if their services,
systems, and applications portfolio align to their business strategy. The more services are
aligned to organizational strategies, the better the business performs.
Investment targeting
Investment targeting helps analyze the impact of IT investments on the cost and revenue
structures of business. It also enables you to evaluate the investment base of your
organization.
Management vision
Management vision helps you understand the senior management's view of the strategic
value of IT and can provide a direction for the future.
After assessing your business performance using specific categories, you can proceed to identify and
implement service measures.
While implementing service measures, you should ensure that the mode of IT service measurement is
integrated with your business needs. For instance, IT service reporting deals with statistical data, while
business service reporting emphasizes customer experience. You need to consider such differences and
bridge the gap between them before you define the service measures.
While defining service measures, you should ensure you choose measurements relevant to your
customers. The most common service measurements that apply to customers are the following:
number of service outages – for example; five outages reported this week
duration of service outages – for example; each of the five outages of the service lasted for 40
minutes, and
impact of service outages on business – for example; the total duration of outages this week
is 200 minutes; this impacted productivity and the revenue target wasn't met
After identifying the measures relevant to your customers, you need to consider the common service
measurements that should be defined. They are
service level
The service level measurement considers a number of features such as service, system,
component availability, and transaction. It also considers component and service response
time, on time and on budget service delivery, service quality, and compliance with security
requirements.
Additionally, most Service Level Agreements or SLAs monitor and report some Incident
Management measures. These include Mean Time To Repair or MTTR and Mean Time to
Restore a Service or MTRS.
customer satisfaction
To implement the customer satisfaction measure, you need to conduct regular surveys to
measure and track customer feedback. The Service Desk and Incident Management
teams contribute to this measure by analyzing randomly selected incident tickets.
business impact, and
The business impact measure investigates causes that trigger service disruptions. This
measure also analyzes how these causes impact internal and external business operations
and processes.
supplier performance
Supplier performance is a measurement often employed by organizations that outsource
part of their services to third-party suppliers. This measure verifies that the services offered
by these suppliers are quantifiable and conform to predefined standards.
Question
Your organization wants to selectively define measurements for your business services so you
can assess why the services were disrupted. You also want to use these measures to
determine the service quality of your vendors. What service measures would you define in
Event Management?
Options:
1. Supplier performance
2. Business impact
3. Number of service outages
4. Duration of service outages
Answer
Option 1: This option is correct. The supplier performance measurement verifies whether or
not the services offered by your vendors are quantifiable and conform to predefined standards.
Option 2: This option is correct. The business impact measure identifies the causes for
service disruption and analyzes how they impact your business operations.
Option 3: This option is incorrect. The number of service outages measurement identifies the
number of service outages that occurred over a period of time.
Option 4: This option is incorrect. The duration of service outages helps determine the time for
which a service outage lasted and impacted your services.
Correct answer(s):
1. Supplier performance
2. Business impact
The specific details that Event Management metrics should include are the following:
the number of events by category – for example; the metrics should categorize and report
how many events have occurred due to network failure and how many due to application
failure
the number of events by significance – for example; the metrics should report the occurrence
of high priority events such as server outage and low priority events such as unresponsive
applications
the number and percentage of events that required human intervention and whether or not
this was performed – for example; the metrics should provide the percentage of events that
required IT staff to resolve them and those that were automatically fixed, and
the number and percentage of events that resulted in incidents or changes – for example; an
application overload that caused a network jam indicates that an event has resulted in another
incident or change
Metrics should also specify the number and percentage of events caused by existing problems or known
errors. This may result in a change in priority of the work or the service which caused the error.
Additionally, metrics should include the number and percentage of repeated or duplicated events. By
doing this, you can tune the Correlation Engine to prevent unnecessary event generation or help design
improved event generation functionality in new services.
There are a few other specific details that Event Management metrics should cover:
the number and percentage of events indicating performance issues, which have caused
operational difficulties
the number and percentage of events indicating potential availability issues
the number and percentage of each type of event per platform or application, and
the number and ratio of events compared with the number of incidents
Question
You want to monitor the Event Management process in your organization using specific
metrics. You also want to analyze events by their priority and identify events that cause
operational issues. What metrics would you use to do this?
Options:
Answer
Option 1: This option is correct. To identify events that caused operational issues, you should
use the metric that provides the number and percentage of events indicating performance
issues.
Option 2: This option is correct. To analyze events by their priority, you should employ the
metric that sorts events according to their significance – into high and low priority events.
Option 3: This option is incorrect. The number of events by category metric helps you classify
events by their type. It doesn't help you analyze events by their priority or identify those
causing operational issues.
Option 4: This option is incorrect. The number of events that required human intervention
metric provides the number of events resolved automatically or by the IT staff.
Correct answer(s):
5. Summary
Well-correlated Event Management data can improve the reliability, efficiency, and effectiveness of IT
infrastructure. To identify metrics for Event Management, basic service measurements such as viability,
reliability, and performance are used.
The next step in Event Management, after identifying the basic service measurements, is setting up a
Service Measurement Framework. This framework, with its best practices, key steps, and critical
elements helps facilitate value-added reporting and accurate service measurement.
Identifying the important service measures is a key step in Event Management. Some of the common
service measures include service levels, customer satisfaction, business impact, and supplier
performance.
Another key step is to identify metrics that help measure services. These metrics provide specific
information such as number of events by category and number of events by significance.
Event Management Challenges
Learning Objective
After completing this topic, you should be able to
recognize how to meet the key challenges to effective Event Management
Question
Options:
Answer
Option 1: This option is incorrect. Activities that can divert resources may develop when there
is an issue with rolling out necessary monitoring agents across the entire IT infrastructure. This
will not be the result of setting filtering levels incorrectly.
Option 2: This option is correct. You can prevent your Event Management system from being
flooded with relatively insignificant events by integrating Event Management into all feasible
Service Management processes.
Option 3: This option is correct. If your system does not detect significant events before it's
too late, you can employ the trial and error method of Event Management to identify
unclassified events.
Option 4: This option is incorrect. The purchase and installation of tools may be delayed if the
funds for the project are not acquired. This will not be caused by the incorrect setting of
filtering levels.
Correct answer(s):
What are the specific challenges involved in the Event Management process?
Options:
Answer
Option 1: This option is incorrect. Monitoring events and preparing reports to enable further
analysis of events is a step in the Event Management process. It is not a challenge that may
hinder the process.
Option 2: This option is incorrect. Assessing results continually for consistent functioning of
the system is a part of the Event Management process. It is not a specific challenge that
impacts the process.
Option 3: This option is correct. You can identify sources of funding by preparing a Business
Case that includes the benefits of effective Event Management. The Business Case explains
how to outweigh the cost of setting up Event Management by returning the investments with a
considerable profit.
Option 4: This option is correct. To address the skills challenge, you should budget your
resources properly. You should also schedule adequate time for Event Management
administrators to acquire the necessary skills.
Correct answer(s):
Question
You are a systems administrator and your team is attempting to streamline the Event
Management system in your organization. The main issue with it is that it generates a large
volume of insignificant events such as status messages. These events congest the system
and prevent you from detecting critical events such as server outages. You want to apply the
correct level of filtering for your system. How would you meet this challenge?
Options:
Answer
Option 1: This option is correct. Integrating Event Management into all feasible Service
Management processes helps ensure that only the events relevant to these processes are
monitored and reported.
Option 2: This option is correct. The trial and error method ensures that filters don't fail to
report an unclassified event. Also, it helps the Event Management process to evolve into a
formal and continual process, with regular updates to evaluate the effectiveness of filters.
Option 3: This option is incorrect. Preparing a Business Case with an account of the benefits
of effective Event Management helps acquire funds for your project. It doesn't help apply the
correct level of filtering.
Option 4: This option is incorrect. You can schedule time for Event Management staff to
acquire the necessary skills when there is a shortage of skilled resources. This doesn't help
apply correct filtering level.
Correct answer(s):
1. Designing instrumentation
Event Management is an important evaluation process, which many organizations adopt to monitor the
performance and availability of their IT services.
Event Management helps you to assess the quality and productivity of your services. To do this, it
involves a number of objectives and targets. It is important that you define and determine these
objectives and targets and design the process early. Ideally, they should be defined during the
Availability and Capacity Management processes in the Service Life Cycle. The overall designing of the
Event Management process shouldn't wait until the services are deployed into Operations. It should be
designed during the Service Design phase.
Designing the Event Management process at an early stage of the project doesn't mean that it must be
designed and built by a team of separate and distinct systems developers. Event Management is an
active process that should remain dynamic even after it is designed and deployed.
Also, the day-to-day operations of your service monitoring should invoke additional events, alerts,
priorities, and other improvements. These are fed back into the appropriate phases, such as Service
Strategy and Service Design, through the Continual Improvement process.
To design your Event Management process, you need to first identify the basic design areas.
They are
instrumentation
error messaging
threshold identification, and
event detection and alert mechanisms
Question
Options:
1. Service Strategy
2. Instrumentation
3. Service Design
4. Event detection
Answer
Option 1: This option is incorrect. Service Strategy is a phase in the ITIL® Service Life Cycle.
To design the Event Management process of this Life Cycle, you need to design features such
as instrumentation, event detection, alert messages, error messaging, and threshold
identification.
Option 3: This option is incorrect. Service Design is a phase in the ITIL® Service Life Cycle.
This is performed in parallel with Event Management, which can be designed by configuring
features such as instrumentation and event detection.
Option 4: This option is correct. In addition to event detection, other Event Management
design areas include instrumentation, alert messages, error messaging, and threshold
identification.
Correct answer(s):
2. Instrumentation
4. Event detection
Among the design areas, instrumentation deals with designing how to monitor and control your IT
infrastructure and services. It helps create a basic framework that specifies the aspects of your service
that need to be monitored and how they'll be assessed.
The main function of instrumentation is to define which Configuration Items or CIs need to be monitored
in your Event Management setup. It also helps you analyze how the behavior of CIs can be affected
during the Event Management process.
Instrumentation enables you to make important decisions about your Event Management framework.
Some of these decisions include determining
In addition to enabling you to make decisions, instrumentation helps you design methods to execute
these decisions. Some of the methods for executing your Event Management decisions are the
following:
determine how events will be generated, for instance, whether by assessing the service in
parts or completely
verify whether CIs already include event generation mechanisms as an inherent feature and
which of these should be used
confirm that these mechanisms are sufficient for monitoring the service or if additional
mechanisms are required for a customized evaluation of the service, and
identify the information used to populate the Event record; this information includes the event
log, event history, priority, and frequency of the occurrence, or a combination of these details
More mechanisms that help you execute your Event Management decisions include the following:
verify whether events are detected automatically or only when CIs are polled
take the steps to ensure that events are generated according to your requirements
determine the memory in which events are logged and stored, and
define how supplementary information would be obtained for further event analysis
Question
You want to design a customized Event Management framework for your organization. You've
identified its key design areas and want to configure the instrumentation setup. What decisions
would you need to make to do this?
Options:
Option 1: This option is correct. Determining what needs to monitored helps you also specify
what you don't need in your IT infrastructure and services. This will impact your overall
approach to instrumentation design.
Option 2: This option is incorrect. Specifying how events would be generated is a method for
executing decisions in instrumentation design. This would help you execute the decisions that
you've already made.
Option 4: This option is correct. By specifying when you want your Event Management setup
to generate events, you can prevent your system from becoming congested with insignificant
events.
Correct answer(s):
Question
You want to design instrumentation for the Event Management process in your company. As
part of designing this setup, you've made the required decisions and identified what needs to
be monitored. Identify the mechanisms that you would adopt to execute these decisions.
Options:
Answer
Option 1: This option is incorrect. Defining the type of information that needs to be
communicated in the event is a decision that you make in instrumentation setup.
Option 2: This option is correct. Verifying whether events are detected automatically or only
when CIs are polled is essential in instrumentation design. It helps you take the steps to
ensure that events are generated as per your requirements.
Option 3: This option is correct. Event records can be populated with details such as event
log, event history, priority, and frequency of occurrence. You need to identify which of these is
required for your Event Management setup.
Option 4: This option is incorrect. The process of determining for whom the messages are
intended is a decision you may need to take when designing instrumentation. This decision
helps establish the roles and responsibilities of employees within the team.
Correct answer(s):
It is critical that all software applications support Event Management as part of their functionality and the
output needs to be monitored and assessed. To facilitate this, the error messaging functionality
generates specific error messages or codes to determine the exact area of failure. These codes also
suggest what might've caused this failure. This helps you identify the reason, for example, when an
application collapsed, enabling you to resolve the issue as soon as possible.
While monitoring software applications, you should remember that a strong interface exists with every
application design. Because of this, all applications should be carefully coded to generate detailed error
messages at the exact point of failure. You can use these error message logs and codes, along with the
actual event, to investigate and analyze the cause of failure. After you find the cause, these error
messages should help in its swift diagnosis and resolution.
Error messaging also plays an important role in the assessment of new applications. For instance,
whenever a new application needs to be monitored and assessed, its event generation functionality and
error messaging features should also be tested in parallel. Because these features are assessed early in
the development phase, they can lead to an improved Event Management setup.
Another important feature of error message design is that it can be incorporated into the application, at
its code level, using the right tools.
Consider that you want to monitor and manage a set of new applications your organization is
developing. To do this, you want to identify key design areas and design the error messaging feature.
However, you don't have the time or the adequate number of programmers to incorporate the error
messaging functionality within the application code.
To resolve this issue, you can build distributed, modular, and dynamic solutions for your applications.
This can be done using tools built with available technologies. These tools help reduce the need for
programmers to manually include error messaging within the code of applications, devices, and service-
driven networks. By facilitating this, these error messaging tools also allow a significant level of
normalization and code-independence in your applications.
Another important design area for Event Management is identifying thresholds. Threshold identification
is not set separately during the Event Management process. They're designed and communicated
during the instrumentation phase. Identifying thresholds helps determine the performance level
appropriate for each CI of your service.
Consider that you want to assess the Management Information System of your organization to assess
the availability of this system. You also want to check if all your employees can access it simultaneously.
To do this, you need to first define the actual number of employees who can access the system at any
one point in time. This would identify the threshold for the availability of the system. You can then verify if
this system is able to meet this threshold identification.
When identifying thresholds, you need to remember that they aren't always constant. In most cases,
threshold identification is characterized by a set of variables, which vary according to the changing
requirements of your service. For example, the availability of your web service may vary depending upon
factors such as the number of users accessing it simultaneously and the available network bandwidth.
The knowledge to define threshold identification for these factors is often gained through experience.
You can acquire this experience by constantly tuning and updating Correlation Engines through the
Continual Service Improvement or CSI process.
Question
You want to design the Event Management process in your organization so that you can
assess your applications and network components. You want to specifically configure the error
messaging functionality. What functions of error messaging would help you design Event
Management specifically for your components?
Options:
Answer
Option 1: This option is correct. Error messages and codes point out the specific area of
failure in your application and suggest the possible cause for the failure. This helps you identify
the reason why the application collapsed and resolve the issue as soon as possible.
Option 2: This option is incorrect. Threshold identification enables you to determine the
performance level of each CI. It is designed during the instrumentation design phase.
Option 4: This option is correct. Error messaging tools are built using available technologies.
These tools allow a significant level of normalization and code-independence in your
applications.
Correct answer(s):
Identify the functions of the threshold identification that support the overall Event Management
design in your organization.
Options:
Answer
Option 1: This option is incorrect. The error messaging functionality enables you to identify
why an application collapsed and suggests ways to resolve the issue.
Option 2: This option is correct. Configuring threshold identification helps you determine the
level of performance appropriate for each CI. To monitor and manage these CIs, you need to
define their threshold identification.
Option 3: This option is correct. Most threshold identification varies according to the changing
requirements of the service. Because of this, you need to tune and update the Correlation
Engines regularly using the CSI process.
Option 4: This option is incorrect. The error messaging functionality helps assess new
applications in the Event Management setup. Whenever a new application is tested, its event
generation functionality should also be tested in parallel.
Correct answer(s):
To configure the event detection functionality, you need to populate your Correlation Engine with specific
rules and criteria. These rules help the Correlation Engine assign priority to events. These rules also
enable the Correlation Engine to provide suggestions of how to resolve the different types of events.
To design event detection features and alert mechanisms, you need to acquire specific information.
Some of the details you require for configuring this design area are
data that can help you fix the problems that arise with a CI
basic knowledge of incident prioritization and categorization codes, which can be produced for
creating an Incident Record
thorough knowledge of all CIs, those dependent on the affected CI and those the affected CI
depends on, and
details of Known Error information acquired from previous experience and from third-party
vendors
Question
You want to streamline the way events are filtered, correlated, and escalated in your Event
Management system. To do this, you want to configure the event detection functionality and
alert mechanism features. What are the requirements for designing these specific features?
Options:
Answer
Option 1: This option is correct. Known Error information acquired from previous experience
helps design the alert mechanism and event detection functionality features. You can also
acquire this information from third party vendors if you outsource your service monitoring.
Option 2: This option is incorrect. The knowledge to identify and thresholds for changing
service features is a requirement of the threshold identification design area. Configuring this
design area helps determine the performance level appropriate for each CI of your service.
Option 3: This option is incorrect. Error messages and codes indicating the specific area of
failure are features of the error messaging functionality. You can automatically include this
feature in application codes using tools.
Correct answer(s):
Instrumentation enables you to make important decisions about your Event Management framework. It
also helps you design methods to execute these decisions.
Error messaging generates error messages that indicate the specific area of failure in applications. This
function also suggests causes for the failure. Threshold identification helps determine the level of
performance appropriate for each CI.
The event detection and alert mechanism helps you streamline the way events are filtered, correlated,
and escalated. To do this, it uses tools such as the Correlation Engine.
Designing Event Management
Learning Objectives
After completing this topic, you should be able to
determine risks faced by Event Management
design Event Management
1. Exercise overview
In this exercise, you're required to analyze and determine the risks faced by an Event Management
setup and design a customized Event Management framework for your organization.
Question
Over the last few days, the users of your online commodities trading platform have been
experiencing recurring issues with the service. Either the service is reported to be unavailable
or the performance of the service is considered poor. To investigate the issue and identify its
cause, you set up a service measurement process. How would you measure this service?
Options:
Option 1: This option is correct. To find out why your customers are facing issues with your
service, you need to view and assess the service as the customer would. To do this, you need
to assess and report on the service as an end-to-end service. This includes using service
measurements for all aspects of the service.
Option 2: This option is incorrect. If your online service is measured in parts, you may only
assess the performance of the service components, which may not be faulty. Because of this,
you may not be able to find out the reason for the performance hiccup.
Option 3: This option is correct. Defining the procedures and policies is a critical step of the
Service Measurement Framework. In this step, you define the procedures for service
measurement and identify the tools to execute it. You then need to identify the appropriate
service measures to assess the service. For your online commodities trading service, you may
choose the service levels and customer satisfaction measures.
Option 4: This option is incorrect. When you assess the selected service components, you
wouldn't experience the actual level of service. To understand what customers require from the
service, it should be assessed as a whole. Also, you may run the risk of selecting a component
that is not faulty, missing the actual cause of the issue.
Correct answer(s):
Question
After setting up your Service Measurement Framework, you've chosen the appropriate service
measures and identified the metrics. You've almost completed the process, except for rolling
out the necessary monitoring agents. However, this is time consuming because you need to do
it for the entire IT infrastructure of your online trading platform. How would you meet the
challenge of implementing this?
Options:
Answer
Option 1: This option is correct. When you want to roll out monitoring agents for your Event
Management setup, you could find that other activities divert resources and delay the rollout.
To prevent this, extensive plans should be made before you roll out the monitoring agents
across the entire IT infrastructure.
Option 2: This option is correct. Rolling out monitoring agents requires an ongoing
commitment of your staff for a long time period. To tackle this, you should consider this activity
as a realistic project with a defined timescale, where you recruit and allocate resources
adequately and protect them throughout the duration of the project.
Option 3: This option is incorrect. Designing new services with special considerations for
Event Management enables you apply the correct level of filtering for your services. But you
may not use this to roll out the monitoring agents necessary for your Event Management
setup. Instead, you need to set a timescale for this activity as you would for a real-time project.
Option 4: This option is incorrect. Preparing a Business Case with an account of the benefits
of effective Event Management helps to acquire funds for your project. However, to roll out the
necessary monitoring agents, you will not need to do this. Instead, you need to plan
extensively and recruit and allocate adequate resources for your project.
Correct answer(s):
Question
As part of your Event Management framework design, you first want to configure the
instrumentation design features of your service. As part of this, you've already made the
required Event Management decisions such as identifying the needs to be monitored, and the
type of monitoring required. However, you want to identify the methods to implement your
decisions. What instrumentation mechanisms would you adopt to implement your decisions?
Options:
Answer
Option 1: This option is correct. In addition to enabling you to make Event Management
decisions, instrumentation helps you design methods and mechanisms to execute these
decisions. One of these mechanisms is defining how to obtain supplementary information. This
enables you to do further analysis on your Event Management setup.
Option 2: This option is correct. To implement your Event Management decisions, you need to
adopt mechanisms that include identifying the information required to populate the Event
record. This includes the event log, event history, priority and frequency of the occurrence, or
all these details. Also, you need to determine where and how the events will be stored so that
you can easily retrieve them later.
Option 3: This option is incorrect. You have to determine what does or doesn't need to be
monitored in your IT infrastructure. This will impact your overall approach to instrumentation
design. Similarly, you also need to decide for whom the event messages are intended.
However, these are not instrumentation mechanisms but decisions involved in Event
Management design.
Correct answer(s):
Question
After configuring the instrumentation features for your Event Management framework design,
you realize that the events generated in your online commodities trading platform are not
filtered, correlated, and escalated properly. To fix this issue, you need to streamline the event
detection and alert mechanism design areas in your Event Management setup. To do this, you
want to configure your Correlation Engine. What would you require to complete these
configurations of Event Management design?
Options:
Answer
Option 1: This option is correct. Basic knowledge of incident prioritization and categorization
code is essential to configure your Correlation Engine. This knowledge also helps you
configure event detection and alert mechanism design areas, which can also be produced
when there is a requirement for creating an Incident Record.
Option 2: This option is incorrect. To monitor and manage the CIs of your services, it is
important that you define and determine the thresholds for the CIs. However, the knowledge to
define thresholds for changing service features is a requirement of the thresholds design area.
This doesn't aid event detection and alert mechanism directly.
Option 3: This option is correct. Knowledge of normal and abnormal behavior of CIs can be
used to tune your Correlation Engine. In addition to this, details such as who is assigned for
the CI support and details of recurring events on similar or different CIs and their significance
also helps you configure the event detection and alert mechanism features.
Option 4: This option is incorrect. Event records can be populated with details such as the
event log, event history, priority, and frequency of the occurrence. However, acquiring this
knowledge is a mechanism adopted while implementing the instrumentation features of the
Event Management framework design. It doesn't help tune the Correlation Engine or event
detection and alert mechanism directly.
Correct answer(s):
1. Exercise overview
2. Determine objectives and policies
3. Analyze Event Management contribution
4. Analyze an Event Management design