In an attempt to schedule an afternoon social I ran across
the quandary of finding a time that was suitable for all my colleagues. One such
person, who shall remain nameless, had said that the service desk that they
work on is overloaded with incidents on Wednesday so that wouldn’t be a good
day. They said it had to do with the way that they changed the date that the CAB
convened on a few months back, and this was the cause so for as they could
tell.
It turned out that CAB had a better attendance, due to availability,
on Monday vs Wednesday which was when they used to have the CAB in the past. This
got me thinking. What does the cycle look like from the implementation of a
change, into an incident in some unfortunate situations, and into a resolution
of those incidents through another change (or roll back).
Looking at this example closer we have a CAB on Mondays which
allows changes to be implemented (for good or bad) as early as 24 hours later,
in this case Tuesday. When issues arise from these changes we are starting to
see them on Wednesday which correlates to the influx of incidents. As a result,
depending on the circumstances, some changes are reverted to a pre-change state
while others require an emergency change to be implemented to correct these
issues.
To me the issue seems obvious. However I often find that
the obvious is only seen by those who are looking at it with fresh eyes. Those who
are waist deep in the situation may not see it as clearly as someone external
or if they can see the issue cannot visualize the solution as clearly. When I
asked my friend about it they said, “I think we have always had incidents as a
result of change but since the CAB date change we don’t get as many of them on
Mondays like we used to, so it seems as though it has improved.” They continued
to say that in actuality the number of incidents now is just more balanced and
this is why the issue isn’t pressed as hard from an IT perspective.
Wait a second… I thought that they were putting me on for
a second but they weren’t. I had to ask about the fact that they were simply
moving the issues from one day to another without addressing the main issue –
the fact that changes were failing. Despite the fact that the issue isn’t that “pressing”
the end result is the delivery of services. Iuf you were to ask them about the
issues the customer might suggest that the service has not improved one bit
over the course of time, and they would be correct.
As you can see in the picture below, from a customer
experience perspective the issues are still present despite what day the
changes are implemented. If you were only to shuffle the days at the bottom
axis around the humps which represent changes and incidents remain.
“Why does no one see that this is an issue?” I asked.
The response was that they report on service management metrics
(incident and change) separately and do not connect the dots between the two.
That has to change, I insisted. As a start I would look
at the following:
1. The number
of changes which cause incidents and/or are rolled back. In the case where
changes are not updated to reflect any issues and we see a high rate of
implementation success (and if the two teams are not working collaboratively this
will be likely) we need to see how many emergency changes we are creating and
why. Even in a state where change is measured in a silo we should see that a
change caused an emergency change.
2. We may
need to take a look at the timeframe where something is reviewed in CAB and
then implemented in production. It might be possible that the changes which are
not successful are also the ones which are being implemented 24 hours after
CAB. Knowing which changes are impacting the business in a negative way will
allow us as an IT organization to better assess what is not working as well as
it could be an build a strategy to improve it.
Whatever it might be we want to ensure that the customer
is getting a good experience and that we are not reviewing the provision of
that service as a unit through our metrics by challenging the inputs and
outputs for each process as it pertains to delivering service. Getting these
teams together regularly with key stakeholders will allow some visibility on
the areas which need improvement to better provide a solid customer experience.
Follow me on Twitter
@ryanrogilvie or connect with me on LinkedIn
Labels: Change Management, CSI, Incident Management, ITIL, ITSM, Service Delivery