Operational work is quite hard to plan in a sprint, because it is very hard to estimate how much operational work there is going to be in the upcoming sprint.
The previous 4 sprints I was helping out as a scrum master in another team in my company, the CDN team.
Those guys have a couple of hundred machines, which they have to manage and maintain.
And naturally every now and then a machine breaks and has to be fixed.
But as you don’t know upfront, how many machines need to be fixed and how hard that will be, it is pretty difficult to plan such tasks upfront.
In the CDN team we distinguish between development work and operational work.
Development work is obvious: implementing new features. You can break features down in small, manageable tasks, estimate them in regards to the reference task and plan them in the sprint.
But operational work is a bit more difficult.
Operational work
Let me give you some examples, which we in the CDN team consider as operational work:
- Maintaining the platform
- Upgrading software on production machines
- Extend capacity on a machine
- Install updates
- Handling incidents, when machines or services are down
- Fix broken networks or network devices
These examples are quite different by its nature: some of them you can foresee, some other just happen unexpectedly.
That’s why you can classify operational work in planned and in unplanned tasks.
Planned operational work
Examples for planned operational work are upgrading software on production machines, extend the capacity of a machine or install updates.
Such kind of work you already know beforehand, that you have to do it.
Therefore you can estimate it in a refinement session and plan it in the sprint as usual.
Unplanned operational work
On the other side there are unplanned tasks, which you can not foresee at the beginning of the sprint. Unplanned tasks can be classified in two types, which I call here general unplanned tasks and exceptional unplanned tasks.
General unplanned tasks
An example for general unplanned work is for instance let’s say 5 incoming severe incidents on average per sprint. Such incidents happen regularly, you can not postpone them to the next sprint, but fix them immediately.
General unplanned work is based on experience.
You look at your incoming incidents in the past and you can extract some information on how many unplanned tasks happen per sprint on average.
Then you estimate the amount of effort it takes on average to complete such tasks.
Based on that estimation you can reserve some time in the upcoming sprint, because it is very likely that such tasks will pop up again.
Therefore you can create a plan for your sprint, which includes even tasks, that do not yet exist at the time of creating the plan. Based on experience you know, that such tasks will very likely appear and you have to deal with them in the sprint.
Exceptional unplanned tasks
But then there are also tasks, which happen every now and then and take a very large effort of the whole team to fix them.
For example a core switch breaks or any other outage of a very important component of your system occurs.
In such a case probably the whole team has to stop working on what they were doing at the moment and put all attention to that incident.
You can not plan for such type of exceptional cases.
And of course this has a big impact on your whole sprint. It is very likely that you are not able to meet your sprint goal, because a big part of the sprint effort had to go into fixing this incident.
But fortunately such events do not happen on a regular basis, otherwise you really need to work on a solution to reduce the probability of such types of incidents.
Anyway, if such an exceptional incident happens during the sprint and therefore you are not able to meet your sprint goal, you will have a very good explanation for the stakeholders.
I’m quite sure your stakeholders will understand that keeping the system up and running has a higher priority than implementing that new feature. They are getting that feature anyway, just a week or two delayed.
Let me wrap it up
So you can plan for unplanned work based on historical data. You know from experience how much effort you spend on average dealing with operational incidents during the sprint.
Actually, you can include work for tasks, which do not even yet exist, in your sprint.
In fact, the majority of the unplanned work falls into the category of what I call here general unplanned work. Therefore most of the unplanned work can be included in the planning.
Only work, which I here call exceptional unplanned work, is not predictable at all. Hence, you can not plan for such work.
But as such exceptional work rarely happens, our plannings will be always quite accurate – and for the rest we have a good explanation for stakeholders.
What type of work do you have in your sprint, which you cannot plan directly during your sprint planning? Let me know in a comment and tell me if you think I missed something!
That’s it for this week. Have a great day and stay tuned. HabbediEhre!
Herbi –
Thank you for the detailed article on how you approach operational work using Scrum. I have few questions:
a. Given a choice why would you use Scrum for operational work when you can use Kanban specially when the underlying principles that you are trying to benefit from are the same between the two? Ex: visualizing workflow, maximizing team throughput, estimating work (doesn’t exist in classical Kanban but can be added for better predictability) and shortening feedback loops
b. If you are doing operational work, why would you like to bind yourself to iteration timelines for releases (assuming production releases happen at the end of each sprint)? Wouldn’t it be better if you could release the fixes as soon as possible into the production specially if you have aggressive SLAs to meet?
c. Are scrum’s ceremonies of sprint planning, sprint review, and retrospective as valuable in the operational setting as much in development setting? Do these really help improve the quality of operational work?
Any thoughts on these would be appreciated. Thank you
Hi Falcon,
thank you for reaching out. These are very good questions.
For a. and b.: In the CDN team the majority of work was development work, which we could estimate, prioritize and plan upfront. Then there was some operational work we could plan for. And finally we had some operational work, which we couldn´t foresee. The latter was a significant part, but smaller than the development work. Therefore we used Scrum, as most of a the work could be planned for.
I agree, if you only have operational work and most of it is not planable, then Kanban might be a better choice.
For c.: I think the Retrospective meeting is very valuable for Kanban as well. You dedicate time and look back what you can improve in the process.
However, for the other Scrum ceremonies I think they are only valuable if the majority of the work of the team can be planned for.