Posts Tagged operations

Managing risks in Operations & Production Support environment..

Managing risks in a production environment, that is making money for customers, is extremely essential. However, most often, due to the unpredictable nature of the production support or operations management work, the fear of unknown increases drastically.

More often than not, for an operations analyst or a production support analyst every day is a new day and every problem is a new problem and hence the traditional risk management model that suggest to Identify->Analyze->Plan->Track->Control. The traditional model assumes there is a significant time available that will allow you to analyze and assess the risks after you identify it. However, in the production support or operations management area, the time is something that is not available and you are expected to react it quickly.

I have been part of a workshop recently to discuss about the Risk Management and how it could be done in such a volatile, unpredictable and unknown environment such as production (or Live).  In one of my previous experiences about awarding the winners in an organization, it was observed that the companies, most often, tend to reward the people who do better crisis management than the people who do better risk management and that often means that the risks are tend to be reacted only when they are realized and become a bigger problem.

So, at the end of the discussion, it was more or less agreed that the Risk Management in a production environment is all about behavioral change and mindset. Interesting ? .. read ahead !

If you consider the possible responses to a risk once you identify it, they could broadly classified as follows,

  • Terminate – terminate the risk at the source and do not accept the same
  • Transfer – transfer the risk to the concerned stakeholders and ensure they are mitigated
  • Treat – accept the risk immediately and start controlling
  • Tolerate – accept the risk and do nothing !

If you revisit all the scenarios you had experienced related production support or operations business, they are more often than not demand urgent attention.  A priority 1 ticket is waiting or some incident is threatening to take the shape of a bigger problem.  Now, for such situations, can you terminate the risk ? Can you tolerate the risk or can you transfer the risk and keep quiet ? I would think no ! In all such cases, you would have taken quick action to either resolve the risk yourself or ensure that the risk is resolved at the earliest.

Now, coming back to my earlier statement of relating the Treat, you would agree that to treat the risk in an production environment that requires collaboration across multiple teams, you need to develop the ownership & risk taking mindset. Someone needs to take the ownership and drive the problem through to the solution or mitigate the risk in full.

Few tips on mitigating the production risks are as follows,

  • Keep customers informed of more bad news than the good news. Even if you do not believe, the customers are more prepared to listen to worse news than you can possibly give the.
  • Expose your vulnerability without going into victim mindset !

Tags: , , , , ,

Moving to IT Operations 2.0? – Using Web 2.0, for managing 24×7 IT operations

Do you manage the teams that work 24×7 across the shifts? Especially on 12 hour/day shift and 4 days/week model? If yes, then I would really be interested to know how manage to have team meetings and to the team collectively?

Getting everyone together for a briefing, team meeting, gathering, round table etc., has been a great concern because of the lack of people in office (due to their shift working). Almost half of the team is off on weekly offs every day while half of remaining work in day and rest in the night shift. Thus at any point of time, I have access to only 1/4th of the team at wok (again due to shift working).

Since most of the team members are not available having a team meeting is really out of question and does not really add value. So I would need to develop some method of getting the offline updates across to everyone and make them equally participant in the decision process, which, may last longer than usual one / two hour meeting. It could actually take 1 week depending on the shift schedule.

Few things I have been contemplating of implementing are,

  • A discussion forum – Creating an online discussion forum where each discussion topic is listed and kept open for a period of 1 week.  Offline reminders are sent to the team to go through the discussion forum and the comments / questions are invited and are answered through further replies / comments over the forum.
  • Offline recording of meetingsRecord the meeting discussions in audio / video formats and again share with the team offline. Invite comments over the recording & engage them into question / answer session offline. Create a Question Basket where you invite questions and get them answered through email replies / audio recordings or video sessions.
  • Issue tracker & project update acknowledger – Create an online application where the important decisions are stored and compel the team members to acknowledge that they have read and understood these decisions. If they have issues, they can raise queries back to the manager via online application.

There might be few more things I could possibly do, but I would really like to know if you have any good suggestions which I could take on board and try and implement them so as to work even more effectively.

Please leave a comment, a feedback or a note on my blog if you could help.

Tags: , , , , ,