Here are some essential questions when it comes to delivering software:
• How can I tell when something has gone wrong with my software?
• How can I diagnose what went wrong
• How can I be sure my fix works
• How can I get my fix into the live environment
• How can I tell how much delivery of my system costs
Devops is the process of making the answers to all of these questions be an assured “easily, and quickly”.
So, it means monitoring, alarming and alerting. Of infrastructure, so you know if CPU is exhausted or a disk fails. But also of your app’s behavior – HTTP 5xx error codes, failed delivery of messages, exceptions with particular inputs. Something has to go red when there is a failure.
It means having access to infrastructure and application logs. The ELK stack or Graylog are good and popular examples. To diagnose what went wrong, I want to view the logs around the time frame of the failure, and get a good error message. Writing good error messages is a cornerstone of Devops strategy. Don’t look down on it. The Ops part of you will thank the Dev part of you one day in the future.
It means having tests for your code, and it means investing the effort in making code testable. When it comes to fixing an issue, you write tests to prove the fix. This is not strictly a Devops practice, more an Agile one, but I’m struggling more and more these days to see the difference. And also less inclined to care about the difference. Devops is about caring holistically about continued delivery of service to customers, not about process names.
It means having a continuous integration and delivery process. An automated one. So when you check in your fix, it gets built, it gets tested, packaged, and deployed onto an environment that is identical to production, but maybe on a smaller scale. Thereafter the package is promoted to environments progressively more like production, until it gets to production itself.
It means having tools or reports that tell you how much your infrastructure costs. It used to be called Total Cost of Ownership. Whatever happened to that? Every cloud platform I have worked on so far offers this. I’m paying $5,000 a month.
What’s less quantifiable in many systems, is how much money the system is making the business. Maybe this will have its day – “FinDevOps”, where the finance team are involved in the software delivery as equal partners, and they want to see a taxi-meter style “profit for the day” gauge on Grafana. I’ve seen such a thing once.
Anyway, enough day dreaming.
Devops is also a culture. One of end-to-end caring. Having devs on front line support is a super way to bootstrap this mindset. “If you wrote it, you run it”. Suddenly it gets personal. If a dev realizes he or she is going to be call on Saturday night, suddenly unit testing makes sense! A fast build makes sense. Continuous Delivery makes sense. Monitoring makes sense. Automation makes sense. Retrospectives following an incident make sense.
I have worked in dev environments that were disconnected from delivery and support, and there has been a cynical attitude to “best practices”. Especially retrospectives. Devs can be very cynical about retrospectives. But if you’re going to get a call in the middle of the night on a Saturday to fix your stuff, you want it done fast and reliably so you can get back to bed. Or back to the pub.
And then on Monday morning you will be very motivated to say “Hey folks, we screwed up with that last delivery. How can we do better”. Now you’re very motivated to have a retrospective all of a sudden. And to implement the findings!
So Devops is about caring. Its about not being siloed. It’s not “somebody else’s problem”. It’s a spirit of collective responsibility. And if you go the whole hog and involve the finance Finance people too – give them a taxi meter so they can get excited and join you – you might be starting a new movement.