Book Review: Release It!
A few weeks ago I heard an interview on Software Engineering Radio with Michael Nygard, author of a book I hadn’t heard of called Release It! My wife had been reading Ship It! and I had heard good things about Manage It! so I was happy to hear about this new book. Over the course of the hour or so interview, Mr. Nygard made one heck of a case both for the book and for his way of thinking about writing software.
Mr. Nygard is a operations guy. That is to say his job is to help big companies maintain the software they use. The focus of the book is pointing out ways developers can engineer their software to work better with operations and be more maintainable. It’s an unfortunately seldom seen topic in programming but at least now we have a fairly thorough book to reference on the topic.
The book starts out with a pretty scary tale of a post-mortem the author did on a huge outage at a major airline. It’s a very interesting look at a huge failure that ended up being caused by a pretty small programming error that any of us could make. He also talks here about getting a thread dump of a Java process to find out where it’s having trouble which I had occasion to use in real life right after I finished the book.
The structure of the book is to introduce a topic, then do a section on Patterns and Anti-Patterns around that topic. The first section is Stability. He talks about different types of failure, and defines stability in the first place which ends up being harder than you’d anticipate. Having spent most of my professional career so far writing internal corporate applications, this was the first place where the book veered off from being specifically applicable to my life. Not to say we corporate developers don’t have to worry about customers or uptime but it’s a different set of concerns. Nobody is going to switch to another billing system because the one we work on is down. But still, it’s useful stuff.
The 2nd section of the book is Capacity. Admittedly, I skimmed this section since I’m not working on anything right now that requires accounting for massive amounts of users or fine-tuning my Ajax requests. I will revisit this section for sure when I get onto something more relevant.
The 3rd section is General Design Issues; split into sections on Network, basic Security, Availability, and Administration. Section 4 is Operations. Both of these are very valuable. Just about everything is illustrated with real examples and specific recommendations, which I like to see.
I like reading about Anti-Patterns because I’m always on the lookout for not only ways to do things but ways not to do things. The Patterns are, of course, good things to keep in mind whether you’re developing a website or a corporate integration program. In fact the Patterns in this book are probably the highlight. Things like using Timeouts, Circuit Breakers, and Connection Pooling are timeless and useful all over, hallmarks of really being Patterns and not just quick fixes and bandaids.
Overall if you’re developing any kind of serious software that’s going to have to serve users and be maintained over time, this book should really be on your bookshelf. It’s the rare book that works first as a read-through and then as a reference to be returned to later. Especially if you’re not the one who has to maintain your code, the focus on Operations is a very valuable way of thinking. If you’ve read the book I’d be interested in hearing your thoughts in the comments.