Free report: How seven companies use distributed SQL and CockroachDB to save money, increase efficiency, and lay their own foundations for operational resilience.Download now
The world is getting weird.
Increasingly epic weather events, multiplying global geopolitical uncertainties, and stubborn economic stressors make for an exceptionally unpredictable business environment. To navigate this unfortunate but unavoidable “new normal”, enterprises need a strong but flexible foundation in both business continuity and operational resilience.
Aren’t these basically the same thing? Nope. It’s a common misconception that operational resilience is the more or less automatic outcome of having solid business continuity planning in place. The two are intertwined but separate: Business continuity is planning how to respond, and then recover, in the event of a given disaster (CSP outage, DDOS attack, insert your own worst nightmare scenario here…). Operational resilience means proactively building in capabilities to ensure that, if said disaster does happen, it doesn’t affect you in the first place.
And both are, in their own ways, essential for an organization’s survival and success.
Business continuity and operational resilience share the same ultimate goal: survival. However, they reach it from different directions. How can you tell them apart?
Think about it in terms of playing a video game. You’re in the final boss battle (talk about mission critical!) and suddenly the game crashes. Business continuity is the equivalent of being able to go back to the last save point and pick up close to where you left off. Operational resilience is where the game experiences the same glitch but, since the software was architected for zero downtime, your gameplay was never interrupted — you never noticed there was any kind of issue.
Though different, they are also symbiotic: An organization adept in operational resilience is better positioned to create a durable business continuity plan. Conversely, a company with a strong business continuity plan has a foundation that naturally fosters operational resilience. Like peanut butter and jelly, they are better together.
Let’s look at how this plays out in the real world.
Distributed Denial-of-Service (DDoS) Attacks: A sudden massive DDoS attack can paralyze even the most elegantly architected system.
For business continuity, you have a backup system ready to keep operations running — like switching to backup servers or employing cached, static versions of web pages to keep services running until the attack is stopped.
For operational resilience, however, your systems have been architected to recognize threats and act to prevent any damage before it occurs. Your application stack includes services that, in this scenario, automatically handle traffic rerouting, load balancing, and adaptive rate limiting. Your database automatically redistributes to available nodes.
Unplanned Outages: Cloud providers have outages all the time. Occasionally these are significant enough to bring entire regions to a standstill.
A major point where operational resilience and business continuity diverge: when was the last time some government entity enquired about your company’s business continuity strategy?
This doesn’t happen, because business continuity planning is something that organizations pursue on their own, for their own survival. Outside entities don’t get involved. Things are becoming very different when it comes to operational resilience.
Countries around the world are beginning to create legislation and regulatory standards to require operational resilience, beginning with critical sectors like financial services . One of the most significant is the European Union’s proposed Digital Operational Resilience Act (DORA), which seeks to ensure that all financial market participants have effective strategies and capabilities in place to manage operational resilience. DORA is expected to apply to all digital service providers, including cloud service providers, search engines, e-commerce platforms, and online marketplaces, regardless of whether they are based within or outside the EU.
We’ll say it again: the world is getting weird. It’s impossible to predict the unhappy surprises that climate change or armed conflict between countries or economic fluctuation may bestow at any moment. Add to that the growing possibility that government regulations may pop up to directly affect your business and your bottom line.
When all you can do is expect the unexpected, you have to be ready for…well, everything really.
Operational resilience is how you get ready.
This means hardwiring operational resilience into an application architecture by making every piece of your application platform agnostic. A cloud-agnostic application architecture can allow for easier scalability and flexibility. As your application grows, different services or platforms can be added or replaced without the need for major code changes. Being cloud agnostic also ensures the interoperability of applications across different cloud service providers: besides guaranteeing resilience and availability, this also makes it straightforward to satisfy any operational resilience regulations that eventually arise.
After all, who needs business continuity when you have four nines of uptime? Just kidding, sorry, you do still need business continuity planning — but consider the guarantees and the service level agreement that come with best-in-class managed services. When things fail (as they always do; as Cockroach founder Spencer Kimball says, “Sh*t happens, and at scale sh*t is always happening”), managed services have a team dedicated to fixing it immediately. You don’t need to provision one to go respond, and ideally you won’t even notice anything happened.
Ultimately, the simplest way to build for your own operational resilience is to choose architecture made up of cloud agnostic, highly available services that have already solved this problem for you — essentially, Operational-Resilience-as-a-Service.
With that in place, business continuity planning just got a whole lot easier.
Major cloud platform outages used to be rare events. As the amount of global data increases exponentially, however (90% …Read more
According to Gartner, by 2025 over 95% of new digital workloads will be deployed on cloud-native platforms. It makes …Read more
In the media and streaming industry, downtime is simply not acceptable. From the infamous Game of Thrones outages, to …Read more