Conflicting business requirements are a common problem – and it can be found in every corner of an organization, including information technology. Resolving these conflicts is a must, but it is not always easy – although sometimes there is a new solution that helps.
In IT management, there is a constant battle between security and operations teams. Yes, both teams ultimately want to have secure systems that are harder to break. However, security can come at the expense of availability – and vice versa. In this article, we look at the conflict between availability and security and a solution that helps to resolve that conflict.
Ops team focus on availability…security teams shut down
Operational teams will always have stability, and therefore availability, as their top priority. Yes, ops teams will also make security a priority, but only to the extent that it affects stability or availability, never as an absolute goal.
It’s set in the “five nines” uptime goal that sets an incredibly high requirement: that a system be up and running 99.999% of the time and available to process requests. It is a commendable goal that keeps stakeholders happy. Tools like high availability help with this by providing system or service level redundancies, but security goals can quickly get in the way of achieving “five nines”.
For security teams, the ultimate goal is to have systems as locked down as possible, minimizing the attack surface and overall risk levels. In practice, security teams may require a system to go offline right now and not in two weeks, reducing availability for immediate patching – regardless of the impact on users.
It’s easy to see that this approach would cause a huge headache for operations teams. Worse still, where high availability has really helped operational teams achieve their availability and stability goals, it could actually make things worse for security teams who now have to take care of an exponentially larger number of servers or services, all of which must be protected and monitored.
What best practice should you follow?
It creates a conflict between operations and security, quickly putting the two groups at odds over topics like best practices and processes. When you think about patching, a maintenance window-based patching policy will cause fewer interruptions and increase availability because there is a several week lag between patching efforts and the associated downtime.
But there’s a catch: maintenance windows aren’t patched fast enough to properly protect against emerging threats, because these threats are often actively exploited within minutes of disclosure (or even before disclosure, e.g. Log4j).
The problem occurs across all types of workloads and it doesn’t really matter if you’re using the latest DevOps, DevSecOps, or whatever-ops approach as the flavor of the day. Ultimately, you either patch faster for secure operations at the expense of availability or performance, or you patch slower and take unacceptable risks with security.
It quickly gets very complicated
Deciding how quickly to patch is just the beginning. Sometimes patching is not easy. For example, you may be dealing with programming language-level vulnerabilities – which in turn affect applications written in that language, for example, CVE-2022-31626a PHP vulnerability.
When this happens, there is another group that participates in the conflict between availability and security: the developers who have to address a language-level vulnerability in two steps. First, by updating the appropriate language version, which is the easy part.
But updating a language version doesn’t just bring security improvements; it also brings other fundamental changes. Therefore, developers must go through a second step: compensating for the language-level changes caused by rewriting application code.
That also means retesting and in some cases even recertification. Like ops teams looking to avoid reboot-related downtime, developers want to avoid extensive code edits for as long as possible, because this implies big work that, yes, tightens security — but otherwise doesn’t show developers anything ahead of their time.
You can easily see why today’s patch management processes create multi-layered conflict between teams. A top-down policy can solve the problem to some extent, but it usually means no one is really happy with the outcome.
Worse, these policies can often compromise security by leaving systems unpatched for too long. Patching systems weekly or monthly with the idea that the risk is acceptable will sooner or later lead to a sobering reality check at the current threat level.
There is one way to significantly reduce or even resolve the conflict between immediate patching (and disruption) and delayed patching (and vulnerabilities). The answer lies in interference-free and frictionless patching, at any level or at least as many levels as is practical.
Frictionless patching can resolve the conflict
Live patching is the frictionless patching tool your security team should look out for. Live patching allows you to patch much faster than normal maintenance windows could ever achieve, and you never have to restart services to apply updates. Fast and secure patching, with little to no downtime. A simple, effective way to resolve the conflict between availability and security.
Bee TuxCare we provide comprehensive live patching for critical Linux system components and patches for multiple programming languages and programming language versions that target security vulnerabilities and that don’t introduce language-level changes that would otherwise force code refactoring – your code will continue to work as it is, only securely. Even if your business relies on unsupported applications, you don’t have to worry about vulnerabilities seeping into your systems through a programming language flaw – nor do you need to update the application code.
So to wrap up, in the conflict between availability and security, live patching is the only tool that can significantly reduce tension between operations and security teams.