Protection from Container Malware with Anthos
TL;DR there is a fairly new attack campaign using the Kinsing malware targeted at container platforms like Docker and GKE. This post will show you how to protect your infrastructure with Google Cloud’s Anthos both on prem and in the cloud.
Last Friday, research from Aqua Security came out showing that the Kinsing malware has been used recently in a campaign against Docker Daemon APIs to the end of launching a Bitcoin miner and self-propagating. Anthos GKE offers several security features that can be used to protect your enterprise from such threats. We’ll start by analyzing the attack tree for this campaign, taken from the linked Aqua blog.
Background
The attack tree shows a few basic steps that we’ll address, as well as some other vectors that are not used in this attack.
- Bypass API Security
- Download, launch and run a script
- Add itself to cron job for persistence
- Lateral movement via SSH
- Command & Control
- Crypto mining
Protecting Container APIs
While there are specific protections that Docker customers can use to protect their APIs, our goal will be to talk about the protections Anthos GKE can offer. In the top node, the attacker first makes use of an open control plane. The research does not go into details as to whether these were actually authenticated or not, but let’s assume they were at best weakly authenticated. Anthos GKE allows admins and users to authenticate with Google Identity or GSuite on top of the traditional means of authentication into Kubernetes. This is strong authentication that can not only be configured to require MFA but also audited using Cloud Operations (formerly Stackdriver). GKE’s control plane can also be configured to be completely private to your internal VPC or on-prem network. This is highly recommended on top of authentication with Cloud Identity to reduce the attack surface.
Protecting the Runtime
In Anthos Config Management (ACM), there is the concept of a Policy Controller which can limit the properties of the pods that run in your GKE environment using constraints. These constraints use a policy as code language call Rego which is highly extensible and testable to allow full control over the privileges your pods have. The Gatekeeper policy library is shipped with ACM to give you a baseline to start working with. Specifically this malware searched bash history, known hosts, etc. to find hosts that the compromised container has SSH’d to. Using the host-network-ports policy you could enforce that containers of a particular type are only accessible by certain ports, and SSH port 22 should not be one of them. The idea here is to scope each container to the least privilege it needs to function.
While these policies ensure host-level control, it is also recommended to scope network access to other containers using the principle of least privilege as well. Kubernetes provides Network Policies for precisely this purpose. On top of that Anthos Service Mesh gives the ability to mutually authenticate and encrypt all data in transit and create application-level policies. What this means is even if a pod is compromised and runs a script to access other pods, it would not matter since access to other Service Mesh resources would be at the HTTP/Application level, not the network level. At the network level you can also configure a whitelist of allowed IPs to egress to within your service mesh so contacting command and control servers would not be possible. To be clear you can use both application/HTTP-level and network-level controls within Anthos GKE, but if you must choose between the two, application-level provides more control.
While this particular malware did not specifically use cloud resources, following the principle of least privilege means we also downscope access to cloud resources as well as network resources. In Anthos GKE, Workload Identity allows each pod to connect to their own individual Google service account for fine-grained access to GCP resources.
Protecting the Supply Chain
Even if the Kubernetes Master API is secured from direct access by malicious actors launching containers, there is still the threat of compromise to the supply chain. That is, if the container repository was compromised and someone pushed a malicious image that passed some policies that are in place, that container could still be deployed to a pod. Using Anthos GKE BinAuthz, you have the option to sign containers with Grafeas after they pass automated tests in the CI/CD pipeline and verify those signatures before deploying them to your clusters. This control dramatically reduces risk combined with ACM policies that can enforce the host-level abilities of the container.
In many cases, third party containers must be used where you cannot necessarily control the patch management or have deep insight into the potential vulnerabilities. In cases like this, GKE Sandboxes can be used to launch pods into an effectively jailed environment where kernel calls are limited.
Ensuring Observability
We would be remiss to not mention that these controls may protect your environment from known threats, but attackers are constantly becoming more sophisticated. A robust strategy for observability within your clusters is paramount. Within Anthos GKE, data is shipped to Cloud Logging and Cloud Monitoring for making informed decisions about potential novel threats in your infrastructure. With proper logging and monitoring, teams can be alerted to attempts at Bitcoin mining or other malicious activities based on anomalous CPU utilization and other indicators. The combinations of these controls create a defense in depth strategy that can mitigate the threat of external or even internal malicious activity.