Escaping GKE gVisor sandboxing using metadata

December 30, 2020

Bastien Chatelard

Bastien Chatelard
@bchatelard

4 min read

Introduction

GKE is a Google Cloud service that offers a managed Kubernetes cluster, the nodes of the clusters are running on Google Cloud VM instances, the control plane and network is fully managed by GKE.

GKE offers a sandboxing feature (https://cloud.google.com/kubernetes-engine/docs/concepts/sandbox-pods), based on gVisor (https://gvisor.dev/docs/) it protects the host kernel from untrusted code. This sandboxing offers a very good isolation and allow SaaS business to execute unknown code submitted by their users.

I tried to use this feature to run isolated workloads and found that the isolation was not entirely effective and that the access to the metadata API was possible under certain conditions.

Network isolation using network policy

By default, in a Kubernetes cluster all pods are able to communicate, GKE recommends to use Network Policy to restrict the network traffic between pods (https://cloud.google.com/kubernetes-engine/docs/how-to/hardening-your-cluster#restrict_with_network_policy).

When running untrusted code, it is a good practice to isolate your clients from each other and from your own services.

With this feature, it is easy to define a policy and attach it to a group of pods and restrict the network access for theses pods.

Sandbox metadata protection

Google Cloud team documents how to harden the workload isolation using GKE sandbox (https://cloud.google.com/kubernetes-engine/docs/how-to/sandbox-pods#sandboxed-application), and gives some hints on how to configure and test the access to the metadata.

To validate that the filtering is properly enabled, you can launch a new pod and run the following command:

curl -s "http://metadata.google.internal/computeMetadata/v1/instance/attributes/kube-env" -H "Metadata-Flavor: Google"

This command is failing as described in the documentation because there is filtering denying the access to the metadata API.

By default the instance metadata API server is not supposed to be accessible from any sandboxed pod.

Bug found

When testing the network isolation for untrusted pods, I tried to configure the network policy on the cluster and applied some network filtering rules for the pods that I wanted to isolate.

After more testing, I found out that I was able to query the metadata API, it appears that the network filtering applied for the gVisor sandboxed pod by the GKE team was entirely disabled when the network policy was activated.

Since this sandboxing feature is supposed to run untrusted code, this would give an attacker access to sensitive informations about the node, project and Kubernetes cluster.

The bug was reported to the VRP team and quickly fixed, I was able to mitigate this by manually filtering the 169.254.169.254 IP in the network policy applied to theses pods.

How to reproduce

You can follow the steps here : https://cloud.google.com/kubernetes-engine/docs/how-to/sandbox-pods

  • Create a new cluster with network policy enabled
gcloud container clusters create cluster-name --enable-network-policy
  • Create a new gVisor pool
gcloud container node-pools create gvisor \ --cluster=cluster-1 \ --node-version=1.16.13-gke.401 \ --machine-type=e2-standard-2 \ --image-type=cos_containerd \ --sandbox type=gvisor --zone europe-west1-c
  • Apply the test configuration from the documentation
# sandbox-metadata-test.yaml apiVersion: apps/v1 kind: Deployment metadata: name: fedora labels: app: fedora spec: replicas: 1 selector: matchLabels: app: fedora template: metadata: labels: app: fedora spec: runtimeClassName: gvisor containers: - name: fedora image: fedora command: ["/bin/sleep","10000"]
  • Launch a shell
kubectl exec -it pod-name /bin/sh
  • Enjoy full access on the metadata API
curl "http://169.254.169.254/computeMetadata/v1/instance/attributes/kube-env" -H "Metadata-Flavor: Google" ... ALLOCATE_NODE_CIDRS: "true" API_SERVER_TEST_LOG_LEVEL: --v=3 ...

Going deeper

With this metadata exposure bug, an attacker may gain access to sensitive information about the node, project and Kubernetes cluster.

Depending of the configuration, this could lead to:

  • read project id
  • read public ssh keys
  • get node information (name, ip, ...)
  • add his own ssh key and gain root access on the node
  • get Kubernetes configuration and certificate
  • access the Kubernetes cluster
  • impersonate a Kubernetes node
  • retrieve an service account token
  • access / create / edit / delete project resources

Better isolation of untrusted code in GKE

Even when the isolation is properly working you have many ways to protect yourself against this kind of metadata exposure.

A few recommandation for running untrusted code in GKE: