Skip to main content

Command Palette

Search for a command to run...

GKE Hardening: Secrets Injection & Control Plane Access

Published
6 min read

GKE Hardening: Secrets Injection and Control Plane Access with IAM + DNS

Two GKE security patterns that teams consistently defer: how secrets reach pods, and how CI/CD pipelines access the control plane. After reading this guide, you'll have production-ready configurations for both — Secret Manager native integration for secrets, and DNS-based IAM access for the control plane.

The Problem

Most GKE clusters I audit have the same two issues:

Secrets live in environment variables or base64-encoded Kubernetes Secrets. Developers assume Kubernetes Secrets are encrypted. They aren't — not by default. Base64 encoding is not encryption. Anyone with kubectl get secrets permission sees everything.

Control plane access uses IP allowlists that grow stale. A developer's VPN changes, the CI/CD pipeline breaks, someone widens the IP range to "fix" it. Six months later, the allowlist includes ranges nobody recognizes and half the entries are no longer valid.

Both create audit findings. Both create real operational risk. Both are fixable in an afternoon if you plan the migration correctly.

Prerequisites

Before starting:

  • GKE cluster running version 1.29 or later (required for native Secret Manager add-on GA)
  • Secret Manager API enabled in your project
  • Workload Identity enabled on the cluster
  • gcloud CLI authenticated with permissions to update clusters
  • Existing secrets you want to migrate (start with one non-critical secret)

Verify your cluster version:

gcloud container clusters describe CLUSTER_NAME \
  --location=REGION \
  --format="value(currentMasterVersion)"

Step 1: Enable the Secret Manager Add-on

Why this matters

The Secret Manager native add-on (GA since 2024) mounts secrets directly into pods as files. No External Secrets Operator, no custom sync jobs, no additional controllers to monitor. Google manages the integration.

Implementation

gcloud container clusters update CLUSTER_NAME \
  --enable-secret-manager-addon \
  --location=REGION

This installs the Secrets Store CSI driver and the GCP provider. It takes 5-10 minutes to propagate.

Verify the add-on is running:

kubectl get pods -n kube-system -l app=secrets-store-csi-driver

You should see secrets-store-csi-driver pods running on each node.

What goes wrong

The add-on requires Workload Identity. If Workload Identity isn't enabled, the command succeeds but pods fail to mount secrets with cryptic CSI driver errors. Check Workload Identity status first:

gcloud container clusters describe CLUSTER_NAME \
  --location=REGION \
  --format="value(workloadIdentityConfig.workloadPool)"

If empty, you have a bigger migration ahead.

Step 2: Configure IAM for Secret Access

Why this matters

The Kubernetes Service Account your pod uses must have permission to read the specific secrets it needs. This is where least-privilege actually happens.

Implementation

Create a Secret Manager secret (or use an existing one):

gcloud secrets create db-password \
  --replication-policy="automatic"

echo -n "your-actual-password" | gcloud secrets versions add db-password --data-file=-

Grant the Kubernetes Service Account access:

gcloud secrets add-iam-policy-binding db-password \
  --member="serviceAccount:PROJECT_ID.svc.id.goog[NAMESPACE/KSA_NAME]" \
  --role="roles/secretmanager.secretAccessor"

Replace:

  • PROJECT_ID with your GCP project
  • NAMESPACE with the Kubernetes namespace
  • KSA_NAME with the Kubernetes Service Account name

What goes wrong

Teams often grant secretmanager.secretAccessor at the project level. This gives every secret to every workload. Bind at the secret level. It's more work upfront, but it's the difference between "compromised pod reads one secret" and "compromised pod reads all secrets."

Step 3: Create the SecretProviderClass

Why this matters

The SecretProviderClass tells the CSI driver which secrets to mount and where. One class per logical group of secrets.

Implementation

apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: app-secrets
  namespace: your-namespace
spec:
  provider: gcp
  parameters:
    secrets: |
      - resourceName: projects/PROJECT_ID/secrets/db-password/versions/latest
        fileName: db-password
      - resourceName: projects/PROJECT_ID/secrets/api-key/versions/latest
        fileName: api-key

Apply it:

kubectl apply -f secretproviderclass.yaml

What goes wrong

Using versions/latest means pods get the current secret version at mount time. It does not auto-rotate. If you rotate a secret, existing pods keep the old value until they restart. Design your applications to handle this, or use a sidecar that watches for rotation events.

Step 4: Mount Secrets in Your Deployment

Why this matters

This is where the actual secret injection happens. Secrets appear as files in the pod filesystem.

Implementation

apiVersion: apps/v1
kind: Deployment
metadata:
  name: your-app
spec:
  template:
    spec:
      serviceAccountName: your-ksa  # Must have Secret Manager access
      containers:
      - name: app
        image: your-image
        volumeMounts:
        - name: secrets
          mountPath: /secrets
          readOnly: true
      volumes:
      - name: secrets
        csi:
          driver: secrets-store.csi.k8s.io
          readOnly: true
          volumeAttributes:
            secretProviderClass: app-secrets

Your application reads /secrets/db-password as a file.

What goes wrong

Applications expecting environment variables need code changes. This is the migration work teams underestimate. Start with new deployments, then retrofit existing ones.

Step 5: Enable DNS-Based Control Plane Access

Why this matters

DNS-based access replaces IP allowlists with IAM. No more stale IP ranges. Access is granted to identities, not network locations — which aligns with the Security by Design principle in the SCALE framework.

Implementation

gcloud container clusters update CLUSTER_NAME \
  --enable-dns-access \
  --location=REGION

After enabling, get the new DNS endpoint:

gcloud container clusters describe CLUSTER_NAME \
  --location=REGION \
  --format="value(controlPlaneEndpointsConfig.dnsEndpointConfig.endpoint)"

Update kubeconfig to use the DNS endpoint:

gcloud container clusters get-credentials CLUSTER_NAME \
  --location=REGION \
  --dns-endpoint

What goes wrong

Existing kubeconfig files using the IP endpoint stop working. Every developer workstation and CI/CD pipeline needs updated credentials. Plan this migration:

  1. Enable DNS access (both endpoints work simultaneously)
  2. Update all CI/CD pipelines to use DNS endpoint
  3. Notify developers to refresh credentials
  4. Optionally disable IP-based access later

Step 6: Scope Control Plane IAM Permissions

Why this matters

DNS-based access requires the container.clusters.connect permission. Scope this to specific service accounts, not broad groups.

Implementation

For CI/CD service accounts:

gcloud projects add-iam-policy-binding PROJECT_ID \
  --member="serviceAccount:cicd-sa@PROJECT_ID.iam.gserviceaccount.com" \
  --role="roles/container.clusterViewer" \
  --condition=None

For deployments, the service account also needs appropriate RBAC within the cluster. IAM gets you to the control plane; Kubernetes RBAC controls what you can do once connected.

What goes wrong

Teams grant container.admin to CI/CD pipelines because it's faster. This allows the pipeline to delete clusters, modify node pools, and change security settings. Use container.developer for deployment pipelines — it's sufficient for kubectl apply operations.

Common Mistakes

Deploying ESO when the native add-on works. External Secrets Operator adds operational overhead — another controller to monitor, another failure point. If you're single-cloud GCP, the native add-on is simpler. ESO makes sense for multi-cloud or complex secret routing.

Skipping the migration plan for DNS access. Enabling DNS access is easy. The hard part is updating every system that talks to the control plane. Inventory your access points before flipping the switch.

Not monitoring secret sync failures. Whether native add-on or ESO, secret mount failures should trigger alerts. A pod restart during an outage shouldn't be the first time you discover secret sync is broken.


Secrets in environment variables and stale IP allowlists appear in nearly every GKE security review I conduct. Neither is difficult to fix — but both require migration planning that accounts for existing workloads and CI/CD pipelines.

Work with a GCP specialist — book a free discovery call


Amit Malhotra, Principal GCP Architect, Buoyant Cloud Inc


Work with a GCP specialist — book a free discovery callhttps://buoyantcloudtech.com