Skip to main content

Command Palette

Search for a command to run...

GKE Gateway API for Production Traffic Management Guide

Published
5 min read

How to Implement GKE Gateway API for Production Traffic Management

Kubernetes Ingress served its purpose for simple HTTP routing. It breaks down when you need canary deployments, multi-domain TLS with automatic certificate rotation, or cross-namespace traffic routing. At that point, you're fighting annotations — I've seen production clusters with 40+ Ingress annotations trying to approximate behaviour that Gateway API handles natively.

This guide walks through implementing GKE Gateway API from scratch, replacing Ingress patterns with the role-oriented model that GKE supports natively through Global External Load Balancer integration.

The Problem

Ingress resources conflate infrastructure concerns (load balancer configuration, TLS termination) with application concerns (routing rules, traffic splitting). This creates friction between platform teams and application teams. Platform engineers end up owning Ingress resources they shouldn't touch, or application teams request annotation changes that require infrastructure-level access.

Gateway API separates these concerns explicitly:

  • Gateway: Infrastructure team owns the load balancer configuration
  • HTTPRoute: Application teams own their routing rules
  • ReferenceGrant: Explicit cross-namespace permissions

This separation matters when you're building a multi-tenant GKE platform or preparing for SOC 2 audits where access boundaries need clear documentation.

Prerequisites

Before starting, verify these requirements:

  • GKE cluster running version 1.24 or later (Gateway API is GA from 1.26+)
  • gke-l7-global-external-managed GatewayClass available (enabled by default on Autopilot, requires enabling on Standard clusters)
  • Certificate Manager API enabled in your project
  • IAM permissions: roles/container.admin for Gateway resources, roles/certificatemanager.editor for TLS setup

Check GatewayClass availability:

kubectl get gatewayclass

Expected output includes:

NAME                             CONTROLLER                  ACCEPTED   AGE
gke-l7-global-external-managed   networking.gke.io/gateway   True       10d

If the GatewayClass is missing, enable the Gateway controller on your Standard cluster:

gcloud container clusters update CLUSTER_NAME \
  --gateway-api=standard \
  --location=LOCATION

Why This Matters

The gke-l7-global-external-managed class provisions a Global External Application Load Balancer as the data plane. No nginx pods, no controller VMs consuming node resources — the load balancer runs entirely in GCP infrastructure.

What Goes Wrong

Teams sometimes try to use community Gateway controllers (like Envoy Gateway) on GKE instead of the native GatewayClass. This works, but you lose the managed integration with Certificate Manager, Cloud Armor, and automatic NEG creation. Stick with the GKE-managed classes unless you have a specific requirement.

Step 1: Create a Certificate Map with Certificate Manager

Gateway API on GKE integrates with Certificate Manager for TLS. Create a DNS authorization and certificate first:

# Create DNS authorization
gcloud certificate-manager dns-authorizations create app-dns-auth \
  --domain="app.example.com" \
  --project=PROJECT_ID

# Get the CNAME record details
gcloud certificate-manager dns-authorizations describe app-dns-auth \
  --project=PROJECT_ID

Add the CNAME record to your DNS zone, then create the certificate:

# Create managed certificate
gcloud certificate-manager certificates create app-cert \
  --domains="app.example.com" \
  --dns-authorizations=app-dns-auth \
  --project=PROJECT_ID

# Create certificate map
gcloud certificate-manager maps create app-cert-map \
  --project=PROJECT_ID

# Add certificate to map
gcloud certificate-manager maps entries create app-cert-entry \
  --map=app-cert-map \
  --certificates=app-cert \
  --hostname="app.example.com" \
  --project=PROJECT_ID

What Goes Wrong

DNS authorization validation can take 10-30 minutes. Teams often assume the certificate failed when it's still propagating. Check status with:

gcloud certificate-manager certificates describe app-cert --project=PROJECT_ID

Look for state: ACTIVE before proceeding.

Step 2: Deploy the Gateway Resource

Create the Gateway in a dedicated gateway-infra namespace owned by your platform team:

apiVersion: v1
kind: Namespace
metadata:
  name: gateway-infra
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: external-gateway
  namespace: gateway-infra
spec:
  gatewayClassName: gke-l7-global-external-managed
  listeners:
  - name: https
    port: 443
    protocol: HTTPS
    tls:
      mode: Terminate
      options:
        networking.gke.io/cert-map: projects/PROJECT_ID/locations/global/certificateMaps/app-cert-map
    allowedRoutes:
      namespaces:
        from: All
  - name: http-redirect
    port: 80
    protocol: HTTP
    allowedRoutes:
      namespaces:
        from: All

Apply with:

kubectl apply -f gateway.yaml

Why This Matters

The allowedRoutes.namespaces.from: All setting lets application teams in any namespace attach HTTPRoutes to this Gateway. For tighter control, use from: Selector with namespace labels.

What Goes Wrong

The Gateway takes 3-5 minutes to provision because GCP is creating the underlying load balancer, health checks, and backend services. Check status:

kubectl describe gateway external-gateway -n gateway-infra

Look for Programmed: True in the status conditions.

Step 3: Create HTTPRoute for Application Routing

Application teams create HTTPRoutes in their own namespaces:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: app-route
  namespace: app-team
spec:
  parentRefs:
  - name: external-gateway
    namespace: gateway-infra
  hostnames:
  - "app.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /api
    backendRefs:
    - name: api-service
      port: 80
  - matches:
    - path:
        type: PathPrefix
        value: /
    backendRefs:
    - name: web-service
      port: 80

What Goes Wrong

If the application namespace doesn't have a matching Service, the HTTPRoute enters a BackendNotFound state. The route won't work, but it also won't block other routes — this is cleaner than Ingress behaviour where a missing backend can break entire resources.

Step 4: Implement Canary Traffic Splitting

Gateway API handles traffic splitting natively — no annotations, no third-party controllers:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: canary-route
  namespace: app-team
spec:
  parentRefs:
  - name: external-gateway
    namespace: gateway-infra
  hostnames:
  - "app.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /api
    backendRefs:
    - name: api-stable
      port: 80
      weight: 90
    - name: api-canary
      port: 80
      weight: 10

Adjust weights and reapply — changes propagate in under a minute.

Why This Matters

I've seen canary deployments implemented via Ingress weight annotations break silently during nginx controller upgrades. Gateway API traffic splitting is part of the spec, not an annotation extension. The behaviour is consistent across versions.

Common Mistakes

Using wrong GatewayClass for internal traffic: gke-l7-global-external-managed creates internet-facing load balancers. For internal traffic, use gke-l7-rilb (Regional Internal Load Balancer).

Skipping ReferenceGrant for cross-namespace backends: If your HTTPRoute references a Service in a different namespace, you need a ReferenceGrant in the target namespace. Without it, the route silently ignores that backend.

Expecting immediate propagation: Global load balancer changes take 3-5 minutes. Build this into your deployment pipelines.

Mixing Ingress and Gateway on the same cluster: This works technically, but creates confusion about which resource controls which traffic path. Pick one model per cluster.

Wrapping Up

Gateway API represents the Automation and Lifecycle stages of the SCALE framework in practice — infrastructure as code that separates concerns cleanly and reduces operational friction. If you're building a new GKE platform in 2025, Gateway API is the right starting point. If you're running Ingress in production, plan the migration now, especially if canary deployments or multi-domain TLS are on your roadmap.

Ingress will get you there eventually, with enough annotations. Gateway API gets you there cleanly.


Work with a GCP specialist — book a free discovery call

Amit Malhotra, Principal GCP Architect, Buoyant Cloud Inc


Work with a GCP specialist — book a free discovery callhttps://buoyantcloudtech.com