GKE Gateway for load balancing creates wrong health check path

1.7k Views Asked by At

I'm setting up GKE Gateway API in GKE cluster by following this tutorial and docs.

The Gateway resource is working fine. It has created a Load Balancer resource on GCP and assigned a static IP address to it.

apiVersion: v1
kind: Namespace
metadata:
  name: infra-ns
  labels:
    shared-gateway-access: "true"
---
kind: Gateway
apiVersion: gateway.networking.k8s.io/v1beta1
metadata:
  name: external-http
  namespace: infra-ns
spec:
  gatewayClassName: gke-l7-gxlb
  listeners:
  - name: http
    protocol: HTTP
    port: 80
    allowedRoutes:
      namespaces:
        from: Selector
        selector:
          matchLabels:
            shared-gateway-access: "true"

The HTTPRoute resource apparently is working fine. It was recognized by the Gateway.

apiVersion: v1
kind: Namespace
metadata:
  name: other-namespace
  labels:
    shared-gateway-access: "true"
---
apiVersion: v1
kind: Service
metadata:
  annotations:
    cloud.google.com/neg: '{"ingress": true}'
  name: myservice
  namespace: other-namespace
spec:
  ports:
    - name: myservice
      port: 80
      protocol: TCP
      targetPort: 8080
  selector:
    app: myservice
  type: NodePort
---
kind: HTTPRoute
apiVersion: gateway.networking.k8s.io/v1beta1
metadata:
  name: myhttproute
  namespace: other-namespace
spec:
  parentRefs:
  - kind: Gateway
    name: external-http
    namespace: infra-ns
  hostnames:
  - "demo.example.com"
  rules:
  - backendRefs:
    - name: myservice
      port: 80

I can see NEG was created for my service and the load balancer refers to it as a Backend Service. The problem is the Health check showing my service as UNHEALTHY while I have tested the only pod beneath it and it's healthy.

What I see in GCP console is the wrong path for health checks. Manual edit doesn't work since it comes back to the same. And I haven't found anything to configure this path in HTTPRoute. Anyway, I belive it doesn't have to be configured since it should get the health path stated in livenessProbe/readinessProbe which is already present in the manifest of Deployment and it's /health, not /.

enter image description here

Am I doing something wrong? Why the health check path differs from the stated in liveness and readiness probes?

$ kubectl -n other-namespace describe deployment myservice

...
  Containers:
   myservice:
    Image:      myserviceimage:latest
    Port:       8080/TCP
    Host Port:  0/TCP
    Requests:
      cpu:        1m
      memory:     1Mi
    Liveness:     http-get http://:8080/health delay=10s timeout=5s period=10s #success=1 #failure=3
    Readiness:    http-get http://:8080/health delay=2s timeout=3s period=2s #success=1 #failure=2
    ...
2

There are 2 best solutions below

1
boredabdel On BEST ANSWER

That's because the HealthCheck is set by default to the same port and path as the one your app serves traffic from.

If you need to customize healthCheck port/path or other Parameters you need to use a HealthCheckPolicy https://cloud.google.com/kubernetes-engine/docs/how-to/configure-gateway-resources#configure_health_check

0
x-yuri On

I believe this the part of the documentation that addresses specifically your case:

GKE Gateway behaves differently than Ingress, in that Gateway does not infer health check parameters. If your Service does not return 200 for requests to GET /, or you have other tuned pod readiness checks, you need to configure a HealthCheckPolicy for your service.

You supposedly need a HealthCheckPolicy along the lines of:

apiVersion: networking.gke.io/v1
kind: HealthCheckPolicy
metadata:
  name: NAME
  namespace: default
spec:
  default:
    logConfig:
      enabled: true
    config:
      type: HTTP
      httpHealthCheck:
        requestPath: /health
  targetRef:
    group:
    kind: Service
    name: YOUR_SERVICE_NAME

But for better or worse this changes the defaults (in case this matters):

checkIntervalSec: 15 -> 5
timeoutSec: 15 -> 5
healthyThreshold: 1 -> 2

Which you can override.

logConfig.enabled is optional. It lets you see the health checks that it makes:

$ gcloud logging read 'logName:healthchecks' \
    --freshness=10m \
    --project PROJECT_ID \
    --format=json \
  | jq '.[] | {
    targetIp: .jsonPayload.healthCheckProbeResult.targetIp,
    probeResultText: .jsonPayload.healthCheckProbeResult.probeResultText,
    detailedHealthState: .jsonPayload.healthCheckProbeResult.detailedHealthState,
    timestamp}'

$ gcloud logging read 'logName:healthchecks' \
    --freshness=10m \
    --project PROJECT_ID \
  | yq '{
    "targetIp": .jsonPayload.healthCheckProbeResult.targetIp,
    "probeResultText": .jsonPayload.healthCheckProbeResult.probeResultText,
    "detailedHealthState": .jsonPayload.healthCheckProbeResult.detailedHealthState,
    "timestamp": .timestamp}'