I'm stuck setting up Keda in my private GKE cluster, with Google PubSub Scaler.
I have a deployment what I would like to scale based on the number of messages in a subscription. My deployment is accessing Google resources (poll a subscription) through Workload Identity
My current setup is something like this.
My deployment: (just added the relevant parts):
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
run: delay-executor
name: delay-executor
namespace: workflows
spec:
template:
spec:
serviceAccountName: delay-executor-sa
automountServiceAccountToken: false
terminationGracePeriodSeconds: 70
The service account yaml looks like this:
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
iam.gke.io/gcp-service-account: delay-executor-sa@PROJECT_ID.iam.gserviceaccount.com
name: delay-executor-sa
namespace: workflows
this service account has the roles: roles/pubsub.publisher
, roles/pubsub.subscriber
and roles/monitoring.viewer
.
Now comes the keda part.
I thought I should use TriggerAuthenticator with podIdentity set to gcp
, this is my setup:
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: keda-trigger-auth-gcp-credentials
namespace: workflows
spec:
podIdentity:
provider: gcp
And my ScaledObject yaml:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: delay-executor-pubsub-scaledobject
namespace: workflows
spec:
scaleTargetRef:
name: delay-executor
triggers:
- type: gcp-pubsub
authenticationRef:
name: keda-trigger-auth-gcp-credentials
metadata:
subscriptionSize: "5"
subscriptionName: "my-subscription"
After applying these settings, my deployment scales down to zero, however in the subscriptions there are many messages, checking the logs of keda-operator
I found errors like this:
ERROR controller Reconciler error {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject", "name": "delay-executor-pubsub-scaledobject", "namespace": "workflows", "error": "error getting scaler for trigger #0: error parsing PubSub metadata: GoogleApplicationCredentials not found"}
and
ERROR gcp_pub_sub_scaler error getting Active Status {"error": "unexpected end of JSON input"}
The Pod works as expected, can pull from the subscription.
So There are two things here,
- I have an authentication issue, and I'm curious how a working setup looks like?
- Why the deployment is scaled down to zero, in case of broken authentication?
Thanks