We want to dynamically scale our AKS Cluster based on the number of Websocket connections.
We use Application Gateway V2 along with Application Gateway Ingress Controller on AKS as Ingress.
I configured HorizontalPodAutoscaler to scale the deployment based on the consumed memory.
When i deploy the sample app to AKS i can connect to the websocket endpoints and communicate. However, when any scale operation happens (pods added or removed) i see connection losses on all the clients.
- How can i keep the existing connections when pods are added?
- How can i gracefully drain connections when pods are removed so existing clients are not affected?
I tried activating cookie based affinity on application gateway but this had no effect on the issue.
Below is the deployment i use for testing. It is based on this sample and modified a but so it allows to specify the number of connections and regularily sends ping messages to the server.
apiVersion: apps/v1
kind: Deployment
metadata:
name: wssample
spec:
replicas: 1
selector:
matchLabels:
app: wssample
template:
metadata:
labels:
app: wssample
spec:
containers:
- name: wssamplecontainer
image: marxx/websocketssample:10
resources:
requests:
memory: "100Mi"
cpu: "50m"
limits:
memory: "150Mi"
cpu: "100m"
ports:
- containerPort: 80
name: wssample
---
apiVersion: v1
kind: Service
metadata:
name: wssample-service
spec:
ports:
- port: 80
selector:
app: wssample
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: websocket-ingress
annotations:
kubernetes.io/ingress.class: azure/application-gateway
appgw.ingress.kubernetes.io/cookie-based-affinity: "true"
appgw.ingress.kubernetes.io/connection-draining: "true"
appgw.ingress.kubernetes.io/connection-draining-timeout: "60"
spec:
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: wssample-service
port:
number: 80
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: websocket-scaler
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: wssample
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 50
Update:
- I am running on a 2-node cluster with autoscaler activated to scale up to 4 nodes.
- There is still plenty of memory available on the nodes
- At first i thought it was an issue with browsers and javascript but i got the same results when i connected to the endpoint via a .NET Core Based Console Application (Websockets went to state 'Aborted' after the scale operation)
Update 2:
I found a pattern. The problem occurs also without HPA and can be reproduced using the following steps:
- Scale Deployment to 3 Replicas
- Connect 20 Clients
- Manually Scale Deployment to 6 Replicas with kubectl scale command
- (existing connections are still fine and clients communicate with backend)
- Connect another 20 Clients
- After a few seconds all the existing connections are reset
Update 3:
- The AKS cluster is using kubenet networking
- Same issue with Azure CNI networking though
I made a very unpleasant discovery. The outcome of this GitHub issue basically says that the behavior is by design and AGW resets all websocket connections when any backend pool rules change (which happens during scale operations).
It's possible to vote for a feature to keep those connections in those situations.