I am trying to set up AWS Xray Daemonset in EKS Kubernetes cluster.
The issue is that "xray-daemon" pods fail to start with "CrashLoopBackOff" status.
When I check the logs of daemonset pods, the following error is shown:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 20s default-scheduler Successfully assigned default/xray-daemon-wcnzt to ip-172-34-169-37.ap-south-1.compute.internal
Normal Pulling 15s (x2 over 19s) kubelet Pulling image "amazon/aws-xray-daemon:latest"
Normal Pulled 12s (x2 over 16s) kubelet Successfully pulled image "amazon/aws-xray-daemon:latest"
Normal Created 12s (x2 over 16s) kubelet Created container xray-daemon
Warning Failed 12s (x2 over 16s) kubelet Error: failed to start container "xray-daemon": Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: exec: "/usr/bin/xray": stat /usr/bin/xray: no such file or directory: unknown
Steps to Reproduce:
First I create IAM Service Account with AWSXRayDaemonWriteAccess
eksctl create iamserviceaccount \
--name xray-daemon \
--namespace default \
--cluster eksdemo1 \
--attach-policy-arn arn:aws:iam::aws:policy/AWSXRayDaemonWriteAccess \
--approve \
--override-existing-serviceaccounts
Then I try to create xray daemonset with this "xray-k8s-daemonset.yml" file:
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
app: xray-daemon
name: xray-daemon
namespace: default
# Update IAM Role ARN created for X-Ray access
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::12345678999:role/eksctl-eksdemo1-addon-iamserviceaccount-defa-Role1-1FIM8S2K6D404
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: xray-daemon
namespace: default
spec:
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
app: xray-daemon
template:
metadata:
labels:
app: xray-daemon
spec:
serviceAccountName: xray-daemon
volumes:
- name: config-volume
configMap:
name: "xray-config"
containers:
- name: xray-daemon
image: amazon/aws-xray-daemon
command: ["/usr/bin/xray", "-c", "/aws/xray/config.yaml"]
resources:
requests:
cpu: 256m
memory: 32Mi
limits:
cpu: 512m
memory: 64Mi
ports:
- name: xray-ingest
containerPort: 2000
hostPort: 2000
protocol: UDP
- name: xray-tcp
containerPort: 2000
hostPort: 2000
protocol: TCP
volumeMounts:
- name: config-volume
mountPath: /aws/xray
readOnly: true
---
# Configuration for AWS X-Ray daemon
apiVersion: v1
kind: ConfigMap
metadata:
name: xray-config
namespace: default
data:
config.yaml: |-
TotalBufferSizeMB: 24
Socket:
UDPAddress: "0.0.0.0:2000"
TCPAddress: "0.0.0.0:2000"
Version: 2
---
# k8s service definition for AWS X-Ray daemon headless service
apiVersion: v1
kind: Service
metadata:
name: xray-service
namespace: default
spec:
selector:
app: xray-daemon
clusterIP: None
ports:
- name: xray-ingest
port: 2000
protocol: UDP
- name: xray-tcp
port: 2000
protocol: TCP
Role arn is correct, I'm sure (I checked arn in AWS Console)
The issue was fixed by using "amazon/aws-xray-daemon:3.2.0" docker image
It seems that pod template in "xray-k8s-daemonset.yml" doesn't work correctly with the latest version of "amazon/aws-xray-daemon" image anymore.
Starting from image version 3.3.0 there are some changes, which break xray daemonset deployment in "xray-k8s-daemonset.yml".
So I replaced image "amazon/aws-xray-daemon" with "amazon/aws-xray-daemon:3.2.0" in "xray-k8s-daemonset.yml".
Version 3.2.0 works correctly, so the problem was fixed
P.S. It is still not clear how to make xray daemonset work correctly with the latest version of "amazon/aws-xray-daemon" image.
So, other solutions are still welcome.