AWS Xray DaemonSet Pod error: failed to start container "xray-daemon": Error response from daemon: OCI

569 Views Asked by At

I am trying to set up AWS Xray Daemonset in EKS Kubernetes cluster.

The issue is that "xray-daemon" pods fail to start with "CrashLoopBackOff" status.

When I check the logs of daemonset pods, the following error is shown:

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 20s default-scheduler Successfully assigned default/xray-daemon-wcnzt to ip-172-34-169-37.ap-south-1.compute.internal
Normal Pulling 15s (x2 over 19s) kubelet Pulling image "amazon/aws-xray-daemon:latest"
Normal Pulled 12s (x2 over 16s) kubelet Successfully pulled image "amazon/aws-xray-daemon:latest"
Normal Created 12s (x2 over 16s) kubelet Created container xray-daemon
Warning Failed 12s (x2 over 16s) kubelet Error: failed to start container "xray-daemon": Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: exec: "/usr/bin/xray": stat /usr/bin/xray: no such file or directory: unknown

Steps to Reproduce:

First I create IAM Service Account with AWSXRayDaemonWriteAccess

eksctl create iamserviceaccount \
    --name xray-daemon \
    --namespace default \
    --cluster eksdemo1 \
    --attach-policy-arn arn:aws:iam::aws:policy/AWSXRayDaemonWriteAccess \
    --approve \
    --override-existing-serviceaccounts

Then I try to create xray daemonset with this "xray-k8s-daemonset.yml" file:

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app: xray-daemon
  name: xray-daemon
  namespace: default
  # Update IAM Role ARN created for X-Ray access
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::12345678999:role/eksctl-eksdemo1-addon-iamserviceaccount-defa-Role1-1FIM8S2K6D404
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: xray-daemon
  namespace: default
spec:
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      app: xray-daemon
  template:
    metadata:
      labels:
        app: xray-daemon
    spec:
      serviceAccountName: xray-daemon
      volumes:
        - name: config-volume
          configMap:
            name: "xray-config"
      containers:
        - name: xray-daemon
          image: amazon/aws-xray-daemon
          command: ["/usr/bin/xray", "-c", "/aws/xray/config.yaml"]
          resources:
            requests:
              cpu: 256m
              memory: 32Mi
            limits:
              cpu: 512m
              memory: 64Mi
          ports:
            - name: xray-ingest
              containerPort: 2000
              hostPort: 2000
              protocol: UDP
            - name: xray-tcp
              containerPort: 2000
              hostPort: 2000
              protocol: TCP
          volumeMounts:
            - name: config-volume
              mountPath: /aws/xray
              readOnly: true
---
# Configuration for AWS X-Ray daemon
apiVersion: v1
kind: ConfigMap
metadata:
  name: xray-config
  namespace: default
data:
  config.yaml: |-
    TotalBufferSizeMB: 24
    Socket:
      UDPAddress: "0.0.0.0:2000"
      TCPAddress: "0.0.0.0:2000"
    Version: 2
---
# k8s service definition for AWS X-Ray daemon headless service
apiVersion: v1
kind: Service
metadata:
  name: xray-service
  namespace: default
spec:
  selector:
    app: xray-daemon
  clusterIP: None
  ports:
    - name: xray-ingest
      port: 2000
      protocol: UDP
    - name: xray-tcp
      port: 2000
      protocol: TCP

Role arn is correct, I'm sure (I checked arn in AWS Console)

1

There are 1 best solutions below

0
On

The issue was fixed by using "amazon/aws-xray-daemon:3.2.0" docker image

It seems that pod template in "xray-k8s-daemonset.yml" doesn't work correctly with the latest version of "amazon/aws-xray-daemon" image anymore.

Starting from image version 3.3.0 there are some changes, which break xray daemonset deployment in "xray-k8s-daemonset.yml".

So I replaced image "amazon/aws-xray-daemon" with "amazon/aws-xray-daemon:3.2.0" in "xray-k8s-daemonset.yml".

Version 3.2.0 works correctly, so the problem was fixed

P.S. It is still not clear how to make xray daemonset work correctly with the latest version of "amazon/aws-xray-daemon" image.

So, other solutions are still welcome.