I am attempting to get a kube-thanos (https://github.com/thanos-io/kube-thanos) implementaiton working in an AWS EKS cluster.
I am implementing a "remote write" setup, with S3 integration, thanos-receive and thanos-store, with no sidecar for Prometheus.
Everything seems to come up fine but the thanos-store pod keeps crashing with err="bucket store initial sync: sync block: BaseFetcher: iter bucket: Access Denied" log messages.
I am attempting using the AWS IRSA method to enable thanos-store pod to access S3.
I have a "thanos" role with the required S3 permissions and the role is properly annotated on the thanos-store service account.
The --objstore.config=$(OBJSTORE_CONFIG) points to a Kubernetes secret that is formulated from this YML:
type: S3
config:
bucket: gd9-thanos
endpoint: s3.us-east-2.amazonaws.com
When the thanos-store pod comes up (before it crashes) it looks like it has all the environment variables needed to make the IRSA work:
- name: AWS_STS_REGIONAL_ENDPOINTS
value: regional
- name: AWS_DEFAULT_REGION
value: us-east-2
- name: AWS_REGION
value: us-east-2
- name: AWS_ROLE_ARN
value: arn:aws:iam::xxxxxxxxxxx:role/thanos
- name: AWS_WEB_IDENTITY_TOKEN_FILE
value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
I have tried a few of the suggestions here but nothing seems to work.
Can anyone suggest how to further troubleshoot?
k get pods -n monitoring | grep thanos
thanos-query-75f5bbf7c-62528 1/1 Running 0 4h10m
thanos-receive-ingestor-default-0 1/1 Running 0 4h10m
thanos-receive-router-76576bf5cb-ld6jh 1/1 Running 0 4h10m
thanos-store-0 0/1 CrashLoopBackOff 8 (44s ago) 21m
Thanks for any suggestions!
The problem you have might be related to the IMDSv2 enabled on your worker nodes. See here https://github.com/thanos-io/thanos/issues/3143