I am working with EKS 1.24 version, and created 2 node groups in EKS: groupA and groupB. GroupB is with taint "dedicated:druid:NoSchedule", but the pods without tolerations "dedicated Equal druid NoSchedule" are also scheduled to groupB, what is the possible reason?
My expectation is only the pods with toleration "dedicated Equal druid NoSchedule" are scheduled to groupB
I had the same problem again in production, but after I restarted all pods several times, all pods were restored to the correct worker nodes.
Then I noticed something weird, every time I found pods on incorrect worker nodes, they were created very close together.
So I guess that if pods and worker nodes start at the same time, before eks has not marked the taint on the worker node, the pod maybe put into the worker node with the mismatching taint.
I tried some things to solve this problem and it works in my environment:
Hope those informations help you resolve your issue.