Prevent Amazon ECS replaced 1 tasks due to an unhealthy status

107 Views Asked by At

My backend is running in AWS-Fargate. For me it's critical that it keeps running on some specific times as I have long running task (1.5 hours) without interruption. This happened twice in the past week, AWS replaced a task due to an unhealthy status which broke the long running process. Looking at the health monitor I've noticed that the memory consumption was higher than usual, first time around 72% and second time 65%. The thing is - first time AWS replaced the task after 12 days of high consumption and second time after just 5 minutes.

  1. Can this behavior be prevented and where do I find this setting? All information I've googled before was deployment related and not for runtime.
  2. Does that happen because of the memory consumption? I can't find any additional info on the health status or some extended reason for replacement. Can I enable more logging somewhere to at least know more if that happens again?
  3. If this behavior can't be prevented, can I specify some time ranges there the task must continue running and will be replaced later?

I'm a Backend developer and investigating the memory leak now. Having limited knowledge in DevOps I will appreciate if you point me into right direction or propose a workaround. I've already created the alerts for high memory/cpu consumption.

enter image description here enter image description here enter image description here

0

There are 0 best solutions below