Amazon CloudWatch - repeat action when alarm is raised

3.1k Views Asked by At

I'm using Amazon CloudWatch service to manage a group of EC2 instances. More precisely, I do AutoScaling actions when the alarm is changed from OK to ALARM state.

Consider following example: there is a downscale action which terminates one instance when SQS queue length became less than 1 for consecutive 5 minutes. Imagine there are 5 instances running, then alarm is raised and we have 4 running instances. But! I want CloudWatch to continue executing my action when I'm in alarm state! I want to have 3, then 2, then only one instance when I have nothing to process.

I tried another approach: reset alarm state to INSUFFICIENT_DATA right after auto-scaling action. Such way I can effectively downscale my pool to one instance, but then the whole system stucks in 'infinite loop': I change state to INSUFFICIENT_DATA, then Amazon immediately raises alarm again, then I change state and so on.

So, I want either: Amazon repeats my alarm action when there is alarm state OR to have some alarm cooldown period to prevent Amazon from immediately rising it right after the change of state.

Please help me to find the correct approach for my problem.

1

There are 1 best solutions below

3
On

When an Alarm is triggered, Autoscaling will scale according to your autsocaling policy. However, It will also lock down the autoscaling group so that it won't accept any other scaling request during that time frame.

Once the resources are provisioned/de-provisioned, then the auto-scaling cool-down period starts during which it does not entertain any other cloudwatch triggers. Once the cool-down period is over, it is ready to accept any new scaling request from Cloudwatch alarm.

This whole process is explained in detail here.

So....

What you essentially want, is to have auto-scaling respond to cloudwatch alarms non-stop. In other words, You do not want to have the cool-down period OR you want the cool-down period be zero. The default Cool-down period is 300 seconds. You can configure your auto-scaling policy with zero cool-down period so that it will reduce the number of instance from 5 to 4 to 3 to 2 to 1 ...etc provided that the alarm remains active during that time span.

Click here for the command which can configure cool-down period.

What are the implications of turning cool-down to Zero? Well, I do not know but technically/theoretically this is what you are essential looking forward to.