aws command line interface - aws ec2 wait - Max attempts exceeded

22.6k Views Asked by At

I am working on shell script, witch does follow:

  • creates snapshot of EBS Volume;
  • creates AMI image based on this snapshot.

1) I use follow command to create snapshot:
SNAPSHOT_ID=$(aws ec2 create-snapshot "${DRYRUN}" --volume-id "${ROOT_VOLUME_ID}" --description "${SNAPSHOT_DESCRIPTION}" --query 'SnapshotId')

2) I use waiter to wait complete state:
aws ec2 wait snapshot-completed --snapshot-ids "${SNAPSHOT_ID}"

When I test it with EBS Volume 8 GB size everything goes well.
When it is 40 GB, I have an exception:
Waiter SnapshotCompleted failed: Max attempts exceeded

Probably, 40 GB takes more time, then 8 GB one, just need to wait.

AWS Docs (http://docs.aws.amazon.com/cli/latest/reference/ec2/wait/snapshot-completed.html) don't have any timeout or attempts quantity option.

May be some of you have faced the same issue?

6

There are 6 best solutions below

0
On BEST ANSWER

So, finally, I used follow way to solve it:

  1. Create snapshot
  2. Use loop to check exit status of command aws ec2 wait snapshot-completed
  3. If exit status is not 0 then print current state, progress and run waiter again.

# Create snapshot
SNAPSHOT_DESCRIPTION="Snapshot of Primary frontend instance $(date +%Y-%m-%d)"
SNAPSHOT_ID=$(aws ec2 create-snapshot "${DRYRUN}" --volume-id "${ROOT_VOLUME_ID}" --description "${SNAPSHOT_DESCRIPTION}" --query 'SnapshotId')

while [ "${exit_status}" != "0" ]
do
    SNAPSHOT_STATE="$(aws ec2 describe-snapshots --filters Name=snapshot-id,Values=${SNAPSHOT_ID} --query 'Snapshots[0].State')"
    SNAPSHOT_PROGRESS="$(aws ec2 describe-snapshots --filters Name=snapshot-id,Values=${SNAPSHOT_ID} --query 'Snapshots[0].Progress')"
    echo "### Snapshot id ${SNAPSHOT_ID} creation: state is ${SNAPSHOT_STATE}, ${SNAPSHOT_PROGRESS}%..."

    aws ec2 wait snapshot-completed --snapshot-ids "${SNAPSHOT_ID}"
    exit_status="$?"

done

If you have something that can improve it, please share with us.

1
On

https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-retries.html

You can set a variable or use the config file to increase the timeouts.

AWS_MAX_ATTEMPTS=100

~/.aws/config
[default]
retry_mode = standard
max_attempts = 6
0
On

aws ec2 wait snapshot-completed takes a while to time out. This snippet uses aws ec2 describe-snapshots to get the progress. When it's 100% it calls snapshot-completed.

# create snapshot
SNAPSHOTID=$(aws ec2 create-snapshot --volume-id $VOLUMEID --output text --query "SnapshotId")
echo "Waiting for Snapshot ID: $SNAPSHOTID"

SNAPSHOTPROGRESS=$(aws ec2 describe-snapshots --snapshot-ids $SNAPSHOTID --query "Snapshots[*].Progress" --output text)

while [ $SNAPSHOTPROGRESS != "100%" ]
do
  sleep 15
  echo "Snapshot ID: $SNAPSHOTID $SNAPSHOTPROGRESS"
  SNAPSHOTPROGRESS=$(aws ec2 describe-snapshots --snapshot-ids $SNAPSHOTID --query "Snapshots[*].Progress" --output text)
done

aws ec2 wait snapshot-completed --snapshot-ids "$SNAPSHOTID"

This is essentially the same thing as above, but prints out a progress message every 15 seconds. Snapshots that are completed return 100% immediately.

0
On

you should probably use until in bash, looks a bit cleaner and you don't have to repeat.

echo "waiting for snapshot $snapshot"
until aws ec2 wait snapshot-completed --snapshot-ids $snapshot 2>/dev/null
do
    do printf "\rsnapshot progress: %s" $progress;
    sleep 10
    progress=$(aws ec2 describe-snapshots --snapshot-ids $snapshot --query "Snapshots[*].Progress" --output text)
done
0
On

ISSUE: In ci/cd we had command to wait ecs service to be steady and got this error

aws ecs wait services-stable \
    --cluster MyCluster \
    --services MyService

ERROR MSG : Waiter ServicesStable failed: Max attempts exceeded

FIX

in order to fix this issue we followed this doc -> https://docs.aws.amazon.com/AmazonECS/latest/bestpracticesguide/load-balancer-healthcheck.html

aws elbv2 modify-target-group --target-group-arn <arn of target group> --healthy-threshold-count 2 --health-check-interval-seconds 5 --health-check-timeout-seconds 4 

-> https://docs.aws.amazon.com/AmazonECS/latest/bestpracticesguide/load-balancer-connection-draining.html

aws elbv2 modify-target-group-attributes --target-group-arn <arn of target group> --attributes Key=deregistration_delay.timeout_seconds,Value=10

this fixed the issue

In case you have more target groups to edit just output the target groups arns to a file and run this in a loop.

0
On

I faced the same error, while archiving AMIs to S3.

Adding on antonbormotov's answer, following one-liner worked for me on Linux (BASH) & Windows (PowerShell) respectively:

# Bash
date && aws ec2 create-store-image-task --image-id <<AMI ID>> --bucket <<BUCKET>> --region <<REGION>> && false; while [ $? != "0" ];do aws ec2 wait store-image-task-complete --image-ids <<AMI ID>> --region <<REGION>>; done && date

# Powershell
date; aws ec2 create-store-image-task --image-id <<AMI ID>> --bucket <<BUCKET>> --region <<REGION>>;$LASTEXITCODE=-1; while ($LASTEXITCODE -ne 0) {aws ec2 wait store-image-task-complete --image-ids <<AMI ID>> --region <<REGION>>};date