Invoke a script on EC2 termination

23.5k Views Asked by At

I have to take certain actions during AWS autoscaling scale-in event.The ec2 instance should be able to save some logs and reports to S3 bucket. This can take anywhere between 5 to 15 mins.

I already have a script that gets called on termination:

ln -s /etc/ec2-termination /etc/rc0.d/S01ec2-termination

However the script ends abruptly within 5 mins. I am looking at leveraging AWS LifeCycle hooks to extend the EC2 lifetime. The documentation is not clear on invoking a script in a way similar to user-data script.

There are ways of using AWS lambda or SNS to receive notification. This can be potentially used to inform the ec2.

But, I would like to know if there is a simpler solution to this problem. Is there a way to register a script with Lifecycle hooks that gets called on a scale-in event.

5

There are 5 best solutions below

1
On

Here is a solution using Lifecycle Hooks, Automation and Run Command, based on this article:

Resources:
  MyTerminationHook:
    Type: AWS::AutoScaling::LifecycleHook
    Properties:
      AutoScalingGroupName: !Ref MyAutoScalingGroup
      DefaultResult: CONTINUE
      HeartbeatTimeout: 900
      LifecycleTransition: autoscaling:EC2_INSTANCE_TERMINATING

  MyTerminationDocument:
    Type: AWS::SSM::Document
    Properties:
      DocumentType: Automation
      Content:
        description: 'Run command before terminating instance'
        schemaVersion: '0.3'
        assumeRole: !GetAtt MyTerminationDocumentRole.Arn
        parameters:
          instanceId:
            type: String
        mainSteps:
          - name: RunCommand
            action: aws:runCommand
            inputs:
              DocumentName: AWS-RunShellScript
              InstanceIds:
                - '{{ instanceId }}'
              TimeoutSeconds: 60
              Parameters:
                commands: /etc/my-termination-script.sh
                executionTimeout: '900'
          - name: TerminateInstance
            action: aws:executeAwsApi
            inputs:
              Api: CompleteLifecycleAction
              AutoScalingGroupName: !Ref MyAutoScalingGroup
              InstanceId: '{{ instanceId }}'
              LifecycleActionResult: CONTINUE
              LifecycleHookName: !Ref MyTerminationHook
              Service: autoscaling

  MyTerminationRule:
    Type: AWS::Events::Rule
    Properties:
      EventPattern:
        source:
          - aws.autoscaling
        detail-type:
          - EC2 Instance-terminate Lifecycle Action
        detail:
          AutoScalingGroupName:
            - !Ref MyAutoScalingGroup
      Targets:
        - Id: my-termination-document
          Arn: !Sub 'arn:aws:ssm:${AWS::Region}:${AWS::AccountId}:automation-definition/${MyTerminationDocument}:$DEFAULT'
          RoleArn: !GetAtt MyTerminationRuleRole.Arn
          InputTransformer:
            InputPathsMap:
              instanceId: '$.detail.EC2InstanceId'
            InputTemplate: '{"instanceId":[<instanceId>]}'

  MyTerminationRuleRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service: events.amazonaws.com
            Action: sts:AssumeRole
      Policies:
        - PolicyName: start-automation
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - ssm:StartAutomationExecution
                Resource: !Sub 'arn:aws:ssm:${AWS::Region}:${AWS::AccountId}:automation-definition/${MyTerminationDocument}:$DEFAULT'

  MyTerminationDocumentRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service: ssm.amazonaws.com
            Action: sts:AssumeRole
      Policies:
        - PolicyName: run-command-and-complete-lifecycle
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - autoscaling:CompleteLifecycleAction
                Resource: !Sub 'arn:aws:autoscaling:${AWS::Region}:${AWS::AccountId}:autoScalingGroup:*:autoScalingGroupName/${MyAutoScalingGroup}'
              - Effect: Allow
                Action:
                  - ssm:DescribeInstanceInformation
                  - ssm:ListCommands
                  - ssm:ListCommandInvocations
                Resource: '*'
              - Effect: Allow
                Action:
                  - ssm:SendCommand
                Resource: 'arn:aws:ssm:*::document/AWS-RunShellScript'
              - Effect: Allow
                Action:
                  - ssm:SendCommand
                Resource: !Sub 'arn:aws:ec2:${AWS::Region}:${AWS::AccountId}:instance/*'

The permissions required to deploy these are

          - Sid: CreateDocument
            Effect: Allow
            Action:
              - "ssm:CreateDocument"
              - "ssm:GetDocument"
              - "ssm:DeleteDocument"
              - "ssm:ListTagsForResource"
            Resource: !Sub "arn:aws:ssm:<...>"
          - Sid: InstallLifecycleHook
            Effect: Allow
            Action:
              - "autoscaling:DeleteLifecycleHook"
              - "autoscaling:CreateLifecycleHook"
            Resource: !Sub "arn:aws:autoscaling:<...>"
          - Sid: ManageRules
            Effect: Allow
            Action:
              - "events:PutRule"
              - "events:ListRules"
              - "events:DescribeRule"
              - "events:DeleteRule"
              - "events:PutTargets"
              - "events:RemoveTargets"
            Resource: !Sub "arn:aws:events:<...>"

There might be more; these are the ones I had to add to our existing deployment policy. They may also not all be required, but I was fed up redeploying and adding them piecemeal so I added some of the Rule ones as an educated guess.

1
On

Just a correction on my earlier comment. SSM run command would work against the instance in Autoscaling group if an instance got terminated due to an auto scaling event not if you terminate the instance manually.

3
On

Yes, you can run a shell-script on your terminating EC2 instance using AWS Systems manager.

  1. Configure Lifecycle Hooks for your Autoscaling group. You can do this from the EC2 console or CLI:

    aws autoscaling put-lifecycle-hook
    --lifecycle-hook-name my-lifecycle-hook
    --auto-scaling-group-name My_AutoScalingGroup
    --lifecycle-transition autoscaling:EC2_INSTANCE_TERMINATING
    --default-result CONTINUE
    --region us-east-2

Set the Heartbeat timeout value depending on the duration your script takes to run. Now, when your ASG scales-in, your instances go into a Terminate:Wait state during which your script will run.

  1. Set up a CloudWatch Event that is triggered when an instance changes to Terminating:Wait status and it Targets a System Manager Run Command that executes the shell script on your instance. Use the console.

Alternative Solution: Your Lifecycle Hook sends a message with the Instance-ID to SQS when an instance changes to Terminating:Wait status. SQS on receiving a message triggers a Lambda function that sends the Run Command to System Manager to execute the shell script on your terminating instance.

References: 1 2 3

2
On

No.

The fact that the instance is being terminated is managed within the AWS infrastructure. Auto Scaling does not have the ability to reach "into" the EC2 instance to trigger anything.

Instead, you would need to write some code on the instance that checks whether the instance is in the termination state and then takes appropriate action.

An example might be:

  • The Lifecycle Hook sends a notification via Amazon SNS
  • Amazon SNS triggers an AWS Lambda function
  • The Lambda function could add a tag to the instance (eg Terminating = Yes)
  • A script on the EC2 instance is triggered every 15 seconds to check the tags associated with the EC2 instance (on which it is running). If it finds the tag, it triggers the shutdown process.

(Be careful that the script doesn't trigger again during the shutdown process otherwise it might try performing the shutdown process every 15 seconds!)

Alternatively, store the shutdown information in the Systems Manager Parameter Store or a database, but using Tags seems nicely scalable!

Updated version:

Thanks to raevilman for the idea:

  • The Lifecycle Hook sends a notification via Amazon SNS
  • Amazon SNS triggers an AWS Lambda function
  • The Lambda function calls the AWS Systems Manager Run Command to trigger code on the instance

Much simpler!

0
On

Depending on what you want to achieve, with this approach you would only need 2 things and much simpler):

  1. The Lifecycle Hook sends a notification to SQS
  2. The app reads the SQS and performs the action