AWS Step Function - Wait until a group of other Step Functions have finished then fire a different Step Function

1.5k Views Asked by At

I have a scenario where I need to post process results that have been produced by a group of discrete Step Functions. How can I orchestrate this arrangement such that, if I have Step Function A, B and C. Once A, B and C have completed successfully then trigger Step Function D.

Step Function D will take as a payload outputs from Step Functions A, B and C. A, B and C are triggered from an external Java Microservice. I have a Dynamo DB table containing details of A, B and C, so I know which execution IDs belong together.

This seems to be quite a common pattern so I was hoping that there was already some sort of robust design to address it.

I have thought about using SNS to trigger an event when Step Functions A, B and C complete but I need to capture these events together in a group. So if I had a Lambda which captured the event, I would need to somehow know which event this is and whether or not all prior events have been received. I could use a Dynamo DB table to track each Step Functions completion status, at the end of the Step Function update the row. Then the lambda when it receives the completion event can check if each of the rows pertaining to the group of executions is marked as completed? Would this introduce a race condition? is this a trustworthy method?

3

There are 3 best solutions below

0
zaf187 On BEST ANSWER

I ended going with my approach and i used aws sqs to queue the events marking each step functions completion. I had a lambda which handled the events one by one. A dynamo db table which ties all the step function executions together using a common guid

5
Justin Callison On

You will want to use the Optimized Service integration with Step Functions (aka "Nesting") and the Map state.

In this example below, the first Pass state generates a list of state machines to run in Group 1. These go into the Map state, which then executed each of the state machines in parallel and waits for completion using the Run a Job (.sync) integration pattern. Once all of these complete, the consolidated results will be passed as an array to state machine D.

{
  "StartAt": "Generate Group A List",
  "States": {
    "Generate Group A List": {
      "Type": "Pass",
      "Result": {
        "group_a": [
          "arn:aws:states:<region>:<account_id>:stateMachine:<state_machine_A>",
          "arn:aws:states:<region>:<account_id>:stateMachine:<state_machine_B>",
          "arn:aws:states:<region>:<account_id>:stateMachine:<state_machine_C>"
        ]
      },
      "Next": "Group A"
    },
    "Group A": {
      "Type": "Map",
      "ItemProcessor": {
        "ProcessorConfig": {
          "Mode": "INLINE"
        },
        "StartAt": "Execute Group A State Machine",
        "States": {
          "Execute Group A State Machine": {
            "Type": "Task",
            "Resource": "arn:aws:states:::states:startExecution.sync:2",
            "Parameters": {
              "StateMachineArn.$": "$",
              "Input": {
                "StatePayload": "Input",
                "AWS_STEP_FUNCTIONS_STARTED_BY_EXECUTION_ID.$": "$$.Execution.Id"
              }
            },
            "End": true
          }
        }
      },
      "ItemsPath": "$.group_a",
      "Next": "Execute State Machine After"
    },
    "Execute State Machine After": {
      "Type": "Task",
      "Resource": "arn:aws:states:::states:startExecution.sync:2",
      "Parameters": {
        "StateMachineArn": "arn:aws:states:<region>:<account_id>:stateMachine:<state_machine_D>",
        "Input": {
          "group_a_results": "$",
          "AWS_STEP_FUNCTIONS_STARTED_BY_EXECUTION_ID.$": "$$.Execution.Id"
        }
      },
      "End": true
    }
  }
}

enter image description here

enter image description here

2
uyen.do On

Not sure I get your point or not, but how about using Parallel State..