Execute SQL query on Redshift using AWS fargate orchestrated by Step Function

212 Views Asked by At

I could find many examples of invoking SQL script on Redshift using Lambda inside Step Function (ex: this works well), however I could not figured out how to execute SQL script using AWS Fargate in Step Functions. I could not figured out how to pass the Redshift credentials to execute the queries. In the lambda case it was easy as below. How to passing the credentials to Fargate Task probably is the same, however what are the commands to execute the SQL scripts?

"States": {
    "loadJob": {
      "Type": "Pass",
      "Next": "loadJobETL",
      "Result": {
        "input": {
          "redshift_cluster_id": "<RS_CLUSTER>",
          "redshift_database": "<MY_DB>",
          "redshift_user": "<MY_DB>root",
          "redshift_schema": "MY_SCHEMA",
          "action": "load_customer_address",
          "sql_statement": [
            "begin transaction;",
            "MY_SQL_STATEMENT",
            "end transaction;"
          ]
        }
      }
    },
    "loadJobETL": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:XXXXXXXXXXX:function:SOME_FUNCTION",
      "TimeoutSeconds": 180,
      "HeartbeatSeconds": 60,
      "InputPath": "$",
      "ResultPath": "$",
      "Next": "checkStatus"
    }
}
1

There are 1 best solutions below

4
On BEST ANSWER

You can use the Step Functions Optimized Integration with ECS RunTask. Which might look like what's below. As to what the code will look like in your Fargate container, that would depend on your chosen language but I would expect to use the AWS SDK and look very similar to what you included in your Lambda code.

{
  "StartAt": "loadJob",
  "States": {
    "loadJob": {
      "Type": "Pass",
      "Next": "loadJobETL",
      "Result": {
        "input": {
          "redshift_cluster_id": "<RS_CLUSTER>",
          "redshift_database": "<MY_DB>",
          "redshift_user": "<MY_DB>root",
          "redshift_schema": "MY_SCHEMA",
          "action": "load_customer_address",
          "sql_statement": [
            "begin transaction;",
            "MY_SQL_STATEMENT",
            "end transaction;"
          ]
        }
      }
    },
    "loadJobETL": {
      "Type": "Task",
      "Resource": "arn:aws:states:::ecs:runTask.sync",
      "Parameters": {
        "LaunchType": "FARGATE",
        "Cluster": "arn:aws:ecs:REGION:ACCOUNT_ID:cluster/MyECSCluster",
        "TaskDefinition": "arn:aws:ecs:REGION:ACCOUNT_ID:task-definition/MyTaskDefinition:1",
        "Overrides": {
          "ContainerOverrides": [
            {
              "Name": "container-name",
              "Command.$": "$"
            }
          ]
        }
      },
      "End": true
    }
  }
}

Alternatively, depending on the size of the output, you may be able to just run this directly from Step Functions using arn:aws:states:::aws-sdk:redshiftdata:executeStatement.

enter image description here