Dependency based ETL flow in AWS

679 Views Asked by Krish At 17 August 2025 at 16:26

We want to create a dynamic flow based on input data in S3. Based on data available in S3 and along with meta data we want to create dynamic clusters and dynamic tasks/transformation jobs in the system. And Some jobs are dependency based. Here I am sharing the expected flow, want to know how efficiently we can do this using AWS services and env.

I am exploring AWS SWF, Data Pipe Line and Lambda. But now sure how to take care of dynamic tasks and dynamic dependencies. Any thoughts around this.

Data Flow is explained in the attached image (refer ETL Flow) ETL Flow

Original Q&A

There are 2 best solutions below

Kannaiyan On 11 September 2017 at 23:53 BEST ANSWER

Amazon Step Functions with S3 Triggers should get the job done in a cost effective and scalable manner.

All Steps are defined with state language.

https://states-language.net/spec.html

You can run jobs in parallel and wait for them to finish before you start your next job.

Below is one of the sample from AWS Step Functions,

Maxim Fateev On 11 September 2017 at 23:25

If you use AWS Flow Framework that is part of official SWF client then modeling such dynamic flow is pretty straightforward. You define its object model, write code that instantiate it based on your pipeline definition and execute using the framework. See Deployment Sample for an example of such dynamic workflow implementation.

Dependency based ETL flow in AWS

There are 2 best solutions below

Related Questions in AMAZON-WEB-SERVICES

Related Questions in AMAZON-S3

Related Questions in AWS-LAMBDA

Related Questions in AMAZON-DATA-PIPELINE

Related Questions in AMAZON-SWF

Trending Questions

Popular # Hahtags

Popular Questions