data factory with multiple outputs at different on-prem DBs

422 Views Asked by At

I would like to have a Data Factory with one pipeline and one activity which moves data from a Azure DB to separate on-prem DBs. The separate DBs are in different data centers and will need to have different data copied at different times of the day and not at equal intervals e.g. 4am EST, 5am EST, 11am EST, 6pm EST

I know I'll need a Data Management Gateway for each Data Center, but how do I schedule one activity so that it runs at different times of the day (not at equal intervals) and each run is to a different on-prem DB via a separate Data Management Gateway

1

There are 1 best solutions below

1
On BEST ANSWER

You need to have a different activity and dataset defined for each schedule. This isn't ideal. But with the current limitations of ADF its your only option. Its not like a SQL Agent where you can have multiple schedules for the same job.

You can have 1 pipeline. But you'll need 4x activities. 4x input datasets and 4x outputs datasets. Given your example in your question (e.g. 4am EST, 5am EST, 11am EST, 6pm EST).

To actually deal with the different times use the offset attribute within the dataset and pipeline.

Dataset JSON block example:

"availability": {
  "frequency": "Day",
  "interval": 1,
  "style": "StartOfInterval",
  "offset": "00.05:00:00" // <<<< 5AM start.
 },

Activity JSON block example:

    "scheduler": {
      "frequency": "Day",
      "interval": 1,
      "style": "StartOfInterval",
      "offset": "00.05:00:00" // <<<<<5AM start. Matching the dataset.
    }

The datasets can of course point to the same table in the database. But just with different names.

Hope this helps.