I have the following combination of tap/target in Meltano: tap-marketo and target-s3-parquet.
I want to extract data from tap-marketo from data A to date B in the past.
I saw that we can only define start_date and max_export_days.
I have tried to start with start_date A and stop the run once I reach B. But this does not work.
The loader only emit the state once their work is completely done, and the target is not called. So a load was not done.
I also saw that, the export is being done.
{'run_id': '46ba5256-7019-48c7-890a-28746bb5272a', 'state_id': '2023-02-09T152428--tap-marketo--target-s3-parquet', 'stdio': 'stderr', 'cmd_type': 'extractor', 'name': 'tap-marketo', 'event': 'INFO GET: https://XXXXXXX/bulk/v1/activities/export/6636daf1-ad1e-41e1-b8d5-cdd31de5d4e0/file.json', 'level': 'info', 'timestamp': '2023-02-09T17:35:02.098016Z'}
But where do I find this file in my container?
I want to invoke the target separately but need to give the --input.
# meltano invoke target-s3-parquet --help
Environment 'dev' is active
Usage: target-s3-parquet [OPTIONS]
Execute the Singer target.
Options:
--input FILENAME A path to read messages from instead of from
standard in.
--config TEXT Configuration file location or 'ENV' to use
environment variables.
--format [json|markdown] Specify output style for --about
--about Display package metadata and settings.
--version Display the package version.
--help Show this message and exit.
To invoke the tap and target separately
Which is equivalent to:
In both of the above cases, you can retry just the second step.
However, if you invoke both together, using
meltano run tap-marketo target-s3-parquet
or similar, the intermediate file will not be stored on disk, and you would not be able to replay just the target-side processing.Why these files aren't stored on disk by default
The stream of messages you'll see in the examples above will necessarily contain potentially secret or confidential data, and the volume contained within the stream can be extremely large, since it contains the records themselves as well as metadata used for coordinating between the tap and target. For this reason, this stream of messages from tap to target is not stored to disk during a normal sync operation.