running job with different file without reloading the file

587 Views Asked by At

I created a job that could be reusable for new files. The entire activities in the job, the maps and everything else will remain the same except for the file name. I already tried it once but it seems that i need to re "load" the file and remap everything again. It's inefficient. Is there any way for me to pass different file in a job without remaping, reconfiguring and reloading anything?

2

There are 2 best solutions below

1
On BEST ANSWER

You have multiple options for allowing a DataStage parallel job to use a different filename for input on each job run:

  1. When using either Sequential File stage or File Connector stage, in stead of typing the actual filename, you can input the name of a job parameter which has been defined on the Parameters tab of the job properties dialog. For example, if you define string parameter myFile, then in the filename field of input stage you would enter #myFile# and at job run-time that would be replaced by whatever is the current value of the myFile parameter. If you run job manually from Director/Designer clients, you will have job run dialog where you can specify a value for job parameters. If you start job via dsjob command, there are options to pass in job parameters on command line. You also have option to use parameterset files that you can modify prior to job run.

  2. Another option would be to use a file location and pattern instead of a specific file name. Both Sequential File stage and File Connector stage let you specify a pattern, for example: /data/my_input_files/*.txt Then, each time you run job it will input any files at that location matching the above pattern, so it can process multiple files. However, to prevent re-processing files from prior job runs, you will want to clean up any files at that location after job completes. Then when you have new files to process just put them in that directory and re-run the job.

4
On

In case if all the files contains a similar data structure, you need to implement one parallel job and if you have a similar pattern of file name for all file names Such as 1234ab.xls, 1234vd.xls, 1234gd.xls, ... you could pass the file name as 1234??.xls In the sequential job file name parameter (Use this as file name in parallel job) which contains the above parallel job to be executed.