IBM Data Stage - How to find database tables used in jobs

8.6k Views Asked by At

For a project we need to investigate an existing installation of IBM Data Stage, doing a whole lot of ETL in loads of jobs.

The job flow diagrams contain lots of tables being used a source (both in MSSQL as well as Oracle), as well as a target (mostly in Oracle).

My question is now

How can I find all database tables used by all jobs in a certain Data Stage Project ?

I looked in Tools - Advanced Find, and there I can see all "table definitions". BUT, most of the tables actually used in jobs do not show up there, as they are defined as what Data Stage calls "Parallel Jobs" which in effect are SQL queries against database tables.

I am particularly interested in locating TARGET tables which are being loaded by a job.

So to put it bluntly, I want to be able to answer the question "Which job loads table XY ?".

If that is not possible, an automated means of extracting all the SQL statements used by the jobs would be an alternative.

We have access to IBM Websphere Data Stage and Quality Stage Designer 8.1

1

There are 1 best solutions below

0
On

Exporting the jobs creates a text file that details what the job does. Open the export file in a text editor and you should be able to find SQL inserts with a simple search. Start with searching for SQL keywords like 'INTO' and 'FROM'.

Edit: Alternatively, if every table that was used was defined by importing table definitions, you should be able to find the table definition in the folder for its type. This however, will not make it apparent where and how the table was used (which job, insert or select from?), so I would recommend the first method of searching the Export files.