Oracle 12c - refreshing the data in my tables based on the data from warehouse tables

87 Views Asked by At

I need to update the some tables in my application from some other warehouse tables which would be updating weekly or biweekly. I should update my tables based on those. And these are having foreign keys in another tables. So I cannot just truncate the table and reinsert the whole data every time. So I have to take the delta and update accordingly based on few primary key columns which doesn't change. Need some inputs on how to implement this approach.

My approach:
Check the last updated time of those tables, views.
If it is most recent then compare each row based on the primary key in my table and warehouse table.
update each column if it is different.
Do nothing if there is no change in columns.
insert if there is a new record.

My Question:
How do I implement this? Writing a PL/SQL code is it a good and efficient way? as the expected number of records are around 800K.
Please provide any sample code or links.

2

There are 2 best solutions below

0
On

I would go for Pl/Sql and bulk collect forall method. You can use minus in your cursor in order to reduce data size and calculating difference.

You can check this site for more information about bulk collect, forall and engines: http://www.oracle.com/technetwork/issue-archive/2012/12-sep/o52plsql-1709862.html

0
On

There are many parts to your question above and I will answer as best I can:

  1. While it is possible to disable referencing foreign keys, truncate the table, repopulate the table with the updated data then reenable the foreign keys, given your requirements described above I don't believe truncating the table each time to be optimal
  2. Yes, in principle PL/SQL is a good way to achieve what you are wanting to achieve as this is too complex to deal with in native SQL and PL/SQL is an efficient alternative

Conceptually, the approach I would take is something like as follows:

Initial set up:

  1. create a sequence called activity_seq
  2. Add an "activity_id" column of type number to your source tables with a unique constraint
  3. Add a trigger to the source table/s setting activity_id = activity_seq.nextval for each insert / update of a table row
  4. create some kind of master table to hold the "last processed activity id" value

Then bi/weekly:

  1. retrieve the value of "last processed activity id" from the master table
  2. select all rows in the source table/s having activity_id value > "last processed activity id" value
  3. iterate through the selected source rows and update the target if a match is found based on whatever your match criterion is, or if no match is found then insert a new row into the target (I assume there is no delete as you do not mention it)
  4. on completion, update the master table "last processed activity id" to the greatest value of activity_id for the source rows processed in step 3 above.

(please note that, depending on your environment and the number of rows processed, the above process may need to be split and repeated over a number of transactions)

I hope this proves helpful