How do I access an old transaction of a dataset in Code Workbook?

446 Views Asked by At

In Contour you can access old transactions by clicking on the "version" button at the top.

How do I do this in Code Workbook?

1

There are 1 best solutions below

2
On

Update: The method below is no longer supported under the security configurations of most Foundry environments. Instead, we'd recommend using Contour for workflows that involve referencing old transactions of datasets.

Old Answer:

You can create a template to take in the transaction_id, branch, and dataset like so:

def time_machine():

    from pyspark.sql import SQLContext
    sql_context = SQLContext(spark.sparkContext)

    transaction_id = '{{{transaction_id}}}'
    branch = '{{{branch}}}'
    path = '{{{path}}}'
    return sql_context.sql("SELECT * FROM `%s:%s@%s`.`%s`" % (transaction_id, transaction_id, branch, path))

In Code Workbook, you'll create a new transform and start with the Template in order to import the desired transaction.

Make sure you check your retention policies though! You won't be able to pull in old transactions if your retention policies have deleted them already.