Defining Month Time Partition using Teradata Data Transfer Service Custom Schema

181 Views Asked by At

I am looking for a way to define a Teradata Data Transfer custom schema that implements a month based date partition. The documentation only provides a method to do this at a timestamp or date level.

https://cloud.google.com/bigquery-transfer/docs/teradata-migration-options#custom_schema_file

Is there an undocumented approach to defining a custom schema file that handles this? Or is the alternative to migrate at the day level, then once in BigQuery insert into a table that is defined with the month partition?

1

There are 1 best solutions below

1
On

As mentioned in the Answer :

In Teradata, you might find trunc() to be a simple method:

select a.id, a.name, a.number, a.date
from (select a.*,
             row_number() over (partition by trunc(date, 'MON') order by date desc) as seqnum
      from tableA a
     ) a
where seqnum = 1;

Teradata also supports qualify:

select a.id, a.name, a.number, a.date
from tableA a
qualify row_number() over (partition by trunc(date, 'MON') order by date desc) = 1

As mentioned in the Documentation, you can refer to the Explanation:

A partitioned primary index enables Teradata Database to partition the rows of a table or uncompressed join index in such a way that row subsets can be accessed efficiently without resorting to full-table scans. If the partitioning expression is defined using an updatable current date or updatable current timestamp, the partition that contains the most recent rows can be defined to be as narrow as possible to optimize efficient access to those rows.

An additional benefit of an updatable current date or updatable current timestamp for a partitioning is that the partitioning expression can be designed in such a way that it might not need to be changed as a function of time.

you can specify the DATE, CURRENT_DATE, or CURRENT_TIMESTAMP functions in the partitioning expression of a table or uncompressed join index and then periodically update the resolution of their values. This enables rows to be repartitioned on the newly resolved values of the DATE, CURRENT_DATE, or CURRENT_TIMESTAMP functions at any time you determine that they require reconciliation. You can update the resolution of your partitioning scheme by submitting appropriate ALTER TABLE TO CURRENT statements.

For more information, you can refer to the documentation.