ERROR: column <name> is of type timestamp without time zone but expression is of type character varying

8.1k Views Asked by At

I am reading a csv file export from MS SQL db table and trying to insert the data into PostgreSQL (Aurora Serverless) db. When I created the equivalent table in the PostgreSQL database I used column type mapping that I found online to map the datetime (MS SQL) to timestamp(3) (PostgreSQL) type. However, when I try to insert a record into the Postgres db I get that error:

Batch entry 0 INSERT INTO my_table VALUES (\'10000002\', \'2020-11-22 00:00:00\')\nRETURNING * was aborted: 
ERROR: column "Date" is of type timestamp without time zone but expression is of type character varying\n 
Hint: You will need to rewrite or cast the expression.\n  Position: 37  Call getNextException to see other errors in the batch.

My MS SQL table schema:

ID: varchar(20)
Date: datetime

csv file contains:

10000002, 2020-11-22 00:00:00
10000003, 2020-11-22 00:00:00
...

PostgreSQL table schema/create statement:

CREATE table my_table
(
    "ID"                      varchar(20)  not null,
    "Date"                  timestamp(3)
);

In the error message, the hint says that I may have to cast the expression but I am not sure how I am supposed to do that or if that is the best resolution in my case.

I am using the following script to load the data from csv file and push it to the PostgreSQL db:

import boto3
import csv
rds_client = boto3.client('rds-data')

def batch_execute_statement(sql, sql_parameter_sets, transaction_id=None):
    parameters = {
        'secretArn': db_credentials_secrets_store_arn,
        'database': database_name,
        'resourceArn': db_cluster_arn,
        'sql': sql,
        'parameterSets': sql_parameter_sets
    }
    if transaction_id is not None:
        parameters['transactionId'] = transaction_id
    response = rds_client.batch_execute_statement(**parameters)
    return response

transaction = rds_client.begin_transaction(
    secretArn=db_credentials_secrets_store_arn,
    resourceArn=db_cluster_arn,
    database=database_name)


sql = "INSERT INTO rigs_latest VALUES (:ID, :Date;"
parameter_set = []


with open('data.csv', 'r') as file:
    reader = csv.DictReader(file, delimiter=',')
    for row in reader:
        entry = [
            {'name': 'ID', 'value': {'stringValue': row['ID']}},
            {'name': 'Date', 'value': {'stringValue': row['Date']}}
        ]
        parameter_set.append(entry)

response = batch_execute_statement(
    sql, parameter_set, transaction['transactionId'])

UPDATE:

If I use the Query Editor to insert a record such as insert into rigs_latest values (10000002, '2020-11-22 00:00:00'); it works without a problem

enter image description here

That suggests that there is something wrong with the code itself

1

There are 1 best solutions below

1
On BEST ANSWER

After spending some more time researching, I found that we can specify a typeHint for each entry in the parameter_set. According to the documentation the typeHint allows us to specify that the parameter is of time TIMESTAMP like so:

...
    entry = [
        {'name': 'ID', 'value': {'stringValue': row['ID']}},
        {'name': 'Date','typeHint': 'TIMESTAMP', 'value': {'stringValue': row['Date']}}
    ]
    parameter_set.append(entry)
...

I guess that helps the interpreter figure out that we want the Date parameter to be of type TIMESTAMP