Decode binary file in AWS environment using PySpark

30 Views Asked by need_the_buzz At 19 March 2024 at 18:41

Is it possible to consume a Netezza backup file in AWS environment and load it to Redshift. File is a compressed binary file created using the below query. This file can also be produced using NZ_BACKUP utility in Netezza for a full database.

CREATE EXTERNAL TABLE 'C:\filename.bak' USING (remotesource 'ODBC' FORMAT 'internal' 
COMPRESS true) AS SELECT * FROM schema.tablename;

or 

nzbackup -dir /home/user/backups -u user -pw password -db db1

I want this file to be decoded and load to a data frame in AWS environment utilizing Python or PySpark (glue). Following are the steps I am planning to do in AWS. I need some guidance in the first step. How to decode a compressed binary from Netezza.

Decode the file to ASCII (How?)
Load to DF and generate parquet
Copy to Redshift

Original Q&A

Decode binary file in AWS environment using PySpark

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in PYSPARK

Related Questions in AWS-GLUE

Related Questions in BINARYFILES

Related Questions in NETEZZA

Trending Questions

Popular # Hahtags

Popular Questions