How to convert date column from INT to DATE type in large CSV files?

55 Views Asked by At

I have a CSV file of over 30 GB, and I want to load and store it into a DolphinDB database using the loadTextEx function. The file contains a column named “date” which is stored as INT type, with values like “20230731”. I would like to convert the column into the DATE type (i.e., 2023.07.31). How can I do this?

1

There are 1 best solutions below

0
On BEST ANSWER

You can define a UDF where the function temporalParse can be applied to convert data types, and then pass this UDF to the transform parameter in loadTextEx. Here is an example:

// establish a database connection
db = database("dfs://snapshot_L2_TSDB")  
// specify the format for converted date and time to match the database which is partitioned by date and hash
def transType(mutable memTable)
{
   return memTable.replaceColumn!(`DataTimeStamp,temporalParse( "20221001"+string(memTable.DataTimeStamp),"yyyyMMddHH:mm:ss.SSS"))
}
// import the data
tmpTB=loadTextEx(dbHandle=db, tableName=`snapshot_L2_TSDB_t, partitionColumns=`DataTimeStamp`SecurityID, filename=dataFilePath,schema=schemaTB ,sortColumns=`SecurityID`DataTimeStamp,transform=transType);