Perhaps this is well documented, but I am getting very confused how to do this (there are many Apache tools).
When I create an SQL table, I create the table using the following commands:
CREATE TABLE table_name(
column1 datatype,
column2 datatype,
column3 datatype,
.....
columnN datatype,
PRIMARY KEY( one or more columns )
);
How does one convert this exist table into Parquet? This file is written to disk? If the original data is several GB, how long does one have to wait?
Could I format the original raw data into Parquet format instead?
Apache Spark can be used to do this:
Example: