I am newbie to AWS ecosystem. I am creating an application which queries data using AWS Athena. Data is transformed from JSON into parquet using AWS Glue and stored in S3.
Now use case is to update that parquet data using SQL.
can we update underlying parquet data using AWS Athena SQL command?
No, it is not possible to use
UPDATE
in Amazon Athena.Amazon Athena is a query engine, not a database. It performs queries on data that is stored in Amazon S3. It reads those files, but it does not modify or update those files. Therefore, it cannot 'update' a table.
The closest capability is using
CREATE TABLE AS
to create a new table. You can provide aSELECT
query that uses data from other tables, so you could effectively modify information and store it in a new table, and tell it to use Parquet for that new table. In fact, this is an excellent way to convert data from other formats into Snappy-compressed Parquet files (with partitioning, if you wish).