Athena preserve order

918 Views Asked by At

Is there a way to preserve the order on a query from Athena? Assume the data in the s3 bucket or data lake are partitioned and are in parquet files. Every time I query something, the order is different each time. I am not sure how Athena works, but it makes sense to have multiple workers performing the query for performance and just combining the results together, which will make sense why the order is different each time. But is it possible to preserve the order of the results if all data is just coming from a single parquet file?

1

There are 1 best solutions below

0
On

If the data in your original files is already time sorted, adding an order by time_column won't add complexity to the query. Conceptually, each worker will sort a small fraction of the data, then merge-sort results from workers. For data that's already sorted these are inexpensive operations.