I have parquet files stored in s3 location which are partitioned by date key. Using Polars, i need to read the parquet file(s) from the latest date key folder. Here is an example of my s3 structure:
Amazon S3>Buckets>Bucket name>dev/>target/>refined/>STUDENTS.parquet/
Under STUDENTS.parquet, there are several folders partitioned by DATE_KEY i.e.,
Each of these folders contain parquet file(s).
Using Polars, i need to read the parquet file from the latest date key folder (DATE_KEY=2024-03-06/ in this example) into a Polars dataframe.
Do you think doing a descending sort on Name folder would be a way to achieve this?
Can someone please help me on his as i'm after Polars dataframe and not Pandas.

I see two ways to achieve this.
Option 1. Scan entire dataset into a
pl.LazyFrameand filter on the fly.Option 2. Obtain name of latest partition folder and read only corresponding data.