I am getting None statistics (min / max) when reading file from S3 using fastparquet. When calling
fp.ParquetFile(fn=path, open_with=myopen).statistics['min']
Most of the values are None, and some of the values are valid.
However, when I read the same file with other framework, I am able to get the correct min/max for all values.
How can I get all the statistics? Thanks
The full set of row groups are available as the list
and each row group has a
.columns
attribute, which in turn havemeta_data
; so you can dig around to see what the individual min/max of the columns are.