Documentation for Parquet-mr java library

1.8k Views Asked by At

I need to use Parquet-mr library to read from Parquet files programmatically in Java. I need to selectively read a few columns and skip other columns (For example, read 3 columns out of 500 columns). I can't seem to find any documentation on how to do that. Can someone please point me to one if there is any?

1

There are 1 best solutions below

0
On

Unfortunately this is not documented too well. There are some examples you can check out here. These ones use the ExampleParquetWriter class from Parquet however, which was meant to be used as an example only. Nevertheless, it works.

The proper way to use Parquet would be either through one of the supported object models (like Avro, Thrift or Protobuf) or by implementing your own object model (which leads to the best performance). You can read more about object models here.