Big query table on top of bigtable taking too long to read in google dataflow job

164 Views Asked by At

I have a dataflow job that reads from bigquery table( created on top of big table). The data flow job is created using custom template in java. I need to process around 500 million records from bigquery. The issue I am facing is even to read 1 million record big query read is taking 26 min and dataflow job is taking 36 min. The read is too slow in big query.

Any suggestions on how to improve the read performance .

Do apache beam programming model provide support to read from source in parallel ? Any IO connector available for parallel read from bigtable. Any help will be highly appreciated

0

There are 0 best solutions below