How to insert 38000 records in BigQuery using streaming insertAll?

594 Views Asked by At

I am trying to insert 38000 records in BigQuery using Streaming insertAll method.

But first I am getting error as:

Insert operation not performed 
com.google.cloud.bigquery.BigQueryException: Read timed out 

Then after increasing the timeout time as:

RetrySettings retrySetting = RetrySettings.newBuilder().setTotalTimeout(Duration.ofMinutes(90))
  .build();
BigQueryOptions bigqueryOptions = BigQueryOptions.newBuilder()
  .setRetrySettings(retrySetting).build();
BigQuery bigquery = bigqueryOptions.getDefaultInstance().getService();

... I'm getting another error:

Insert operation not performed 
com.google.cloud.bigquery.BigQueryException: Connection reset

Please help, how can I insert all records in BigQuery?

2

There are 2 best solutions below

0
On

Best approach for you to identify a filed over which you can create different Custer's of data. Once you have identified different Cluster's based on identified filed (e.g. id, date, or etc) then try to load each cluster data individually. Provided each cluster data is less that maximum limit of (10K). You also need to make sure that each n numbers of cluster's data should be loaded simultaneously.

0
On

Maybe you are hitting limits:

Maximum rows per request: 10,000 rows per request

A maximum of 500 rows is recommended. Batching can increase performance and throughput to a point, but at the cost of per-request latency. Too few rows per request and the overhead of each request can make ingestion inefficient. Too many rows per request and the throughput may drop.

A maximum of 500 rows per request is recommended, but experimentation with representative data (schema and data sizes) will help you determine the ideal batch size.

Try splitting data in chunks.