Integration tests with BigQuery's JsonStreamWriter in Java

268 Views Asked by At

For local development and integration testing, I am using a BigQuery emulator that exposes ports 9050 (for HTTP) and 9060 (for grpc). For example, this is how I instantiate BigQuery client object in Java (this works and it is connecting to my emulator):

BigQuery bigQuery = BigQueryOptions.newBuilder()
        .setProjectId(BQ_PROJECT)
        .setHost("http://localhost:9050")
        .build().getService();

Now, I need to do the same thing for a JsonStreamWriter. Unfortunately, I couldn't find a way to do it and from the documentation, it's unclear how to do it.

I have tried instantiating the JsonStreamWriter in many different ways, but with no success:

JsonStreamWriter streamWriter = JsonStreamWriter.newBuilder(
    TableName.of(BQ_PROJECT, BQ_DATASET, BQ_TABLE).toString(),
    BigQueryWriteClient.create(
        BigQueryWriteSettings.newBuilder()
            .setEndpoint("localhost:9060")
            .build()
        )
    )
    .build();

No matter what I try, it always gets stuck when calling the build() method. does anybody know what is not working or what else I should be doing/trying?

2

There are 2 best solutions below

1
On

Can you try this sample found here? (It has also a bit of description provided in the comments)

   public void initialize(TableName parentTable)
        throws DescriptorValidationException, IOException, InterruptedException {
      // Use the JSON stream writer to send records in JSON format. Specify the table name to write
      // to the default stream.
      // For more information about JsonStreamWriter, see:
      // https://googleapis.dev/java/google-cloud-bigquerystorage/latest/com/google/cloud/bigquery/storage/v1/JsonStreamWriter.html
      streamWriter =
          JsonStreamWriter.newBuilder(parentTable.toString(), BigQueryWriteClient.create())
              .setExecutorProvider(
                  FixedExecutorProvider.create(Executors.newScheduledThreadPool(100)))
              .setChannelProvider(
                  BigQueryWriteSettings.defaultGrpcTransportProviderBuilder()
                      .setKeepAliveTime(org.threeten.bp.Duration.ofMinutes(1))
                      .setKeepAliveTimeout(org.threeten.bp.Duration.ofMinutes(1))
                      .setKeepAliveWithoutCalls(true)
                      .setChannelsPerCpu(2)
                      .build())
              .build();
    }

You can also find other functions as well in this github library: https://github.com/googleapis/java-bigquerystorage/blob/main/samples/snippets/src/main/java/com/example/bigquerystorage/WriteToDefaultStream.java

1
On

The Google Java client is still trying to connect to gRPC on TLS. There's a way to specify your own transport to get this back to plaintext.

setTransportChannelProvider(
  FixedTransportChannelProvider.create(
      GrpcTransportChannel.create(
          NettyChannelBuilder.forTarget("localhost:9060").usePlaintext().build()
      )
  )
)

This is one way to get a plaintext transport.

I had to apply this to both the BigQueryOptions and the StreamWriter in my project.