Importing a local CSV into a BigQuery Temporary Table

80 Views Asked by At

I am attempting to import a local CSV into a BigQuery table, but am running into the following error:

An internal error occurred and the request could not be completed. This is usually caused by a transient issue. Retrying the job with back-off as described in the BigQuery SLA should solve the problem: https://cloud.google.com/bigquery/sla. If the error continues to occur please contact support at https://cloud.google.com/support. Error: 4550751

This has been happening for a few days now, so I don't think it's a purely transient issue. I do not have access to Google Cloud support, so I unfortunately cannot contact them.

I am using the BigQuery Java SDK (version 2.29) and am following this strategy:

Run a query against a table that matches the schema of the CSV that I'm trying to import. The code for this looks something like:

// Create temporary table
String query = "SELECT * FROM " + original + " WHERE 1=1";
QueryJobConfiguration queryConfig = QueryJobConfiguration.newBuilder(query).setCreateSession(true).build();

String jobname = "jobId_" + datasetName;
JobId jobId = JobId.newBuilder().setJob(jobname).build();

// Create a job with job ID
bigqueryService.create(JobInfo.of(jobId, queryConfig));
// Get the job that was just created and execute it
Job job = bigqueryService.getJob(jobId);
Job completed = job.waitFor();

I then get the TableId object and session ID string from the job like so:

String sessionId = completed.getStatistics().getSessionInfo().getSessionId();
TableId tble = ((QueryJobConfiguration) job.getConfiguration()).getDestinationTable(); 

These are then both passed to a WriteChannelConfiguration object, which is then passed to a Java output stream object to stream the CSV to BigQuery. This is taken from these docs

ArrayList<ConnectionProperty> props = new ArrayList<>();
props.add(ConnectionProperty.of("session_id", sessionId));

// Use the TableId object from earlier to configure the writechannel
WriteChannelConfiguration config =
    WriteChannelConfiguration.newBuilder(tble).setFormatOptions(CsvOptions.newBuilder()
    .setSkipLeadingRows(1)
    .setAllowQuotedNewLines(true)
    .build()).setConnectionProperties(props)
    .build();

JobId writerJob = JobId.newBuilder().build();
TableDataWriteChannel dataWriter = bigqueryService.writer(writerJob, config);

try (OutputStream stream = Channels.newOutputStream(dataWriter)) {
    Files.copy(csvPath, stream);
}

However, the job to load the CSV fails with the error message above. My questions are:

  • Is it possible to upload a local CSV to a BigQuery temporary table? And if so, are there obvious issues with my code?
  • Is there a way to debug the internal error on my end?

Thanks!

0

There are 0 best solutions below