BigQueryIO Batch pipeline with STORAGE_API_WRITE doesn't truncate table

122 Views Asked by Dyjah At 27 June 2025 at 01:13

I'm have a BATCH pipeline that needs to write to BigQuery truncating the table. I'm using method STORAGE_API_WRITE and the table is not truncated, instead the values are appended.

        .apply(BigQueryIO.<RunQueryResponse>write()
            .to(new TableReference()
                .setProjectId(clientProject)
                .setDatasetId(firestoreStateDataset)
                .setTableId(table))
            .withFormatFunction(new RunQueryResponseToTableRow())
            .withMethod(BigQueryIO.Write.Method.STORAGE_WRITE_API)
            .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
            .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE));

I know WRITE_TRUNCATE don't work on streaming pipelines, but this is a BATCH pipeline. STORAGE_WRITE_API does not support WRITE_TRUNCATE?

The table is not partitioned.

If I change to use the default method FILE_LOADS, it works.

Original Q&A

There are 1 best solutions below

Mel On 09 November 2023 at 21:51

STORAGE_WRITE_API indeed supports real-time streaming and batch data processing to BigQuery, however, the base function is through stream hence the challenges working with WRITE_TRUNCATE.

In pending type, records are buffered in a pending state until you commit the stream. When you commit a stream, all of the pending data becomes available for reading. The commit is an atomic operation. Use this type for batch workloads, as an alternative to BigQuery load jobs.

This storage write api batch load data using pending type document provides you with instructions and code samples which you can reference for your use case. Alternatively, you can also explore configuring BigQuery load jobs for your batch data pipelines.

BigQueryIO Batch pipeline with STORAGE_API_WRITE doesn't truncate table

There are 1 best solutions below

Related Questions in GOOGLE-BIGQUERY

Related Questions in GOOGLE-CLOUD-DATAFLOW

Related Questions in APACHE-BEAM

Related Questions in APACHE-BEAM-IO

Trending Questions

Popular # Hahtags

Popular Questions