I have running few BigQuery DML tests to get a better idea of the performance of BigQuery DML capabilities. So far, here are some initial observations: 1) Slow performance when updating only a few records in a very small table (30K+ records)
UPDATE babynames.names_2014
SET name = 'Emma B'
WHERE name = 'Emma';
Output: - 2 rows affected (# of records in the table: 33176) - Query complete (4.5s elapsed, 621 KB processed)
2) Very slow performance when deleting only few records from a small table
SQL:
DELETE from babynames.names_2014_copy
where gender<>'U'
Output: -2 rows affected. -Query complete (162.6s elapsed, 1.21 MB processed) - ~3 minutes
QUESTIONS: 1) Are these known behavior? 2) Any suggestions on how to improve the performance?
I have also noticed that
Update
andDelete
operations might me very slow in BigQuery.Interestingly, overwriting the table with "create or replace table" statement usually has significantly better performance.
So instead of:
consider just using:
A similar technique also works for
Update
; you just need to write acase
statement to modify your values.