We have tables in CockroachDB which have a PK which always have 2 extra columns prefix, e.g. (u1, u2, id), now not needed anymore.
For performance, there's also the real primary key set up as a UNIQUE CONSTRAINT.
CONSTRAINT table1_pk PRIMARY KEY (u1, u2, id),
CONSTRAINT table1_ck_id UNIQUE (id)
I want to remove u1 and u2 from the table.
The issue is that the tables sometimes have billions of entries, and re-indexing will take quite some time. Cockroach DB does not support CONCURRENTLY, so AFAIK, this will block UPDATE queries for a long time.
I have identified 3 ways to do this:
A) "Promoting" a UNIQUE CONSTRAINT to PRIMARY KEY. This may not work.
ALTER TABLE tnt.xxx DROP CONSTRAINT xxx_pk;
ALTER TABLE tnt.xxx ADD CONSTRAINT xxx_pk PRIMARY KEY USING INDEX xxx_pk_id;
ALTER TABLE tnt.xxx DROP CONSTRAINT xxx_pk_id;
ALTER TABLE tnt.xxx DROP COLUMN u1, DROP COLUMN u2;
B) Changing the PK columns. See https://www.cockroachlabs.com/docs/stable/primary-key#changing-primary-key-columns
ALTER TABLE tnt.xxx ALTER PRIMARY KEY USING COLUMNS (id);
ALTER TABLE tnt.xxx DROP CONSTRAINT xxx_ck_id;
ALTER TABLE tnt.xxx DROP COLUMN u1, DROP COLUMN u2;
C) DROP and ADD the PRIMARY KEY in one transaction. See https://www.cockroachlabs.com/docs/v23.2/alter-table#drop-and-add-a-primary-key-constraint
ALTER TABLE tnt.xxx DROP CONSTRAINT xxx_pk;
ALTER TABLE tnt.xxx ADD CONSTRAINT xxx_pk PRIMARY KEY (id);
ALTER TABLE tnt.xxx DROP CONSTRAINT xxx_ck_id;
ALTER TABLE tnt.xxx DROP COLUMN u1, DROP COLUMN u2;
Is there any other way?
Which one has the best performance?
The migration happens in a transaction. So this should not leave the table without a PK for the SELECTs happening during that.
Schema changes in CRDB are always online and non-blocking, so there should be no concerns of IUD operations getting stuck behind any ALTER TABLE commands. See: https://www.cockroachlabs.com/docs/stable/online-schema-changes.
In practice your only concern running the sequence above is an impact on throughput if you have an active workload executing. The best course of action is to use implicit transactions (see the note in the link above, explicit ones are not fully supported). Then to directly execute:
The impact when doing online schema changes will be mainly to write operations, since the backfills will exhibit a phenomenon called dual-writes (i.e. each write will go to the old and new primary indexes).