Faster way to copy a large PostgreSQL table into a new table with Ruby on Rails migrations

292 Views Asked by At

I'm working on a Ruby on Rails project with a PostgreSQL database, and I have a large table with millions of records. I want to copy this table into another table named new_X. I want both tables to be identical at the end. By identical I mean, the same constraints & default values. Also the primary key in the old table is (id). And I want new_X to have the same auto-incrementing primary key. I have two approaches;

Approach 1: Copy the table and then add the constraints and indexes

First copying the values to the new table, and creating a sequence for the id, and then adding the not null constraints and default values.

# Copy the old table to new table
execute <<-SQL.squish
  CREATE TABLE new_X AS TABLE old_X;
SQL

# Create a new serial sequence for id
execute <<-SQL.squish
  CREATE SEQUENCE new_X_id_seq;
SQL

# Update the id column to use the new sequence
execute <<-SQL.squish
  ALTER TABLE new_X
  ALTER COLUMN id SET DEFAULT nextval('new_X_id_seq');
SQL

# Set the sequences current value to the maximum existing id value
execute <<-SQL.squish
  SELECT setval('new_X_id_seq', (SELECT max(id) FROM new_X));
SQL

# Set the id column as the primary key
execute <<-SQL.squish
  ALTER TABLE new_X
  ADD PRIMARY KEY (id);
SQL

After creating the table and adding the records, now I'll add the constraints, default values and indexes. For example I'll have to add this not null constraint. And the index.

# Add the created_at and updated_at columns with NOT NULL constraint
execute <<-SQL.squish
  ALTER TABLE new_X
  ADD COLUMN created_at timestamp without time zone NOT NULL DEFAULT NOW(),
  ADD COLUMN updated_at timestamp without time zone NOT NULL DEFAULT NOW();
SQL

Now add the index.

# Add index for is_Y
execute <<-SQL.squish      
CREATE INDEX CONCURRENTLY index_new_X_on_is_Y
  ON new_X (is_Y);
SQL

Create the table with column names, constraints, defaults, indexes and then insert the data

Create a table with name new_X. With all the column names, constraints, defaults, indexes. After creating it, insert millions of records from the old table.

I think I'm closer to approach #1, but I would appreciate any insights, best practices, or experiences from the community in dealing with similar scenarios. Thank you!

Since I don't have millions of records in my development environment, I couldn't test the behaviours.

0

There are 0 best solutions below