Best practice for running database schema migrations

1.9k Views Asked by At

Build servers are generally detached from the VPC running the instance. Be it Cloud Build on GCP, or utilising one of the many CI tools out there (CircleCI, Codeship etc), thus running DB schema updates is particularly challenging.

So, it makes me wonder.... When's the best place to run database schema migrations?

From my perspective, there are four opportunities to automatically run schema migrations or seeds within a CD pipeline:

  1. Within the build phase
  2. On instance startup
  3. Via a warm-up script (synchronously or asynchronously)
  4. Via an endpoint, either automatically or manually called post deployment

The primary issue with option 1 is security. With Google Cloud Sql/Google Cloud Build, it's been possible for me to run (with much struggle), schema migrations/seeds via a build step and a SQL proxy. To be honest, it was a total ball-ache to set up...but it works.

My latest project is utilising MongoDb, for which I've connected in migrate-mongo if I ever need to move some data around/seed some data. Unfortunately there is no such SQL proxy to securely connect MongoDb (atlas) to Cloud Build (or any other CI tools) as it doesn't run in the instance's VPC. Thus, it's a dead-end in my eyes.

I'm therefore warming (no pun intended) to the warm-up script concept.

With App Engine, the warm-up script is called prior to traffic being served, and on the host which would already have access via the VPC. The warmup script is meant to be used for opening up database connections to speed up connectivity, but assuming there are no outstanding migrations, it'd be doing exactly that - a very light-weight select statement.

Can anyone think of any issues with this approach?

Option 4 is also suitable (it's essentially the same thing). There may be a bit more protection required on these endpoints though - especially if a "down" migration script exists(!)

1

There are 1 best solutions below

2
guillaume blaquiere On

It's hard to answer you because it's an opinion based question!

Here my thoughts about your propositions

  1. It's the best solution for me. Of course you have to take care to only add field and not to delete or remove existing schema field. Like this, you can update your schema during the Build phase, then deploy. The new deployment will take the new schema and the obsolete field will no longer be used. On the next schema update, you will be able to delete these obsolete field and clean your schema.
  2. This solution will decrease your cold start performance. It's not a suitable solution
  3. Same remark as before, in addition to be sticky to App Engine infrastructure and way of working.
  4. No real advantage compare to the solution 1.

About security, Cloud Build will be able to work with worker pool soon. Still in alpha but I expect in the next month an alpha release of it.