When not to use Rails db:migrate?

Rails db:migrate is used to manage database structural changes, .e.g. adding database columns, and database content changes, e.g. populating lookup tables. In general it works very well. There are however cases when alternative database migration approaches should be used. These include:

long lasting migrations
those which lock a table for too long
destructive structural changes

Locking table for too long

Locking table for too long will result in an application downtime as requests attempting to make changes to the table will time out. Structural changes to large tables fall into this category as executing ALTER TABLE will lock the table.

The way around the issue is to implement a multi-step process:

Before the release

Create a new table with the new structure.
Gradually copy data from the original table to the new table.

During the release (this can be executed as a transaction)

Copy remaining data, i.e. original table data which are not in the new table yet.
Rename the original table.
Rename the new table to the original table name.

The process can be implemented as a set of rake tasks or database procedures.

Long lasting migrations

Long lasting migrations, even if they do not lock tables for a long time, can be problematic. They elongate the release process and typically you would want to complete a release process as quickly as possible to move to a stable infrastructure state. Data aggregation or data model de-normalization migrations fall into such category. For example you may decide to add a customer balance column to a customer table when calculating balance from transactions table is starting to take too long. This migration can be executed in three steps:

In release A add a new aggregated column.
Between release A and B run a rake task which populates the new column.
In release B deploy code which uses the new column.

Destructive structural changes

Changes such as deleting columns can result in exceptions if a code using a column and the column itself are removed as part of a single release process. Between the time database migration is run and the time code on all servers is upgraded to the latest code base, the application may attempt to use a column which is no longer there.

A simple solution is to do structural deletions over two releases:

In release A remove all code references to the database element (e.g. column) about to be removed.
In release B implement database changes.

Work management

If migration involves multiple steps spanning two releases we simply create multiple, cross-referenced work tickets.

Rake tasks naming convention

One of the benefits of db:migrate is that it automates sequence of execution for various migrations. That benefit is lost when using rake to implement migration tasks. Implementing a naming convention for the rake tasks as outline below can help:

	namespace "x.x.x.x" do

	desc "bar"
	task :foo => [:environment] do
	# foo implementation goes here
	end

	end

	namespace "y.y.y.y" do
	# release y.y.y.y rake tasks go here
	end

view raw

releases.rake

hosted with ❤ by GitHub

Tasks associated with a given release are grouped together using the release number as a namespace for the tasks. In general, releases rake file should be sparsely populated, i.e. only a small subset of releases should require migration rake tasks.

Summary

Rails db:migrate has many benefits and should be used whenever possible to implement database migration. There are a few use cases however where blindly using db:migrate would lead to issues. Simple processes utilizing rake tasks provide a viable alternative for these scenarios.

software development

mysql, programming, ruby on rails, software development

July 28, 2012

1 Comment

CTO Insights