Cron is a great tool for running scheduled tasks on Unix servers. It has however some shortcomings when running applications in the cloud.

Cron is essentially a single server solution. When running applications on multiple servers, scheduled tasks can be divided into two categories:

  1. Those which should run on all servers of a given role, e.g. log rotation on application servers.
  2. Those which should be executed on a single server only, e.g. mysql data aggregation.

Handling the former is straightforward – you simply configure identical cron jobs on the servers. It is the latter which creates a challenge.

A simplistic way of handling category two tasks would be to to configure these type of cron jobs on one server only. The problem with this type of approach is that:

  1. It requires manual work to configure a unique server.
  2. If that server goes down, another server has to be set up manually or a cron specific failover mechanism needs to be put in place.
  3. The solution lacks symmetry and elegance.

What is needed is a better, fully automated and a scalable solution.


  1. Cron jobs should be automatically configured on all servers upon launch.
  2. There needs to be an easy mechanism to prevent jobs which should be executed only once from running on multiple servers.
  3. Cron jobs’ load should be distributed across multiple servers.


Our solution (we are a Ruby on Rails shop) includes three components:

  1. whenever gem to configure cron jobs
  2. a custom database semaphore mechanism to guarantee that only one instance of a job will run
  3. delayed_job gem for load distribution


Whenever gem provides a mechanism for defining and deploying cron jobs.

Jobs are defined in a config/schedule.rb file such as the one shown below:

# Use this file to easily define all of your cron jobs.
# It's helpful, but not entirely necessary to understand cron before proceeding.

# Example:
# every 2.hours do
#   command "/usr/bin/some_great_command"
#   runner "MyModel.some_method"
#   rake "some:great:rake:task"
# end
# every 4.days do
#   runner "AnotherModel.prune_old_records"
# end

# Learn more:

every, :at => '1:00 am' do
  rake "data:verify"

To deploy jobs, a crontab file is generated from schedule.rb by running the following command:

whenever --set environment=#{ENV['RAILS_ENV']} -w

This is incorporated into the server launch process.

The benefits of using Whenever include:

  1. Cron jobs are managed with the rest of an application using the common source code management tool and processes.
  2. Servers are automatically configured with the cron jobs upon launch.

Database Semaphore

The custom database semaphore class shown below provides a locking mechanism. (The code is also available at

class DatabaseSemaphore < ActiveRecord::Base
  validates_presence_of :name, :message => "can't be blank"

  def, lock_duration = 600)
  # only one requestor can get open semaphore at a time
  # sempahore can be locked in a closed position for lock_duration in seconds
    semaphore_open = false
    now =
    # insert record if it does not exist yet
    DatabaseSemaphore.create(:name => name, :locked_at => now - lock_duration) if !DatabaseSemaphore.find_by_name(name)
    DatabaseSemaphore.transaction do
      semaphore = DatabaseSemaphore.find_by_name(name, :lock => "LOCK IN SHARE MODE")
      if semaphore and semaphore.locked_at <= now - lock_duration
        semaphore.locked_at = now
        semaphore_open = true if
    return semaphore_open
  rescue ActiveRecord::StatementInvalid => e
    # deadlock
    return false

class CreateDatabaseSemaphores < ActiveRecord::Migration
  def self.up
    create_table :database_semaphores do |t|
      t.string :name
      t.datetime :locked_at

    add_index :database_semaphores, :name, :unique => true

  def self.down
    drop_table :database_semaphores

Jobs defined in schedule.rb are added to rake files such as lib/tasks/data.rake file shown below:

namespace :data do
  desc "Run data integrity check"
  task :verify => [:environment] do
    Delayed::Job.enqueue if"VerifyData")

All servers with a cron job configured will attempt to run the job. Only the first to call the Database Semaphore will succeed and all other servers will promptly exit the job.


Delayed_job gem provides a database based priority queue. Job skeleton is shown below:

module CronJob
  class GenericJob
    def initialize; end
    def perform; end

  class VerifyData < GenericJob
    def perform
      # job code goes here
    rescue Exception => e

We are running delayed_job workers on all application servers. A worker picks up a job to execute when it is available. If multiple jobs are scheduled to run a the same time, they will be picked up by different workers and thus distributed across multiple servers.


With a small amount of effort we were able to set up a distributed, scalable, fault tolerant cron system.