Rails 6 with Metaprogrammed Models

#ruby #programming #webdev #rails

The Task

In the last weeks I had to migrate several databases to a new one, which was mostly copying data, but also a bit of enriching data. It was just a little bit too much for plain SQL and since Rails 6 has multi database support I wanted to give it a try.

Here's an example of the task:

So we already have both schemas, the one in the old database and the one in the new database. The table names are all the same in both databases, some models just have some additional fields in the new database like persons.country_of_phone_number. We fill this new field during migration by fetching the country code from the phone number (e.g. "+49") and doing a lookup to which country this country code belongs (e.g. in a REST service or another table).

Not The Plan

Since this migration is a one-time-procedure that has nothing to do with the evolution of the rails application we won't "pollute" the code of the rails application with our migration task. Instead we will create a new rails app just for the migration.

We will also not copy/generate the model files from the existing rails application since we don't need any model functionality. We just want to copy data. But how does database access work without (ActiveRecord based) model files?

The Plan

It's possible to generate model files on-the-fly just as we need them. Here's the code:

def generate_class_if_not_exists(namespaced_model)
  check = namespaced_model.safe_constantize
  if check.nil?
    mod = namespaced_model.split(/::/)[0]
    klass = namespaced_model.split(/::/)[1]
    eval %{
module #{mod}
  class #{klass} < BaseRecord
  end
end
}
  end
end

The indentation looks a bit ugly, but I like it better than using lots of "\n" in a single line string.

So let's look a the code. The most important thing to notice is that we generate a class that inherits from BaseRecord in a module. And that's the key to access different databases. So we need to make sure, that we have one module for the old database and another one for the new database.

Database Configuration

Rails 6 has introduced one more level in database.yml in order to define multiple databases:

development:
  primary:
    <<: *default
    database: new_db
    username: 'newuser'
    password: 'newpassword'
  old_db:
    <<: *default
    database: old_db
    username: 'olduser'
    password: 'oldpassword'
  another_old_db:
    <<: *default
    database: another_old_db
    username: 'olduser'
    password: 'oldpassword'
  end
end

As you can see, I recommend to use the primary database for the (single) new database, thus all old databases have specific names.

BaseRecords

Now in order to tell ActiveRecord which database to use we have to implement the BaseRecord for each database. Since the new database uses the primary db configuration we don't need to specify the database in app/models/new_db/base_record.rb:

module NewDb
  class BaseRecord < ApplicationRecord
    self.abstract_class = true
  end
end

But all old databases must be linked to their database.yml configuration, so let's take a look at app/models/old_db/base_record.rb:

module OldDb
  class BaseRecord < ApplicationRecord
    connects_to database: { writing: :old_db, reading: :old_db }
    self.abstract_class = true
  end
end

That's it for the setup.

The Migration

Now the migration is really easy. We can write a rake task for that:

  desc 'migrate all tables'
  task :migrate => :environment do
    %i( :persons :foos :bars ).each do |table|
      migrate_table(table)
    end
  end

  def migrate_table(table)
    table = table.to_s.gsub(/^:/, '')

    source_string = 'OldDb::'
    source_string += table.classify
    generate_class_if_not_exists(source_string)
    source = source_string.safe_constantize

    target_string = 'NewDb::'
    target_string += table.classify
    generate_class_if_not_exists(target_string)
    target = target_string.safe_constantize

    begin
      # first try to call a specific method for data enrichment
      send("table_#{table}".to_sym, source, target)
    rescue NoMethodError => _e
      # no specific method found means: no enrichment, just copy the data
      normal_table(source, target)
    end
  end

So we have two cases here:

tables that are 100% the same in old db and new db use normal_table to just copy the data
tables that need enrichment use a specific method, e.g. table_persons for adding the Country-of-Phone-Number property; we need to implement a specific method for each table that needs enrichment

Here are both methods:

  def normal_table(source, target)
    puts "migrating #{source} to #{target}..."
    begin
      source.all.each.with_index do |u|
        s = target.send(:new, u.attributes.except(:id))
        s.save!
      end
    rescue => e
      puts e
    end
    puts "#{target.count} of #{source.count} records migrated."
  end

  def table_persons(source, target)
    puts "migrating #{source} to #{target}..."
    begin
      source.all.each.with_index do |u|
        s = target.send(:new, u.attributes.except(:id))
        s.country_of_phone_number = some_magic_function(u)
        s.save!
      end
    rescue => e
      puts e
    end
    puts "#{target.count} of #{source.count} records migrated."
  end

And that's it. We can copy hundreds of tables from old_db to new_db with just two files under app/models.

Discussion: Metaprogramming

We always should ask ourselves: would another developer understand my codebase? There are several aspects we need to consider, e.g.:

How big is my codebase?
How readable is my code?
How complex is my code?
Do I make use of advanced techniques?

On the plus side our migration project is a very minimalistic rails project with no legacy code and the least amount of self written code I can think of. But we have used advanced techniques like send(:method, :param1, :param2) in order to call methods by names we determine at runtime (dynamically). This might be called "metaprogramming". There's even a book for that:

You can buy it here

One of the claims in this book is that metaprogramming in ruby is just programming, it's nothing special, because you don't need to access anything "forbidden". Other languages might have things like __method_name__ to access metaprogramming, which already tells you by the name "Caution! My double underscore prefix is reserved for language constructs. Here be dragons!"

Ruby is different. Take a look at the object model, for example, it's a real beauty - go check it out! There's no need for ruby to draw a line between programming and metaprogramming, because ruby's advanced concepts still rely on simple, yet powerful foundations, that are easy to grasp.

So I think my codebase is easy to understand mainly thanks to its brevity. You only have to know one or two advanced concepts. And IMHO that's much better than writing a lot of code explicitly, because the more code I write, the more bugs I write, and the more code another developer has to read.

What do you think?

(The cover image is "Final Train" by Jason Heeley.)