The Task
In the last weeks I had to migrate several databases to a new one, which was mostly copying data, but also a bit of enriching data. It was just a little bit too much for plain SQL and since Rails 6 has multi database support I wanted to give it a try.
Here's an example of the task:
So we already have both schemas, the one in the old database and the one in the new database. The table names are all the same in both databases, some models just have some additional fields in the new database like persons.country_of_phone_number
. We fill this new field during migration by fetching the country code from the phone number (e.g. "+49") and doing a lookup to which country this country code belongs (e.g. in a REST service or another table).
Not The Plan
Since this migration is a one-time-procedure that has nothing to do with the evolution of the rails application we won't "pollute" the code of the rails application with our migration task. Instead we will create a new rails app just for the migration.
We will also not copy/generate the model files from the existing rails application since we don't need any model functionality. We just want to copy data. But how does database access work without (ActiveRecord based) model files?
The Plan
It's possible to generate model files on-the-fly just as we need them. Here's the code:
def generate_class_if_not_exists(namespaced_model)
check = namespaced_model.safe_constantize
if check.nil?
mod = namespaced_model.split(/::/)[0]
klass = namespaced_model.split(/::/)[1]
eval %{
module #{mod}
class #{klass} < BaseRecord
end
end
}
end
end
The indentation looks a bit ugly, but I like it better than using lots of "\n" in a single line string.
So let's look a the code. The most important thing to notice is that we generate a class that inherits from BaseRecord in a module. And that's the key to access different databases. So we need to make sure, that we have one module for the old database and another one for the new database.
Database Configuration
Rails 6 has introduced one more level in database.yml
in order to define multiple databases:
development:
primary:
<<: *default
database: new_db
username: 'newuser'
password: 'newpassword'
old_db:
<<: *default
database: old_db
username: 'olduser'
password: 'oldpassword'
another_old_db:
<<: *default
database: another_old_db
username: 'olduser'
password: 'oldpassword'
end
end
As you can see, I recommend to use the primary database for the (single) new database, thus all old databases have specific names.
BaseRecords
Now in order to tell ActiveRecord which database to use we have to implement the BaseRecord for each database. Since the new database uses the primary db configuration we don't need to specify the database in app/models/new_db/base_record.rb
:
module NewDb
class BaseRecord < ApplicationRecord
self.abstract_class = true
end
end
But all old databases must be linked to their database.yml
configuration, so let's take a look at app/models/old_db/base_record.rb
:
module OldDb
class BaseRecord < ApplicationRecord
connects_to database: { writing: :old_db, reading: :old_db }
self.abstract_class = true
end
end
That's it for the setup.
The Migration
Now the migration is really easy. We can write a rake task for that:
desc 'migrate all tables'
task :migrate => :environment do
%i( :persons :foos :bars ).each do |table|
migrate_table(table)
end
end
def migrate_table(table)
table = table.to_s.gsub(/^:/, '')
source_string = 'OldDb::'
source_string += table.classify
generate_class_if_not_exists(source_string)
source = source_string.safe_constantize
target_string = 'NewDb::'
target_string += table.classify
generate_class_if_not_exists(target_string)
target = target_string.safe_constantize
begin
# first try to call a specific method for data enrichment
send("table_#{table}".to_sym, source, target)
rescue NoMethodError => _e
# no specific method found means: no enrichment, just copy the data
normal_table(source, target)
end
end
So we have two cases here:
- tables that are 100% the same in old db and new db use
normal_table
to just copy the data - tables that need enrichment use a specific method, e.g.
table_persons
for adding the Country-of-Phone-Number property; we need to implement a specific method for each table that needs enrichment
Here are both methods:
def normal_table(source, target)
puts "migrating #{source} to #{target}..."
begin
source.all.each.with_index do |u|
s = target.send(:new, u.attributes.except(:id))
s.save!
end
rescue => e
puts e
end
puts "#{target.count} of #{source.count} records migrated."
end
def table_persons(source, target)
puts "migrating #{source} to #{target}..."
begin
source.all.each.with_index do |u|
s = target.send(:new, u.attributes.except(:id))
s.country_of_phone_number = some_magic_function(u)
s.save!
end
rescue => e
puts e
end
puts "#{target.count} of #{source.count} records migrated."
end
And that's it. We can copy hundreds of tables from old_db to new_db with just two files under app/models
.
Discussion: Metaprogramming
We always should ask ourselves: would another developer understand my codebase? There are several aspects we need to consider, e.g.:
- How big is my codebase?
- How readable is my code?
- How complex is my code?
- Do I make use of advanced techniques?
On the plus side our migration project is a very minimalistic rails project with no legacy code and the least amount of self written code I can think of. But we have used advanced techniques like send(:method, :param1, :param2)
in order to call methods by names we determine at runtime (dynamically). This might be called "metaprogramming". There's even a book for that:
One of the claims in this book is that metaprogramming in ruby is just programming, it's nothing special, because you don't need to access anything "forbidden". Other languages might have things like __method_name__
to access metaprogramming, which already tells you by the name "Caution! My double underscore prefix is reserved for language constructs. Here be dragons!"
Ruby is different. Take a look at the object model, for example, it's a real beauty - go check it out! There's no need for ruby to draw a line between programming and metaprogramming, because ruby's advanced concepts still rely on simple, yet powerful foundations, that are easy to grasp.
So I think my codebase is easy to understand mainly thanks to its brevity. You only have to know one or two advanced concepts. And IMHO that's much better than writing a lot of code explicitly, because the more code I write, the more bugs I write, and the more code another developer has to read.
What do you think?
Top comments (0)