Tom de Bruijn

Posted on Feb 6, 2023 • Originally published at tomdebruijn.com

Write your own Domain Specific Language in Ruby

#ruby #dsl #metaprogramming #tutorial

Let's do some metaprogramming in Ruby! This article assumes you have some familiarity with writing Ruby and are interested in learning how to use metaprogramming.

This article is also available as a video recording of a talk.

What's a DSL?

Let's start with some definitions on what we're going to be looking at. The abbreviation DSL stands for Domain Specific Language.

A DSL is a sub-language within a Ruby app (or any other programming language) that is specific to a certain domain: like a Ruby gem or part of an app. Without that gem or app, the DSL won't work. It's specific to that domain.

In Ruby there's a gem named Rake. This gem provides a DSL to create tasks to be run from the command line. A small example looks like this:

# Rakefile
task :hello do
  puts "Hello there"
end

You can call this task from the terminal with the following command. When called, it will perform the block we've given to the task method.

$ rake hello
Hello there

This is the power of a DSL. We don't need to know how to listen to arguments given to Ruby from the command line. Rake does this for us. We only need to define the commands.

Why use a DSL?

Configuration

One of the most common reasons to create your own DSL is for configuration, for an app or part of a gem. In Rails, the Application class contains configuration for the different gems that make up Rails. It's even possible to define your own custom config options specific to your app.

# config/application.rb
module MyApp
  class Application < Rails::Application
    config.load_defaults 7.0

    config.time_zone = "UTC"

    config.app.custom_option = "some value"
  end
end

Within the Application class, the config object can be used to set config options or load defaults, like shown above.

Automate writing code

A DSL can be used to abstract away a bunch of repetitive code. In a Rails app there can be many different HTTP endpoints the app listens to. To write a Ruby app that listens to these endpoints can require a lot of code. In Rails, we can declare several different routes with one method: resources

Rails.application.routes.draw do
  resources :posts do
    resources :comments
  end
  root "homepage#show"
end

The resources method will do several things:

create several endpoints like GET /posts, GET /posts/new, POST /posts, DELETE /posts, etc.;
nest endpoints, if they are nested within another resources method call's block, like in the example above for comments;
route the request data on the endpoint to the relevant controller.

This is all defined with one method call. We don't need to go through several files and methods to make sure these steps are all done for every route. This makes, what would be high churn code, more maintainable*.

*: At times, we may be swapping out a lot of code with less code, but that code is more complex because of the metaprogramming involved.

Writing your own Ruby DSL

The start of a DSL

The first thing that usually gets a DSL is configuration. A gem ships with a configuration class or a part of an app does. A new Config class gets created with some config options. (In this case a CLI tool that has options to print more information with verbose and a particular output format called documentation.)

class Config
  def initialize(options = {})
    @options = options
  end
end

config = Config.new(
  :verbose => true,
  :format => :documentation
)
# => #<Config:...
#        @options={:verbose=>true, :format=>:documentation}>

The above example is usually enough for a small component.

Let's look how we can make this DSL feel more like some of the DSLs we've seen. In the example below we don't configure the Config class with a Hash of options, but use method calls like the verbose= attribute writer.

Ruby has helpers for defining reader and writer methods on a class for instance variables. Calling attr_accessor :verbose in a class definition will define the verbose or verbose= methods.

class Config
  attr_accessor :verbose, :format
end

config = Config.new
config.verbose = true
config.verbose # => true
config.format = :documentation

pp config
# => #<Config:...
         @format=:documentation,
         @verbose=true>

While this is quick to set up, it has some downsides. Like before, we want to store all config options on a @options Hash instead of on their own instance variables. To export the config options as a Hash, we would need to export them all manually. Now there are more places that need to be updated the more config options are added. Easily forgotten, easily broken.

class Config
  # ...

  def options
    {
      :verbose => @verbose,
      :format => @format
    }
  end
end

pp config.options
# => {:verbose=>true, :format=>:documentation}

To make the Config class work with a Hash of options, we can define our own methods for each config option. In these methods we set the values on the @options Hash, using the key of the config option we want to set. Great! We now can export our configuration with ease.

class Config
  attr_reader :options

  def initialize
    @options = {}
  end

  def verbose=(value)
    @options[:verbose] = value
  end

  def format=(value)
    @options[:format] = value
  end
end

config = Config.new
config.verbose = true
config.format = :documentation

pp config.options
# => {:verbose=>true, :format=>:documentation}

With this setup, we run into a familiar problem. Every time a new config option is added we need to add another method for that config option. As more config options are added, the Config class grows and grows, becoming more difficult to maintain.

Dynamically define methods in Ruby

Instead of defining a method per config option, we can dynamically define these methods. Dynamically defining methods allow us to quickly define several methods that share almost the same implementation.

In the example below I've created a new class method, as indicated with the def self.def_options syntax (1). This method, "define options", receives an array of option names to define methods for (2). In the def_options method it will loop through these option names (3). In each iteration of the loop it will call the define_method method (4). With this method we can define a new method, like with the def <method name> syntax, but with a dynamic method name.

class Config
  def self.def_options(*option_names) # 1
    option_names.each do |option_name| # 3
      define_method "#{option_name}=" do |value| # 4
        @options[option_name] = value # 5
      end
    end
  end

  def_options :verbose, :format # 2

  def initialize
    @options = {}
  end

  def options
    @options
  end
end

config = Config.new
config.verbose = true
config.format = :documentation

pp config.options
# => {:verbose=>true, :format=>:documentation}

The define_method method receives the name of the method, and a block (4). This block will be the implementation of the method we are defining. In this case, it will call the @options instance variable to receive the Hash of options configured, and add a new value for the option name we've declared (5). The DSL syntax for the end-user remains the same.

Using Ruby blocks

In my opinion, no Ruby DSL is complete without blocks*. They look good for one, but they also provide a new scope of context to users of our DSL, in which our new DSL rules apply.

*: It's perfectly fine to have a DSL without blocks.

Config.configure do |config|
  config.verbose = true
  config.format = :documentation
end

In the next example, I've created a new class method called configure. To interact with blocks given to a method, we can use the yield keyword. If a method has been given a block, yield will call that block. If we give a value to yield, it will make it so that the block receives that value as an argument in the block, as is shown with |config| in the example above.

class Config
  def self.configure
    instance = new # Create a new instance of the Config class
    yield instance # Call the block and give the Config instance as an argument
    instance
  end

  # ...
end

Using instance_eval

We can make this shorter, and in my opinion, feel more like a DSL with its own block context. We can drop the |config| block argument and avoid having to call config. in front of every config option.

Config.configure do
  verbose true
  format :documentation
end

I've omitted the equals sign (=) from the example here. More on that later.

To make this work we go back to our configure class method. Instead of yield, we'll use instance_eval. (Eval may sound scary, and it has some things to look out for, but in this small example it's quite safe.)

To call instance_eval we need to pass in the block given to the method. We can't use yield for this. We need to explicitly specify the block argument of the configure method with &block. The key part being the ampersand (&) of the argument name &block, telling Ruby it's the block argument.

class Config
  def self.configure(&block)
    instance = new
    instance.instance_eval(&block)
    instance
  end

  # ...
end

(In Ruby 3.1 and newer the &block argument can also be written as only the ampersand &, the name is optional.)

When called on another object, like our Config class instance, instance_eval will execute that block within the context of the Config instance. The context of that block is changed from where it was defined to the Config instance. When a method is called within that block, it will call it directly on the Config instance.

Local variable assignment gotcha

Remember how in the example above I omitted the equals sign from the method names? When calling a method name ending with an equals sign in instance_eval, Ruby will interpret it as a local variable assignment instead. It will not call the method on the Config instance. That's how Ruby works, and we can't change that. I omitted the equals sign to make the DSL still work.
Config.configure do
  # This won't work
  verbose = true
  format = :documentation
end
It's possible to work around this by explicitly calling the methods on self., but this syntax becomes basically the same as using yield. Instead, I removed the equals sign from the method name.
Config.configure do
  # This works, but requires `self.` in front of every method call
  self.verbose = true
  self.format = :documentation
end

When to use yield or instance_eval?

I've shown two ways of calling blocks in Ruby. Which one should be used is up to the authors of the DSL. I have my personal preference for using instance_eval, but there are definitely scenarios where using yield works better.

When something needs to be configured, using a lot of methods with an equals sign (writer methods), using yield with a block argument is quite common.

Config.configure do |config|
  config.verbose = true # Set a value
end

When defining something or calling actions, I see the instance_eval approach used a lot.

Config.configure do
  load_defaults! # Perform an action
end

That doesn't mean one excludes the other. Gems like Puma use the instance_eval method for their configuration file.

# config/puma.rb
workers 3
preload_app!

Use the approach you feel works best for your DSL.

Share DSL behavior with modules

With the above methods you can start building your DSL. This gives us a good toolbox. As the DSL grows, or more DSLs get added to the domain, it's good to share behavior between the DSL classes.

For example, this domain has two separate configuration classes: a Config and an Output class. One configures how the gem works, the other how it outputs results to the terminal.

# Multiple configuration classes
Config.configure do
  enabled true
end

Output.configure do
  verbose true
  format :documentation
end

We don't want to repeat ourselves, so we move the DSL definition behavior to a module called Configurable. When that module is included the class method def_options is made available with the class definition.

class Config
  include Configurable

  def_options :enabled
end

class Output
  include Configurable

  def_options :verbose, :format
end

When the Configurable module is included, it adds the options method to the class. The module also extends the class it's included in, using the included callback in Ruby modules. In that callback, the class is extended with the ClassMethods module, defined inside the Configurable module. This module will add the class methods (def_options and configure) to the class. This is also how things like ActiveSupport::Concern work.

module Configurable
  def self.included(base)
    base.extend ClassMethods
  end

  module ClassMethods
    def def_options(*option_names)
      # ...
    end

    def configure(&block)
      # ...
    end
  end

  def options
    @options ||= {}
  end
end

With all the logic moved to the Configurable module, we now have a reusable module to create as many config classes as the domain needs.

Nesting DSLs

The Output class is really a sub domain of the Config class though. It's part of the gem config: the configurations should be nested within one another.

Config.configure do
  enabled true

  output do
    verbose true
    format :documentation
  end
end

Rather than Output being a top-level config class, we store it on the Config class. In the output method in the example below, the method creates a new Output instance. When a block is given, it will instance_eval the block on the @output object. The method will always return the created @output object so the sub-config can be accessed.

class Config
  include Configurable

  def_options :enabled

  class Output
    include Configurable

    def_options :verbose, :format
  end

  def output(&block)
    @output ||= Output.new
    @output.instance_eval(&block) if block_given?
    @output
  end
end

Module builder pattern

The Module builder pattern is a really neat design pattern in Ruby that allows us to do away with the two step process of including a module and then calling methods included by it. This pattern is described in more detail in The Ruby Module Builder Pattern by Chris Salzberg.

In the example below, a new ConfigOption module is used to include a dynamically defined module. For the end-user, the resulting DSL remains the same.

class Config
  include ConfigOption.new(:verbose, :format)
end

When ConfigOption.new is called, the desired config option names are given to the initialize method. Like before, we iterate over this list. Using define_method the necessary methods are defined on the module. It's possible to do so in this initialize method, because it's creating a new implementation of the ConfigOption module.

class ConfigOption < Module
  def initialize(*option_names)
    define_method :options do
      @options ||= {}
    end

    option_names.each do |option_name|
      define_method option_name do |value|
        options[option_name] = value
      end
    end
  end
end

For a much more in-depth look, I can highly recommend reading The Ruby Module Builder Pattern by Chris Salzberg.

Create DSL objects

Another step I recommend is to separate the behavior of the DSL itself and the actual app code that uses it, by creating DSL classes. These classes are only used when the end-user interacts with the DSL, like configuring a gem. When the configuration is done, the options are read from the DSL class and set on whatever config object the gem uses.

Like before, we have a Config class. When configured using Config.configure, it creates a ConfigDSL class (1). This ConfigDSL class will become the context of the block given to the configure class method.

class Config
  def self.configure(&block)
   dsl = ConfigDSL.dsl(&block) # 1
   new(dsl.options) # 2
  end

  attr_reader :options

  def initialize(options)
    @options = options
  end
end

class ConfigDSL
  include ConfigOption.new(:verbose, :format)

  def self.dsl(&block)
    instance = new
    instance.instance_eval(&block)
    instance
  end
end

After the configure block has been called, the Config class reads the options from the ConfigDSL and initializes a new instance of itself with these options (2).

With this approach the DSL that configures the gem is only used when the app starts. The app doesn't carry around all that extra weight of how the DSL works all the time. The separation of this behavior makes it easier to not accidentally call the configure DSL when it should not be available.

This approach also solves a downside of using instance_eval. When calling blocks this way, private methods are accessible inside the block. That's something we usually don't want to allow.

class Config
  def self.configure(&block)
    instance = new
    instance.instance_eval(&block)
    instance
  end

  private

  def secret_method
    "super secret method"
  end
end

Config.configure do
  # This should raise an error about calling
  # a private method, but it doesn't
  secret_method
  # => "super secret method"
end

Private methods can be accessed in blocks called using instance_eval, because the block is evaluated in the context of the instance. It's as if it's being run from a method within that object. With separate DSL classes you have to worry less about private methods that you don't want anyone to call.

Conclusion

Now that our toolbox is complete we can create our own Domain Specific Language using Ruby. We have the following tools at our disposal:

Use define_method to dynamically define new methods on classes.
Use Ruby blocks to give your DSL that real DSL feeling and create a context wherein your DSL shines.
- Use yield to return a DSL object as a block argument, or;
- Use instance_eval to change how to block works, and allow end-users to directly call methods on the new context of new block.
Use Modules to:
- share behavior between many classes, and;
- use the Module builder pattern to remove any logic of how the DSL is constructed from the class that uses the DSL.
DSL objects separate the logic of how the DSL works and how the rest of the gem works. Creating a better separation of responsibilities in your code.

Now go forth, and build your own DSL!

Top comments (1)

Comment deleted