DEV Community

Ayush Newatia
Ayush Newatia

Posted on

How to generate YAML from Ruby objects without type annotations

Serialising a Ruby object to YAML is dead simple. Ruby's standard library has a yaml module built in which uses Psych under the hood. All we need to do is require 'yaml' and we're good to go! But there's a catch.

Let's say we have Photo and Album classes where an Album contains an array of Photos; and we want to serialise Album to YAML. This is what such a script would look like:

require 'yaml'

class Photo
  attr_reader :file

  def initialize(file)
    @file = file
  end
end

class Album
  attr_accessor :name, :photos

  def initialize(name, photos)
    @name = name
    @photos = photos
  end
end

photos = [Photo.new("DSC_0001.jpg"), Photo.new("DSC_0002.jpg"), Photo.new("DSC_0003.jpg")]
album = Album.new("Outdoors", photos)

puts album.to_yaml
Enter fullscreen mode Exit fullscreen mode

The above script will print out:

--- !ruby/object:Album
name: Outdoors
photos:
- !ruby/object:Photo
  file: DSC_0001.jpg
- !ruby/object:Photo
  file: DSC_0002.jpg
- !ruby/object:Photo
  file: DSC_0003.jpg
Enter fullscreen mode Exit fullscreen mode

What are all those class annotation type thingies?! They define the type of object that item was serialised from. When this YAML is deserialised, Ruby will try to deserialise each item into the object defined by its class annotation. Now if we only ever want to use this YAML in the context of our script or app, that's fine. However if it needs to be portable, 😬.

To serialise an object to YAML without class annotations, we first need to convert it to a Hash and then to YAML.

Since this example is very simple, we could just hand write methods to specifically convert these objects into Hashes. But as our app grows, this will be unsustainable, so let's take a look at how we could write a generic module to convert any object to a Hash.

module Hashify
  # Classes that include this module can exclude certain
  # instance variable from its hash representation by overriding
  # this method
  def ivars_excluded_from_hash
    []
  end

  def to_hash
    hash = {}
    excluded_ivars = ivars_excluded_from_hash

    # Iterate over all the instance variables and store their
    # names and values in a hash
    instance_variables.each do |var|
      next if excluded_ivars.include? var.to_s

      value = instance_variable_get(var)
      value = value.map(&:to_hash) if value.is_a? Array

      hash[var.to_s.delete("@")] = value
    end

    return hash
  end
end
Enter fullscreen mode Exit fullscreen mode

We can now include the above module in our Photo and Album classes and serialise them to YAML without the class annotations!

class Photo
  include Hashify

  ...
end

class Album
  include Hashify

  ...
end

photos = [Photo.new("DSC_0001.jpg"), Photo.new("DSC_0002.jpg"), Photo.new("DSC_0003.jpg")]
album = Album.new("Outdoors", photos)

puts album.to_hash.to_yaml
Enter fullscreen mode Exit fullscreen mode

The script will now output:

---
name: Outdoors
photos:
- file: DSC_0001.jpg
- file: DSC_0002.jpg
- file: DSC_0003.jpg
Enter fullscreen mode Exit fullscreen mode

This is just plain YAML and can be used in any app written using any language with a YAML serialiser!

This article was originally published on my blog.

Top comments (2)

Collapse
 
mtancoigne profile image
Manuel Tancoigne

Hi ! Thanks for the idea, however I wonder if your method has any benefits over a quick and dirty JSON.parse(my_obj.to_json).to_yaml ?

I'm searching for something to avoid it but including a bunch of modules in my classes seems a lot of work, even if it seems really cleaner (as long a we don't forget to include the module everywhere)

Collapse
 
ayushn21 profile image
Ayush Newatia

I think the main advantage would be you'd have fine grained control over which attributes are written to the YAML representation. I believe .to_json would write out all the instance variables right? That might not always be desireable ....