loading...
Amplifr.com

Postgres enums with Rails

dsalahutdinov profile image Salahutdinov Dmitry Updated on ・4 min read

Introduction

Enum - is a user-defined data type representing a set of named values. In general, the idea of enums aims to provide 2 main purposes:

  • to make codebase more readable and human-friendly
  • to add extra type safety by limiting possible assignable value to the variable of enum type

Enum is based on integer type data type, what is definitely good for performance. If we are going to deal with a lot of data.

The "Old school" way

The most common way of using "enum" is the definite state of some object. In spite of having no enums in pure Ruby, we are able to emulate them with constants or symbols. Here is the sample of old-styled Rails 3 (legacy) data-model source code:

    class Account < ActiveRecord::Base
      PAID_STATUS = 'paid'
      OVERDUE_STATUS = 'ovedue'
      BLOCKED_STATUS = 'blocked'

      STATUSES = [PAID_STATUS, OVERDUE_STATUS, BLOCKED_STATUS].freeze
      validates :status, presence: true, inclusion: {in: STATUSES}

      def paid?
        self.status == PAID_STATUS
      end
    end

    # everywhere we work with Account instance - it is much easier
    # to deal with helper predicate instead of comparing values "in place"
    if account.paid?
      # do something special for the user who pays
    end

It gives more readability to the codebase and makes it looks friendly for a new developer! But it is overloaded with boilerplates - constants definition and values checking. This checks spread all over the codebase and make maintenance more difficult.

The "Rails way"

Staring from Rails 4.1 Active Record provides easier way to use enums:

    class Conversation < ActiveRecord::Base
      enum status: [ :active, :archived ]
    end

and we've got some useful mutation methods, check value methods and scopes out of the box, g.e.:

    conversation.active! # sets the active status for object
    conversation.active? # checks if the object have active status
    Conversation.action.find_each { |c| puts c.status } # provides scope for searching

We've got source code to be more readable and maintainable.
Validation comes out of the box, and it raises the ArgumentError if a value is not defined.

    conversation.status = :unknown
    => ArgumentError: 'unknown' is not a valid channel

Default behavior supposes to use numeric as the underlying type of the column:

    create_table :conversations do |t|
      t.integer :status
    end

And Active Record will assign values for your named constants:

    [1] pry(main)> Conversation.statuses
    => {"active"=>0, "archived"=>1}

Everything is great until we will need to add value to your enumeration (and make it first):

    class Conversation < ActiveRecord::Base
      enum status: [ :initial, :active, :archived ]
    end

This change will force Active Record to re-enumerate values starting from zero. And values will mix up and crash your application logic:

    > Conversation.statuses
    => {"initial"=>0, "active"=>1, "archived"=>2}

The "Rails way" with Postgres Enums

We have to points to do it:

  • define enum values explicitly with model definition
  • introduce Postgres enum type to declare underlying database column data type

Here is the model with channel attribute having enumerated values:

    class UserNotification < ActiveRecord::Base
      # define rails enum and the underlying values to use for every enum-value
      enum channel: {
        email: 'email',
        webpush: 'webpush',
        telegram: 'telegram'
      }
    end

It means that we will have email, webpush and telegram - as values for column channel.

Write migration to create Postgres enum-type user_notification_channel:

    class CreateUserNotifications < ActiveRecord::Migration
      def up
        execute <<-DDL
          CREATE TYPE user_notification_channel AS ENUM (
            'email', 'webpush', 'telegram'
          );
        DDL

        create_table :user_notifications do |t|
          ...
          t.column :channel, :user_notification_channel
        end
      end

      def down
        drop_column :channel, :user_notification_channel
        execute "DROP type user_notification_channel;"
      end
    end

This trick lets us have 'string' value in Ruby:

    notification.channel
    => "email"

but occupies only 4 bytes in a database row.

Main benefits of this complex "Rails&Postgres" way:

  • readable and user-friendly source code
  • string representation of enum in Ruby with optimal data storage in the database
  • more type safety on database level
  • meaningful and clear database structure out of Rails-application context

How to examples

Migrating existing text column to Postgres Enum

The reasons for doing this are:

  • reduce table size (fixed 4 bytes enum vs string length)
  • increase type safety and database data consistency
  • make database values more meaningful and friendly (out of models context)
    class CreateUserNotifications < ActiveRecord::Migration
      def up
        execute <<-DDL
          ALTER TABLE user_notifications
          ALTER COLUMN channel TYPE user_notification_channel
          USING channel::user_notification_channel;
        DDL
      end

      def down
        # moves column back to have text data-type
        execute <<-DDL
          ALTER TABLE user_notifications
          ALTER COLUMN channel TYPE text
          USING channel::text;
        DDL
      end
    end

This works well without breaking any changes!

Add value to existing enum migration:

    class ExpandValueOfUserNotificationChannels < ActiveRecord::Migration
      def up
        execute <<-DDL
          ALTER TYPE user_notification_channel ADD VALUE 'post_pigeon';
        DDL
      end
    end

Drop value from existing enum

Postgres does not allow to drop value from existing enum, but here is some trick to drop value:

  • migrate the column to text
  • re-create the new enum type
  • migrate back to the new enum
    class RemoveTelegramFromUserNotificationChannel < ActiveRecord::Migration
      def up
        execute <<-DDL
          ALTER TABLE user_notifications 
          ALTER COLUMN channel TYPE text
          USING channel::text;

          DROP TYPE user_notification_channel;

          CREATE TYPE user_notification_channel AS ENUM (
            'email', 'webpush'
          );

          ALTER TABLE user_notifications
          ALTER COLUMN channel TYPE user_notification_channel
          USING channel::user_notification_channel;
        end
    end

Migrate from numeric Rails enum to Postgres Enum

Imagine we have Rails enums and the integer channel column in database:

    class UserNotification < ActiveRecord::Base
      enum channel: [:email, :webpush, :telegram]
    end

Numeric values of each Rails enum value could be gotten by the following code:

    [1] pry(main)> UserNotification.channels
    => {"email"=>0, "webpush"=>1, "telegram"=>3}

Just alter column by changing type and transform the value from numeric to the enum:

    class MigrateNotificationsChannelToPostgresEnum < ActiveRecord::Migration
      def up
        execute <<-DDL
          ALTER TABLE user_notifications
          ALTER COLUMN channel TYPE user_notification_channel
          USING CASE channel
                  WHEN 0 THEN 'email'::user_notification_channel
                  WHEN 1 THEN 'webpush'::user_notification_channel
                  WHEN 2 THEN 'telegram'::user_notification_channel
                  ELSE RAISE 'UNKNOWN VALUE'
          END
        DDL
      end
    end

This migration will transform numeric values to enums. Profit! 🙂

Conclusion

Postgres enums are great and helpful!
They are very easy to integrate with Rails!
They help to ensure data consistency for the definite set of values. The increase human-readability of the data out of the Rails application context.
They also have better performance (comparing to text values).

Thanks for reading.

Posted on Jul 11 '18 by:

dsalahutdinov profile

Salahutdinov Dmitry

@dsalahutdinov

Lead backend developer (Ruby) at Amplifr, testing fan, simplicity propagandist Family man, nature-lover, mathematician, red car driver

Amplifr.com

The easiest way to publish and analyze content on social media for media companies, e-commerce and small businesses.

Discussion

markdown guide
 

Everything is great until we will need to add value to your enumeration

You can define your enums as a hash to make the mapping explicit, then AR will not remap your values when you add something.

 

Also if storage size is an issue then you can add the limit: option to the migration, which let's you select integer sizes:

gist.github.com/stream7/1069589/e0...

So in your example something like:

t.integer :status, null: false, limit: 1

On MySQL this will become a tinyint (1 byte) and on Postgres a smallint (2 bytes). Of course this doesn't solve the problem of enum values not being meaningful outside the app context.

 

Michael, thanks for your comment!
This is great idea, which comes after the fail without having explicit values first :)

 

Thanks for sharing this knowledge!

I noticed a mistake on the migration:

              WHEN 0 THEN 'email'::user_notification_channel
              WHEN 0 THEN 'webpush'::user_notification_channel
              WHEN 0 THEN 'facebook'::user_notification_channel

Every value is 0, I think it should be 0, 1 and 2.

 

Nice post @dsalahutdinov , apart from what @antico5 says I see another detail here, is that the last one should be

WHEN 2 THEN 'telegram'::user_notification_channel

instead of

WHEN 2 THEN 'facebook'::user_notification_channel
 
 

Thanks for the article. I was already looking to Postgres Enums for a solution to my data validation and now you've given me the path to my proposed solution. Happy days.

 
 

thanx for sharing! should be remove_column instead of drop_column in migration.