Franck Pachot for YugabyteDB

Posted on Nov 4, 2022 • Updated on Nov 16, 2022

Mastodon on YugabyteDB

#mastodon #yugabytedb #postgres #distributed

Twitter is changing and lot of users are moving to Mastodon. With this increase of load, had to scale up their database:

<br>
@Gargron<br>
From 8 cores to 36 cores processing power, let's see how the new database server holds up. Hopefully everything will be much faster from now on, but it should also give me a bit more room to scale horizontally.<br> https://mastodon.social/@Gargron/109276158158046472

Wouldn't it be easier to scale out? Mastodon uses PostgreSQL, so let's try it on YugabyteDB, which is the Open Source Postgres-compatible Distributed SQL database.

My test is available here, ready to run on GitPod: https://github.com/FranckPachot/gitpod-mastodon-yb

Changes for YugabyteDB

I forked this and changed the base image from gitpod/workspace-postgresql to gitpod/workspace-yugabytedb.

As the default port for YSQL, the YugabyteDB PostgreSQL-compatible endpoint is 5433 I changed the DB parameters in .env.production.example to:

DB_HOST=127.0.0.1
DB_PORT=5433
DB_NAME=mastodon
DB_USER=yugabyte
DB_PASS=

The YSQL endpoint takes a few seconds to be available, then I wait that it is available and create a mastodon database:

      # wait until YSQL is available, create the database and utility function
      until ysqlsh <<<'
       create database mastodon;\c mastodon
       ' ; do sleep 1 ; done 2>/dev/null
      # create the database
      RAILS_ENV=production rails --trace db:setup

The creation of the database failed on two index creation:

CREATE UNIQUE INDEX index_ip_blocks_on_ip ON public.ip_blocks USING btree (ip);

NOTICE:  index method "btree" was replaced with "lsm" in YugabyteDB
ERROR:  INDEX on column of type 'INET' not yet supported

I'll not create this index. If it is required for performance reasons, there are some alternatives (use text instead of inet datatype), and a git issue can be opened to support it in Yugabyte.

CREATE UNIQUE INDEX index_unique_conversations ON public.account_conversations USING btree (account_id, conversation_id, participant_account_ids);

NOTICE:  index method "btree" was replaced with "lsm" in YugabyteDB
ERROR:  INDEX on column of type 'user_defined_type' not yet supported

The participant_account_ids is bigint[]. I'm not sure how this makes sense in a unique index because arrays with the same items in different order will be considered different.
Then, I'll create a non-volatile function that sorts the items and returns them as text:

       create or replace function array_signature(a bigint[])
       returns text as $$
       select array_agg(unnest order by unnest)::text
       from unnest(a);
       $$ immutable language sql;

and create the index as:

 CREATE UNIQUE INDEX index_unique_conversations ON public.account_conversations 
       (account_id, conversation_id, (array_signature(participant_account_ids)));

Those are quick workaround. If you want to run Mastodon with YugabyteDB in production, please contact the Yugabyte community: https://www.yugabyte.com/community/

To be able to create the database, I've removed the indexes from the schema definition:

      # remove indexes that are not supported
      sed -e '/"index_unique_conversations"/d' -i db/schema.rb
      sed -e '/"index_ip_blocks_on_ip"/d' -i db/schema.rb

then run the creation:

      # create the database
      RAILS_ENV=production rails --trace db:setup

This works:

If you are concerned by the performance of DDL, there are ways to improve it (like CREATE INDEX NONCONCURRENTLY).

Then, once created, I create the unique index with the function on the array:

      # add indexes with DDL (this may be done from schema.rb)
      ysqlsh -e -c "
       create or replace function array_signature(a bigint[])
       returns text as '
       select array_agg(unnest order by unnest)::text
       from unnest(a);
       ' immutable language sql;
      " -c '
       CREATE UNIQUE INDEX index_unique_conversations ON public.account_conversations 
       (account_id, conversation_id, (array_signature(participant_account_ids)));
       ' mastodon
      # end modifications for YugabyteDB

Testing

The logs shows the users and passwords created:

I create some conversations:

And look at the account_conversations; table to check that all is ok:

In conclusion

I love: Gitpod, Mastodon, PostgreSQL and YugabyteDB. If you have an application on PostgreSQL, you can get High Availability (resilience to failures and rolling upgrades) and Elasticity (scale out and down without application downtime) with YugabyteDB. If something is not supported, please open a git issue or ask for a workaround, there are always many possibilities. When moving from monolithic to distributed database, it is also important to check the indexes (hash and range sharding) and test performance.

DEV Community

Mastodon on YugabyteDB

Changes for YugabyteDB

Testing

In conclusion

Top comments (0)

Read next

Azure Data Lake: Essential Insights for Optimizing Your Data Strategy

Learning: Performing Backup and Recovery

Implementing CSV Data Export in React Without External Libraries

Storage server