DEV Community

dcopensource
dcopensource

Posted on

 

Compatibility of GitLab on CockroachDB and YugabyteDB (I) - System Initialization

Testing Background

GitLab is a globally popular source code management tool. In earlier versions, users could choose to use either MySQL or PostgreSQL, but since version 12.1.0, the official support for MySQL has been dropped completely.
Image description

Many of the features in the new version of GitLab are based on PostgreSQL, which is the benchmark for many products that use PostgreSQL as the underlying data store.

Imagine a scenario where a large group is divided into divisions and each division or even a small team may maintain its own GitLab, making it tricky to manage these repositories from the group level. For example.

  • Versioning issues (open source and commercial versions, high and low versions)
  • Fine-grained permission control
  • Data backups
  • Infrastructure utilization

Having a unified GitLab environment with good scalability and high availability would certainly be the best solution. But the traditional standalone PostgreSQL database does not meet the above needs, so can we consider running GitLab on a distributed database?

CockroachDB and YugabyteDB are relatively well-known new open source distributed databases that implement the PG protocol, and according to the descriptions on their respective official websites.

CockroachDB supports the PostgreSQL wire protocol and the majority of PostgreSQL syntax. This means that existing applications built on PostgreSQL can often be migrated to CockroachDB without changing application code. (reference

YugabyteDB is a high-performance, cloud-native distributed SQL database that aims to support all PostgreSQL features. (reference

CockroachDB says it supports most PG syntax, and YugabyteDB says it supports all PG features. This series of review articles is used to compare how well these two databases support GitLab, and to a certain extent reflects the compatibility with standard PostgreSQL.

Test Environment

  • CockroachDB
  defaultdb=# select version();
                                           version
  -----------------------------------------------------------------------------------------
   CockroachDB CCL v21.2.2 (x86_64-unknown-linux-gnu, built 2021/12/01 14:35:45, go1.16.6)
  (1 row)
Enter fullscreen mode Exit fullscreen mode
  • YugabyteDB
  postgres=# select version();
                                                    version
  ------------------------------------------------------------------------------------------------------------
   PostgreSQL 11.2-YB-2.9.1.0-b0 on x86_64-pc-linux-gnu, compiled by gcc (Homebrew gcc 5.5.0_4) 5.5.0, 64-bit
  (1 row)
Enter fullscreen mode Exit fullscreen mode
  • GitLab
  GitLab information
  Version:        12.1.0-ee
  Revision:       1f2e6f3f6d8
  Directory:      /home/git/gitlab
  DB Adapter:     PostgreSQL
Enter fullscreen mode Exit fullscreen mode

GitLab deployed with standard PostgreSQL contains the following database schema:

  gitlab_production=# select C.relkind,count(C.relname) from pg_class C left join pg_namespace n on n.oid = C.relnamespace where n.nspname = 'public' group by C.relkind;
   relkind | count
  ---------+-------
   r       |   249
   i       |   903
   S       |   231
  (3 rows)
Enter fullscreen mode Exit fullscreen mode

CockroachDB Startup Process

1. Database initialization

Execute the GitLab setup program to generate the required database schema.

dc@dc-virtual-machine:/home/git/gitlab$ sudo -u git -H bundle exec rake gitlab:setup RAILS_ENV=production
This will create the necessary database tables and seed the database.
You will lose any previous data stored in the database.
Do you want to continue (yes/no)? yes

Dropped database 'gitlab'
Created database 'gitlab'
-- enable_extension("pg_trgm")
rake aborted!
ActiveRecord::StatementInvalid: PG::FeatureNotSupported: ERROR:  unimplemented: extension "pg_trgm" is not yet supported
HINT:  You have attempted to use a feature that is not yet implemented.
See: https://go.crdb.dev/issue-v/51137/v21.2
: CREATE EXTENSION IF NOT EXISTS "pg_trgm"
/home/git/gitlab/config/initializers/peek.rb:18:in `async_exec_params'
/home/git/gitlab/config/initializers/peek.rb:18:in `exec_params'
/home/git/gitlab/vendor/bundle/ruby/2.6.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/postgresql_adapter.rb:611:in `block (2 levels) in exec_no_cache'
....
Enter fullscreen mode Exit fullscreen mode

As you can see from the output above, GitLab initialization relies on PostgreSQL's Extension feature, but unfortunately CockroachDB does not currently support it, and fails in the first step, when no objects are created in the database.

gitlab=# select C.relkind,count(C.relname) from pg_class C left join pg_namespace n on n.oid = C.relnamespace where n.nspname = 'public' group by C.relkind;
Empty set
Enter fullscreen mode Exit fullscreen mode

2. Visit GitLab

When we visit the main GitLab page it will return a 502 error message.

Image description

From the logs, it is because the SQL execution could not find the target table when it reported the error.

ActiveRecord::StatementInvalid: PG::UndefinedTable: ERROR:  relation "geo_nodes" does not exist
:               SELECT a.attname, format_type(a.atttypid, a.atttypmod),
                     pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod,
                     c.collname, col_description(a.attrelid, a.attnum) AS comment
                FROM pg_attribute a
                LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum
                LEFT JOIN pg_type t ON a.atttypid = t.oid
                LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation
               WHERE a.attrelid = '"geo_nodes"'::regclass
                 AND a.attnum > 0 AND NOT a.attisdropped
               ORDER BY a.attnum
Enter fullscreen mode Exit fullscreen mode

3. Update database version

Considering that the current version of CockroachDB is not the latest version, is it possible that the latest version already supports extension function, try to upgrade the version to latest-v22.1:

defaultdb=# select version();
                                      version
------------------------------------------------------------------------------------
 CockroachDB CCL v22.1.0 (x86_64-pc-linux-gnu, built 2022/05/23 16:27:47, go1.17.6)
(1 row)
Enter fullscreen mode Exit fullscreen mode

Executing setup again to create the database, I still find the same problem "ActiveRecord::StatementInvalid: PG::FeatureNotSupported: ERROR: unimplemented: extension "pg_trgm " is not yet supported", indicating that the extension feature is not supported in the new version either.

YugabyteDB Startup Process

1. Database initialization

Modify the GitLab configuration file to switch the database connection to YugabyteDB and initialize a new repository in the same way.

dc@dc-virtual-machine:/home/git/gitlab$ sudo -u git -H bundle exec rake gitlab:setup RAILS_ENV=production
This will create the necessary database tables and seed the database.
You will lose any previous data stored in the database.
Do you want to continue (yes/no)? yes

Dropped database 'gitlab'
Created database 'gitlab'
-- enable_extension("pg_trgm")
   -> 2.5496s
-- enable_extension("plpgsql")
   -> 0.1143s
-- create_table("abuse_reports", {:id=>:serial, :force=>:cascade})
   -> 0.3709s
-- create_table("appearances", {:id=>:serial, :force=>:cascade})
   -> 0.3022s
...
...
-- create_table("issue_tracker_data", {:force=>:cascade})
   -> 3.7627s
-- create_table("issues", {:id=>:serial, :force=>:cascade})
rake aborted!
ActiveRecord::StatementInvalid: PG::InternalError: ERROR:  index method "ybgin" not supported yet
HINT:  See https://github.com/YugaByte/yugabyte-db/issues/1337. Click '+' on the description to raise its priority
: CREATE  INDEX  "index_issues_on_description_trigram" ON "issues" USING gin ("description" gin_trgm_ops)
/home/git/gitlab/vendor/bundle/ruby/2.6.0/gems/peek-pg-1.3.0/lib/peek/views/pg.rb:17:in `async_exec'
/home/git/gitlab/vendor/bundle/ruby/2.6.0/gems/peek-pg-1.3.0/lib/peek/views/pg.rb:17:in `async_exec'
Enter fullscreen mode Exit fullscreen mode

From the above output information, we can see that at first setup runs normally and can create extension and table normally, but after about 20 minutes, it fails to create index, because YugabyteDB can't recognize "gin" type index, and the type instead is "ybgin" instead.

Look at the objects generated by the database up to this point:

gitlab=# select C.relkind,count(C.relname) from pg_class C left join pg_namespace n on n.oid = C.relnamespace where n.nspname = 'public' group by C.relkind;
 relkind | count
---------+-------
 S       |   113
 i       |   391
 r       |   117
(3 rows)
Enter fullscreen mode Exit fullscreen mode

The situation looks a little better than CockroachDB, but still much worse than the full database schema.

2. Visit GitLab

At this point, the main GitLab page is still inaccessible, and from the logs, I found that the reason for the error is that the target table is missing.

source=rack-timeout id=7gatOugcqB8 timeout=60000ms state=ready
Started GET "/" for 10.3.74.126 at 2022-05-27 16:05:31 +0800
Processing by RootController#index as HTML
Completed 500 Internal Server Error in 78ms (ActiveRecord: 58.8ms | Elasticsearch: 0.0ms)

ActiveRecord::StatementInvalid (PG::UndefinedTable: ERROR:  relation "projects" does not exist
LINE 8:                WHERE a.attrelid = '"projects"'::regclass
                                          ^
:               SELECT a.attname, format_type(a.atttypid, a.atttypmod),
                     pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod,
                     c.collname, col_description(a.attrelid, a.attnum) AS comment
                FROM pg_attribute a
                LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum
                LEFT JOIN pg_type t ON a.atttypid = t.oid
                LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation
               WHERE a.attrelid = '"projects"'::regclass
                 AND a.attnum > 0 AND NOT a.attisdropped
               ORDER BY a.attnum
):
Enter fullscreen mode Exit fullscreen mode

Image description

3. Update database version

Similarly, we tried to upgrade YugabytesDB to the latest version to see if Gin index compatibility has been completed:

postgres=# select version();
                                                                                         version
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 11.2-YB-2.13.2.0-b0 on x86_64-pc-linux-gnu, compiled by clang version 12.0.1 (https://github.com/yugabyte/llvm-project.git bdb147e675d8c87cee72cc1f87c4b82855977d94), 64-bit
(1 row)
Enter fullscreen mode Exit fullscreen mode

Execute the setup program again, the process is relatively smooth, about 30 minutes later the program exits normally without errors. At this point we look at the objects in the database.

gitlab=# select C.relkind,count(C.relname) from pg_class C left join pg_namespace n on n.oid = C.relnamespace where n.nspname = 'public' group by C.relkind;
 relkind | count
---------+-------
 S       |   231
 i       |   903
 r       |   249
(3 rows)
Enter fullscreen mode Exit fullscreen mode

You can see that the comparison with the standard PostgreSQL library is exactly the same. Opening a browser to visit the GitLab homepage automatically jumps to the login page, and checking the logs without error reporting.

Image description

Fill out the user registration form and submit, then the new user will be registered successfully and automatically jump to the main page of GitLab.

Image description

Initially, GitLab functionality is not affected by switching databases. More detailed tests will be presented to you in the next issue.

Test Conclusion

1、CockroachDB v21.2 does not support Extension function, so GitLab cannot initialize the database, and finally fails to start, but the problem still exists after updating to the latest version v22.1.

2、YugabyteDB v2.9 does not support Gin Index (Generalized inverted indexes), resulting in an error after creating a part of the table, which also can not be started, but after updating to the latest version v2.13, the problem is solved, and you can access GitLab page and register users normally.

3、YugabyteDB supports PostgreSQL Extension, CockroachDB does not.

The Next Step

Next we will try to bypass the GitLab database generation step and import a standard GitLab library with data into CockroachDB and YugabyteDB, select some frequently used read and write scenarios, and then compare their compatibility performance.

Top comments (2)

Collapse
 
ajwerner profile image
Andrew Werner • Edited

CockroachDB will have pg_trgm support in 22.2, due out in another month or so. There are betas up now and an RC should be out in a couple of weeks!

github.com/cockroachdb/cockroach/p...

Collapse
 
franckpachot profile image
Franck Pachot

Yes YugabyteDB has GIN indexes as of 2.11 and pg_trgm (supported from the get-go because YugabyteDB query layer is PostgreSQL compatible) can use it with gin_trgm_ops.

Timeless DEV post...

Git Concepts I Wish I Knew Years Ago

The most used technology by developers is not Javascript.

It's not Python or HTML.

It hardly even gets mentioned in interviews or listed as a pre-requisite for jobs.

I'm talking about Git and version control of course.

One does not simply learn git