DEV Community

loading...

Updating data in a ClickHouse cluster replacing partitions

Javier Vidal
Developer.
・5 min read

Using the ALTER TABLE ... UPDATE statement in ClickHouse is a heavy operation not designed for frequent use. If we design our schema to insert/update a whole partition at a time, we could update large amounts of data easily. Doing it in a simple MergeTree table is quite simple, but doing it in a cluster with replicated tables is trickier. Let's see how could be done.

In this example we use a cluster test_cluster that consists of 2 nodes. We will create a replicated table in each node and a distributed table that we could use to parallelize reading.

In each node we should have the test_cluster configured in the /etc/clickhouse-server/config.xml configuration file:

<remote_servers>
  <test_cluster>
    <shard>
      <internal_replication>true</internal_replication>
      <replica>
        <host>192.168.33.201</host>
        <port>9000</port>
      </replica>
      <replica>
        <host>192.168.33.202</host>
        <port>9000</port>
      </replica>
    </shard>
  </test_cluster>
</remote_servers>
Enter fullscreen mode Exit fullscreen mode

In this example we have the IP addresses but we could have the hostnames if they were correctly configured in the /etc/hosts file.

We will use also a couple of macros in /etc/clickhouse-server/config.d/macros.xml. They will be used as parameters in ON CLUSTER DDL statements. In the first ClickHouse server (ch1) we have:

<yandex>
  <macros>
    <shard>shard_01</shard>
    <replica>replica_01</replica>
  </macros>
</yandex>
Enter fullscreen mode Exit fullscreen mode

And in the second one (ch2):

<yandex>
  <macros>
    <shard>shard_01</shard>
    <replica>replica_02</replica>
  </macros>
</yandex>
Enter fullscreen mode Exit fullscreen mode

Let's connect to one of the nodes and create the database test. As we use ON CLUSTER this statement will be executed in all the nodes of the cluster:

CREATE DATABASE IF NOT EXISTS test ON CLUSTER test_cluster
Enter fullscreen mode Exit fullscreen mode

Let's create the replicated table. This will be the table that will store all the data. We will perform the partitioning by hour and insert all the data for that hour in a single insert.

CREATE TABLE IF NOT EXISTS test.hits_local ON CLUSTER test_cluster
(
  `date` Datetime,
  `user_id` String,
  `pageviews` Int32
)
ENGINE = ReplicatedMergeTree('/clickhouse/test_cluster/tables/{shard}/hits_local', '{replica}')
PARTITION BY toStartOfHour(date)
ORDER BY (date)
Enter fullscreen mode Exit fullscreen mode

Then we create a table that will be used just to perform the inserts. It must be created in all the nodes of the cluster and its engine must be MergeTree, otherwise we could end up experiencing a lot of ZooKeeper errors. We will call this table the temporary table (notice that it's not a real TEMPORARY table).

CREATE TABLE IF NOT EXISTS test.hits_local_tmp ON CLUSTER test_cluster
AS test.hits_local
ENGINE = MergeTree()
PARTITION BY toStartOfHour(date)
ORDER BY (date)
Enter fullscreen mode Exit fullscreen mode

Finally we could create the distributed table:

CREATE TABLE IF NOT EXISTS test.hits ON CLUSTER 'test_cluster'
AS test.hits_local
ENGINE = Distributed('test_cluster', 'test', 'hits_local')
Enter fullscreen mode Exit fullscreen mode

So now we have 3 tables in each of the nodes. In ch1:

ch1 :) use test
ch1 :) show tables
┌─name───────────┐
│ hits           │
│ hits_local     │
│ hits_local_tmp │
└────────────────┘
Enter fullscreen mode Exit fullscreen mode

And in ch2:

ch2 :) use test
ch2 :) show tables
┌─name───────────┐
│ hits           │
│ hits_local     │
│ hits_local_tmp │
└────────────────┘
Enter fullscreen mode Exit fullscreen mode

Let's insert some values in the temporary table:

ch1 :) INSERT INTO test.hits_local_tmp VALUES
('2020-10-27 10:00:00', 'user 1', 7),
('2020-10-27 10:00:00', 'user 2', 3),
('2020-10-27 10:00:00', 'user 3', 1),
('2020-10-27 10:00:00', 'user 4', 12)
Enter fullscreen mode Exit fullscreen mode

We can check that now we have a new partition:

ch1 :) SELECT partition, name, active FROM system.parts WHERE table = 'hits_local_tmp'
┌─partition───────────┬─name─────────────┬─active─┐
│ 2020-10-27 10:00:00 │ 1603792800_1_1_0 │      1 │
└─────────────────────┴──────────────────┴────────┘
Enter fullscreen mode Exit fullscreen mode

The idea is to move this partition in the temporary table to the distributed table. But we can not do it using MOVE PARTITION:

ch1 :) ALTER TABLE test.hits_local_tmp MOVE PARTITION '2020-10-27 10:00:00' TO TABLE test.hits_local
Received exception from server (version 20.10.2):
  Code: 48. DB::Exception: Received from 127.0.0.1:9000. DB::Exception: Table test.hits_local_tmp (400fd602-745e-4904-9f61-76ccda9eaf0a) supports movePartitionToTable only for MergeTree family of table engines. Got ReplicatedMergeTree.
Enter fullscreen mode Exit fullscreen mode

The good news is that we can do a REPLACE PARTITION and delete the original partition in the temporary table (REPLACE PARTITION does not delete the source partition). The REPLACE PARTITION works even if we don't have a partition to replace in the destination table:

ch1 :) ALTER TABLE test.hits_local REPLACE PARTITION '2020-10-27 10:00:00' FROM test.hits_local_tmp
ch1 :) ALTER TABLE test.hits_local_tmp DROP PARTITION '2020-10-27 10:00:00'
Enter fullscreen mode Exit fullscreen mode

Now we have the partition in the distributed table:

ch1 :) SELECT partition, name, active FROM system.parts WHERE table = 'hits_local'
┌─partition───────────┬─name─────────────┬─active─┐
│ 2020-10-27 10:00:00 │ 1603792800_1_1_0 │      1 │
└─────────────────────┴──────────────────┴────────┘
Enter fullscreen mode Exit fullscreen mode

The data are there:

ch1 :) select * from test.hits_local
┌────────────────date─┬─user_id─┬─pageviews─┐
│ 2020-10-27 10:00:00 │ user 1  │         7 │
│ 2020-10-27 10:00:00 │ user 2  │         3 │
│ 2020-10-27 10:00:00 │ user 3  │         1 │
│ 2020-10-27 10:00:00 │ user 4  │        12 │
└─────────────────────┴─────────┴───────────┘
Enter fullscreen mode Exit fullscreen mode

We can query them through the distributed table:

ch1 :) select * from test.hits
┌────────────────date─┬─user_id─┬─pageviews─┐
│ 2020-10-27 10:00:00 │ user 1  │         7 │
│ 2020-10-27 10:00:00 │ user 2  │         3 │
│ 2020-10-27 10:00:00 │ user 3  │         1 │
│ 2020-10-27 10:00:00 │ user 4  │        12 │
└─────────────────────┴─────────┴───────────┘
Enter fullscreen mode Exit fullscreen mode

They are replicated to the other node:

ch2 :) select * from test.hits_local
┌────────────────date─┬─user_id─┬─pageviews─┐
│ 2020-10-27 10:00:00 │ user 1  │         7 │
│ 2020-10-27 10:00:00 │ user 2  │         3 │
│ 2020-10-27 10:00:00 │ user 3  │         1 │
│ 2020-10-27 10:00:00 │ user 4  │        12 │
└─────────────────────┴─────────┴───────────┘
Enter fullscreen mode Exit fullscreen mode

And the partition has been created in the second node too:

ch2 :) SELECT partition, name, active FROM system.parts WHERE table = 'hits_local'
┌─partition───────────┬─name─────────────┬─active─┐
│ 2020-10-27 10:00:00 │ 1603792800_1_1_0 │      1 │
└─────────────────────┴──────────────────┴────────┘
Enter fullscreen mode Exit fullscreen mode

Now let's do the update of the whole partition. We have to perform the insert again in the temporary table:

ch1 :) INSERT INTO test.hits_local_tmp VALUES
('2020-10-27 10:00:00', 'user 1', 6),
('2020-10-27 10:00:00', 'user 2', 4),
('2020-10-27 10:00:00', 'user 3', 2),
('2020-10-27 10:00:00', 'user 4', 15),
('2020-10-27 10:00:00', 'user 5', 14)
Enter fullscreen mode Exit fullscreen mode

A new partition is created:

ch1 :) SELECT partition, name, active FROM system.parts WHERE table = 'hits_local_tmp'
┌─partition───────────┬─name─────────────┬─active─┐
│ 2020-10-27 10:00:00 │ 1603792800_3_3_0 │      1 │
└─────────────────────┴──────────────────┴────────┘
Enter fullscreen mode Exit fullscreen mode

We can use it to replace the data in the replicated table:

ch1 :) ALTER TABLE test.hits_local REPLACE PARTITION '2020-10-27 10:00:00' FROM test.hits_local_tmp
ch1 :) ALTER TABLE test.hits_local_tmp DROP PARTITION '2020-10-27 10:00:00'
Enter fullscreen mode Exit fullscreen mode

We see that we only have a partition in the first node:

ch1 :) SELECT partition, name, active FROM system.parts WHERE table = 'hits_local'
┌─partition───────────┬─name─────────────┬─active─┐
│ 2020-10-27 10:00:00 │ 1603792800_3_3_0 │      1 │
└─────────────────────┴──────────────────┴────────┘
Enter fullscreen mode Exit fullscreen mode

The new data are there:

ch1 :) select * from test.hits_local
┌────────────────date─┬─user_id─┬─pageviews─┐
│ 2020-10-27 10:00:00 │ user 1  │         6 │
│ 2020-10-27 10:00:00 │ user 2  │         4 │
│ 2020-10-27 10:00:00 │ user 3  │         2 │
│ 2020-10-27 10:00:00 │ user 4  │        15 │
│ 2020-10-27 10:00:00 │ user 5  │        14 │
└─────────────────────┴─────────┴───────────┘
Enter fullscreen mode Exit fullscreen mode

They are replicated to the second node:

ch2 :) select * from test.hits_local
┌────────────────date─┬─user_id─┬─pageviews─┐
│ 2020-10-27 10:00:00 │ user 1  │         6 │
│ 2020-10-27 10:00:00 │ user 2  │         4 │
│ 2020-10-27 10:00:00 │ user 3  │         2 │
│ 2020-10-27 10:00:00 │ user 4  │        15 │
│ 2020-10-27 10:00:00 │ user 5  │        14 │
└─────────────────────┴─────────┴───────────┘
Enter fullscreen mode Exit fullscreen mode

And the distributed table is fine too:

ch1 :) select * from test.hits
┌────────────────date─┬─user_id─┬─pageviews─┐
│ 2020-10-27 10:00:00 │ user 1  │         6 │
│ 2020-10-27 10:00:00 │ user 2  │         4 │
│ 2020-10-27 10:00:00 │ user 3  │         2 │
│ 2020-10-27 10:00:00 │ user 4  │        15 │
│ 2020-10-27 10:00:00 │ user 5  │        14 │
└─────────────────────┴─────────┴───────────┘
Enter fullscreen mode Exit fullscreen mode

Discussion (0)