DEV Community

Michael
Michael

Posted on • Originally published at gbase.cn

GBase 8a Partitioning Best Practices: From Design to Operations

In a gbase database handling tens of billions of rows, smart partitioning is no longer optional — it's mandatory. This guide covers partition type selection, pruning validation, daily maintenance, and common pitfalls to help you build a solid partitioning strategy for your GBASE cluster.

Supported Partitioning Types

GBase 8a offers four partitioning methods:

  • RANGE – split by column value intervals (ideal for time‑series data)
  • LIST – split by discrete enumerated values (e.g., region, status)
  • HASH – distribute evenly via hash modulus
  • KEY – similar to HASH, but the database chooses the hash function

In practice, RANGE partitioning on a date column is the most common choice.

RANGE Partitioning

Monthly Partitioning Example

CREATE TABLE orders (
    order_id     BIGINT NOT NULL,
    customer_id  BIGINT NOT NULL,
    order_date   DATE NOT NULL,
    amount       DECIMAL(12,2),
    status       VARCHAR(20),
    PRIMARY KEY (order_id, order_date)   -- partition key must be part of PK
)
PARTITION BY RANGE (TO_DAYS(order_date)) (
    PARTITION p202401 VALUES LESS THAN (TO_DAYS('2024-02-01')),
    PARTITION p202402 VALUES LESS THAN (TO_DAYS('2024-03-01')),
    PARTITION p202403 VALUES LESS THAN (TO_DAYS('2024-04-01')),
    PARTITION p202404 VALUES LESS THAN (TO_DAYS('2024-05-01')),
    PARTITION p202405 VALUES LESS THAN (TO_DAYS('2024-06-01')),
    PARTITION p202406 VALUES LESS THAN (TO_DAYS('2024-07-01')),
    PARTITION p_future VALUES LESS THAN MAXVALUE  -- catch-all
);
Enter fullscreen mode Exit fullscreen mode

Key rules:

  • The partition key column must be included in the primary key or unique key, otherwise the DDL will fail.
  • For daily partitioning, keep the total partition count under 1024 to avoid metadata overhead.

LIST Partitioning Example

CREATE TABLE sales_regional (
    sale_id   BIGINT NOT NULL,
    region    VARCHAR(20) NOT NULL,
    sale_date DATE NOT NULL,
    amount    DECIMAL(12,2),
    PRIMARY KEY (sale_id, region)
)
PARTITION BY LIST COLUMNS (region) (
    PARTITION p_north   VALUES IN ('北京','天津','河北','山西','内蒙古'),
    PARTITION p_east    VALUES IN ('上海','江苏','浙江','安徽','山东'),
    PARTITION p_south   VALUES IN ('广东','广西','海南','福建'),
    PARTITION p_central VALUES IN ('湖北','湖南','河南','江西'),
    PARTITION p_west    VALUES IN ('四川','重庆','云南','贵州','西藏'),
    PARTITION p_other   VALUES IN ('其他')
);
Enter fullscreen mode Exit fullscreen mode

Partition Pruning

Pruning kicks in only when the WHERE clause references the partition key. Verify it with EXPLAIN PARTITIONS:

EXPLAIN PARTITIONS
SELECT * FROM orders
WHERE order_date BETWEEN '2024-03-01' AND '2024-03-31';
Enter fullscreen mode Exit fullscreen mode
  • ✅ If only p202403 appears in the partitions column, pruning works.
  • ❌ If all partitions are listed, pruning is disabled — check your query.

Common pruning killers:

  • Applying a function: WHERE MONTH(order_date) = 3
  • Implicit type conversion: WHERE order_date = 20240301
  • Mixing partition-key and non‑key conditions with OR ❌ — consider UNION ALL instead.

Day‑to‑Day Partition Maintenance

Add a New Partition

ALTER TABLE orders
    REORGANIZE PARTITION p_future INTO (
        PARTITION p202407 VALUES LESS THAN (TO_DAYS('2024-08-01')),
        PARTITION p_future VALUES LESS THAN MAXVALUE
    );
-- Always update statistics immediately!
ANALYZE TABLE orders;
Enter fullscreen mode Exit fullscreen mode

Drop Historical Partitions (orders of magnitude faster than DELETE)

ALTER TABLE orders DROP PARTITION p202101, p202102, p202103;
Enter fullscreen mode Exit fullscreen mode

Merge or Split Partitions

-- Merge two small partitions
ALTER TABLE orders
    REORGANIZE PARTITION p202401, p202402 INTO (
        PARTITION p2024_q1 VALUES LESS THAN (TO_DAYS('2024-03-01'))
    );
-- Split a large partition
ALTER TABLE orders
    REORGANIZE PARTITION p_future INTO (
        PARTITION p202407 VALUES LESS THAN (TO_DAYS('2024-08-01')),
        PARTITION p202408 VALUES LESS THAN (TO_DAYS('2024-09-01')),
        PARTITION p_future VALUES LESS THAN MAXVALUE
    );
Enter fullscreen mode Exit fullscreen mode

Common Mistakes

  • Too many partitions: Over 500 partitions cause metadata bloat and slower plan generation. Aim for 50–200 partitions.
  • Low‑cardinality partition key: Using a column like status (3 values) causes severe data skew.
  • No catch‑all partition: Without MAXVALUE, out‑of‑range inserts fail immediately.
  • Skipping ANALYZE TABLE after partition changes: The optimizer sees empty partitions and may choose poor plans.

Automate Future Partition Creation

A simple script added to crontab ensures you never forget to create next month's partition:

#!/bin/bash
NEXT_MONTH=$(date -d "+1 month" +%Y%m)
NEXT_MONTH_START=$(date -d "+1 month" +%Y-%m-01)
NEXT_NEXT_MONTH_START=$(date -d "+2 month" +%Y-%m-01)
PARTITION_NAME="p${NEXT_MONTH}"

mysql -h 192.168.1.10 -P 5258 -u gbase -pyour_password sales_db <<EOF
ALTER TABLE orders
    REORGANIZE PARTITION p_future INTO (
        PARTITION ${PARTITION_NAME} VALUES LESS THAN (TO_DAYS('${NEXT_NEXT_MONTH_START}')),
        PARTITION p_future VALUES LESS THAN MAXVALUE
    );
ANALYZE TABLE orders;
EOF
Enter fullscreen mode Exit fullscreen mode

Crontab: 0 1 1 * * /bin/bash /usr/local/scripts/auto_add_partition.sh

Design Decision Flow

  1. Estimate data size: below 10 million rows → no partition needed; above 10 million → consider partitioning.
  2. Date column? → RANGE. Discrete categories? → LIST. No obvious pattern? → HASH.
  3. Granularity: daily inserts > 5 million → partition by day; 1–5 million → by month; < 1 million → by quarter or year.
  4. Ensure the partition key is in the primary key, then validate pruning with EXPLAIN.

A well‑designed partitioning layout is the foundation of query performance in a gbase database. Spend the time to get it right at design time, and you'll save yourself a world of trouble later.

Top comments (0)