DEV Community

taoify
taoify

Posted on

Multi-dimensional sharding and partitioning order splitting design

summary
As platform auction and purchasing orders continue to accumulate, single-table queries with tens of millions of data points become slow, and mere index optimization cannot fundamentally solve performance issues. This article proposes a composite sharding scheme based on user ID hash sharding and creation time table sharding to achieve horizontal sharding of order data. The bidfans order storage adopts this sharding and table sharding architecture.

  1. Bottleneck of massive orders in a single database and table When the order table exceeds 10 million rows, slow queries occur in pagination, bill statistics, and time range queries; the index volume becomes huge, and write and insert performance continues to decline; the disk and CPU resource limits of a single database are fixed, and cannot be linearly expanded with business growth; it is impossible to optimize the index for both user-based and time-based query requirements simultaneously. Composite sharding strategy: Split the database by user ID hash (horizontal sharding), and split the data tables by order creation date (vertical sharding), considering both user dimension and time dimension high-frequency query scenarios. II. Routing rules for composite sharding and sub-table Database partitioning rule: Take the hash value of the user ID and modulo it by the total number of databases. All orders from the same user are stored in the same database. When querying orders for a single user, only a single database is accessed, and cross-database associative queries are rare; Table partitioning rules: Each database should be partitioned into tables based on the order creation date, such as order_202606 and order_202607. New data tables should be created monthly, and the data volume of each table should be permanently controlled within one million records; The middleware encapsulates routing logic uniformly, so the business code does not need to be aware of the rules for database and table sharding. The syntax for single-table and multi-table queries is completely unified. Archive historical orders on a monthly basis, and migrate cold orders that are older than 12 months to an archive repository, thereby relieving the storage and query pressure on the primary repository and achieving physical separation between hot and cold data. III. Compatibility scheme for cross-dimensional queries User dimension query: Directly route to the corresponding database, read all monthly sub-table data for the user, and paginate and merge the results; Global time statistics: The middleware performs parallel queries on the sub-tables corresponding to each month in each database, aggregates and summarizes the data, and utilizes caching to reduce redundant cross-database queries; The order ID incorporates a built-in sharding identifier, allowing direct localization of the associated database and table through the order number, eliminating the need for user ID routing and facilitating backend order retrieval. After splitting the order database table in bidfans, the response speed of single-user order queries has been improved by 70%, and slow queries in monthly report statistics have been completely eliminated. IV. Smooth transition for capacity expansion When adding new database shards, only the hash modulo base needs to be adjusted, and user data is gradually distributed with the help of data migration tools, ensuring business continuity during expansion. Monthly partition tables are automatically created in advance at scheduled times, eliminating the need for manual table creation operations and enabling automated operation and maintenance. Conclusion The composite sharding architecture, combining user hash sharding and time-based table sharding, addresses the performance bottleneck of a single table handling massive orders. It simultaneously accommodates two core query scenarios: user-based and time-based. The separation of hot and cold data further optimizes the load on the online database, making it the optimal solution for long-term operation of the Daipai platform's order storage.

Top comments (0)