DEV Community

Apache ShardingSphere
Apache ShardingSphere

Posted on

Apache ShardingSphere 5.0.0: Upgraded Federation Execution Engine

Federation Execution Engine is another spotlight of Apache ShardingSphere 5.0.0 GA version's kernel. We aim to support distributed query statements that don't work in the 4.1.1 GA version, such as cross-database instances related queries and subqueries. Thanks to the Federation Execution Engine, developers no longer need to worry about the usage scope of SQLs, and they can concentrate on their business function development, reducing function limitations at the business level.

The figure above shows you the processing flow of the Federation Execution Engine. In general, the process flow still follows the sequence: SQL Parser, SQL Router, SQL Rewriter, and then SQL Executor. The only different step is the additional SQL Optimizer used in Federation Execution Engine. The optimizer contains RBO (Rule-Based Optimizer) and CBO (Cost-Based Optimizer) to optimize distributed query statements and get the execution plan with minimal cost paid. In SQL Router, the router engine analyzes whether an SQL statement is given to cross-database instances and if yes, it determines to use Federation Execution Engine to execute the SQL.

We have not completed developing the Federation Execution Engine yet, and even though we develop it as fast as we can, a lot of optimizations will be required anyway. It is still an experimental feature, so it is disabled by default. If you want to use the engine, you can enable it by configuring `Federation Execution Engine is another spotlight of the 5.0.0 GA version's kernel. We aim to support distributed query statements that don't work in the 4.1.1 GA version, such as cross-database instances related queries and subqueries. Thanks to the Federation Execution Engine, developers no longer need to worry about the usage scope of SQLs, and they can concentrate on their business function development, reducing function limitations at the business level.

The figure above shows you the processing flow of the Federation Execution Engine. In general, the process flow still follows the sequence: SQL Parser, SQL Router, SQL Rewriter, and then SQL Executor. The only different step is the additional SQL Optimizer used in Federation Execution Engine. The optimizer contains RBO (Rule-Based Optimizer) and CBO (Cost-Based Optimizer) to optimize distributed query statements and get the execution plan with minimal cost paid. In SQL Router, the router engine analyzes whether an SQL statement is given to cross-database instances and if yes, it determines to use Federation Execution Engine to execute the SQL.

We have not completed developing the Federation Execution Engine yet, and even though we develop it as fast as we can, a lot of optimizations will be required anyway. It is still an experimental feature, so it is disabled by default. If you want to use the engine, you can enable it by configuringsql-federation-enabled: true.

Federation Execution Engine applies to cross-database instances related queries and subqueries, and some aggregate queries not supported by the kernel. There are specific scenario examples below that we give to help you understand the statements supported by the Federation Execution Engine.

  • Cross-DataNode Related Query: Federation Execution Engine is used when multiple tables of a relational query are distributed on different database instances.

For example, see the following data sharding configuration: t_order and t_order_item tables are table shards with multiple datanodes, and no binding table rules are configured, while t_user and t_user_role are single tables distributed on different database instances.


rules:
-!SHARDING
tables:
t_order:
actualDataNodes: ds_${0..1}.t_order_${0..1}
tableStrategy:
standard:
shardingColumn: order_id
shardingAlgorithmName: t_order_inline
t_order_item:
actualDataNodes: ds_${0..1}.t_order_item_${0..1}
tableStrategy:
standard:
shardingColumn: order_id
shardingAlgorithmName: t_order_item_inline

The following SQL commands are often used for cross-database instances, so it's suggested to use Federation Execution Engine to complete these relational queries:


SELECT * FROM t_order o INNER JOIN t_order_item i ON o.order_id = i.order_id WHERE o.order_id = 1;
SELECT * FROM t_order o INNER JOIN t_user u ON o.user_id = u.user_id WHERE o.user_id = 1;
SELECT * FROM t_order o LEFT JOIN t_user_role r ON o.user_id = r.user_id WHERE o.user_id = 1;
SELECT * FROM t_order_item i LEFT JOIN t_user u ON i.user_id = u.user_id WHERE i.user_id = 1;
SELECT * FROM t_order_item i RIGHT JOIN t_user_role r ON i.user_id = r.user_id WHERE i.user_id = 1;
SELECT * FROM t_user u RIGHT JOIN t_user_role r ON u.user_id = r.user_id WHERE u.user_id = 1;

  • Sub-Query: Simple Push Down Engine supports subqueries with consistent sharding conditions, and sub-queries routed to a single shard. When you don't specify the same shard key for the sub-query and the outer query, or you give the inconsistent value of a shard key, you will need the Federation Execution Engine.

The code block below displays some sub-queries supported by the Federation Execution Engine:


SELECT * FROM (SELECT * FROM t_order) o;
SELECT * FROM (SELECT * FROM t_order) o WHERE o.order_id = 1;
SELECT * FROM (SELECT * FROM t_order WHERE order_id = 1) o;
SELECT * FROM (SELECT * FROM t_order WHERE order_id = 1) o WHERE o.order_id = 2;

  • Aggregate Query: Currently, Apache ShardingSphere Simple Push Down Engine cannot support all aggregate queries. However, you can use Federation Execution Engine to execute aggregate queries not supported by Simple Push Down Engine.


SELECT user_id, SUM(order_id) FROM t_order GROUP BY user_id HAVING SUM(order_id) > 10;
SELECT (SELECT MAX(user_id) FROM t_order) a, order_id FROM t_order;
SELECT COUNT(DISTINCT user_id), SUM(order_id) FROM t_order;

Summary

Federation Execution Engine significantly enhances ##distributed query capabilities of Apache ShardingSphere##. In the future, we will continue to optimize Apache ShardingSphere, aiming at reducing memory usage of the Federation Execution Engine.

Open Source Project Links:

ShardingSphere Github: https://github.com/apache/shardingsphere

ShardingSphere Twitter:https://twitter.com/ShardingSphere

ShardingSphere Slack Channel:https://join.slack.com/t/apacheshardingsphere/shared_invite/zt-sbdde7ie-SjDqo9~I4rYcR18bq0SYTg

GitHub Issues: https://github.com/apache/shardingsphere/issues

Contributor Guide:https://shardingsphere.apache.org/community/cn/contribute/

Top comments (0)