DEV Community

Cover image for RisingWave Roadmap Q4 2023
RisingWave Labs
RisingWave Labs

Posted on

RisingWave Roadmap Q4 2023

Tao Wu | Product Manager

「Announcement: Introducing RisingWave Tutorials — A Must-Read Handbook for Stream Processing Enthusiasts

One and a half years ago, in April 2022, we open-sourced RisingWave, the distributed SQL streaming database. A quarter ago, in July 2023, we released the first official version of RisingWave, RisingWave 1.0, a battle-tested system that can be used in production. More recently, RisingWave 1.3 has been released

As an open-source streaming database released under Apache 2.0 license, the development team behind RisingWave actively collects feedback from users and strives to democratize stream processing: to make it simple, affordable, and accessible.

As a system that has been deployed in production in dozens of enterprises and fast-growing startups, how will RisingWave evolve? We plan to make it transparent and periodically update our roadmap. Here’s what you can anticipate in the future release of RisingWave.

Note that the roadmap is not final, and we will frequently update our roadmap to reflect the item priority to better serve users.

Short-term goals (within the next 3 months)

  • Adaptive Scaling

    Implement adaptive scaling to automatically adjust materialized view parallelism based on the number of CPU cores in the cluster.

  • Improvements to the Existing External Sinks

    Optimize performance and improve stability of supported external sinks like Doris, Clickhouse, and Elasticsearch. We’ll also expand supported encoding formats for Kafka sink, including Protobuf, Avro, and the support for Schema Registry.

  • Iceberg Sink V2

    We recently introduced a native integration with Iceberg, which is no longer based on the official Java library. It’s fully rewritten by Rust for performance and stability. We plan to stabilize it in the next few months.

  • Enhanced Observability

    Expand system tables and add metrics for stateful operators to provide greater visibility into system health and performance.

  • Improved Open-source Web UI

    Enhance RisingWave's open-source web UI with additional system information and monitoring capabilities.

  • Sink into table

    Users may want to dynamically union the results of multiple views into a single table. For example, a view may correspond to a department in a company while there can be new departments once in a while. With this feature, users can seamlessly merge data from new views as they are added.

  • CDC Connection Sharing

    RisingWave currently creates one CDC connection per table. Each connection will individually consume the replication logs, which consist of transactions not only to the source table but also to other tables in the same database. Therefore, multiple connections will lead to the duplicate consumption and a heavy load on the upstream database. Shared CDC connections can thus reduce the load and improve the stability of the CDC.

  • Recoverable CREATE MATERIALIZED VIEW

    Persist materialized view progress to allow recovering from failures without losing work already completed.

  • CDC Transaction Atomicity

    CDC transactions in RisingWave currently apply by events, which may contain only partial content in a transaction. With the new feature, RisingWave will buffer all CDC events within a transaction until it can be fully applied atomically.

  • Parallel CDC Snapshot Loading

    Introduce parallelism during CDC snapshot loading to improve the user experience for large upstream tables.

Mid-term goals (within the next 6 months)

  • SSL/TLS Secured Connection

    Implement SSL/TLS encryption for client/server communications to enhance security.

  • Alter Materialized View

    Add the ability to modify existing materialized views.

  • Session Window

    Introduce session window functionality for advanced streaming analytics.

  • MemTable Spill

    A refresh to a small table could suddenly cause 1k times amplification on write throughput. Such a case typically happens when there is a 10+ way join. A way to mitigate this is to use the local disk as a buffer for the flooded writes, thus avoiding OOM.

  • Dedicated Computes for Materialized View Creation

    Some users complained that RisingWave’s materialized view creation is too slow, as it requires a resource-intensive ad-hoc computation. On the other hand, since the streaming (incremental computations) is long-running, it requires fewer resources at the same time. As a result, it’s possible to allocate dedicated resources for MV creation separately when needed and deallocate them once finished.

  • More External Sinks

    Redshift Sink and Snowflake Sink are in the plan.

  • Recursive CTE

    Enable recursive common table expressions (CTE) to traverse hierarchical data like the organizational tree in a company.

  • Shared Meta Plane

    Enable RisingWave clusters to share the meta plane, including Etcd (or Postgres in the future), to better utilize compute resources across clusters.

Long-term goals

  • Optimize analytical query performance on third-party systems like Presto and Trino

  • GraphQL API

    To allow retrieving results from RisingWave directly through the browser.

  • Serverless Compaction

    Automatically scale Compactor instances in and out to match workload demands in a serverless model.

CONCLUSION

RisingWave is an open-source streaming database aiming at democratizing stream processing: to make stream processing ease of use and cost-efficient. Its development direction is highly influenced by user requests. We would love to hear from the community and update our agenda accordingly. If you have any questions or comments regarding RisingWave’s roadmap, please don’t hesitate to let us know by commenting here. Your voice will help shape the future of real-time stream processing!

About RisingWave Labs

RisingWave is an open-source distributed SQL database for stream processing. It is designed to reduce the complexity and cost of building real-time applications. RisingWave offers users a PostgreSQL-like experience specifically tailored for distributed stream processing.

Official Website: https://www.risingwave.com/

Documentation: https://docs.risingwave.com/docs/current/intro/

Tutorial:https://tutorials.risingwave.com/

Slack:https://risingwave-community.slack.com

GitHub:https://github.com/risingwavelabs/risingwave

LinkedIn:linkedin.com/company/risingwave-labs

Top comments (0)