The release wave that defined late April carried straight into early May, with Arrow shipping two more votes in seven days, Polaris settling into post-1.4.0 stabilization mode, and the Iceberg dev list staying focused on V4 design follow-ups from the summit. The clearest story of the week is Arrow's release engineering: the arrow-rs 58.2.0 vote that opened on April 28 closed cleanly on May 2, and the Arrow .NET 23.0.0 vote opened the same day and passed by May 5. Two votes, two passing results, four days apart — a cadence that would have been unimaginable a year ago when the project was still navigating its full-stack release cycle. Iceberg's design lists stayed in absorption mode as contributors continued to translate post-summit alignments into formal specification work, and Parquet's dev list remained dense with format-level threads that have been simmering since the ALP encoding vote closed in April.
Apache Iceberg
Iceberg's dev list ran quieter this week than the Arrow and Polaris lists, but the design conversations that have anchored 2026 continued to advance in the background. The V4 metadata.json optionality direction — the proposal to treat catalog-managed metadata as a first-class supported mode while preserving static-table portability through explicit opt-in semantics — is still the project's defining specification conversation, with Anton Okolnychyi, Yufei Gu, Shawn Chang, Steven Wu, and Russell Spitzer continuing to push edge cases on portability guarantees and Spark driver behavior. The single-file commits proposal that Russell Spitzer and Amogh Jahagirdar have been advancing remains on track for a formal write-up that should land on the dev list in the coming weeks.
Péter Váry's efficient column updates proposal for wide tables continues to attract collaboration. Anurag Mantripragada and Gábor Kaszab are working alongside Péter on POC benchmarks for both the Iceberg-native and Parquet-native approaches, with the latency and metadata footprint improvements making this one of the more practically grounded V4 proposals on the list. The design — write only the columns that change on each commit, then stitch the result at read time — is squarely aimed at petabyte-scale feature stores with thousands of embedding and model-score columns, and that workload pressure is precisely what's pulling the V4 spec design forward.
The labels in LoadTableResponse proposal that Andrei Tserakhau drove through March continues to anchor the catalog-managed metadata conversation. The design lets each catalog (Polaris, Unity Catalog, Lakekeeper) surface internal metadata such as ownership, cost attribution, and semantic context through a standard optional field on table loads, without forcing requirements onto catalogs that don't track that data. The cross-implementation POCs that Andrei published — Polaris (PR #4048), Unity Catalog (PR #1417), Lakekeeper (PR #1676), and the PyIceberg client (PR #3191) — remain useful reference points as the spec change progresses through review. Iceberg Summit 2026 session recordings continued rolling out on the project's YouTube channel, and the published AI contribution policy that Holden Karau, Kevin Liu, Steve Loughran, and Sung Yun pushed through March remains the next concrete deliverable to track.
Apache Polaris
Polaris transitioned from release-week intensity into stabilization mode this week. The 1.4.0 release that Adnan Hemani announced on April 23, followed by the Python CLI 1.4.0 release on April 28, gave the project its first major release pair as a graduated top-level project. The post-launch issues that Alexandre Dutra surfaced — the Helm chart repo inconsistency, the release workflow failure in step 4, the Artifact Hub request, and the KMS-related upgrade bug — are exactly the kind of friction a project surfaces in its first independent release cycle. Yufei Gu has continued to triage most of the upgrade-path issues, and the Helm packaging questions are converging toward resolution.
Design discussions stayed active alongside the post-release stabilization. EJ Wang's DISCUSS thread on AGENTS.md for Polaris — the proposal to add agent-readable repository metadata so coding agents can pick up the project conventions consistently — continued building toward a concrete implementation proposal, which the previous newsletter flagged as the next deliverable to watch. ITing Lee's proposal to add OpenLineage to Polaris has accumulated the volume of review feedback from Adnan Hemani, Jean-Baptiste Onofré, Yufei Gu, and Michael Collado that it needs to move toward an implementation RFC. Yufei's thread on narrowing the scope of SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION drew further engagement from Dmitri Bourlatchkov and Dennis Huo, and Alexandre Dutra's URL path decoding and PolarisPrivilege grant validation threads continued to be active points of discussion.
Jean-Baptiste Onofré's confirmation that Polaris is back on a monthly release cadence means a 1.4.1 patch release or 1.5.0 planning email is the natural next step. Given the volume of upgrade-path issues that surfaced after 1.4.0, a quick 1.4.1 to address the KMS bug and Helm packaging fixes seems the more likely path before the project moves on to 1.5.0 feature scoping.
Apache Arrow
Arrow's release engine kept running. Andrew Lamb's arrow-rs 58.2.0 RC1 vote that opened on April 28 closed on May 2, with the release approved by 6 +1 votes (4 binding) and immediately published to crates.io. Bryce Mecum, Ed Seidl, Jeffrey Vo, Raúl Cumplido, and L. C. Hsieh carried the verification work, with L. C. Hsieh casting the final binding +1 from an Intel Mac on April 29. The 58.2.0 release continues the monthly arrow-rs cadence that has held since 58.1.0 shipped in March, and 59.0.0 remains scheduled as a major release that may include breaking changes.
Curt Hagenlocher opened the Arrow .NET 23.0.0 RC0 vote on May 2 — the same day arrow-rs 58.2.0 was approved — and the vote passed on May 5 with 5 binding +1s from Bryce Mecum, Adam Reeve, Raúl Cumplido, Sutou Kouhei, and Curt himself. Sutou Kouhei verified on Debian sid with .NET SDK 8.0.413, and Curt ported verify_rc.sh to Powershell as part of the validation. Curt is now working through the post-vote release tasks, including a 401 issue with the GitHub release download script that he flagged for follow-up. The .NET 23.0.0 release continues the steady cadence the .NET implementation has settled into post the 22.0.0 cycle.
Beyond releases, the design conversations stayed lively. The pyarrow-stubs donation vote that Rok Mihevc opened on April 14 continued building toward a final tally. Emil Sadek's ADBC Logo Proposal drew further engagement from Nic Crane, Julian Hyde, and Rusty Conover, and Benjamin Philip's Arrow Erlang grant documents thread continued the project's expansion into more language ecosystems. Andrew Lamb's arrow-rs security policy discussion and Mandukhai Alimaa's canonical BigDecimal extension type proposal both continued to draw input as the project tightens its production posture.
Apache Parquet
Parquet's lists stayed dense. Manu Zhang's DISCUSS thread on a new parquet-java release continued attracting input from Steve Loughran, Aaron Niskode-Dossett, Fokko Driesprong, Julien Le Dem, Gang Wu, and Rahil C, with the conversation now narrowing on a target version and ship date for what would be the next parquet-java release after 1.17.0. Ismaël Mejía's thread soliciting code reviews for Java performance optimization work continued with Steve Loughran picking up the review load.
The format-level proposals continued evolving. Will Edwards's DISCUSS thread on an alternative to the FlatBuffer footer with a lightweight byte-offset index kept drawing design feedback from Andrew Lamb, Ed Seidl, Jan Finis, Alkis Evlogimenos, Raphael Taylor-Davies, and Andrew Bell. Ed Seidl's proposal to make path_in_schema optional continued attracting commentary from Gang Wu, Steve Loughran, and Micah Kornfield. Andrew Lamb's thread on where VariantJsonParser should live — the cross-project boundary question between Parquet and Iceberg's variant tooling — kept moving with input from Steve Loughran and Gang Wu.
The Geospatial work continued threading toward closure. Milan Stefanovic's Geospatial CRS string format clarification drew further input from Dewey Dunnington and Micah Kornfield, and Jan Finis's question on RLE bitpack page-edge validity continued the kind of spec-edge clarification work that matters for cross-implementation interoperability. The Parquet sync that Julien Le Dem ran on April 22 set the agenda for the design work that's now playing out across the dev list.
Cross-Project Themes
This week's clearest pattern is the rhythm of post-graduation Polaris finding its operational footing alongside Arrow's well-established release cadence. Two Arrow votes in four days plus the Polaris 1.4.x stabilization wave plus Iceberg's quiet absorption of summit alignments plus Parquet's dense format-level work make the lakehouse stack feel less like four separate projects and more like one coordinated platform. The arrow-rs 58.2.0 release in particular landed inside a single five-day vote window — proposed April 28, approved May 2, published to crates.io the same day — which is a useful benchmark for how tight Apache release engineering can run when the verification community is engaged.
The second pattern is the continued translation of post-summit alignments into spec work. The V4 metadata.json optionality direction, the labels-in-LoadTableResponse proposal, the AGENTS.md thread for Polaris, the OpenLineage RFC, the Parquet footer redesign work, and the Geospatial spec clarifications are all converging on the same broader question: what does the lakehouse stack look like when the workloads it powers shift from analytical SQL to AI agents and ML feature engineering? Each design conversation makes more sense if you assume the next decade's workload mix looks meaningfully different from the last decade's.
The third pattern is enterprise-readiness work surfacing in real time. Polaris's KMS upgrade bug, Helm packaging issues, OAuth2 Manager v2 design, and credential-subscoping scope discussion are all the work of a project being deployed at scale rather than a project being built. The visible triage on the dev list rather than behind closed doors is a healthy signal.
Looking Ahead
Watch for a Polaris 1.4.1 patch release vote to address the KMS bug and Helm packaging issues that surfaced after 1.4.0, with 1.5.0 planning to follow. The AGENTS.md discussion should firm into a concrete implementation proposal, and the Polaris OpenLineage RFC has the volume of feedback it needs to move toward an implementation. On the Iceberg side, the formal V4 single-file commits write-up, the V4 metadata.json optionality direction, and the published AI contribution policy remain the next concrete deliverables to track. The labels-in-LoadTableResponse spec PR (apache/iceberg#15750) should converge toward merge as the cross-catalog POCs validate the design.
On the Arrow side, the pyarrow-stubs donation vote should close in the coming days, and arrow-go and arrow-cpp release planning will shape what ships in May and June. For Parquet, Manu Zhang's parquet-java release thread should converge on a target version, the path_in_schema optionality proposal looks ready for a formal vote, and the FlatBuffer-footer alternative is on track for a more formal design document. Iceberg Summit 2026 session recordings will continue rolling out on YouTube — the V4 design talks and production case studies from Apple, Bloomberg, and Pinterest are particularly worth catching as they land.
Resources & Further Learning
Get Started with Dremio
- Try Dremio Free — Build your lakehouse on Iceberg with a free trial
- Build a Lakehouse with Iceberg, Parquet, Polaris & Arrow — Learn how Dremio brings the open lakehouse stack together
Free Downloads
- Apache Iceberg: The Definitive Guide — O'Reilly book, free download
- Apache Polaris: The Definitive Guide — O'Reilly book, free download
Books by Alex Merced
Top comments (0)