Get Data Lakehouse Books:
- Apache Iceberg: The Definitive Guide
- Apache Polaris: The Defintive Guide
- Architecting an Apache Iceberg Lakehouse
- The Apache Iceberg Digest: Vol. 1
Lakehouse Community:
- Join the Data Lakehouse Community
- Data Lakehouse Blog Roll
- OSS Community Listings
- Dremio Lakehouse Developer Hub
The third week of January 2026 marks a significant milestone period for the Apache lakehouse ecosystem, featuring major releases, the inaugural Iceberg-Spark community sync, and the close of the Iceberg Summit call for papers. The community continues building momentum with both technical advances and grassroots engagement as 2026's development cycle accelerates.
Apache Iceberg
Iceberg Summit CFP Closes: The call for papers for the inaugural Apache Iceberg Summit closed on January 18th, marking the end of a successful submission period for the April 8-9 event in San Francisco. The selection committee, led by Russell Spitzer as the main PMC contact, will now begin reviewing proposals from across the vendor, user, and contributor community. This milestone represents the maturation of Iceberg as a central open standard worthy of its own dedicated conference.
Inaugural Iceberg-Spark Community Sync: On January 20th, the first dedicated Iceberg-Spark Community Sync took place, bringing focused attention to Spark integration topics that have grown too numerous for the main community sync. Initiated by Anurag Mantripragada with support from Anton Okolnychyi and Kevin Liu, the monthly sync covers ongoing work including sort order reporting, Spark 4.1 support, and the future of Datafusion-Comet integration. This specialized forum recognizes the depth and complexity of Spark-specific development while keeping the main community sync focused on broader cross-engine discussions.
Atlanta Meetup Builds Grassroots Momentum: The Apache Iceberg Meetup ATL held its first event of 2026 on January 21st, continuing the grassroots community engagement that has grown throughout 2025. Organizers are working on a structured CFP process to encourage diverse presenters and topics throughout the year, demonstrating how local meetups complement the global summit and online syncs in building a vibrant Iceberg community.
Official Project Blog Launches: Following the successful vote in early January, the official Apache Iceberg blog at iceberg.apache.org/blogs/ is now operational. The first post promotes the Iceberg Summit 2026, demonstrating the blog's role as a central hub for community announcements, technical deep dives, and project updates. This communication channel provides a more polished complement to mailing list discussions and GitHub activity.
Apache Polaris
Graduation Path Solidifies: Polaris continues its steady march toward full Apache graduation, with regular community syncs and development sprints focusing on documentation, onboarding, and resolving open issues. The expanding Podling Project Management Committee (PPMC) reflects healthy governance maturation, with multiple contributors stepping into leadership roles across the project.
Generic Table Feature Stabilizing: Following the 1.3.0-incubating release earlier in January, the community is preparing to graduate the "Generic Table" capability from beta status in an upcoming release. This feature enables Polaris to catalog external table formats like Apache Hudi and Delta Lake in a stable, production-ready manner, significantly expanding Polaris's value as a multi-format catalog.
Cloud Integration Testing Expansion: With AWS credits now available to the project, contributors are expanding integration testing against real cloud infrastructure. The focus includes IAM AssumeRole flows and credential vending scenarios that are difficult to simulate locally, improving production-readiness validation for enterprise deployments on AWS, Azure, and Google Cloud.
Apache Arrow
Arrow 23.0.0 Released: On January 18th, Apache Arrow shipped its 23.0.0 major release, covering over three months of development with 417 commits from 71 distinct contributors. This release continues Arrow's quarterly cadence and includes improvements across the C++, Python, Java, R, and Go implementations. The release brings performance optimizations, new compute functions, and enhanced multi-language consistency that strengthens Arrow's position as the universal columnar interchange format.
Leadership Stability Reinforced: With Antoine Pitrou formally serving as PMC Chair, Arrow enters 2026 with stable leadership from one of its co-creators. This governance continuity provides consistent technical vision while the project continues expanding its language bindings and compute capabilities.
Multi-Language Consistency Focus: The 23.0.0 release emphasizes keeping Arrow's implementations consistent across languages, ensuring that data interchange works seamlessly whether teams are using PyArrow, Arrow Java, Arrow C++, or Arrow R. This consistency is critical for the lakehouse ecosystem where different query engines and tools need to exchange data efficiently.
Apache Parquet
Parquet 1.17.0 Officially Released: Following the successful vote in early January, Apache Parquet 1.17.0 was officially released on January 13th, 2026. This release represents a significant modernization milestone as it drops Java 8 support and sets Java 11 as the new minimum runtime requirement. The change aligns Parquet with contemporary Java library standards and enables the use of modern language features in future development.
Java 11 Migration Complete: The move to Java 11 minimum requirement marks an important transition for the Parquet ecosystem, signaling that users still on legacy Java versions should plan upgrades. This modernization parallels similar discussions in the Iceberg community around moving to Java 17, reflecting a broader trend toward contemporary Java platforms across the lakehouse stack.
V3 Format Groundwork Continues: While no concrete proposals have emerged, informal discussions continue around what a Parquet V3 format might include. Topics under consideration include FSST (Finite State Symbol Table) compression for string encoding, cleaner metadata layouts, and enhanced bloom filter indexing. The community remains focused on completing V2 features like optional checksums and page-level statistics before introducing any breaking format changes.
Cross-Project Themes
Java Modernization Wave: Both Iceberg and Parquet are executing major Java version transitions, with Parquet completing its move to Java 11 and Iceberg discussions pointing toward Java 17 as a future baseline. This coordinated modernization across the lakehouse stack enables cleaner dependency management, access to modern language features, and improved performance characteristics.
Community Infrastructure Investment: From the Iceberg Summit and local meetups to dedicated Spark integration syncs and expanded testing infrastructure for Polaris, all four projects demonstrate strong investment in community engagement and production-readiness validation. These investments in "soft infrastructure" complement the technical advances and help translate mailing list discussions into practical implementation guidance.
Release Cadence Maturity: The synchronized timing of major releases—Arrow 23.0.0 and Parquet 1.17.0 both shipping in mid-January—reflects the mature release cadence across these projects. Regular, predictable releases with thorough community validation demonstrate that these projects have moved beyond experimental phases into stable, production-grade platforms.
Looking Ahead
The coming weeks will see Iceberg Summit talk selections, continued development of Spark 4.1 support in Iceberg, and further refinement of Polaris's graduation roadmap. Arrow will continue its quarterly development cycle toward the next release, while Parquet stabilizes the 1.17.0 adoption and advances V3 design discussions. As these projects continue evolving in lockstep, the open lakehouse ecosystem solidifies its position as the default architecture for modern analytics workloads.
Key Dates and Events:
- Iceberg Summit 2026: April 8-9, San Francisco
- Next Iceberg-Spark Community Sync: February (monthly cadence)
- Arrow 24.0.0: Expected April 2026 (quarterly cadence)
Top comments (0)