Get Data Lakehouse Books:
- Apache Iceberg: The Definitive Guide
- Apache Polaris: The Defintive Guide
- Architecting an Apache Iceberg Lakehouse
- The Apache Iceberg Digest: Vol. 1
Lakehouse Community:
- Join the Data Lakehouse Community
- Data Lakehouse Blog Roll
- OSS Community Listings
- Dremio Lakehouse Developer Hub
The second week of January brings continued momentum across Apache Iceberg, Polaris, Arrow, and Parquet as the community transitions from holiday mode into active development. This week featured key governance discussions, community organizing, and technical proposals that will shape the lakehouse ecosystem throughout 2026.
Apache Iceberg
Iceberg-Spark Community Sync Established: Anurag Mantripragada initiated a discussion about creating a dedicated monthly sync for Spark-Iceberg integration discussions, separate from the main community sync. The proposal gained immediate support from Anton Okolnychyi and Kevin Liu, who scheduled the first "Iceberg-Spark Community Sync" for January 20th (10-11am PT). The sync will cover ongoing work including sort order reporting, Spark 4.1 support, and the future of Datafusion-Comet integration. This specialized forum recognizes the depth of Spark-specific topics while keeping the main community sync focused on broader discussions.
Project Blog Launch Vote Passes: Following positive discussion, Kevin Liu called a formal vote to establish an official Apache Iceberg blog at iceberg.apache.org/blogs/. The vote passed with multiple binding and non-binding +1s from community members including Russell Spitzer, Steven Wu, and others. The first post will promote the Iceberg Summit 2026, demonstrating the blog's role in community announcements and project updates.
OAuth2 Manager v2 Proposal Discussion: Contributors continued refining the OAuth2 Manager v2 design document, which overhauls Iceberg's authentication manager with a phased migration approach spanning multiple minor versions. Christian Thiel provided feedback questioning whether legacy token-exchange behavior needs migration to the new manager, as the endpoint has been deprecated for over 1.5 years. Discussion is scheduled for the January 14 catalog sync, where the community will determine which legacy flows to preserve.
Summit CFP Reminder: With the January 18th deadline approaching, the community continues rallying speakers for the April 8-9 Iceberg Summit in San Francisco. Robin Moffatt inquired about the selection committee composition, with Jean-Baptiste Onofré confirming Russell Spitzer as the main PMC contact and noting that committee affiliations will be clearly listed in the final proposal.
Apache Polaris
Polaris development maintained steady progress through the holiday transition period, with the community focusing on consolidating recent 1.3.0-incubating features rather than introducing new major changes.
Graduation Momentum: Regular community syncs and development sprints continued through early January, with the expanding PPMC reflecting healthy governance maturation. The project's Generic Table capability, which enables cataloging external formats like Apache Hudi and Delta Lake, is expected to graduate from beta in the upcoming release.
Integration Testing Expansion: With AWS credits now available to the project, contributors discussed expanding integration testing against real cloud infrastructure, particularly for IAM AssumeRole flows and credential vending scenarios that are difficult to simulate locally. This infrastructure investment will improve production-readiness validation.
Apache Arrow
Arrow entered 2026 with stable leadership and continued focus on multi-language consistency.
Leadership Continuity Confirmed: Antoine Pitrou, Arrow's co-creator, was formally appointed PMC Chair, reinforcing governance stability and providing continued technical vision from the project's founding leadership. This appointment ensures strategic continuity as Arrow expands its role as the universal columnar interchange layer.
Format Enhancements Continue: The project advanced work on format improvements including timezone support in temporal types and enhanced compute functions. These incremental updates maintain Arrow's position as the in-memory standard for analytics workloads across engines and languages.
Apache Parquet
Board Report Draft Circulated: Julien Le Dem shared the draft January board report for community review ahead of the January 14th submission deadline and January 21st board meeting. Fokko Driesprong reviewed and approved the report, which will cover recent release activity and community health metrics.
1.17.0 Release Finalized: Following the January 2nd vote passage, contributors verified signatures and conducted final release validation. The release officially drops Java 8 support in favor of Java 11 as the minimum runtime, representing a significant modernization milestone for the Parquet ecosystem.
FSST Encoding Progress: Design discussions around FSST (Finite State Symbol Table) compression for string and byte array encoding advanced, with contributors exploring how to efficiently share compressed dictionaries across multiple column pages to reduce file size for string-heavy workloads.
Cross-Project Themes
Java Modernization Wave: Both Iceberg and Parquet are elevating their Java requirements (Parquet to Java 11, with Iceberg considering Java 17), enabling modern language features and cleaner dependency management. This coordinated modernization reflects ecosystem maturity and willingness to drop legacy runtime support.
Community Infrastructure Investment: From Iceberg's specialized Spark sync and project blog to Polaris's expanded testing infrastructure, all projects are investing in community mechanisms that translate technical discussions into practical implementation guidance and improved engagement.
Format Evolution Balancing Act: While Iceberg explores V4 features and Parquet scopes V3 possibilities, both projects demonstrate careful balance between innovation and stability, ensuring production users have fully-featured, stable platforms before introducing breaking changes.
Looking Ahead
The week ahead will see important milestones: the Iceberg Summit CFP closes January 18th, the Parquet board report submission on January 14th, and the first Iceberg-Spark community sync on January 20th. The Atlanta Iceberg meetup on January 21st will continue grassroots community building efforts that have grown throughout 2025.
As the lakehouse ecosystem matures, these governance, community, and technical foundations position Apache Iceberg, Polaris, Arrow, and Parquet for another year of production-grade innovation and ecosystem growth.
Top comments (0)