- Free Apache Iceberg Course
- Free Copy of “Apache Iceberg: The Definitive Guide”
- Free Copy of “Apache Polaris: The Definitive Guide”
- Purchase "Architecting an Apache Iceberg Lakehouse" (50% Off with Code MLMerced)
- 2025 Apache Iceberg Architecture Guide
- Iceberg Lakehouse Engineering Video Playlist
- Ultimate Apache Iceberg Resource Guide
Release and community updates
- Polaris 1.0.1‑incubating released – Jean‑Baptiste Onofré announced that the vote for Polaris 1.0.1‑incubating (rc0) passed (+1 binding votes from JB Onofré, Ryan Blue and Kent Yao, non‑binding +1 from Robert Stupp, Dmitri Bourlatchkov, Ed Espino). He reported that the release was complete (thread).
- Planning the 1.1.0 release – Pierre Laporte offered to serve as release manager or shadow current release managers because parts of the release process were not well documented. Yufei Gu supported the idea and suggested Pierre either become release manager or shadow Jean‑Baptiste (thread).
API and code‑level discussions
-
Integration points in Polaris – Robert Stupp asked which parts of the codebase should be considered stable integration points. Dennis Huo suggested marking service‑provider interfaces (SPIs) such as
ActiveRolesProvider
,CallContextResolver
andRateLimiter
as public extension points while leaving other components internal. Michael Collado emphasised that dependencies should be explicit rather than hidden behind CDI injection. Yufei Gu preferred a convention‑based approach for marking experimental components. Dmitri Bourlatchkov proposed creating a separate module for SPIs and acknowledged that clarifying APIs might be turbulent but worthwhile (thread). -
Removing
CallContext.copy()
– Robert Stupp proposed eliminating thecopy()
method fromCallContext
since it was used only by tasks. Yufei Gu argued that asynchronous tasks need a freshRealmContext
and removingcopy()
could complicate the code. Dmitri Bourlatchkov and Yun Zou felt tasks should not serialise the entireCallContext
; they suggested defining task parameters and using CDI instead. Dennis Huo noted that async tasks still require a serialisable context and recommended documenting a standard way to pass execution context to tasks. Robert clarified that the PR simply retrievedRealmContext
by ID for the purge‑table task rather than redesigning the task framework (thread). -
Storage credential retrieval – Robert Stupp noted that obtaining storage credentials for tasks required numerous database round trips. He proposed decoupling storage credential retrieval from persistence, removing unnecessary
loadEntity
calls and using aPolarisFileIOSupplier
to provideFileIO
objects directly (thread). - S3 credential vending without STS – Pat Patterson (Backblaze) explained that remote signing should encode the table or catalog information in the URL and use a catalog‑owned key to sign requests. Alexandre Dutra shared an initial patch enabling remote signing in Polaris and noted that it introduces a new table privilege and cannot rely on the default STS endpoint because that endpoint lacks catalog context. Yufei Gu thanked Robert and Alexandre, emphasising that remote signing is a major feature and should be discussed via a design document (thread).
-
Feature flag for purging views – Dmitri Bourlatchkov proposed adding a
PURGE_VIEWS_ON_DROP
feature flag that would allow dropping views whenDROP_WITH_PURGE_ENABLED
is false (thread). -
Role‑ARN optional for S3 – Robert Stupp pointed out that requiring a
roleArn
in the S3 catalog API is redundant and problematic when using Amazon’s IRSA or S3‑compatible storage like Minio or Ceph. He referenced an issue and PR to makeroleArn
optional. Dmitri Bourlatchkov agreed that this change simplifies the REST API and improves the user experience for non‑AWS storage (thread). -
Delegation service for long‑running tasks – Responding to William Hyun’s proof‑of‑concept for a delegation service, Robert Stupp expressed concerns that the PoC described synchronous purging before
dropTable
returns, which might not work if metadata and credentials are only available after deletion. He emphasised that tasks must survive Polaris restarts and suggested improving the existing purge‑table‑files mechanism rather than relying on synchronous tasks (thread).
Operational issues and bug reports
-
Purge table OOMs – Robert Stupp reported that the purge‑table task can cause out‑of‑memory errors because manifest files are stored as base64 strings in the
TableDropData
tasks and persisted state. He summarised how manifest pruning loads the entire snapshot metadata into memory and proposed optimising this to avoid OOMs (thread). - Schema setup in admin tools – Alexandre Dutra raised the issue of setting up database schemas via the admin tool. Dmitri Bourlatchkov summarised prior discussions and proposed removing bootstrap calls from persistence APIs, controlling auto‑bootstrap via CDI, and adding CLI options to opt in/out of schema changes. Robert Stupp supported the proposal but noted that realms are statically defined via configuration, suggesting that dynamic realm management could be a separate project (thread).
Community and meetups
- Seattle meetup – Danica Fine proposed hosting the first Polaris‑specific meetup in Seattle. Community members strongly supported the idea: Yufei Gu gave +1 to the meetup and suggested adopting guidelines from Apache Iceberg; Alex Merced offered to look for speakers; Prashant Singh and Dmitri Bourlatchkov supported the meetup and the idea of publishing guidelines; Jean‑Baptiste Onofré reminded everyone that the project management committee must approve the meetup and offered to give a talk on ASF governance (thread).
Overall, the week’s discussions focused on refining Polaris’s API surface and implementation details (storage credential retrieval, task frameworks, authentication), addressing operational issues like purge‑table OOMs and schema setup, planning the next release, and strengthening community engagement through meetups and collaboration.
Top comments (0)