The post-summit translation work that has dominated 2026 turned into shipped artifacts this week. Iceberg cut a 1.11.0 release candidate on the strength of weeks of design follow-ups. Polaris published a security-focused 1.4.1 patch release alongside four coordinated CVE disclosures and announced 1.5.0 planning for next week. Arrow's Rust subproject opened three release votes in a single day. Parquet finally shipped Java 1.17.1 after a year between releases and turned its attention to the next wave of format-level proposals. The connective tissue across all four projects: production hardening at scale, AI-workload-driven format design, and the slow consolidation of governance frameworks around AI-assisted contribution.
Apache Iceberg
Iceberg's biggest news this week is the 1.11.0 release candidate that Aihua Xu opened for voting on May 9. The RC1 vote drew rapid engagement from Péter Váry, Yuya Ebihara, Steven Wu, Kevin Liu, Steve Loughran, Russell Spitzer, Amogh Jahagirdar, Talat Uyarer, Manu Zhang, Ajay Yadav, and huaxin gao, with verification work splitting between binary checks, Snowflake build tests, and Trino downstream validation. The thread surfaced enough discussion that Aihua had to address questions about Spark integration coverage, version notes, and licensing audit follow-ups across multiple replies before the vote could close cleanly. This is the first 1.x release of 2026 carrying the full weight of V3 production maturity, and the depth of the verification work reflects how seriously contributors are treating it as a stability anchor while V4 design continues in parallel.
The release candidate landed against a backdrop of unblocking work that Ryan Blue drove through the LICENSE updates thread. The discussion threaded through commentary from Russell Spitzer, Steven Wu, Aihua Xu, Jean-Baptiste Onofré, Steve, roryqi, John Zhuge, Péter Váry, Fokko Driesprong, and Kevin Liu, with the conversation focused on ensuring that 1.11 would ship with a clean LICENSE/NOTICE chain that matched what contributors had actually merged. Apache release engineering depends on these audits being thorough, and the discussion shows the community treating LICENSE correctness as a release blocker rather than a checkbox.
Beyond the release work, the V4 design conversations continued advancing on multiple fronts. Ryan Blue opened a fresh DISCUSS thread on partition tuples in V4 that quickly drew responses from Steven Wu, Anoop Johnson, Amogh Jahagirdar, Russell Spitzer, and Micah Kornfield. The thread is one of the more architecturally significant V4 conversations — partition tuples affect how metadata represents partition state when single-file commits replace manifest lists, and the decision shapes how column statistics, manifest delete vectors, and root manifests interact at scale. Amogh's multiple replies on the thread reflect the same depth of analysis that has anchored his work on the broader one-file commits proposal with Russell.
Ryan also opened a DISCUSS thread on a compact bitmap format that drew engagement from Maximilian Michels, Andrei Tserakhau, Guy Khazma, Anoop Johnson, and Amogh Jahagirdar. The proposal targets one of the practical efficiency issues with the current Roaring bitmap encoding used for delete vectors — for sparse delete sets across very large data files, the metadata overhead matters at scale, and a more compact format that preserves the operational semantics could materially affect the storage footprint of MOR tables in production. Anoop Johnson separately opened a V4 Aggregate Column Stats DISCUSS thread that pushes on the same broader theme — making metadata cheaper to scan as table sizes grow.
The catalog-side design work also stayed active. EJ Wang's First-class Tag concept in Iceberg REST Catalog DISCUSS thread drew responses from Yufei Gu and Andrei Tserakhau, building on the broader labels-and-metadata conversation that Andrei has been driving for months. Steven Wu opened a VOTE on adding the CatalogObjectIdentifier schema that drew binding +1 votes through the week from Yufei Gu, Russell Spitzer, huaxin gao, Christian Thiel, Alexandre Dutra, Jean-Baptiste Onofré, and Steve. Yuya Ebihara's DISCUSS thread on recursive namespace listing drew responses from Ajantha Bhat and Yufei Gu — a quality-of-life REST API change that would matter most to catalogs federating across many tenants. Prashant Singh's DISCUSS thread on credential management for KMS/Vault and table-level encryption pulled in feedback from Sreesh Maheshwar, Chris Lu, Gidon Gershinsky, and Ádám Szita. Alexandre Dutra opened a DISCUSS thread on passing arbitrary information to request signers — a thread that builds on the months of work he's been leading on remote signing semantics.
Yuya Ebihara also opened a DISCUSS thread on adding HashiCorp Vault KMS support that drew engagement from Steve Loughran, Romain Manni-Bucau, and Jean-Baptiste Onofré. Vault is the de facto key management standard for self-hosted environments, and bringing first-class Vault support into the encryption layer closes one of the bigger gaps for teams running Iceberg outside the major cloud KMS providers.
The Rust subproject continued shipping. Shawn Chang opened the Iceberg Rust 0.9.1 release candidate vote, worked through two intermediate RCs before RC3 drew binding +1s from Renjie Liu, Kevin Liu, Matt Butrovich, Jean-Baptiste Onofré, Kurtis Wright, Maximilian Michels, Sung Yun, and Fokko Driesprong, and then announced the 0.9.1 release. This is the fifth Iceberg Rust release in seven months — a cadence the community would not have predicted at the start of 2025. The Rust implementation's DataFusion integration makes it a serious alternative for teams that want Iceberg without a JVM dependency, and the cadence reflects how much of that downstream traffic is actually shipping. Kurtis Wright separately opened a Curiosity thread on checksums in Iceberg libraries that drew responses from Russell Spitzer, Steve Loughran, Daniel Weeks, and Andrei Tserakhau — the kind of cross-library integrity question that matters more as Iceberg deployments expand beyond Java.
The community also welcomed Andrei Tserakhau as a new committer, with congratulations rolling in from Matt Topol, Neelesh Salian, Eduard Tudenhöfner, Amogh Jahagirdar, Micah Kornfield, Kevin Liu, huaxin gao, Steven Wu, Alex Stephen, Sung Yun, Fokko Driesprong, Renjie Liu, Maximilian Michels, Gang Wu, Péter Váry, Drew, Talat Uyarer, Russell Spitzer, Shawn Chang, and Kurtis Wright. Andrei has anchored the labels-in-LoadTableResponse work across the spring, and the committer recognition reflects how that proposal moved from idea to multi-implementation POC across Polaris, Unity Catalog, Lakekeeper, and PyIceberg.
Local meetups continue to anchor community activity. Endi Caushi confirmed the Boston Iceberg meetup for May 6, Lester Martin announced the Atlanta meetup for May 13, Viktor Kessler advertised the Iceberg Community Meetup Europe events in Barcelona and Erlangen, Germany for May plus the June London and Amsterdam meetups, and Danica Fine announced the Seattle Iceberg Community Meetup for June 25. Sung Yun shared the Iceberg Summit 2026 Selection Committee retrospective notes — Kevin Liu's reply suggests retrospectives like this are exactly the kind of community-building work that lets the summit grow without breaking the volunteer model that makes it possible.
Apache Polaris
The defining Polaris story of the week is the 1.4.1 patch release and the four coordinated CVE disclosures that accompanied it. Jean-Baptiste Onofré opened the 1.4.1 RC0 release vote on May 5, drew rapid +1s from Robert Stupp, Dmitri Bourlatchkov, Russell Spitzer, and Sung Yun, and closed the vote successfully before announcing the 1.4.1 release. The patch addresses the KMS upgrade bug, Helm packaging fixes, and the security disclosures that landed alongside it.
The CVE disclosures themselves are the more significant artifact. Jean-Baptiste posted four coordinated advisories: CVE-2026-42809 covering staged table creation credential abuse, CVE-2026-42810 covering literal wildcard handling in IAM resource patterns, CVE-2026-42811 covering scope leakage in GCS credentials, and CVE-2026-42812 covering write.metadata.path protection. Four CVEs in one release window is not a small event for a project that graduated to top-level status three months ago, but the visible coordinated disclosure — patches first, public advisories second — is exactly the security posture enterprise deployments need to see. The bugs concentrate around credential vending and resource pattern handling, which is where most cross-tenant exposure surfaces in a catalog that issues subscoped credentials on behalf of clients.
With 1.4.1 out the door, attention turned immediately to the 1.5.0 cycle. Jean-Baptiste opened a DISCUSS thread asking whether Polaris should target 1.5.0 next week. The thread drew responses from Dmitri Bourlatchkov, Ajantha Bhat, Yufei Gu, Alexandre Dutra, and Robert Stupp, with Jean-Baptiste replying multiple times to refine the scope. Monthly release cadence has been the explicit commitment since the project graduated, and the 1.5.0 conversation reflects the community's discipline about staying on that schedule even after a security-focused patch release week.
Anand Kumar Sankaran's Uptaking 1.4.1 and turning on table metrics persistence thread gives a useful window into what production adopters are actually doing with the new release. Dmitri Bourlatchkov's reply walks through the configuration mechanics — a small example of the community's responsiveness to deployment-side questions that wouldn't have had this kind of visible support a year ago when much of that traffic still went through bilateral vendor channels.
Jean-Baptiste also drafted the Polaris May 2026 board report, which drew commentary from Francois Papon, Robert Stupp, Dmitri Bourlatchkov, Yufei Gu, and Adnan Hemani. The board report is the formal artifact the PMC submits to the ASF board, and the visible drafting on the dev list reinforces the project's open-governance posture. The community also welcomed Sung Yun to the PMC, with Robert Stupp's announcement drawing congratulations from Jean-Baptiste Onofré, Alexandre Dutra, Keith Chapman, James Rowland-Jones, Kevin Liu, Yufei Gu, Dmitri Bourlatchkov, and Michael Collado. Sung Yun has anchored the REST Catalog "Trusted Iceberg Client" terminology work on the Iceberg side and brings strong cross-project coordination credentials to the Polaris PMC.
The DISCUSS pipeline stayed dense. Bill Bejeck opened a Diagnostics shell prototype for Polaris thread that drew engagement from Dmitri Bourlatchkov, Jean-Baptiste Onofré, and Yufei Gu. Jean-Baptiste opened a Polaris server custom assembly tool thread that drew responses from Dmitri Bourlatchkov and Yufei Gu. Robert Stupp opened a DISCUSS thread on enabling advisory Copilot PR review for documentation and test omissions that drew responses from Jean-Baptiste Onofré, Dmitri Bourlatchkov, Yong Zheng, and Yufei Gu — exactly the kind of AI-assistance governance conversation that mirrors the Iceberg and Parquet AI contribution policy work happening in parallel. Robert also opened a Guardrails for security-sensitive changes thread that drew a response from Jean-Baptiste Onofré — a conversation that lands with extra weight in the same release window as the four CVE disclosures.
Tornike Gurgenidze opened a DISCUSS thread on storage credential-vending SPI changes, pushing the SPI surface that lets vendors plug in alternative credential vending strategies. Srinivas Rishindra opened a DISCUSS thread on event persistence architecture and a global sanitization pipeline. EJ Wang continued the AI-readability conversation with a DISCUSS thread on linking the AI-generated Code Wiki from project docs. Dmitri Bourlatchkov opened a DISCUSS thread on adjusting the renameTable response code to 204 that drew responses from Nándor Kollár and Yufei Gu — a small REST API conformance question that matters for client interoperability. Anand Kumar Sankaran's feat: Configurable STS session names thread drew a reply from Dmitri Bourlatchkov.
Apache Arrow
Arrow's Rust subproject ran its tightest release week of the year. Andrew Lamb opened Apache Arrow Rust 58.3.0 RC1, Apache Arrow Rust 57.3.1 RC1, and Apache Arrow Rust 56.2.1 RC1 within a single planned cluster of patch and minor releases. All three drew rapid verification from Ed Seidl, Bryce Mecum, Raúl Cumplido, and L. C. Hsieh, and Andrew posted the three RESULT messages confirming all three votes had passed. The arrow-rs project running three concurrent release votes in a single week is a real engineering benchmark — it reflects the maintenance load of supporting multiple active release lines (58.x as the current minor, 57.x and 56.x as supported back-versions) and the verification community has reached the scale where this kind of parallel release cadence is actually sustainable. Andrew's earlier heads-up that planned patch releases were coming this week set the cadence expectations clearly.
Beyond Rust releases, the design surface stayed lively. Antoine Pitrou opened a DISCUSS thread on field/schema/custom metadata restriction to UTF8 that drew engagement from Rusty Conover, Raphael Taylor-Davies, and Dewey Dunnington. The thread sits at the intersection of cross-language compatibility and forward extensibility — Arrow metadata that is strictly UTF8 is easier to interoperate across languages, but the constraint also limits what extension types can carry through. Richie Black opened a DISCUSS thread on column default value metadata changes to FlightSql.proto — a JDBC interoperability question that matters for cross-system data engineering as Flight SQL adoption grows.
The pyarrow-stubs donation vote that has been building since Rok Mihevc opened it in April drew further engagement as Rok closed out the vote and confirmed the donation could move toward formal acceptance. The donation effectively brings type stubs for pyarrow into the official Arrow project rather than relying on a community-maintained external repository — a small but meaningful signal that the Python community is investing in pyarrow's type-checking surface as much as its runtime behavior.
The Erlang language binding moved forward. Benjamin Philip continued advancing the Arrow Erlang grant documents thread with Sutou Kouhei, working through the IP grant paperwork that the ASF requires for code donations. Erlang adoption is a niche audience for Arrow, but the grant work is the same procedural foundation that every language expansion follows, and the visibility of the process is what makes Arrow's multi-language footprint sustainable. Antoine Pitrou's announcement of the Apache Arrow / Parquet meetup in Paris drew enthusiasm from Sutou Kouhei and Marc Deveaux — the kind of cross-project meetup that reflects how interleaved the Arrow and Parquet communities have become.
Mandukhai Alimaa's DISCUSS thread on a canonical BigDecimal extension type and Andrew Lamb's arrow-rs security policy discussion continued threading toward production hardening. The security policy work in particular reflects a project that is being deployed in commercial-grade scenarios where formal vulnerability disclosure paths matter as much as the underlying code quality. The Nishant Avasthi-led DISCUSS on adding Apache Arrow support for IBM Db2 via ADBC drew a response from Ian Cook — another quietly significant adoption signal, since Db2 brings a category of enterprise mainframe and traditional database workloads into Arrow's data interchange footprint.
Apache Parquet
Parquet finally shipped its first Java release of the year. Gang Wu opened the parquet-java 1.17.1 RC0 vote, drew +1s from Steve Loughran, Fokko Driesprong, Russell Spitzer, Daniel Weeks, and Xinli shang, and announced the 1.17.1 release. The release closes the gap that opened after parquet-java 1.17.0 shipped in January 2026, and resolves the long-running DISCUSS thread on a new parquet-java release that Manu Zhang opened in late March. The release cadence question has been one of the more honest community conversations of the year — parquet-java has historically shipped less frequently than its sister projects, and Manu's thread surfaced the real costs (slower bug-fix delivery, encoding feature lag) that justified the patch release effort.
Russell Spitzer's DISCUSS thread on GH-3547 automated release for Parquet drew engagement from Arnav Balyan, Gang Wu, and Fokko Driesprong on the broader question of how to bring parquet-java's release infrastructure closer to the cadence of the format and Rust projects. Automated release tooling is what eventually makes monthly-class release cadences sustainable, and the Polaris release engineering work has been one of the visible reference implementations the Parquet community can point at when designing its own approach.
The community welcomed Ed Seidl as a new committer, with Micah Kornfield announcing the recognition and congratulations following from Andrew Lamb, Gang Wu, and Raúl Cumplido. Ed has driven multiple format-level threads through 2026, including the path_in_schema optionality proposal and engaged commentary on the FlatBuffer footer redesign work. The committer recognition reflects sustained spec-design work over the year.
The format-level conversation continued at high intensity. Will Edwards opened a DISCUSS thread on how readers handle Parquet files with future extensions — a forward-compatibility question that becomes more urgent as the format adds Variant, Geospatial, and now the proposed File logical type. Daniel Weeks opened a DISCUSS thread on supporting non-contiguous pages that drew engagement from Andrew Bell, Adrian Garcia Badaracco, Micah Kornfield, Will Edwards, and Andrew Lamb. Non-contiguous pages would let writers split a column's pages across the file rather than requiring them adjacent — a non-trivial format change with implications for how readers do projection pushdown and how object storage range reads can be coalesced. Andrew Bell's separate Wide Schemas thread drew responses from Andrew Lamb, Adrian Garcia Badaracco, and Steve Loughran — wide-table workloads are precisely the AI/ML use case that has been pulling format design forward all year.
Micah Kornfield's DISCUSS thread on remaining open spec-level questions for ALP drew engagement from Andrew Lamb on closing out the ALP encoding work that passed an earlier vote. Arnav Balyan opened a DISCUSS thread on adding AGENTS.md to parquet-java that drew responses from Aaron Niskode-Dossett, Andrew Lamb, and Micah Kornfield, and a separate DISCUSS thread on an AI tooling policy for Parquet that drew a response from Fokko Driesprong. The AGENTS.md and AI tooling policy threads are the Parquet community's version of the same conversation Iceberg and Polaris have been running in parallel — how AI-assisted contribution fits into Apache governance and what guardrails the project wants to set.
Martin Prammer's Datasets Project — Raincloud thread drew responses from Arnav Balyan on the broader question of how the Parquet community thinks about reference datasets for testing and benchmarking. Dewey Dunnington's Geography test files with statistics thread continued the geospatial spec stabilization work. Julien Le Dem ran the regular Parquet sync on May 6, with the meeting notes thread setting the agenda for the format-level work that played out across the rest of the week.
Cross-Project Themes
The clearest pattern this week is the maturation of release engineering across all four projects. Iceberg cut a major 1.11.0 RC. Polaris shipped a patch release coordinated with four CVE disclosures and immediately planned the next monthly minor. Arrow ran three Rust release votes in parallel. Parquet shipped its first patch release of the year and started seriously planning automated release tooling. Each project is operating at a release cadence that would have been hard to sustain a year ago, and the cumulative effect is a lakehouse stack where every component is shipping at predictable, professional intervals. That is the difference between a research stack and infrastructure, and the lakehouse stack has clearly crossed into the latter.
The second pattern is the consistent attention to AI-assisted contribution governance. Iceberg has the published AI contribution policy work that Holden Karau, Kevin Liu, Steve Loughran, and Sung Yun pushed through March. Polaris ran the AI-generated Code Wiki linking thread and the advisory Copilot PR review thread this week. Parquet ran the AGENTS.md and AI tooling policy threads in parallel. These conversations are not happening in isolation — they reflect a coordinated community position that AI tools are welcome in Apache contribution flows but require explicit governance, disclosure, and review patterns. The pattern is consistent enough across projects that it looks like an emerging Apache-wide norm rather than four parallel one-offs.
The third pattern is the continued translation of AI workload pressure into format-level proposals. Iceberg's compact bitmap format, V4 aggregate column stats, and partition tuples work all push on metadata efficiency at scale. Parquet's wide schemas thread, the File logical type proposal that's still resolving, and the non-contiguous pages discussion all target the AI workloads where data shapes don't match the assumptions the format was originally designed around. Arrow's BigDecimal canonical extension type and the metadata UTF8 thread both push on cross-language interop for AI-pipeline data. These are not coincidences — they're four projects responding to the same pressure from the same workloads, on the same timeline.
The fourth pattern is the visible security maturation across the stack. Polaris's four CVE disclosures landed with formal coordinated advisories, separate patches, and clear remediation guidance. Iceberg's KMS/Vault credential management work, Arrow's security policy discussion, and Parquet's careful handling of forward-compatibility questions all reflect a shared posture that the lakehouse stack is being deployed in environments where formal security disclosure paths are required, not optional. The CVEs being public is itself a healthy signal — projects with bad security posture don't disclose, they hide.
Looking Ahead
Watch for the Iceberg 1.11.0 vote to close and the release to ship in the coming days. The Polaris 1.5.0 planning email is likely the next major release-side artifact, with the 1.4.1 security work as the floor and 1.5.0 feature scoping as the ceiling. The Iceberg V4 design work — partition tuples, compact bitmap format, aggregate column stats — is converging toward formal proposals that should land in the coming weeks alongside the long-anticipated formal single-file commits write-up. The Parquet release engineering automation thread should converge into a concrete proposal, and the AGENTS.md and AI tooling policy discussions should harden into something the community can adopt.
On the Arrow side, the field/schema metadata UTF8 thread and the FlightSql column default value proposal both look ready to mature into more formal proposals, and the pyarrow-stubs donation should land formally. The Iceberg Community Meetup Europe series across Barcelona, Erlangen, London, Amsterdam, and Basel — plus the Atlanta and Seattle North American meetups — will continue translating the dev-list conversations into in-person community building. Iceberg Summit 2026 session recordings will continue rolling out on YouTube, and the next round of Apache board reports across all four projects will set the formal narrative for what shipped in May.
Resources & Further Learning
Get Started with Dremio
- Try Dremio Free — Build your lakehouse on Iceberg with a free trial
- Build a Lakehouse with Iceberg, Parquet, Polaris & Arrow — Learn how Dremio brings the open lakehouse stack together
Free Downloads
- Apache Iceberg: The Definitive Guide — O'Reilly book, free download
- Apache Polaris: The Definitive Guide — O'Reilly book, free download
Books by Alex Merced
Top comments (0)