DEV Community

Sagara
Sagara

Posted on

Personal Picks: Data Product News (October 1, 2025)

Modern Data Stack Information Summary - October 2025

This article is an English translation of the original Japanese version: https://dev.classmethod.jp/articles/modern-data-stack-info-summary-20251001/

Hello, this is Sagara.

As a consultant specializing in Modern Data Stack, I observe that the Modern Data Stack ecosystem generates a tremendous amount of information daily.

Among this wealth of information, this article summarizes the Modern Data Stack-related updates that caught my attention over the past two weeks.

Disclaimer: This is not an exhaustive list of all the latest updates for the mentioned products. The information included here is based on my personal judgment and interests.

Modern Data Stack General

Launch of "Open Semantic Interchange (OSI)"

Snowflake, Salesforce, dbt Labs, and other companies have announced the launch of an open-source initiative called "Open Semantic Interchange (OSI)" to promote data utilization for AI.

This initiative aims to build a common semantic data framework by standardizing fragmented Semantic Layer definitions that vary across different products through a vendor-neutral open specification.

The following vendors are listed as Launch Partners:

press-release-open-semantic-interchange-1200x500-blackrock

Below are the press releases from Snowflake and Salesforce regarding this announcement:

https://www.snowflake.com/en/news/press-releases/snowflake-salesforce-dbt-labs-and-more-revolutionize-data-readiness-for-ai-with-open-semantic-interchange-initiative/

https://www.salesforce.com/blog/agentic-future-demands-open-semantic-layer/

While other participating products have also published blogs about this announcement, I found Select Star's post particularly interesting. As shown in the figure quoted from their blog, if this can be realized, Select Star could act as a hub to coordinate Semantic Layer definitions with BI tools not participating in the Open Semantic Interchange initiative, which I find exciting.

https://www.selectstar.com/resources/snowflake-ai-ready-semantic-model

2025-10-01_10h36_14

"Everyone's Strongest Data Platform Architecture Vol. 5 - All-Star Special!!" Event Held

On September 25th, "Everyone's Strongest Data Platform Architecture Vol. 5 - All-Star Special!!" was held.

https://datatech-jp.connpass.com/event/360596/

The event had over 100 in-person attendees and more than 500 online participants. You can get a sense of the event's excitement by checking the hashtag "みん強" (Min-Kyō) at the following link:

https://x.com/hashtag/%E3%81%BF%E3%82%93%E5%BC%B7?src=hashtag_click

Below are links to the presentation materials from each speaker that I was able to find:

https://speakerdeck.com/kaz3284/minqiang-di-5hui-kubellnodetaji-pan-kai-fa-nozui-xin-zhuang-kuang-toainohuo-yong-noshi-jian-nituite

https://speakerdeck.com/tenajima/data-vaultwoyong-itamarutipurodakutonotamenodetaji-pan-kai-fa

https://speakerdeck.com/pei0804/revops-practice-learned

https://speakerdeck.com/foursue/20250924-lt2ben-yaru

https://speakerdeck.com/genshun9/minqiang-nokoremadetokorekara

Data Extract/Load

Airbyte

Airbyte 2.0 Released

Airbyte has released version 2.0, marking a major version upgrade. (The OSS version has not yet released 2.0.)

https://airbyte.com/v2

https://airbyte.com/blog/airbyte-2-0

As quoted from the links above, the following features have been released:

  • Enterprise Flex: An architecture that separates the control plane and data plane, providing a hybrid model where management is done in the cloud while actual data remains within the customer's infrastructure
  • Data Activation: A feature that directly syncs insights from data warehouses to business applications like Salesforce and HubSpot. This allows the Reverse ETL process to be completed within the platform
  • Speed: Connector architecture has been redesigned to improve data sync speed by 4-10x. For example, MySQL to S3 sync is 4.7x faster, and Postgres to S3 is 12x faster
  • New Pricing Plans: A new plan structure tailored to team growth stages. The "Capacity Based Pricing" introduced for Pro plans and above is particularly notable, as it's based on required parallel processing capacity (Data Workers) rather than data transfer volume
    • Core (formerly OSS): Free open-source version
    • Standard (formerly Cloud): Pay-as-you-go managed service
    • Pro (formerly Teams): Capacity-based pricing with governance features like RBAC and SSO
    • Enterprise Flex: All Pro features plus the ability to deploy data planes anywhere—cloud, multi-cloud, or on-premises
    • Self-Managed Enterprise: Fully self-managed enterprise version for organizations with strict security requirements

Data Warehouse/Data Lakehouse

Snowflake

FILE Data Type Generally Available

The FILE data type for handling unstructured data in Snowflake is now generally available.

This enables confident use of generative AI with Cortex AI SQL for images and document files!

https://docs.snowflake.com/en/release-notes/2025/other/2025-09-25-file-data-type-ga

Cortex Analyst Feature Enhancements

Cortex Analyst has received functional updates with two new features added. Derived metrics is a capability that other Semantic Layers already had, and since actual business often requires calculations using multiple metrics, this is a welcome addition!

  • Private facts and metrics: A feature that defines metrics in the Semantic Model but prevents end users from directly querying these metrics (primarily intended for metrics used only in Derived metrics)
  • Derived metrics: A new type of metric that allows defining metrics based on calculations between multiple metrics

https://docs.snowflake.com/en/release-notes/2025/other/2025-09-30-semantic-model-improvements

dbt Projects on Snowflake Now Supports docs generate

dbt Projects on Snowflake received a silent update that now enables docs generate functionality.

While I haven't tested it yet, this should allow the execute dbt project command to perform docs generate when hosting docs with GitHub Actions, eliminating the need to rewrite profiles.yml for dbt Core!

https://x.com/SS_chneider/status/1973154146976145839

Claude Sonnet 4.5 Now Available in Snowflake

Claude Sonnet 4.5 is now available within Snowflake. The official documentation doesn't mention it yet.

Additionally, it's accessible in unsupported regions by enabling cross-region inference.

https://www.snowflake.com/en/blog/cortex-ai-claude-sonnet-4-5/

SELECT's Summary Article on Snowflake Features Released in Summer 2025

SELECT has published a summary article on Snowflake features released in summer 2025.

https://select.dev/posts/snowflake-summer-2025-product-updates

Best Practices Article for Combining Snowflake × Power BI

phData has published a best practices article for combining Snowflake × Power BI.

The article mainly covers the following topics:

  • Use Power BI's native Snowflake Connector
  • Carefully select connection mode (Import, DirectQuery, or Composite) based on use case
  • Properly model data, including adopting star schema
  • Configure Microsoft Entra SSO for Snowflake
  • Use appropriate Azure VMs for gateways
  • Minimize distance between Snowflake and Power BI data centers
  • Increase concurrent query limits for data models
  • Leverage AI features like Copilot

https://www.phdata.io/blog/how-to-optimize-power-bi-and-snowflake-for-advanced-analyitcs/

BigQuery

Column-Level Lineage Now Available in Dataplex

As a new feature in Dataplex, column-level lineage viewing is now available (generally available).

https://cloud.google.com/dataplex/docs/release-notes#September_29_2025

https://cloud.google.com/dataplex/docs/lineage-views#column-level-lineage

column-level-lineage

Array Unnesting Feature Using Gemini Released

A feature using Gemini that can expand each element of an array into independent rows has been released.

https://cloud.google.com/bigquery/docs/release-notes#September_29_2025

https://cloud.google.com/bigquery/docs/data-prep-get-suggestions#unnest-arrays

Summary Article on New BigQuery SQL Features

yu yamada from Google Cloud has published an article summarizing five new features related to BigQuery SQL, including UNION based on column names and simplified array operations.

https://zenn.dev/google_cloud_jp/articles/3b20a94df7624e

Databricks

Databricks One in Public Preview

"Databricks One," a simple user interface designed for business users, has entered public preview.

https://docs.databricks.com/aws/ja/workspace/databricks-one

As shown in the figure below, it features a UI where you can ask questions about data in natural language and directly link to related dashboards.

landing-page-d83506567dae89e178878be9b9506725

Lakeflow Pipelines Editor in Public Preview

Databricks has released "Lakeflow Pipelines Editor," a new IDE for developing and debugging ETL pipelines, as a public preview.

https://docs.databricks.com/aws/en/dlt/dlt-multi-file-editor

As shown in the figure quoted from the link above, it's not just for editing pipeline code but also allows viewing dependencies between tables.

dlt-multi-file-editor-overview-bd4eb971616acd036963cdd1560b1d8f

OpenAI GPT-5 and Claude Sonnet 4.5 Now Available in Databricks

While these are separate announcements, both GPT-5 and Sonnet 4.5 are now available within Databricks.

https://www.databricks.com/blog/run-openai-models-directly-databricks

https://www.databricks.com/blog/claude-sonnet-45-here

MotherDuck/DuckDB

DuckDB ducklake Extension and DuckLake 0.3 Released

The DuckDB ducklake extension and DuckLake 0.3 have been released. Using the ducklake extension requires DuckDB v1.4.0.

The main updates appear to be data copying between DuckLake and Iceberg using DuckDB's iceberg extension, and using the MERGE statement released in DuckDB v1.4.0 through the ducklake extension.

https://duckdb.org/2025/09/17/ducklake-03.html

MotherDuck Announces First European Cloud Region in Private Preview

MotherDuck has announced its first European cloud region as a private preview.

This new region runs on AWS eu-central-1, with official release planned for this fall.

https://motherduck.com/blog/motherduck-in-europe/

Business Intelligence

Looker

Looker Accessible from Gemini CLI

A feature to access Looker has been released as an extension for Gemini CLI.

It appears you can check available Explores, confirm dimensions and measures available in specified Explores, and even create Looks and dashboards in Looker.

https://cloud.google.com/looker/docs/release-notes#September_23_2025

https://github.com/gemini-cli-extensions/looker

Data Activation (Reverse ETL)

Hightouch

Dashboards Now Available in Hightouch

As a new feature in Hightouch, functionality to consolidate multiple charts into dashboards has been released.

This should be useful for cases where you want to check everything in Hightouch, such as dashboards for confirming campaign performance.

https://changelog.hightouch.io/

https://hightouch.com/docs/campaign-intelligence/dashboards

dashboard_add_additional_chart

Data Orchestration

Airflow

Airflow 3.1 Released

Airflow's latest version 3.1 has been released.

https://github.com/apache/airflow/releases/tag/3.1.0

Astronomer has published a blog post summarizing the added features.

It appears that improvements to AI workflow support, updates to a React-based UI interface, and DAG favorites functionality have been added.

https://www.astronomer.io/blog/introducing-apache-airflow-3-1/

Top comments (0)