DEV Community

Sagara
Sagara

Posted on

Personal Picks: Data Product News (June 11, 2025)

This article is an English translation of the original Japanese article: https://dev.classmethod.jp/articles/modern-data-stack-info-summary-20250611/

Hello, I'm Sagara.

As a consultant specializing in Modern Data Stack, I observe that the Modern Data Stack ecosystem is constantly buzzing with new information being released daily.

Among the wealth of information being shared, I've compiled the Modern Data Stack-related updates that caught my attention over the past two weeks in this article.

Disclaimer: This doesn't cover all the latest information about the mentioned products. The content is based on my personal judgment and preferences for information that I found interesting.

Modern Data Stack in General

The State of Data and AI Engineering 2025

A report article titled "The State of Data and AI Engineering 2025" was published on lakeFS's blog, summarizing the current state of data engineering.

The article begins by mentioning the following trends and then touches on the current state of various fields including Ingestion and Data Lakes:

  • MLOps field is gradually shrinking
  • LLM accuracy, monitoring, and performance solutions are thriving
  • AWS Glue is the only escape from vendor lock-in
    • While BigQuery, Databricks, and Snowflake support read-only federation of Iceberg REST catalogs, AWS Glue enables both read and write operations by integrating with Databricks and Snowflake.
  • Storage providers prioritize performance
    • High-performance storage solutions like Google Cloud's GCS Fast Tier, AWS's S3 Express, and CoreWeave have been released

https://lakefs.io/blog/the-state-of-data-ai-engineering-2025/

Data Warehouse/Data Lakehouse

Snowflake

Snowflake Summit 2025 was held

Snowflake's biggest annual event, "Snowflake Summit 2025," was held.

https://reg.snowflake.com/flow/snowflake/summit25/sessions/page/catalog?tab.sessioncatalogtab=1714168666431001NNiH

For the new features announced at this event, the following official blog should be helpful.

https://www.snowflake.com/en/blog/announcements-snowflake-summit-2025/

I apologize for the self-promotion, but I've posted an article summarizing the content of the official blog that mentioned each new feature, so please take a look at this as well.

https://dev.classmethod.jp/articles/snowflake-summit-2025-platform-keynote-snowflakesummit/

The announcements I found particularly exciting are:

  • Semantic View became generally available and can now be queried with SELECT statements
  • Enhanced Snowflake Horizon Catalog functionality
    • Now able to integrate information from other products including competing DWH like Tableau and Databricks
  • dbt Projects
    • A feature that allows dbt development within Snowflake and scheduled execution via tasks
    • The new dbt engine "Fusion" will also be available
    • Demo video on YouTube
  • Cortex AISQL
    • New functions like AI_AGG and AI_FILTER have been released, allowing users to define query content in natural language by inputting natural language as function arguments
    • Reference blog
  • Adaptive Warehouses
    • A new type of warehouse where Snowflake determines the optimal warehouse specifications without users having to specify size or cluster count
  • Snowflake Intelligence
    • A new feature that allows natural language access to data, with agent functionality planned for executing tasks integrated with external products
    • Demo video on YouTube
  • Snowflake Postgres
    • PostgreSQL running within Snowflake using technology from the acquired Crunchy Data company

BigQuery

Mercari's Case Study: Data Analytics AI Agent "Socrates" and ADK Usage at Mercari

@na0fu3y presented about the data analytics AI agent "Socrates" developed in-house at Mercari using the Agent Development Kit (ADK) at the ADK User Group kickoff.

The presentation materials and the accompanying NotebookLM are as follows:

https://speakerdeck.com/na0/merukariniokerudetaanariteikusu-ai-eziento-socrates-to-adk-huo-yong-shi-li

https://notebooklm.google.com/notebook/98dd8491-4fbb-4614-9368-cd3427db716e

Renaming of "BigQuery tables for Apache Iceberg" and "BigQuery metastore"

In the June 3, 2025 release, the renaming of "BigQuery tables for Apache Iceberg" and "BigQuery metastore" was announced. Both became generally available with this name change.

  • "BigQuery tables for Apache Iceberg" ⇛ "BigLake tables for Apache Iceberg in BigQuery"
  • "BigQuery metastore" ⇛ "BigLake metastore"

MotherDuck/DuckDB

DuckDB reaches 30,000 stars on GitHub

DuckDB's official blog announced that DuckDB has reached 30,000 stars on GitHub.

The blog also mentions the following metrics, clearly showing DuckDB's growth:

  • DuckDB's website has over 3 million unique visitors per month, more than double from December 2024
  • Traffic volume exceeds 700 terabytes with millions of extension downloads
  • In 2025 alone, it rose 10 positions from 55th to 45th in the DB Engines ranking
  • Currently, monthly downloads on PyPI exceed 20 million

https://duckdb.org/2025/06/06/github-30k-stars.html

Data Transform

dbt

2025 dbt Launch Showcase was held

The 2025 dbt Launch Showcase was held on May 28, 2025 (local time). Many new features were announced at this event.

https://www.getdbt.com/resources/webinars/2025-dbt-cloud-launch-showcase

I've summarized the content announced at this event in a blog post, so please take a look at this as well.

https://dev.classmethod.jp/articles/2025-dbt-launch-showcase-summary/

Additionally, new features not mentioned at this event are also listed in the release notes.

https://docs.getdbt.com/docs/dbt-versions/dbt-cloud-release-notes#2025-dbt-launch-showcase

SQLMesh(Tobiko Cloud)

Tutorial using DuckLake and SQLMesh

A tutorial article using DuckLake and SQLMesh was published on Tobiko Data's official blog.

https://www.tobikodata.com/blog/ducklake-sqlmesh-tutorial-a-hands-on

Semantic Layer

Cube

Announced agentic analytics platform "D3"

Cube announced a new agentic analytics platform called "D3".

https://cube.dev/blog/announcing-cube-d3

https://cube.dev/blog/unleashing-agentic-analytics-with-cube-d3

The feature interface looks like the image below, where users can query in natural language while also being able to review and edit the generated SQL. It was mentioned that you can also configure the agent's accessible data scope and data usage rules.

2025-06-11_09h14_50

Business Intelligence

ThoughtSpot

Announced Agentic Semantic Layer

ThoughtSpot announced a new feature called "Agentic Semantic Layer".

This feature has the following characteristics (summarized by generative AI):

  • Agentic by Design: Unlike static models, it's dynamic and context-aware, working in conjunction with ThoughtSpot's AI agent "Spotter".
  • Advanced AI interpretation: Using AI-powered synonym registration, search indexing, and data matching technology, it accurately interprets user intent even with ambiguous natural language questions.
  • Continuous learning: Analysts can provide feedback and "coach" the model to continuously improve AI accuracy and contextual understanding.
  • Flexible integration and implementation: In addition to defining business logic within ThoughtSpot, it can also leverage existing logic in data infrastructure like Snowflake and dbt.
  • SDK for embedded analytics: AI-powered analytics capabilities can be easily embedded into any application.

https://www.thoughtspot.com/blog/introducing-the-agentic-semantic-layer

Sigma

Announced integration with Snowflake's Semantic View

Sigma announced integration functionality with Snowflake's Semantic View.

Dimensions and Metrics defined in Semantic View can now be used within Sigma.

https://www.youtube.com/watch?v=-75decD2NPg

Hex

Announced integration with Snowflake's Semantic View

Hex announced integration functionality with Snowflake's Semantic View.

Dimensions and Metrics defined in Semantic View can now be visualized through drag-and-drop operations in Hex's Explore.

https://hex.tech/blog/introducing-snowflake-semantic-sync-aisql/

2025-06-11_09h36_09

Completed Series C funding of $70M USD

Hex announced the completion of Series C funding of $70M USD.

https://hex.tech/blog/series-c/

Omni

Announced integration with Snowflake's Semantic View

Omni announced integration functionality with Snowflake's Semantic View.

Not only can it use Dimensions and Metrics defined in Semantic View, but the video also shows generating Semantic Views with Python scripts that reflect changes made in Omni.

https://www.youtube.com/watch?v=m5CnfB9Vb90

Evidence

Announced AI-powered development environment "Evidence Studio"

Evidence announced a new AI-powered development environment called "Evidence Studio".

https://evidence.dev/blog/evidence-studio

The images below are quoted from the above link, showing that you can develop Evidence and also perform automatic code generation with AI functionality.

hero_image_final4

2025-06-11_09h41_21

Note that they also mentioned We will be sunsetting the current Cloud product by the end of this year and we are helping existing customers transition to Studio., so existing Cloud version users will need to migrate to Evidence Studio.

Data Quality・Data Observability

Soda

Announced acquisition of NannyML

Soda announced the acquisition of NannyML.

https://launch.soda.io/blog/soda-acquires-nannyml

NannyML was new to me, but it appears to be a company that provides an open-source Python library for monitoring ML model performance after deployment, along with its Cloud version.

https://www.nannyml.com/

https://nannyml.readthedocs.io/en/stable/

Elementary

Elementary announced AI agent "Ella" as a new feature

Elementary announced a new AI agent called "Ella".

They provide agents such as Test Recommendations Agent, Triage & Resolution Agent, Governance Agent, Performance & Cost Agent, and Catalog Agent.

https://www.elementary-data.com/post/announcing-ai-agents

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.