<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Pranav Aurora</title>
    <description>The latest articles on DEV Community by Pranav Aurora (@pranav_aurora).</description>
    <link>https://dev.to/pranav_aurora</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2647643%2Ff189c3a6-1a1a-4ee2-86f1-e636b1a3cb85.png</url>
      <title>DEV Community: Pranav Aurora</title>
      <link>https://dev.to/pranav_aurora</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pranav_aurora"/>
    <language>en</language>
    <item>
      <title>On the Andy Pavlo's DB review</title>
      <dc:creator>Pranav Aurora</dc:creator>
      <pubDate>Thu, 02 Jan 2025 18:37:10 +0000</pubDate>
      <link>https://dev.to/pranav_aurora/on-the-andy-pavlos-db-review-2oh5</link>
      <guid>https://dev.to/pranav_aurora/on-the-andy-pavlos-db-review-2oh5</guid>
      <description>&lt;p&gt;The Andy Pavlo yearly review has a massive chokehold amongst the DB community. It's like the oscars of databases? &lt;/p&gt;

&lt;p&gt;This year was a pretty special review, our project, &lt;a href="https://github.com/Mooncake-Labs/pg_mooncake" rel="noopener noreferrer"&gt;pg_mooncake&lt;/a&gt; was mentioned. &lt;/p&gt;

&lt;p&gt;Here are some thoughts from reading the review, and what we've learnt at &lt;a href="https://mooncake.dev/" rel="noopener noreferrer"&gt;Mooncake Labs&lt;/a&gt; in our first 121 days of existence. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Yes, we're guilty of 'Shoving Ducks everywhere'...&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Our first project, &lt;a href="https://github.com/Mooncake-Labs/pg_mooncake" rel="noopener noreferrer"&gt;pg_mooncake&lt;/a&gt;added a native columnstore table (Iceberg) to Postgres for 1000x faster analytics. &lt;/p&gt;

&lt;p&gt;While, there are quite a few extensions on the market bringing DuckDB into Postgres; we focussed on making the columnar storage feel like a regular Postgres table. Things like transactional writes, triggers etc. See our &lt;a href="https://mooncake.dev/blog/how-we-built-pgmooncake" rel="noopener noreferrer"&gt;architecture&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;To us, it feels like the final touch to complete the 'analytics in PG experience'. Almost a decade later from early projects like Citus, we're optimistic that analytics in Postgres will be a reality. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. 2024 felt like year of the Data Lake.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Snowflake vs Databricks. elastic's 'search lake' (lol). s3 tables.&lt;/p&gt;

&lt;p&gt;What I mean by the 'lake': serverless workloads on data in object storage. &lt;/p&gt;

&lt;p&gt;In 2024, analytic (DatabricksSQL, Snowflake Iceberg) &amp;amp; Vector Search (Turbo Puffer, Lance) moved to the lake. &lt;/p&gt;

&lt;p&gt;In 2025, I reckon there will be more workloads (lookups, full-text) running in this manner. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. As for vector search...&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Agents are everywhere; and yet vector search wasn't a topic at all... Couple thoughts. &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Just use Postgres&lt;/li&gt;
&lt;li&gt;If you have big 'data', LanceDB / Turbopuffer&lt;/li&gt;
&lt;li&gt;Vector search workloads moving toward full-text workloads. Something we've noticed a lot. Hybrid Search results are often ~95%+ full-text results. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;4. As for AI / Agents&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A lot of the AI companies we spend time with are each building a'systems of record' for each customer... And they're all storing structured/unstructured data in a 'Lake'. See &lt;a href="https://docs.rox.com/development/how-rox-works/rox-platform-architecture" rel="noopener noreferrer"&gt;Rox's architecture &lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Another trend we've seen: LLMs being used for data processing and ML tasks (feature extraction, classifiers).&lt;/p&gt;

&lt;p&gt;It kind of makes sense too… on small data. Product engineers can use LLMs out of the box, instead of picking/training/deploying ML models for each task. &lt;/p&gt;

&lt;p&gt;I am super super curious how Snowflake, Databricks and Redshift AI functions will play out this year. &lt;/p&gt;

&lt;p&gt;2025 will be exciting. &lt;/p&gt;

&lt;p&gt;Pranav&lt;/p&gt;

</description>
      <category>discuss</category>
      <category>database</category>
      <category>postgres</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
