<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jared Lewis</title>
    <description>The latest articles on DEV Community by Jared Lewis (@jareddlewis).</description>
    <link>https://dev.to/jareddlewis</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4007251%2F293b2fcb-bc0c-4cae-9333-6653d12ea734.jpg</url>
      <title>DEV Community: Jared Lewis</title>
      <link>https://dev.to/jareddlewis</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jareddlewis"/>
    <language>en</language>
    <item>
      <title>One vector space for photos and words: Bedrock Titan multimodal on Aurora</title>
      <dc:creator>Jared Lewis</dc:creator>
      <pubDate>Mon, 29 Jun 2026 04:33:39 +0000</pubDate>
      <link>https://dev.to/jareddlewis/one-vector-space-for-photos-and-words-bedrock-titan-multimodal-on-aurora-236c</link>
      <guid>https://dev.to/jareddlewis/one-vector-space-for-photos-and-words-bedrock-titan-multimodal-on-aurora-236c</guid>
      <description>&lt;p&gt;In my last post I described &lt;a href="https://opinlog.com/" rel="noopener noreferrer"&gt;OpinLog&lt;/a&gt; — a cross-user review graph where your "burger" and my "burger" resolve to the same canonical item via &lt;code&gt;pgvector&lt;/code&gt; on Amazon Aurora PostgreSQL. This post is about the piece that makes the matching feel like magic: &lt;strong&gt;multimodal embeddings from Amazon Bedrock&lt;/strong&gt;, feeding the matcher inside Aurora, all deployed on Vercel.&lt;/p&gt;

&lt;h2&gt;
  
  
  A photo of a burger should match the word "burger"
&lt;/h2&gt;

&lt;p&gt;Users log items two ways: by typing a name, or by snapping a photo. If text and images lived in different vector spaces, I'd need two matchers and a fusion step. Instead I used &lt;strong&gt;Amazon Titan Multimodal Embeddings G1&lt;/strong&gt; (&lt;code&gt;amazon.titan-embed-image-v1&lt;/code&gt;), which maps &lt;em&gt;both&lt;/em&gt; text and images into the &lt;strong&gt;same 1024-dimensional space&lt;/strong&gt;. One model, one index, one query.&lt;/p&gt;

&lt;p&gt;The entire embedding contract is tiny:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;imageBase64&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;EmbedInput&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;embeddingConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;outputEmbeddingLength&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;inputText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;imageBase64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;inputImage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;imageBase64&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getClient&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;InvokeModelCommand&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;BEDROCK_EMBEDDING_MODEL_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;contentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;accept&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;}))&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;parsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TextDecoder&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;embedding&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pass &lt;code&gt;text&lt;/code&gt;, &lt;code&gt;imageBase64&lt;/code&gt;, or both. Because a photo of an In-N-Out burger lands near the &lt;em&gt;text&lt;/em&gt; "In-N-Out Double-Double," a user who uploads a picture matches a user who typed the name — &lt;strong&gt;with zero special-casing&lt;/strong&gt;. That single design choice is what makes the "add by photo → it already knows what this is" demo moment work.&lt;/p&gt;

&lt;p&gt;A couple of production notes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Bedrock client uses the &lt;strong&gt;default AWS credential chain&lt;/strong&gt; when explicit keys are absent, so the same code runs locally (SSO) and on Vercel (IAM user keys in env).&lt;/li&gt;
&lt;li&gt;I store the returned array straight into a &lt;code&gt;pgvector&lt;/code&gt; column via a literal helper — &lt;code&gt;`[${vec.join(",")}]`&lt;/code&gt; — cast to &lt;code&gt;::vector&lt;/code&gt; inside a raw &lt;code&gt;sql&lt;/code&gt; template.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  From embedding to match
&lt;/h2&gt;

&lt;p&gt;On every add, I embed the new item and run an HNSW cosine ANN search over the canonical catalog:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;photo_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rating_avg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rating_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;canonical_items&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; is pgvector's cosine distance. A high-confidence hit auto-suggests as the default ("we think this is it"); otherwise the user sees the top candidates plus a "None of these — create new" escape hatch. The match is &lt;strong&gt;non-blocking&lt;/strong&gt;: the item saves instantly with &lt;code&gt;canonical_item_id = NULL&lt;/code&gt;, and the link is filled in after.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tuning the floor: where "search" beats "vibes"
&lt;/h2&gt;

&lt;p&gt;Raw vector search will happily return your entire catalog, ranked by ever-fainter similarity. I measured where Titan text embeddings actually sit:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;~0.3 cosine&lt;/strong&gt; for unrelated text&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;~0.5–0.6&lt;/strong&gt; for genuinely on-topic text&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;…and gated results at &lt;code&gt;1 - distance &amp;gt; 0.45&lt;/code&gt;. Below the floor, results are dropped. It's a single constant, but it's the line between "found what you meant" and "here's everything, sorted by a coin flip." For the matcher's &lt;em&gt;auto-suggest&lt;/em&gt; threshold I set the bar much higher, so we only pre-select a match when we're confident, and otherwise let the human choose.&lt;/p&gt;

&lt;h2&gt;
  
  
  The catalog sharpens itself
&lt;/h2&gt;

&lt;p&gt;Each canonical item's embedding is the &lt;strong&gt;running centroid&lt;/strong&gt; of its linked members. When your burger log links to mine, the canonical vector becomes the average of both. The more people log the same thing, the more representative that vector gets — and the more reliably the &lt;em&gt;next&lt;/em&gt; person's log (text or photo) snaps to it. The embedding model does the understanding; Aurora does the remembering and the averaging, in the same transaction that updates the denormalized rating aggregates.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this lives on Aurora
&lt;/h2&gt;

&lt;p&gt;The qualifying AWS databases were DynamoDB, Aurora DSQL, and Aurora PostgreSQL. Only &lt;strong&gt;Aurora PostgreSQL&lt;/strong&gt; ships &lt;code&gt;pgvector&lt;/code&gt;, so the embedding produced by Bedrock can be indexed (HNSW) and queried (&lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt;) right next to the relational data — the ratings, the contributors, the lists. Embedding &lt;em&gt;meaning&lt;/em&gt; (Bedrock) and storing/serving it transactionally (Aurora pgvector) is the whole engine of the app, and keeping them one &lt;code&gt;JOIN&lt;/code&gt; apart made it a weekend project instead of a distributed-systems one.&lt;/p&gt;

&lt;p&gt;Stack: Next.js 16 on &lt;strong&gt;Vercel&lt;/strong&gt; → &lt;strong&gt;Amazon Aurora PostgreSQL Serverless v2&lt;/strong&gt; (pgvector) → &lt;strong&gt;Amazon Bedrock&lt;/strong&gt; (Titan multimodal). Photos live in S3; the matcher is one line of SQL.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built for the H0 Hackathon ("Hack the Zero Stack with Vercel and AWS Databases"). I created this content for the purposes of entering this hackathon.&lt;/em&gt; &lt;strong&gt;#H0Hackathon&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>vectordatabase</category>
      <category>postgres</category>
    </item>
    <item>
      <title>Building a cross-user review graph with pgvector on Amazon Aurora</title>
      <dc:creator>Jared Lewis</dc:creator>
      <pubDate>Mon, 29 Jun 2026 04:19:03 +0000</pubDate>
      <link>https://dev.to/jareddlewis/building-a-cross-user-review-graph-with-pgvector-on-amazon-aurora-7ni</link>
      <guid>https://dev.to/jareddlewis/building-a-cross-user-review-graph-with-pgvector-on-amazon-aurora-7ni</guid>
      <description>&lt;p&gt;Every review app is a silo. Yelp reviews places, Amazon reviews its own catalog, Letterboxd reviews film. I wanted to build the opposite: one place where you can log and rate &lt;strong&gt;anything&lt;/strong&gt; — a ball-point pen, a burger, a cruise — and have your review pool together with everyone else's.&lt;/p&gt;

&lt;p&gt;The hard part isn't the form. It's this: when &lt;em&gt;I&lt;/em&gt; log "In-N-Out Double-Double" and &lt;em&gt;you&lt;/em&gt; log "in n out double double burger," those need to become &lt;strong&gt;the same thing&lt;/strong&gt; so our ratings aggregate. No shared product ID, no barcode, no agreement on spelling. Just two humans describing the same item differently.&lt;/p&gt;

&lt;p&gt;That's a similarity problem, and I solved it with &lt;strong&gt;&lt;code&gt;pgvector&lt;/code&gt; running inside Amazon Aurora PostgreSQL (Serverless v2)&lt;/strong&gt; — the whole app, deployed on Vercel at &lt;a href="https://opinlog.com/" rel="noopener noreferrer"&gt;opinlog.com&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The data model is the product
&lt;/h2&gt;

&lt;p&gt;Two tables carry the entire idea:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;user_items&lt;/code&gt;&lt;/strong&gt; — "&lt;em&gt;I&lt;/em&gt; logged this, paid $X, rated it 4★." One row per logging event. Personal.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;canonical_items&lt;/code&gt;&lt;/strong&gt; — "the &lt;em&gt;one shared thing&lt;/em&gt; everyone's rows point to." Deduplicated; aggregates everyone's ratings.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The matcher's only job is connecting a &lt;code&gt;user_item&lt;/code&gt; to the right &lt;code&gt;canonical_item&lt;/code&gt;. &lt;strong&gt;That edge is the product.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every item carries a 1024-dim embedding. The canonical row's embedding is a &lt;code&gt;vector(1024)&lt;/code&gt;, indexed with HNSW for cosine ANN search:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// db/schema.ts (Drizzle)&lt;/span&gt;
&lt;span class="nx"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;embedding&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;dimensions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;notNull&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
&lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;idx_canon_embedding&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;using&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;hnsw&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;op&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;vector_cosine_ops&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I made the canonical embedding &lt;strong&gt;&lt;code&gt;NOT NULL&lt;/code&gt; on purpose&lt;/strong&gt;: a canonical entry must always be matchable, so every creation path supplies a vector and the recompute logic preserves it rather than ever averaging to NULL.&lt;/p&gt;

&lt;h2&gt;
  
  
  "Is this one of these?" — the matcher
&lt;/h2&gt;

&lt;p&gt;On every add, I embed the new item and run an approximate-nearest-neighbour search over the canonical catalog. Drizzle doesn't model vector operators, so the query is a raw &lt;code&gt;sql&lt;/code&gt; template:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;photo_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rating_avg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rating_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;canonical_items&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;&amp;lt;=&amp;gt;&lt;/code&gt; is pgvector's cosine distance; &lt;code&gt;1 - distance&lt;/code&gt; is similarity. A high-confidence hit auto-suggests as the default ("we think this is it"); otherwise the user sees the top candidates plus a "None of these — create new" escape hatch. &lt;strong&gt;Matching is optional and non-blocking&lt;/strong&gt;: the &lt;code&gt;user_item&lt;/code&gt; is saved instantly with &lt;code&gt;canonical_item_id = NULL&lt;/code&gt;, and the match just fills it in — at upload time or from a queue later. The base loop (track my own stuff) never has friction.&lt;/p&gt;

&lt;h2&gt;
  
  
  The self-sharpening catalog
&lt;/h2&gt;

&lt;p&gt;Here's the part I'm proud of. A canonical item's embedding is the &lt;strong&gt;running centroid&lt;/strong&gt; of its linked members' embeddings. When you link your burger log to mine, the canonical vector moves to the average of both. The more people log the same thing, the &lt;em&gt;sharper and more representative&lt;/em&gt; that vector becomes — the catalog improves itself with use.&lt;/p&gt;

&lt;p&gt;And because "sort by rating" needs to stay fast at scale, the rating aggregates are &lt;strong&gt;denormalized onto the canonical row&lt;/strong&gt;, with &lt;code&gt;rating_avg&lt;/code&gt; as a stored generated column:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;rating_avg&lt;/span&gt; &lt;span class="nb"&gt;real&lt;/span&gt; &lt;span class="k"&gt;GENERATED&lt;/span&gt; &lt;span class="n"&gt;ALWAYS&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;CASE&lt;/span&gt; &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;rating_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="n"&gt;rating_sum&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;real&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;rating_count&lt;/span&gt; &lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;STORED&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So the canonical page — the payoff screen — reads one row to show "4.6★ from 11 reviews across strangers," no aggregation query at render time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Aurora PostgreSQL specifically
&lt;/h2&gt;

&lt;p&gt;I evaluated the three qualifying AWS databases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DynamoDB&lt;/strong&gt; — great scale story, but no native vector search; I'd have to bolt on OpenSearch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aurora DSQL&lt;/strong&gt; — distributed Postgres, but it lacks the &lt;code&gt;pgvector&lt;/code&gt; and &lt;code&gt;PostGIS&lt;/code&gt; extensions, so it can't do the core matching feature.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aurora PostgreSQL&lt;/strong&gt; — does vectors (pgvector), geo (PostGIS), and relations in &lt;strong&gt;one engine&lt;/strong&gt;. One connection, one query planner, transactional joins between "the vector match" and "the rating rollup."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point is the whole reason this was buildable by one person on a deadline. The matcher result and the relational aggregation live in the same database, so linking an item and recomputing its canonical's centroid + ratings is a single transaction — not a dance between a vector store and an RDBMS.&lt;/p&gt;

&lt;h2&gt;
  
  
  The stack
&lt;/h2&gt;

&lt;p&gt;Next.js 16 (App Router, Server Actions) on &lt;strong&gt;Vercel&lt;/strong&gt; → &lt;strong&gt;Aurora PostgreSQL Serverless v2&lt;/strong&gt; via &lt;code&gt;pg&lt;/code&gt;/Drizzle, with &lt;code&gt;pgvector&lt;/code&gt; + &lt;code&gt;PostGIS&lt;/code&gt; + &lt;code&gt;pg_trgm&lt;/code&gt;. Embeddings come from Amazon Bedrock (more on that in the next post). Swapping local Docker Postgres for Aurora was a one-line &lt;code&gt;DATABASE_URL&lt;/code&gt; change — same driver, same SQL.&lt;/p&gt;

&lt;p&gt;The result is a universal, cross-user review graph for literally anything, and the magic is one line of SQL: &lt;code&gt;ORDER BY embedding &amp;lt;=&amp;gt; :q&lt;/code&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built for the H0 Hackathon ("Hack the Zero Stack with Vercel and AWS Databases"). I created this content for the purposes of entering this hackathon.&lt;/em&gt; &lt;strong&gt;#H0Hackathon&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>vectordatabase</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
