(No Vector Database Required)
Everyone talks about AI. I built a product. blackbook.dk is a design portfolio where the primary interface is a conversation. An AI agent that knows every project, skill and client, and retrieves them by meaning, not keywords. Under the hood it is a single content layer that feeds three outputs: the visual site, a schema.org knowledge graph, and the AI agent. No duplication, no manual sync.
This is how it works.
The core idea: content as structured data
Every project, skill, client and role lives once, as structured content in Sanity CMS. A project isn't a page. It is a document with typed relations to clients, categories (skills), areas of expertise and experience entries.
// Sanity project schema, simplified
defineField({
name: 'client',
type: 'reference',
to: [{ type: 'clientLogo' }],
}),
defineField({
name: 'areas',
type: 'array',
of: [{ type: 'reference', to: [{ type: 'portfolioArea' }] }],
}),
defineField({
name: 'categories',
type: 'array',
of: [{
type: 'object',
fields: [
{ name: 'category', type: 'reference', to: [{ type: 'category' }] },
{ name: 'isPrimary', type: 'boolean' },
],
}],
}),
One content model. Three consumers: the visual interface renders it as pages, the knowledge graph serialises it as JSON-LD, and the AI agent queries it at conversation time. Change a project once and it updates everywhere.
Retrieval by meaning, not keywords
This is the part that surprised me most. When someone asks the agent "show me pharma work", it doesn't grep for the word "pharma" in titles. It uses Sanity's built-in semantic similarity: vector embeddings computed server-side, queried with a single GROQ expression.
*[_type == "project" && hidden != true]
| score(text::semanticSimilarity($q))
| order(_score desc)[0...12] {
_score,
"titleEn": title.en,
"slug": slug.current,
"clientName": client->name,
featured,
"categories": categories[0..5]{
"en": category->title.en
}
}
The user's full question is passed as $q. Sanity returns the top 12 projects ranked by semantic similarity, each with a _score. Results below a 0.05 threshold are filtered out.
No Pinecone. No Weaviate. No pgvector. Sanity's own embedding index handles the vector search. The entire retrieval is one GROQ query with a scoring function.
When semantic search returns nothing, which is rare but possible, the system falls back to keyword matching. A simple scoring function counts how many extracted keywords appear in each project's title, tagline, categories and SEO fields.
function scoreProject(p: Project, keywords: string[]): number {
const hay = [
p.titleEn, p.titleDa, p.taglineEn, p.taglineDa,
p.seoEn, p.seoDa, p.clientName, cats,
].join(' ').toLowerCase()
return keywords.filter(k => hay.includes(k)).length
}
Semantic first, keyword fallback. The agent logs which method was used on each response.
One model, three outputs
The same Sanity content that the AI queries also generates a schema.org graph. The most useful part for search is not the biographical metadata, it is the service layer.
Every area of expertise becomes a Service entity, generated from the same content the service pages render. Each one carries a service type, an audience, the areas it serves, and an offer that links to contact.
{
"@type": "Service",
"name": "UX & Product Design",
"serviceType": "UX & Product Design",
"provider": { "@id": "https://www.blackbook.dk/#organization" },
"areaServed": [
{ "@type": "City", "name": "Copenhagen" },
{ "@type": "Country", "name": "Denmark" }
],
"audience": { "@type": "BusinessAudience" },
"offers": {
"@type": "Offer",
"url": "https://www.blackbook.dk/en/contact/"
}
}
This is what tells Google there is something here that can be bought, by whom, and where. It is the difference between "a person who knows about UX" and "a UX service available in Copenhagen". The service pages, the area pages and the schema all derive from the same category documents, so the commercial structure stays in sync with the content automatically.
Alongside the services sits the identity layer: a Person with skills linked to Wikidata, and an Organization with a curated knowsAbout that defines what the studio does. Two distinct sets, commercial positioning and individual expertise, connected via founder and worksFor. Every project is a CreativeWork with typed relations, and every reference resolves to Wikidata, so the graph is not just internal labels but linked to the wider web of entities.
{
"@type": "CreativeWork",
"@id": "https://www.blackbook.dk/en/portfolio/danske-bank/#work",
"name": "White-Label Banking Solution, Danske Bank",
"creator": { "@id": "https://www.blackbook.dk/#person-jeppe" },
"sourceOrganization": {
"@type": "Organization",
"name": "Danske Bank",
"sameAs": ["https://en.wikipedia.org/wiki/Danske_Bank",
"https://www.wikidata.org/wiki/Q1636974"]
},
"about": [
{ "@type": "Thing", "name": "UX & Product Design",
"sameAs": ["https://www.wikidata.org/wiki/Q11248500"] },
{ "@type": "Thing", "name": "FinTech",
"sameAs": ["https://www.wikidata.org/wiki/Q16319025"] }
]
}
Google, ChatGPT and Perplexity can all read this. When someone asks an assistant "who does fintech UX in Copenhagen", the structured service data is what makes the answer possible, not the page text.
The hard part: making site and conversation one system
The honest challenge wasn't building an AI chat. It was making the site and the conversation one system.
The AI agent runs on Claude Haiku. Its context is assembled at request time from three sources, all from the same Sanity dataset the site renders from.
// Same Sanity content, three context blocks
const portfolioContext = await fetchPortfolioContext(messages)
const servicesContext = await getServicesContext(lang)
const dynamicBlock = langInstruction
+ (servicesContext ? '\n\n' + servicesContext : '')
+ (portfolioContext ? '\n\n' + portfolioContext : '')
fetchPortfolioContext queries the same project documents the portfolio pages render. getServicesContext reads the same category descriptions that appear on the service pages. There is no separate "AI content". The agent reads the site.
This means when I update a project description, add a new skill page, or change a client name, the AI agent picks it up on the next conversation. No export, no embedding pipeline, no manual sync. One source of truth.
The trade-off: the agent's knowledge is bounded by the content model. It can't know things that aren't structured in Sanity. That is a feature, not a bug. It keeps the agent honest and the content authoritative.
A note on motion
The site has a hand-built motion layer. No template, no plugin, no animation library.
Page transitions use a circle grid: a 15×17 CSS grid of circular <div> elements that scale from center outward, driven by scroll progress or click events. Each cell has a threshold based on its distance from center.
// Each cell scales based on how far past its threshold the progress is
const cp = Math.min(1, Math.max(0,
(progress * (1 + SPREAD) - cell.threshold) / SPREAD
))
cell.el.style.transform = cp <= 0
? 'scale(0)'
: `scale(${cp * GRID_SCALE})`
On project pages the transition is scroll-driven, so the user controls the reveal by scrolling past the bottom. The same component handles click-based page transitions, image reveals (an iris pattern of circles shrinking to reveal the image) and a scroll-driven Venn diagram animation on the about page. One reusable component, four uses.
Project tiles use a physics-based evade effect, deforming and springing away from the cursor with mesh segmentation and spring damping. WebGL is used only for the hero image displacement transitions on the homepage. Everything else is CSS transforms at 60fps.
What I'd do differently
Semantic search should have been the starting point, not an addition. I built keyword scoring first, then added semantic. In hindsight Sanity's text::semanticSimilarity() is so simple to implement, one GROQ function, that it should have been the only retrieval method from day one, with keyword as emergency fallback.
The schema.org graph took longer than expected. Dozens of entity types, bilingual, with Wikidata cross-references. It works, but the validation and debugging cycle is slow. If I did it again I would build a test suite that validates every page's JSON-LD against schema.org on each deploy.
The motion layer is worth it, but expensive. Hand-built transitions mean hand-maintained transitions. Every new page type needs scroll handling, touch handling, and iOS edge cases. A framework like Framer Motion would have saved time, but it wouldn't have given me scroll-driven circle grids.
Try it
Open www.blackbook.dk/en/, click the orb in the bottom right, and ask it anything. Ask about fintech, ask about a specific client, ask it to surprise you.
Or explore the knowledge graph directly at blackbook.dk/en/cv. It is the same data, visualised as a force-directed network.
The source of truth is one Sanity dataset. The AI, the site, and the knowledge graph are three views of the same content. That is the architecture.
Built with Next.js 16, Sanity, Claude Haiku, and way too many requestAnimationFrame callbacks.





Top comments (0)