Building an AI layer that reads your entire workspace — not just one module

#ai #productivity #webdev #startup

Every major SaaS launched an "AI layer" in 2025. Notion AI, ClickUp AI, Asana Intelligence, Monday AI.

They all have the same quiet architectural problem: they're each scoped to one module.

Notion AI reads your Notion documents. ClickUp AI reads your ClickUp tasks. When you ask either one "what should I focus on today?" — neither can actually answer. The real answer requires your task load, pipeline, calendar, and team capacity simultaneously. That data lives in different databases, owned by different companies.

We built Kobin to solve that specific problem. This is how we approached it.

The core architectural decision: one data model

The reason Notion AI, ClickUp AI, and Asana Intelligence are each scoped to one module isn't primarily a product decision — it's an architecture decision made years before anyone planned an AI layer.

Notion was built as a note-taking tool. ClickUp was built as a task manager. Each product built its own schema optimized for its own use case. When AI arrived, each company bolted an AI layer onto its existing data model. You get siloed AI because the data was siloed first.

We made the opposite decision from day one: every module in Kobin shares one Supabase data model. Tasks, projects, contacts, messages, vault files, calendar events, and team members all share the same foreign keys.

When the AI queries "what tasks are due today for this user," it can also join to the CRM (contacts table) for any contacts linked to those projects, and join to calendar_events for what's on their schedule — in one query, with proper relational joins.

The architectural decision made the AI possible. It also made the product significantly harder to build.

The AI layer: 8 read tools, 5 write tools

The AI command bar (triggered via ⌘K anywhere in the workspace) runs on Claude Sonnet with a tool-use loop. Tools are defined as Anthropic function calls:

Read tools:

list_tasks — all tasks across projects, filterable by assignee, priority, due date, status
list_projects — active and archived projects with status, client link, and completion percentage
get_crm_contacts — full CRM pipeline with deal value, stage, and last interaction date
search_vault — semantic search via pgvector across all uploaded documents
get_inbox_messages — recent messages across all project rooms and DMs
get_calendar_events — upcoming events with Google Meet links and attendees
get_team_workload — task count and priority distribution per team member, live
get_meeting_history — past meeting summaries and extracted action items

Write tools:

create_task — creates a task with full context (project, assignee, priority, due date, notes)
update_crm_contact — updates pipeline stage, follow-up date, or adds meeting outcome
draft_message — composes a message for human review before sending
log_meeting_outcome — records outcome against a calendar event in the CRM
create_project_note — adds a note to a vault project folder

The agentic loop runs a maximum of 4 steps. A prompt like "brief me for my 3pm call with Ahmed" triggers:

1. get_calendar_events → identify the meeting at 3pm
2. get_crm_contacts → fetch Ahmed's record, deal stage, last contact
3. list_tasks → find open tasks linked to Ahmed's project
4. search_vault → find relevant docs (proposals, briefs, brand guidelines)
→ Synthesize into a structured pre-meeting brief

The key constraint: write tools never execute without explicit user confirmation. We show a preview card in the inbox room — "Create this task?" — before anything is committed. Agentic AI that silently writes to production data destroys trust fast.

Semantic search with pgvector

The vault stores every uploaded file — PDFs, DOCX, images, code files — and runs server-side extraction on upload. We use pdf-parse for PDFs and mammoth for DOCX files. The extracted text is chunked at 512 tokens with 64-token overlap and embedded using OpenAI's text-embedding-3-small.

Embeddings are stored in a pgvector column on the vault_items table:

ALTER TABLE vault_items ADD COLUMN embedding vector(1536);

At query time, we embed the user's query and run a cosine similarity search:

SELECT
id, title, project_id, file_type,
1 - (embedding <=> $1::vector) AS similarity
FROM vault_items
WHERE project_id = ANY($2)
AND 1 - (embedding <=> $1::vector) > 0.72
ORDER BY embedding <=> $1::vector
LIMIT 5;

The 0.72 threshold is the result of tuning against real agency document searches — below that, too many false positives from tangentially related content. Above it, results are consistently relevant.

This means "find the brand guidelines from the Q3 rebrand" returns the right document even if it was saved as final_v3_APPROVED.pdf. Filename irrelevant. Content is everything.

Proactive AI: the 8am brief

Reactive AI (command bar on demand) is the obvious piece. Proactive AI is where behaviour actually changes.

Every morning at 8am, a Supabase Edge Function runs across all active workspaces. For each user, it:

Pulls tasks due today and overdue tasks, sorted by urgency
Pulls calendar events for the day with meeting links
Queries CRM contacts with follow_up_date <= NOW() sorted by deal value descending
Checks for tasks with status = 'blocked' across all projects
Generates a brief via Claude Sonnet using all of the above as context
Delivers it as an inbox message in the user's Kobin workspace

No email, no push notification, no new app to check. The brief lands in the same place they check everything else. No additional context switch.

Pre-meeting briefs follow the same pattern: a cron job runs every 5 minutes, checks for calendar events starting in the next 10 minutes without an existing brief, generates one using the same tool chain as above.

Gmail integration: real-time lead detection

The CRM intelligence layer connects to Gmail via Pub/Sub push notifications — no polling. When a new email thread arrives from an address matching a CRM contact:

Thread fetched via Gmail API
Sent to Claude Sonnet for intent classification: interested | not_interested | requesting_info | requesting_meeting | following_up | objection | ready_to_close
Sentiment (positive | neutral | negative) and urgency (high | medium | low) extracted
Suggested pipeline stage returned with reasoning
Lead score updated with a delta from -20 to +20
Action items extracted and staged as task drafts for one-tap creation

For unknown senders with apparent business intent, a contact is auto-created with a review card in the inbox — "New lead detected from sarah@bloomcreative.co. Add to CRM?" — rather than silent auto-creation. The one-tap confirmation makes adoption much higher than requiring manual entry.

What we learned

Cross-module AI is only possible if you design for it from the start. You can't bolt it onto a siloed architecture. Every module needing to share data means every schema decision is shared — which requires product discipline that's harder to maintain than separate schemas.

Proactive beats reactive for daily workflow adoption. The command bar is useful when users remember to open it. The 8am brief with everything pre-synthesized is the feature our users mention first when describing what changed about their day.

Write access makes AI genuinely useful. Read-only AI that answers questions is a smarter search bar. AI that creates the task, assigns the right person based on live workload data, and logs the meeting outcome in the CRM is something people restructure their workflows around.

Confirmation before write is non-negotiable. Any agent that silently modifies production data will eventually do something the user didn't intend. The cost of a confirmation tap is zero. The cost of an unintended CRM update is a broken client relationship.

If you're building in the agency ops space or working on multi-module AI context problems, happy to discuss architecture in the comments.

Kobin is at kobin.team — the full AI workspace breakdown is at https://www.kobin.team/ai-workspace if you want to see how the technical layer is presented to non-technical users.

Full docs at kobin.team/docs.