DEV Community

Cover image for Building Syllabi – Agentic AI with Vercel AI SDK, Dynamic Tool Loading, and RAG
Achu Shankar
Achu Shankar

Posted on

Building Syllabi – Agentic AI with Vercel AI SDK, Dynamic Tool Loading, and RAG

Building Syllabi – Agentic AI with Vercel AI SDK, Dynamic Tool Loading, and RAG

After 6 months of building, I just launched Syllabi – an open-source platform for creating agentic chatbots that can integrate with ANY tool, search knowledge bases, and deploy across channels.

TL;DR: Built with Vercel AI SDK, dynamic tool selection (semantic vs. direct), modular skills system, and real-time RAG. It's open-source (MIT) and ready for self-hosting.

🔗 Website: https://www.syllabi-ai.com/

GitHub: https://github.com/Achu-shankar/Syllabi


The Problem I Was Solving

Every AI project I worked on needed two things:

  1. Answer questions from knowledge bases (RAG)
  2. Take actions (send Slack messages, create tickets, call APIs)

But building both from scratch is tedious. Existing solutions either lock you into their cloud or don't support agentic tool use properly.

So I built Syllabi: an open-source platform where you can:

  • Transform docs/videos/websites into a knowledge base
  • Add "skills" (integrations + webhooks) for taking actions
  • Deploy to web, Slack, Discord, or custom channels
  • Let the AI decide which tools to use (agentic behavior)

Tech Stack

Frontend (Where Most Magic Happens)

  • Next.js 15 (App Router) with TypeScript
  • Vercel AI SDK v5 for streaming, tool calling, and embeddings
  • Supabase (PostgreSQL + pgvector) for data and vector search
  • TailwindCSS for UI

Backend (Document Processing)

  • Python FastAPI for API endpoints
  • Celery + Redis for async job queue
  • PyMuPDF, pdfplumber for PDF parsing
  • Whisper API for video/audio transcription

Key Insight: Most AI logic lives in Next.js API routes, not a separate AI backend. The Python backend is specifically for heavy document processing (PDFs, videos, audio).


Challenge 1: Agentic Tool Use with Vercel AI SDK

The Problem: How do you let AI decide which tools to use WITHOUT overwhelming the model with 50+ tool definitions?

My Solution: Dynamic Tool Selection

I built two strategies:

1. Direct Method (for <15 skills)

Load all skills directly into the AI's tool list.

const skills = await getActiveSkillsForChatbot(chatbotId);
const tools = skills.map(skill => 
  tool({
    description: skill.description,
    parameters: convertJsonSchemaToZod(skill.function_schema.parameters),
    execute: async (params) => {
      return await executeSkill(skill, params, context);
    }
  })
);
Enter fullscreen mode Exit fullscreen mode

2. Semantic Retrieval Method (for 15+ skills)

Use vector search to find only relevant skills based on user's query.

async function getSemanticSkills(
  chatbotId: string,
  userQuery: string,
  maxTools: number
): Promise<Skill[]> {
  // Vector search skills based on user query
  const relevantSkills = await searchChatbotSkills(
    userQuery, 
    chatbotId, 
    maxTools || 5
  );

  console.log(`Found ${relevantSkills.length} relevant skills via semantic search`);
  return relevantSkills;
}
Enter fullscreen mode Exit fullscreen mode

Optimal Selection Logic

The system automatically chooses the best method:

export async function getOptimalToolSelectionConfig(
  chatbotId: string,
  userQuery?: string
): Promise<ToolSelectionConfig> {
  const skills = await getActiveSkillsForChatbot(chatbotId);
  const skillCount = skills.length;

  if (skillCount <= 5) {
    // Few skills: use direct
    return { method: 'direct', maxTools: skillCount };
  } else if (skillCount <= 15) {
    // Medium: direct with limit
    return { method: 'direct', maxTools: 10 };
  } else {
    // Many skills: semantic retrieval
    return {
      method: 'semantic_retrieval',
      maxTools: 10,
      semanticQuery: userQuery
    };
  }
}
Enter fullscreen mode Exit fullscreen mode

Why This Works:

  • Small chatbots (<5 skills): AI sees all tools, no performance hit
  • Medium chatbots (5-15 skills): Limit to top 10 most-used skills
  • Large chatbots (15+ skills): Vector search finds only relevant tools

Lesson Learned: Don't pass 50 tool definitions to GPT-4. Either limit by usage stats or use semantic retrieval. Context window isn't the issue – comprehension is!


Challenge 2: Building a Modular Skills System

The Problem: How do you support both built-in integrations (Slack, Gmail, Discord) AND custom user webhooks with the same architecture?

My Solution: Skills Registry + Executor Pattern

// Built-in skills registry
const BUILTIN_SKILLS_REGISTRY: Record<string, Function> = {
  // Slack skills
  slack_send_message: slackSendMessage,
  slack_list_users: slackListUsers,
  slack_create_reminder: slackCreateReminder,

  // Discord skills  
  discord_send_message: discordSendMessage,

  // Gmail skills
  gmail_send_email: gmailSendEmail,

  // Google Calendar skills
  google_calendar_create_event: googleCalendarCreateEvent,

  // ... 50+ built-in skills
};

// Executor routes to correct implementation
export async function executeSkill(
  skill: Skill,
  parameters: Record<string, any>,
  context: SkillExecutionContext
): Promise<SkillExecutionResult> {
  switch (skill.type) {
    case 'builtin':
      // Execute from registry
      const skillFunction = BUILTIN_SKILLS_REGISTRY[skill.name];
      return await skillFunction(parameters, context);

    case 'custom':
      // Execute user's webhook
      return await executeCustomSkill(skill, parameters);

    default:
      throw new Error(`Unknown skill type: ${skill.type}`);
  }
}
Enter fullscreen mode Exit fullscreen mode

Custom Webhook Skills

Users can define custom skills via webhooks:

async function executeCustomSkill(
  skill: Skill, 
  parameters: Record<string, any>
): Promise<SkillExecutionResult> {
  const config = skill.webhook_config;

  const response = await fetch(config.url, {
    method: config.method || 'POST',
    headers: {
      'Content-Type': 'application/json',
      ...config.headers
    },
    body: JSON.stringify(parameters),
    signal: AbortSignal.timeout(config.timeout_ms || 30000)
  });

  return {
    success: response.ok,
    data: await response.json()
  };
}
Enter fullscreen mode Exit fullscreen mode

Example User Webhook Skill:

{
  "name": "create_jira_ticket",
  "description": "Create a Jira ticket for bug reports",
  "webhook_config": {
    "url": "https://my-api.com/create-ticket",
    "method": "POST",
    "headers": {
      "Authorization": "Bearer YOUR_TOKEN"
    }
  },
  "function_schema": {
    "parameters": {
      "type": "object",
      "properties": {
        "title": { "type": "string" },
        "description": { "type": "string" },
        "priority": { "type": "string", "enum": ["low", "medium", "high"] }
      },
      "required": ["title", "description"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Lesson Learned: Separating "builtin" and "custom" at the executor level (not the AI level) means the AI doesn't care HOW a skill works – it just calls it. This makes adding new integrations trivial.


Challenge 3: JSON Schema → Zod Conversion for AI SDK

The Problem: Vercel AI SDK uses Zod for parameter validation, but storing Zod schemas in a database is impractical. Users need a simple JSON format.

My Solution: Dynamic Zod Conversion

Store skills as JSON Schema in the database, convert to Zod at runtime:

export function convertJsonSchemaToZod(jsonSchema: any): z.ZodObject<any> {
  if (!jsonSchema || !jsonSchema.properties) {
    return z.object({});
  }

  const zodFields: Record<string, z.ZodType> = {};
  const required = jsonSchema.required || [];

  Object.entries(jsonSchema.properties).forEach(([key, prop]: [string, any]) => {
    let zodType = convertPropertyToZod(prop);

    // Make optional if not in required array
    if (!required.includes(key)) {
      zodType = zodType.optional();
    }

    zodFields[key] = zodType;
  });

  return z.object(zodFields);
}

function convertPropertyToZod(property: any): z.ZodType {
  const { type, description, format, enum: enumValues } = property;

  switch (type) {
    case 'string':
      let stringSchema = z.string();

      if (description) stringSchema = stringSchema.describe(description);

      if (format === 'email') stringSchema = stringSchema.email();
      else if (format === 'url') stringSchema = stringSchema.url();
      else if (format === 'date-time') stringSchema = stringSchema.datetime();

      if (enumValues) return z.enum(enumValues as [string, ...string[]]);
      return stringSchema;

    case 'number':
    case 'integer':
      return z.number().describe(description);

    case 'boolean':
      return z.boolean().describe(description);

    case 'array':
      if (property.items) {
        return z.array(convertPropertyToZod(property.items)).describe(description);
      }
      return z.array(z.any()).describe(description);

    case 'object':
      if (property.properties) {
        return convertJsonSchemaToZod(property).describe(description);
      }
      return z.object({}).describe(description);

    default:
      return z.any().describe(description);
  }
}
Enter fullscreen mode Exit fullscreen mode

Usage in AI SDK:

const parameters = convertJsonSchemaToZod(skill.function_schema.parameters);

tools[skill.name] = tool({
  description: skill.description,
  parameters,  // Zod schema
  execute: async (params) => {
    // Vercel AI SDK validates params automatically
    return await executeSkill(skill, params, context);
  }
});
Enter fullscreen mode Exit fullscreen mode

Lesson Learned: Storing JSON Schema in the database is much more flexible than Zod. Users can define skills via UI or API without writing TypeScript. Convert to Zod at runtime for type safety.


Challenge 4: RAG with Supabase & Vector Search

The Problem: Different document types (PDFs, videos, websites) need different retrieval strategies. One-size-fits-all RAG fails.

My Solution: Enhanced RPC Functions with Content Type Filtering

Supabase RPC for Vector Search:

CREATE OR REPLACE FUNCTION match_document_chunks_enhanced(
  query_embedding vector(1536),
  chatbot_id_param uuid,
  match_threshold float DEFAULT 0.2,
  match_count int DEFAULT 10,
  content_types text[] DEFAULT ARRAY['document', 'url', 'video', 'audio'],
  max_per_content_type int DEFAULT NULL
)
RETURNS TABLE (
  chunk_id uuid,
  reference_id uuid,
  chunk_text text,
  page_number int,
  similarity float,
  content_type text,
  start_time_seconds float,
  end_time_seconds float
)
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY
  SELECT 
    dc.id,
    dc.reference_id,
    dc.chunk_text,
    dc.page_number,
    1 - (dc.embedding <=> query_embedding) AS similarity,
    r.content_type,
    dc.start_time_seconds,
    dc.end_time_seconds
  FROM document_chunks dc
  JOIN chatbot_content_sources r ON dc.reference_id = r.id
  WHERE r.chatbot_id = chatbot_id_param
    AND r.content_type = ANY(content_types)
    AND (1 - (dc.embedding <=> query_embedding)) > match_threshold
  ORDER BY dc.embedding <=> query_embedding
  LIMIT match_count;
END;
$$;
Enter fullscreen mode Exit fullscreen mode

AI SDK Tool for RAG:

tools: {
  getRelevantDocuments: tool({
    description: 'Get information from the chatbot\'s knowledge base.',
    parameters: z.object({
      query: z.string().describe('Search query for finding relevant documents.'),
      contentTypes: z.array(z.enum(['document', 'url', 'video', 'audio'])).optional(),
      maxPerType: z.number().optional()
    }),
    execute: async ({ query, contentTypes, maxPerType }) => {
      // 1. Generate embedding using AI SDK
      const { embedding } = await embed({
        model: openai.embedding('text-embedding-3-small'),
        value: query
      });

      // 2. Vector search in Supabase
      const { data, error } = await supabase.rpc('match_document_chunks_enhanced', {
        query_embedding: embedding,
        chatbot_id_param: chatbotId,
        match_threshold: 0.2,
        match_count: 10,
        content_types: contentTypes || ['document', 'url', 'video', 'audio'],
        max_per_content_type: maxPerType || null
      });

      if (error) {
        return { error: `Failed to retrieve documents: ${error.message}` };
      }

      // 3. Return chunks with metadata
      return { 
        documents: data.map(chunk => ({
          content: chunk.chunk_text,
          page_number: chunk.page_number,
          similarity: chunk.similarity,
          content_type: chunk.content_type,
          start_time_seconds: chunk.start_time_seconds, // for videos
          end_time_seconds: chunk.end_time_seconds
        }))
      };
    }
  })
}
Enter fullscreen mode Exit fullscreen mode

Why This Works:

  • Single query for all content types or filter by type
  • Multimedia timestamps preserved (click citation → jump to video timestamp)
  • pgvector handles cosine similarity efficiently
  • AI SDK generates embeddings client-side

Lesson Learned: Don't build a separate vector DB. Supabase's pgvector extension + RPC functions is perfect for RAG. Keep embeddings and metadata in one place!


Challenge 5: Streaming with Tool Calls

The Problem: Vercel AI SDK's streamText can execute tools mid-stream, but you need to handle the flow carefully.

My Solution: Multi-Step Tool Execution

const result = streamText({
  model: openai(modelToUse),
  system: systemPrompt,
  messages,
  temperature: 0.7,
  maxSteps: 5, // Allow up to 5 tool calls in sequence
  tools: {
    getRelevantDocuments,
    slack_send_message,
    gmail_send_email,
    // ... all other skills
  },
  experimental_activeTools: [
    'getRelevantDocuments',
    'listAvailableDocuments',
    ...skillNames // Dynamically loaded skill names
  ],
  onFinish: async ({ response, usage }) => {
    // Save assistant message with token usage
    await saveOrUpdateChatMessages(
      userId,
      sessionId,
      chatbotSlug,
      response.messages,
      usage.totalTokens
    );
  }
});

result.mergeIntoDataStream(dataStream, {
  sendReasoning: true // Show tool call reasoning to user
});
Enter fullscreen mode Exit fullscreen mode

What maxSteps: 5 Does:

  • AI can call a tool, see the result, and call another tool
  • Example flow:
    1. User: "Email the sales team about Q4 targets from our docs"
    2. AI calls getRelevantDocuments(query: "Q4 targets")
    3. AI reads results
    4. AI calls slack_list_users(exclude_bots: true) to find sales team
    5. AI calls gmail_send_email(to: [...], subject: "Q4 Targets", body: "...")

Lesson Learned: maxSteps is critical for agentic behavior. Without it, the AI can only call ONE tool per turn. With it, the AI can chain tools together (RAG → action).


Challenge 6: Integration Auto-Detection

The Problem: When a skill needs Slack/Discord/Gmail credentials, how do you know which integration to use if the chatbot has multiple?

My Solution: Automatic Integration Lookup

async function ensureIntegrationId(
  skill: { name: string }, 
  context: SkillExecutionContext
): Promise<SkillExecutionContext> {
  if (context.integrationId) {
    return context; // Already provided
  }

  // Detect integration type from skill name
  let integrationType: string | null = null;
  if (skill.name.startsWith('slack_')) integrationType = 'slack';
  else if (skill.name.startsWith('discord_')) integrationType = 'discord';
  else if (skill.name.startsWith('gmail_')) integrationType = 'google';

  if (!integrationType) {
    return context; // No integration needed
  }

  // Look up integration ID for this chatbot
  const integrationId = await getIntegrationIdForChatbot(
    context.chatbotId,
    integrationType
  );

  if (!integrationId) {
    throw new Error(
      `No active ${integrationType} integration found. ` +
      `Please connect ${integrationType} in chatbot settings.`
    );
  }

  return { ...context, integrationId };
}
Enter fullscreen mode Exit fullscreen mode

Why This Works:

  • Skills just declare they need "Slack" – no hardcoded integration IDs
  • System automatically finds the correct integration for the chatbot
  • If multiple integrations exist, uses most recent (with warning)

Lesson Learned: Don't make users pass integration IDs manually. Infer it from context (chatbot + skill type) and handle it automatically!


Architecture Overview

Here's how everything fits together:

┌─────────────────────────────────────────┐
│         Frontend (Next.js API)           │
│  ┌────────────────────────────────────┐ │
│  │  /api/chat/route.ts                │ │
│  │  - Vercel AI SDK (streamText)      │ │
│  │  - Dynamic tool loading            │ │
│  │  - Streaming responses             │ │
│  └────────────────────────────────────┘ │
│                    │                     │
│         ┌──────────┼──────────┐          │
│         │          │          │          │
│  ┌──────▼───┐ ┌───▼────┐ ┌───▼──────┐  │
│  │   RAG    │ │ Skills │ │   User   │  │
│  │  Tools   │ │  Tools │ │  Message │  │
│  └──────────┘ └────────┘ └──────────┘  │
└─────────────────────────────────────────┘
             │              │
       ┌─────▼────┐   ┌─────▼──────┐
       │ Supabase │   │   OpenAI   │
       │ pgvector │   │     API    │
       └──────────┘   └────────────┘

┌─────────────────────────────────────────┐
│      Backend (Python FastAPI)            │
│  ┌────────────────────────────────────┐ │
│  │    Celery Worker (Async Queue)     │ │
│  │    - PDF processing                │ │
│  │    - Video transcription           │ │
│  │    - Audio transcription           │ │
│  │    - Embedding generation          │ │
│  └────────────────────────────────────┘ │
└─────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Key Insight: The frontend handles all chat logic. The backend is a specialized service for heavy document processing.


What I'd Do Differently

  1. Start with fewer integrations – I built 50+ built-in skills upfront. Should have shipped with 5-10 and added more based on demand.

  2. Implement skill versioning earlier – When I update a built-in skill's schema, existing chatbots break. Need versioning!

  3. Add skill testing UI sooner – Users need to test webhooks before deploying. I added this late.

  4. Better error messages – When a skill fails (e.g., Slack token expired), the error should guide users to fix it.

  5. Rate limiting per skill – Currently rate-limited per chatbot. Should be per-skill to prevent abuse of expensive APIs.


Key Takeaways

  1. Vercel AI SDK is fantastic – Streaming, tool calling, and embeddings all in one package. Saved weeks of work.

  2. Dynamic tool selection is essential – Don't overwhelm the AI with 50 tool definitions. Use semantic retrieval or prioritize by usage.

  3. JSON Schema → Zod – Store schemas as JSON (database-friendly), convert to Zod at runtime (type-safe).

  4. Supabase pgvector is underrated – You don't need a separate vector DB. Supabase + RPC functions handle RAG beautifully.

  5. Agentic AI needs multi-step executionmaxSteps in Vercel AI SDK lets the AI chain tool calls (RAG → action).

  6. Modular skills system – Separate "builtin" vs "custom" at executor level, not AI level. Makes adding integrations easy.

  7. Auto-detect integrations – Don't make users pass integration IDs. Infer from context and handle automatically.


What's Next

Current priorities:

  • [ ] More AI models (Anthropic Claude, local models via Ollama)
  • [ ] Skill versioning system
  • [ ] Improved analytics (which skills are used most?)
  • [ ] Voice/audio support for chatbot responses
  • [ ] Collaborative features (team management, shared chatbots)

Try It Out

🔗 Website: https://www.syllabi-ai.com/

GitHub: https://github.com/Achu-shankar/Syllabi

📚 Docs: https://www.syllabi-ai.com/docs

Setup in minutes:

git clone https://github.com/Achu-shankar/Syllabi.git
cd Syllabi/frontend
npm install
cp .env.example .env.local
# Add your Supabase & OpenAI keys
npm run dev
Enter fullscreen mode Exit fullscreen mode

Honest disclaimer: Some features (Teams deployment, advanced analytics) are still being refined. Core functionality (RAG, skills, web/Slack/Discord deployment, self-hosting) is production-ready.


Let's Discuss!

Questions I'd love your input on:

  1. Tool selection strategies – Have you implemented agentic AI? How do you handle too many tools?

  2. Skills marketplace – Would a marketplace of pre-built skills/integrations be useful?

  3. Local models – Should I prioritize Anthropic Claude or local models (Llama, Mistral)?

  4. Skill testing – What's the best way to let users test webhooks before deploying?

Drop a comment! I'm happy to dive deeper into any of these topics or answer questions about the implementation.


Building in public is scary but rewarding. If you're working on something similar, let's connect! 🚀

P.S. If you found this helpful, a star on GitHub would mean the world!

Top comments (0)