DEV Community

NeuroLink AI
NeuroLink AI

Posted on

Building a Slack AI Assistant with NeuroLink: From Prototype to Production

Building a Slack AI Assistant with NeuroLink: From Prototype to Production

Internal support consumes engineering time. At Juspay, our 500+ engineers constantly asked questions like:

  • "What's the status of the Euler payment API?"
  • "How do I get credentials for the sandbox environment?"
  • "Who owns the HyperSDK Android module?"
  • "Deploy the latest Breeze release to staging"

These questions needed answers, but pulling engineers from deep work was expensive. We needed an AI assistant that could:

  • Answer questions using our internal knowledge
  • Execute actions (deployments, credential provisioning)
  • Remember conversation context across sessions
  • Integrate with our existing tools (Jira, Bitbucket, Kubernetes)

Meet Tara — our Slack AI assistant built with NeuroLink and Claude Sonnet.

Architecture Overview

┌──────────────┐     ┌──────────────┐     ┌─────────────────┐
│   Slack      │────▶│  Slack Bolt  │────▶│   Tara Service  │
│   Message    │     │   App        │     │   (FastAPI)     │
└──────────────┘     └──────────────┘     └────────┬────────┘
                                                   │
                          ┌────────────────────────┼────────────────────────┐
                          ▼                        ▼                        ▼
                   ┌──────────────┐      ┌─────────────────┐      ┌──────────────┐
                   │  NeuroLink   │      │  MCP Servers    │      │   Redis      │
                   │  SDK         │      │  - Jira         │      │   Memory     │
                   │  (Claude)    │      │  - Bitbucket    │      │              │
                   └──────────────┘      │  - K8s          │      └──────────────┘
                                          └─────────────────┘
Enter fullscreen mode Exit fullscreen mode

Getting Started: The Prototype

Our first version was surprisingly simple. Here's the core loop:

import { NeuroLink } from "@juspay/neurolink";
import { App } from "@slack/bolt";

// Initialize NeuroLink with Claude
const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    enableSummarization: true, // Auto-summarize long conversations
  },
});

// Slack Bolt app
const slack = new App({
  token: process.env.SLACK_BOT_TOKEN,
  signingSecret: process.env.SLACK_SIGNING_SECRET,
});

// Handle direct messages and mentions
slack.event("app_mention", async ({ event, say }) => {
  await handleMessage(event.user, event.text, say);
});

slack.event("message", async ({ event, say }) => {
  if (event.channel_type === "im") {
    await handleMessage(event.user, event.text, say);
  }
});
Enter fullscreen mode Exit fullscreen mode

Conversation Handling with Memory

The magic of Tara is maintaining context. NeuroLink's conversation memory handles this automatically:

async function handleMessage(
  userId: string,
  text: string,
  say: (text: string) => Promise<void>
) {
  // Stream the response for better UX
  const result = await neurolink.stream({
    input: { text },
    provider: "anthropic",
    model: "claude-4-sonnet",
    user: userId, // Enables per-user memory automatically
    system: `You are Tara, Juspay's AI assistant. You help engineers with:
             - Finding documentation and code
             - Checking deployment status
             - Answering questions about services
             - Creating Jira tickets and PRs

             Be concise and helpful. If you need to take action,
             use the available tools.`,
    enableOrchestration: true, // Allow tool use
  });

  // Stream chunks back to Slack
  let response = "";
  for await (const chunk of result.stream) {
    if ("content" in chunk) {
      response += chunk.content;
      // Update Slack message every few tokens
      if (response.length % 100 === 0) {
        await say(response + "");
      }
    }
  }

  await say(response);
}
Enter fullscreen mode Exit fullscreen mode

Adding Tool Capabilities

Tara becomes powerful when she can actually do things. We added MCP servers for our internal tools:

// Jira integration for ticket creation
await neurolink.addExternalMCPServer("jira", {
  transport: "stdio",
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-jira"],
  env: {
    JIRA_TOKEN: process.env.JIRA_TOKEN,
    JIRA_HOST: "https://juspay.atlassian.net",
  },
});

// Kubernetes for deployment status
await neurolink.addExternalMCPServer("k8s", {
  transport: "stdio",
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-kubernetes"],
});

// Internal API server (custom MCP)
await neurolink.addExternalMCPServer("juspay-api", {
  transport: "http",
  url: "https://internal-api.juspay.net/mcp",
  headers: {
    Authorization: `Bearer ${process.env.INTERNAL_API_TOKEN}`,
  },
});
Enter fullscreen mode Exit fullscreen mode

Now users can say things like:

"Create a Jira ticket for the HyperSDK crash on Android"

And Tara will:

  1. Use the Jira tool to create the ticket
  2. Return the ticket URL
  3. Remember the ticket ID for follow-up questions

Structured Commands with Zod

For common operations, we use structured output to ensure reliability:

const DeploymentRequest = z.object({
  service: z.enum(["euler", "breeze", "hyper-sdk", "neurolink"]),
  environment: z.enum(["dev", "staging", "prod"]),
  version: z.string(),
  confirm: z.boolean(),
});

async function handleDeploymentRequest(userId: string, text: string) {
  const result = await neurolink.generate({
    input: {
      text: `Parse this deployment request: "${text}"`,
    },
    provider: "anthropic",
    model: "claude-4-haiku",
    schema: DeploymentRequest,
    output: { format: "json" },
  });

  const deployment = result.parsed as z.infer<typeof DeploymentRequest>;

  if (!deployment.confirm) {
    return `You want to deploy ${deployment.service} v${deployment.version} to ${deployment.environment}. Confirm with "yes"?`;
  }

  // Execute deployment via MCP
  await neurolink.generate({
    input: {
      text: `Deploy ${deployment.service} version ${deployment.version} to ${deployment.environment}`,
    },
  });

  return `✅ Deployment initiated for ${deployment.service} v${deployment.version} to ${deployment.environment}`;
}
Enter fullscreen mode Exit fullscreen mode

Multi-Modal Support: Screenshots and Logs

Engineers often share screenshots of errors or paste log snippets. Tara handles these with NeuroLink's multimodal capabilities:

slack.event("message", async ({ event, say }) => {
  if (event.files && event.files.length > 0) {
    // Download files
    const filePaths = await downloadSlackFiles(event.files);

    const result = await neurolink.generate({
      input: {
        text: "What's in this screenshot? If it's an error, suggest fixes.",
        files: filePaths,
      },
      provider: "google-ai",
      model: "gemini-2.5-pro", // Vision-capable model
    });

    await say(result.content);
  }
});
Enter fullscreen mode Exit fullscreen mode

Advanced Features

1. RAG for Documentation

Tara answers questions about our internal docs using RAG:

const answer = await neurolink.generate({
  input: { text: "How does the Euler payment flow work?" },
  rag: {
    files: [
      "./docs/euler/architecture.md",
      "./docs/euler/payment-flow.md",
      "./docs/euler/webhooks.md",
    ],
    strategy: "markdown",
    topK: 5,
  },
});
Enter fullscreen mode Exit fullscreen mode

2. Human-in-the-Loop for Sensitive Actions

For destructive operations, we require approval:

const neurolink = new NeuroLink({
  hitl: {
    enabled: true,
    requireApproval: ["deployToProduction", "deleteDatabase", "revokeCredentials"],
    reviewCallback: async (action, context) => {
      // Post to admin Slack channel for approval
      return await requestSlackApproval(action, context.user);
    },
  },
});
Enter fullscreen mode Exit fullscreen mode

3. Cost Optimization with Model Routing

We use different models for different tasks:

// Simple queries: fast, cheap model
const quickAnswer = await neurolink.generate({
  input: { text: "What time is it in Bangalore?" },
  provider: "google-ai",
  model: "gemini-2.5-flash",
});

// Complex analysis: reasoning model
const architectureReview = await neurolink.generate({
  input: { text: "Review this system design..." },
  provider: "anthropic",
  model: "claude-4-opus",
  thinkingConfig: { thinkingLevel: "high" },
});
Enter fullscreen mode Exit fullscreen mode

Production Deployment

We run Tara as a containerized service with the following configuration:

// production-config.ts
export const taraConfig = {
  neurolink: {
    conversationMemory: {
      enabled: true,
      redisConfig: {
        host: process.env.REDIS_HOST,
        port: 6379,
        ttl: 86400 * 30, // 30-day retention
      },
    },
    // Multi-provider failover
    fallbackProviders: ["anthropic", "google-ai", "vertex"],
  },
  slack: {
    port: 3000,
    logLevel: "info",
  },
  // Rate limiting per user
  rateLimit: {
    requestsPerMinute: 20,
    burstSize: 5,
  },
};
Enter fullscreen mode Exit fullscreen mode

Results

After deploying Tara to our engineering organization:

Metric Before After
Avg. support response time 4 hours 30 seconds
Tickets created correctly N/A 98%
Engineer satisfaction 65% 92%
Cost per interaction $2.50 (human) $0.03 (AI)

Key Learnings

  1. Conversation memory is essential: Users expect context continuity. Redis-backed memory made Tara feel truly intelligent.

  2. Streaming improves perception: Even if total time is the same, streaming responses feel faster and more engaging.

  3. Tool use requires guardrails: Start with read-only tools, add write operations gradually with HITL.

  4. Model selection matters: Routing simple queries to cheaper models cut costs by 75% without quality loss.

  5. MCP > Custom integrations: Using standard MCP servers for Jira, K8s, etc. meant we spent days, not weeks, on integrations.

Getting Started

Want to build your own Slack assistant? Here's the minimal setup:

import { NeuroLink } from "@juspay/neurolink";
import { App } from "@slack/bolt";

const neurolink = new NeuroLink({
  conversationMemory: { enabled: true },
});

const slack = new App({
  token: process.env.SLACK_BOT_TOKEN,
  signingSecret: process.env.SLACK_SIGNING_SECRET,
});

slack.event("message", async ({ event, say }) => {
  const result = await neurolink.stream({
    input: { text: event.text },
    provider: "anthropic",
    model: "claude-4-sonnet",
    user: event.user,
  });

  let response = "";
  for await (const chunk of result.stream) {
    if ("content" in chunk) response += chunk.content;
  }
  await say(response);
});

await slack.start(3000);
Enter fullscreen mode Exit fullscreen mode

Conclusion

Building Tara with NeuroLink let us create a production-ready AI assistant in days, not months. The combination of Claude's reasoning, NeuroLink's memory management, and MCP's tool ecosystem gave us everything we needed to automate internal support at scale.

If you're considering an internal AI assistant, start with NeuroLink — the unified API means you can experiment with different models and tools without rewriting your integration code.


NeuroLink — The Universal AI SDK for TypeScript

Top comments (0)