Building a Slack AI Assistant with NeuroLink: From Prototype to Production
Internal support consumes engineering time. At Juspay, our 500+ engineers constantly asked questions like:
- "What's the status of the Euler payment API?"
- "How do I get credentials for the sandbox environment?"
- "Who owns the HyperSDK Android module?"
- "Deploy the latest Breeze release to staging"
These questions needed answers, but pulling engineers from deep work was expensive. We needed an AI assistant that could:
- Answer questions using our internal knowledge
- Execute actions (deployments, credential provisioning)
- Remember conversation context across sessions
- Integrate with our existing tools (Jira, Bitbucket, Kubernetes)
Meet Tara — our Slack AI assistant built with NeuroLink and Claude Sonnet.
Architecture Overview
┌──────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Slack │────▶│ Slack Bolt │────▶│ Tara Service │
│ Message │ │ App │ │ (FastAPI) │
└──────────────┘ └──────────────┘ └────────┬────────┘
│
┌────────────────────────┼────────────────────────┐
▼ ▼ ▼
┌──────────────┐ ┌─────────────────┐ ┌──────────────┐
│ NeuroLink │ │ MCP Servers │ │ Redis │
│ SDK │ │ - Jira │ │ Memory │
│ (Claude) │ │ - Bitbucket │ │ │
└──────────────┘ │ - K8s │ └──────────────┘
└─────────────────┘
Getting Started: The Prototype
Our first version was surprisingly simple. Here's the core loop:
import { NeuroLink } from "@juspay/neurolink";
import { App } from "@slack/bolt";
// Initialize NeuroLink with Claude
const neurolink = new NeuroLink({
conversationMemory: {
enabled: true,
enableSummarization: true, // Auto-summarize long conversations
},
});
// Slack Bolt app
const slack = new App({
token: process.env.SLACK_BOT_TOKEN,
signingSecret: process.env.SLACK_SIGNING_SECRET,
});
// Handle direct messages and mentions
slack.event("app_mention", async ({ event, say }) => {
await handleMessage(event.user, event.text, say);
});
slack.event("message", async ({ event, say }) => {
if (event.channel_type === "im") {
await handleMessage(event.user, event.text, say);
}
});
Conversation Handling with Memory
The magic of Tara is maintaining context. NeuroLink's conversation memory handles this automatically:
async function handleMessage(
userId: string,
text: string,
say: (text: string) => Promise<void>
) {
// Stream the response for better UX
const result = await neurolink.stream({
input: { text },
provider: "anthropic",
model: "claude-4-sonnet",
user: userId, // Enables per-user memory automatically
system: `You are Tara, Juspay's AI assistant. You help engineers with:
- Finding documentation and code
- Checking deployment status
- Answering questions about services
- Creating Jira tickets and PRs
Be concise and helpful. If you need to take action,
use the available tools.`,
enableOrchestration: true, // Allow tool use
});
// Stream chunks back to Slack
let response = "";
for await (const chunk of result.stream) {
if ("content" in chunk) {
response += chunk.content;
// Update Slack message every few tokens
if (response.length % 100 === 0) {
await say(response + "⏳");
}
}
}
await say(response);
}
Adding Tool Capabilities
Tara becomes powerful when she can actually do things. We added MCP servers for our internal tools:
// Jira integration for ticket creation
await neurolink.addExternalMCPServer("jira", {
transport: "stdio",
command: "npx",
args: ["-y", "@modelcontextprotocol/server-jira"],
env: {
JIRA_TOKEN: process.env.JIRA_TOKEN,
JIRA_HOST: "https://juspay.atlassian.net",
},
});
// Kubernetes for deployment status
await neurolink.addExternalMCPServer("k8s", {
transport: "stdio",
command: "npx",
args: ["-y", "@modelcontextprotocol/server-kubernetes"],
});
// Internal API server (custom MCP)
await neurolink.addExternalMCPServer("juspay-api", {
transport: "http",
url: "https://internal-api.juspay.net/mcp",
headers: {
Authorization: `Bearer ${process.env.INTERNAL_API_TOKEN}`,
},
});
Now users can say things like:
"Create a Jira ticket for the HyperSDK crash on Android"
And Tara will:
- Use the Jira tool to create the ticket
- Return the ticket URL
- Remember the ticket ID for follow-up questions
Structured Commands with Zod
For common operations, we use structured output to ensure reliability:
const DeploymentRequest = z.object({
service: z.enum(["euler", "breeze", "hyper-sdk", "neurolink"]),
environment: z.enum(["dev", "staging", "prod"]),
version: z.string(),
confirm: z.boolean(),
});
async function handleDeploymentRequest(userId: string, text: string) {
const result = await neurolink.generate({
input: {
text: `Parse this deployment request: "${text}"`,
},
provider: "anthropic",
model: "claude-4-haiku",
schema: DeploymentRequest,
output: { format: "json" },
});
const deployment = result.parsed as z.infer<typeof DeploymentRequest>;
if (!deployment.confirm) {
return `You want to deploy ${deployment.service} v${deployment.version} to ${deployment.environment}. Confirm with "yes"?`;
}
// Execute deployment via MCP
await neurolink.generate({
input: {
text: `Deploy ${deployment.service} version ${deployment.version} to ${deployment.environment}`,
},
});
return `✅ Deployment initiated for ${deployment.service} v${deployment.version} to ${deployment.environment}`;
}
Multi-Modal Support: Screenshots and Logs
Engineers often share screenshots of errors or paste log snippets. Tara handles these with NeuroLink's multimodal capabilities:
slack.event("message", async ({ event, say }) => {
if (event.files && event.files.length > 0) {
// Download files
const filePaths = await downloadSlackFiles(event.files);
const result = await neurolink.generate({
input: {
text: "What's in this screenshot? If it's an error, suggest fixes.",
files: filePaths,
},
provider: "google-ai",
model: "gemini-2.5-pro", // Vision-capable model
});
await say(result.content);
}
});
Advanced Features
1. RAG for Documentation
Tara answers questions about our internal docs using RAG:
const answer = await neurolink.generate({
input: { text: "How does the Euler payment flow work?" },
rag: {
files: [
"./docs/euler/architecture.md",
"./docs/euler/payment-flow.md",
"./docs/euler/webhooks.md",
],
strategy: "markdown",
topK: 5,
},
});
2. Human-in-the-Loop for Sensitive Actions
For destructive operations, we require approval:
const neurolink = new NeuroLink({
hitl: {
enabled: true,
requireApproval: ["deployToProduction", "deleteDatabase", "revokeCredentials"],
reviewCallback: async (action, context) => {
// Post to admin Slack channel for approval
return await requestSlackApproval(action, context.user);
},
},
});
3. Cost Optimization with Model Routing
We use different models for different tasks:
// Simple queries: fast, cheap model
const quickAnswer = await neurolink.generate({
input: { text: "What time is it in Bangalore?" },
provider: "google-ai",
model: "gemini-2.5-flash",
});
// Complex analysis: reasoning model
const architectureReview = await neurolink.generate({
input: { text: "Review this system design..." },
provider: "anthropic",
model: "claude-4-opus",
thinkingConfig: { thinkingLevel: "high" },
});
Production Deployment
We run Tara as a containerized service with the following configuration:
// production-config.ts
export const taraConfig = {
neurolink: {
conversationMemory: {
enabled: true,
redisConfig: {
host: process.env.REDIS_HOST,
port: 6379,
ttl: 86400 * 30, // 30-day retention
},
},
// Multi-provider failover
fallbackProviders: ["anthropic", "google-ai", "vertex"],
},
slack: {
port: 3000,
logLevel: "info",
},
// Rate limiting per user
rateLimit: {
requestsPerMinute: 20,
burstSize: 5,
},
};
Results
After deploying Tara to our engineering organization:
| Metric | Before | After |
|---|---|---|
| Avg. support response time | 4 hours | 30 seconds |
| Tickets created correctly | N/A | 98% |
| Engineer satisfaction | 65% | 92% |
| Cost per interaction | $2.50 (human) | $0.03 (AI) |
Key Learnings
Conversation memory is essential: Users expect context continuity. Redis-backed memory made Tara feel truly intelligent.
Streaming improves perception: Even if total time is the same, streaming responses feel faster and more engaging.
Tool use requires guardrails: Start with read-only tools, add write operations gradually with HITL.
Model selection matters: Routing simple queries to cheaper models cut costs by 75% without quality loss.
MCP > Custom integrations: Using standard MCP servers for Jira, K8s, etc. meant we spent days, not weeks, on integrations.
Getting Started
Want to build your own Slack assistant? Here's the minimal setup:
import { NeuroLink } from "@juspay/neurolink";
import { App } from "@slack/bolt";
const neurolink = new NeuroLink({
conversationMemory: { enabled: true },
});
const slack = new App({
token: process.env.SLACK_BOT_TOKEN,
signingSecret: process.env.SLACK_SIGNING_SECRET,
});
slack.event("message", async ({ event, say }) => {
const result = await neurolink.stream({
input: { text: event.text },
provider: "anthropic",
model: "claude-4-sonnet",
user: event.user,
});
let response = "";
for await (const chunk of result.stream) {
if ("content" in chunk) response += chunk.content;
}
await say(response);
});
await slack.start(3000);
Conclusion
Building Tara with NeuroLink let us create a production-ready AI assistant in days, not months. The combination of Claude's reasoning, NeuroLink's memory management, and MCP's tool ecosystem gave us everything we needed to automate internal support at scale.
If you're considering an internal AI assistant, start with NeuroLink — the unified API means you can experiment with different models and tools without rewriting your integration code.
NeuroLink — The Universal AI SDK for TypeScript
- GitHub: github.com/juspay/neurolink
- Install:
npm install @juspay/neurolink - Docs: docs.neurolink.ink
- Blog: blog.neurolink.ink — 150+ technical articles
Top comments (0)