Stephen Goldberg for Harper

Posted on Feb 26

Build Your Own AI Ops Assistant — Part 6: Ship It

#vibecoding #ai #aiops #rag

This is Part 6 of a 6-part series. Part 5 covers the knowledge loop.

Deploy to Production & What to Build Next

You've built an AI assistant that searches your internal tools in parallel, synthesizes answers with Claude, learns from feedback, and lives in Slack. Now let's deploy it to production, look at real costs, and talk about the extensions that take Harper Eye from good to indispensable.

Step 1: Deploy to Harper Fabric

If you've been developing locally, deploying to production is a single command. Make sure your .env has your Fabric credentials:

CLI_TARGET=your-cluster.your-org.harperfabric.com:9925
CLI_USERNAME=your-username
CLI_PASSWORD=your-password

And your CONFIG.env has all your API keys and tokens.

Deploy:

npm run deploy

This runs:

npx -y dotenv-cli -- harperdb deploy . restart=rolling replicated=true

What happens behind the scenes:

Your application code gets pushed to the Fabric cluster
Harper reads schema.graphql and creates/updates all tables and indexes
Environment variables from CONFIG.env are loaded
The cluster does a rolling restart with zero downtime
Your Resource Classes become live HTTPS endpoints

The whole process takes about 5-8 seconds. Your application is now live at:

https://your-cluster.your-org.harperfabric.com/

Update your Slack app's event URLs to point at this endpoint, and you're done.

Step 2: Verify Everything Works

Run through the checklist:

# Health check
curl https://your-cluster.your-org.harperfabric.com/HealthCheck

# API query
curl -X POST https://your-cluster.your-org.harperfabric.com/Api \
  -H "Content-Type: application/json" \
  -H "Authorization: Basic $(echo -n 'user:pass' | base64)" \
  -d '{"query": "How does replication work?", "mode": "ask"}'

Then in Slack:

/harper-ask how does sharding work?

If you get a structured response with sources from your actual internal systems, you're live. You've just shipped a tool that would cost thousands per month from a vendor.

The Real Cost Breakdown

I've been running Harper Eye in production for our team. Here are the actual numbers:

Monthly Costs

Resource	Cost	Notes
Harper Fabric	~$25	Single instance handles everything: HTTP, database, vector search
Claude API (Anthropic)	~$30-50	~200-400 queries/month at ~$0.10-0.15/query (Sonnet)
Gemini Embeddings	$0	Free tier: 1,500 requests/min. We use maybe 1,000/month
Confluence API	$0	Included in existing Atlassian subscription
Zendesk API	$0	Included in existing subscription
Datadog API	$0	Included in existing subscription
GitHub API	$0	Included (authenticated requests: 5,000/hr)
Slack API	$0	Free for workspace apps
Total	~$55-75/mo	For the entire organization

Cost Per Query

With the knowledge base doing its job, many queries hit the cache and never touch Claude:

Query Type	Cost	Percentage of Queries
KB exact hit	~$0.001 (just the embedding call)	~30% after 2 months
Full orchestration	~$0.10-0.15 (embedding + Claude)	~70% initially, dropping over time
Blended average	~$0.07/query	Getting cheaper every month

The knowledge loop is doing real work. Every verified answer that gets cached means one fewer Claude API call in the future. After two months, about 30% of our queries return instant cached results. That percentage keeps climbing.

What This Replaces

SaaS Alternative	Monthly Cost
Glean or Dashworks (AI search)	$800-2,000
PagerDuty AIOps add-on	$500-1,000
Incident.io or similar	$400-800
Pinecone (vector DB)	$70-200
Total replaced	$1,770-4,000/mo

Savings: $1,700-3,900/month. That's $20,000-47,000/year.

And your custom system is better — it knows your terminology, your architecture, your people. It learns from your team's feedback. It doesn't forget when you cancel a subscription.

Extensions Worth Building

Once you have the core running, these extensions are each a day or less of work:

PagerDuty Webhooks

Create resources/PagerDutyWebhook.js that receives PagerDuty incident webhooks. When a new incident fires, Harper Eye automatically runs the full orchestration and posts the analysis to your Slack incident channel, before any human even looks at it.

export class PagerDutyWebhook extends Resource {
  static loadAsInstance = false;

  async post(target, data) {
    const event = data?.event;
    if (event?.event_type === 'incident.triggered') {
      const incident = event.data;

      // Auto-analyze the incident
      const result = await orchestrate(
        `PagerDuty incident: ${incident.title}. Service: ${incident.service?.summary}`,
        { mode: 'incident' }
      );

      // Post to your incident channel
      const slack = getSlackClient();
      await slack.chat.postMessage({
        channel: config.slack.incidentChannel(),
        blocks: [
          ...formatPagerDutyHeader(incident),
          ...formatIncidentResponse(result),
        ],
      });
    }
    return { ok: true };
  }
}

The result: PagerDuty fires at 2 am, and by the time your on-call engineer opens Slack, there's already an AI analysis with root cause candidates, relevant runbooks, customer impact assessment, and which colleague to escalate to.

Web Dashboard

The site/ directory serves static HTML at /app/. Build a simple dashboard for:

Knowledge base management: view, edit, and delete KB entries
Query analytics: most asked questions, resolution times, source utilization
Feedback trends: which topics get the most negative feedback (signals for documentation gaps)
Expert map: who on your team is the go-to person for which topics

No React. No build step. Vanilla HTML + CSS + JS that calls your /Api endpoint. Harper serves it as static files.

Knowledge Capture from Slack

Add intent classification to @-mentions so engineers can say @harper-eye save this in a Slack thread, and Harper Eye automatically extracts the key Q&A from the thread discussion and saves it to the knowledge base. The tribal knowledge from a debugging session is preserved forever without anyone having to write a wiki page.

Source Relevance Learning

Track which data sources actually contribute to useful answers. If Zendesk results never get cited for architecture questions, stop searching Zendesk for those queries. This saves API calls and reduces noise in Claude's context.

The Project Structure (Final)

harper-eye/
├── config.yaml              # Harper app config
├── schema.graphql           # 7 tables, 5 vector indexes
├── CONFIG.env               # Secrets (gitignored)
├── .env                     # Deploy creds (gitignored)
├── package.json
│
├── resources/               # HTTP endpoints (Resource Classes)
│   ├── SlackEvents.js       # Slack commands + @mentions
│   ├── SlackInteractivity.js # Feedback button handlers
│   ├── PagerDutyWebhook.js  # PagerDuty auto-analysis
│   ├── Api.js               # REST API for web dashboard
│   ├── HealthCheck.js       # GET /HealthCheck
│   └── Debug.js             # Debug endpoint
│
├── lib/                     # Core business logic
│   ├── orchestrator.js      # AI orchestration (the brain)
│   ├── knowledge-base.js    # KB CRUD + vector search + feedback
│   ├── embeddings.js        # Gemini embedding generation
│   ├── config.js            # Config loader
│   ├── slack-formatter.js   # Slack Block Kit formatting
│   └── slack-mentions.js    # @-mention expert suggestions
│
├── mcp/                     # Data source wrappers
│   ├── confluence.js        # Confluence REST API
│   ├── zendesk.js           # Zendesk REST API
│   ├── datadog.js           # Datadog REST API
│   ├── github.js            # GitHub REST API
│   └── harper-docs.js       # Documentation site search
│
└── site/                    # Static web UI
    ├── index.html           # Knowledge base dashboard
    ├── dashboard.html       # Analytics
    └── help.html            # Help page

Total lines of core logic: about 700. Total external infrastructure: zero (beyond Harper itself). Total time to build: 3 days.

What You've Built

Let's take a final inventory. Look at these two images side by side:

That's not a mockup. That's a production system that 35 engineers use every day, running for under $100/month. Here's what you now have:

An AI assistant that:

Searches 6+ internal data sources in parallel (2-4 seconds)
Synthesizes results with Claude into structured, cited responses
Lives in Slack with slash commands, @mentions, and threaded follow-ups
Returns verified cached answers instantly (sub-second)
Learns from team feedback without manual curation
Automatically degrades and purges bad answers
Knows which engineers are experts on which topics
Costs under $100/month for your entire organization

Running on a stack of:

Harper: one platform for HTTP, database, and vector search
Claude: AI synthesis with structured JSON output
Gemini: embedding generation (free tier)
Vanilla JS: Built into Harper, no framework, no build step

The compounding effects over time:

Month 1: ~0% KB cache hits. Every query goes through full orchestration.
Month 2: ~15-20% cache hits. Common questions return instantly.
Month 3: ~25-35% cache hits. Negative feedback has pruned bad answers.
Month 6: ~40-50% cache hits. Your AI costs are dropping while quality improves.
Month 12+: The knowledge base becomes your most valuable internal asset. It's the institutional memory that survives employee turnover.

The Bigger Picture

Here's what I've learned from building and running Harper Eye:

Custom beats generic, always. No vendor product will ever understand your architecture, your terminology, your people, or your incident history the way a custom tool does. The gap isn't about AI capability; Claude is the same Claude whether you use it through a vendor or directly. The gap is about context. Your context.

The feedback loop is everything. An AI assistant without a learning mechanism is a party trick. One that gets smarter every time someone uses it is an institution. The knowledge loop, verified answers, negative feedback, automatic degradation, is what makes Harper Eye more valuable the longer you run it.

Infrastructure should disappear. I didn't want to manage Postgres, Pinecone, Redis, Express, and a deployment pipeline. Harper let me put all of that in a single schema.graphql and config.yaml. The less time you spend on infrastructure, the more time you spend on the logic that actually makes your tool useful.

Build it yourself, but build it with the right tools. I built Harper Eye in 3 days, not because I'm fast, but because Claude wrote most of the code and Harper eliminated most of the infrastructure. The combination of AI-assisted development and a unified platform is what makes this feasible for a single engineer.

If you've followed along this far, you have everything you need. The code is real. The architecture is battle-tested. The costs are provably lower than the alternatives.

Build it. Run it. Let your team use it. Watch it get smarter.

And stop paying thousands a month for SaaS that doesn't know your name.

*If you build your own version, I'd love to hear about it. Reach out on the Harper Discord.

DEV Community