Likhit Kumar V P

Posted on Apr 5

I Built an MCP Server That Lets AI Autonomously Debug Salesforce - Here's How

#mcp #typescript #ai #opensource

I built sf-log-mcp, an open-source MCP server that gives AI assistants (Claude, Copilot, Cursor) the ability to autonomously fetch, analyze, and manage Salesforce debug logs. It detects "silent failures" that Salesforce marks as "Success" but are actually broken. Published on npm, 9 tools, 7 parsers, 101 tests.

GitHub: github.com/Likhit-Kumar/SF-Logs

npm: npx sf-log-mcp

The Problem Nobody Talks About

If you've ever debugged a Salesforce integration, you know the drill:

Open Setup > Debug Logs
Stare at a wall of logs showing Status = "Success"
Manually download each .log file
Ctrl+F through 50,000 lines looking for what went wrong
Find out the "successful" callout actually returned {"error":"rate_limit_exceeded"} inside an HTTP 200

The Status field lies. In my experience, over 90% of real production issues are silent failures the code didn't crash, Apex didn't throw an unhandled exception, but the right thing didn't happen.

Here's what "Success" actually hides:

HTTP 200 with {"error":"rate_limit"} in body - Integration silently failing
Exception caught by try-catch - Error swallowed, moved on
SOQL returned 0 rows - Wrong filter, no data processed
Governor limits at 95% - Works now, breaks at scale
Flow path skipped entirely - Expected automation never fired

Now here's the kicker: no existing MCP server can even fetch these logs, let alone analyze them.

The Gap in the Ecosystem

I spent weeks researching the Salesforce MCP landscape. Here's what I found:

Certinia's `@certinia/apex-log-mcp`

A solid parser, great for performance profiling and bottleneck detection. But 3 out of 4 tools require a local .log file path as input. It cannot list, fetch, or download logs from an org. You still have to manually download them first.

Salesforce's `@salesforce/mcp`

The official MCP server with 60+ tools across metadata, data, testing, DevOps, and code analysis. Impressive scope. But: zero debug log tools. No list_logs, no fetch_log, no manage_trace_flags. Nothing.

The Workflow Today (Even With MCP)


You: "Something's wrong with our Vendor X integration"

AI: "I'd love to help! Can you download the debug log
and give me the file path?"

You: *opens Setup, clicks Debug Logs, downloads file,
saves to Desktop, types the path*

AI: "Thanks, here's the analysis..."

The AI is supposed to be autonomous. But it can't even get the logs it needs.

What I Built

sf-log-mcp, a standalone MCP server that fills the missing layer between your Salesforce org and your AI assistant.

The Philosophy

Don't filter. Fetch. Read. Reason.

Instead of filtering logs by status (which catches <10% of real issues), sf-log-mcp downloads the raw log content and lets the AI reason about what actually happened such as callout responses, DML results, swallowed exceptions, governor limits, flow paths.

The Workflow Now


You: "Something's off with our Vendor X integration — check the recent logs"

AI autonomously:

1. manage_trace_flags → checks if logging is active
2. fetch_latest_logs → downloads 5 most recent logs
3. analyze_log → health score: 35/100 (CRITICAL)
4. get_log_content → drills into callouts section
5. get_log_content → checks exceptions section

AI: "Found it. Log 07L... shows the callout to
api.vendorx.com/sync returned HTTP 200, but the
response body contains {"error":"rate_limit_exceeded"}.
This happened in 3 of the 5 recent logs. The
integration looks healthy from the Status field
but is actually being rate-limited."

Zero manual steps. The AI fetches, reads, reasons, and explains.

Architecture


You (natural language)
│
v
AI Assistant (Claude / Copilot / Cursor)
│ MCP tool calls via stdio
v
sf-log-mcp (this project)
│ Salesforce Tooling API (REST)
v
Your Salesforce Org (auth via SF CLI)

Key Design Decisions

1. Direct Tooling API, not CLI subprocesses

Instead of shelling out to sf apex list log (which spawns a subprocess, has CLI version dependencies, and limited filtering), I use @salesforce/core to make direct REST calls to the Salesforce Tooling API. This gives fine-grained query control and eliminates subprocess overhead.

2. Reuse SF CLI auth

No new credentials, no OAuth setup, no tokens to configure. If sf org list shows your org, sf-log-mcp can connect to it. It reads from ~/.sf/ - the same auth your Salesforce CLI already uses.

3. Standalone, not bundled

sf-log-mcp runs alongside Certinia's parser, not replacing it. You get the best of both: sf-log-mcp fetches and analyzes for silent failures, Certinia does deep performance profiling. The AI combines both results.

9 Tools, 4 Tiers

Tier 1: Log Acquisition

list_debug_logs - List logs with rich filtering (user, operation, date range, size)
fetch_debug_log - Download a specific log by ID
fetch_latest_logs - Batch-download the N most recent logs

Tier 2: Content Intelligence

get_log_content - Extract structured sections (callouts, exceptions, SOQL, DML, governor limits, flows, debug messages)
analyze_log - One-call health analysis with a 0-100 score
search_logs - Regex search across all downloaded logs

Tier 3: Lifecycle Management

manage_trace_flags - Create, list, update, delete trace flags
delete_debug_logs - Delete logs (with dry-run mode)

Tier 4: Cross-Log Intelligence

compare_logs - Side-by-side diff of two logs for regression detection

The Health Score: Diagnosing Logs in One Call

The analyze_log tool is the entry point for debugging. It returns a health score from 0-100:


Health Score: 65/100 — DEGRADED

Critical Issues:

- Silent callout failure: HTTP 200 with error in body (api.vendorx.com)

Warnings:

- 2 handled exceptions (verify error handling is correct)

- Governor limit: SOQL queries at 82% (approaching limit)

- Zero-row SOQL: SELECT Id FROM Account WHERE ExternalId__c = '...'

How it's calculated:


healthScore = 100

healthScore -= (critical issues × 20)

healthScore -= (warnings × 5)

Health Ratings:

HEALTHY (90-100) - No significant issues
WARNING (70-89) - Minor concerns worth checking
DEGRADED (50-69) - Multiple issues, needs attention
CRITICAL (0-49) - Serious failures detected

The AI uses this score to decide what to drill into next callouts? Exceptions? Governor limits? It's the triage step that makes the whole workflow efficient.

Detecting Silent Failures: The 7 Parsers

Each parser is purpose-built to extract and warn about a specific class of silent failure:

1. Callout Parser - The HTTP 200 Lie Detector


CALLOUT_REQUEST|[42]|System.HttpCallout[endpoint=https://api.vendor.com/sync]

CALLOUT_RESPONSE|[42]|System.HttpCallout[status=200, body={"error":"rate_limit_exceeded"}]

Most monitoring checks the HTTP status code. 200 = good, right? Wrong. The callout parser pairs every request with its response and scans the body for error keywords. This catches the most common class of integration failure.

2. Exception Parser - Handled vs. Unhandled


EXCEPTION_THROWN|[15]|System.NullPointerException: Attempt to de-reference a null object

Salesforce only flags unhandled exceptions in the Status field. But most production code wraps everything in try-catch. The exception parser uses a 10-line lookahead : if EXCEPTION_THROWN is followed by FATAL_ERROR, it's unhandled. If followed by METHOD_EXIT, it was caught. Both are reported, because a caught NullPointerException is still a bug.

3. SOQL Parser - The Zero-Row Detector


SOQL_EXECUTE_BEGIN|[23]|SELECT Id FROM Account WHERE ExternalId__c = 'VND-001'

SOQL_EXECUTE_END|[23]|Rows:0

A query that returns 0 rows isn't an error. But if your integration expects to find a matching record and doesn't, the entire downstream process silently does nothing. The SOQL parser flags zero-row results as data issues.

4. Governor Limits Parser - The Time Bomb Detector


Number of SOQL queries: 82 out of 100 (82%) → WARNING

Number of DML rows: 9,800 out of 10,000 (98%) → CRITICAL

At 95% of governor limits, everything works. At 101%, everything breaks. The governor parser calculates percentages and flags anything over 80% as a warning.

5-7. DML, Flow, and Debug Message Parsers

DML Parser: Flags bulk operations (>200 rows) that might cause partial failures
Flow Parser: Tracks 16 flow event types, flags FLOW_ELEMENT_ERROR and FLOW_ELEMENT_FAULT
Debug Messages: Extracts System.debug() output where developers log errors the system doesn't track

Smart Error Handling

Salesforce API errors are notoriously cryptic. sf-log-mcp classifies them into 9 categories with actionable messages:

Session expired → "Re-authenticate with: sf org login web --alias <org>"
API limit exceeded → "Wait and retry, or check API usage in Setup"
Insufficient permissions → "User needs View All Data or Manage Users"
Entity already traced → "Use manage_trace_flags to find the existing flag"

No more googling Salesforce error codes.

Security Model

No credentials stored - Reuses SF CLI auth from ~/.sf/
Org allowlist - --allowed-orgs restricts which orgs the server can access
Stdio transport - No HTTP server, no open ports
SOQL injection protection - All user inputs are escaped
Read-only by default - Only delete_debug_logs and manage_trace_flags modify state (and only debug infrastructure, not business data)

The Numbers

9 MCP Tools
7 Log Parsers
2,488 Source Lines
1,069 Test Lines
101 Tests Passing
15 Test Suites
3 Production Dependencies
44.8 KB npm Package Size
3 Node.js Versions Supported (18, 20, 22)

Try It in 2 Minutes

Prerequisites

Node.js >= 18
Salesforce CLI (sf) authenticated to an org

Setup


npx sf-log-mcp --allowed-orgs ALLOW_ALL_ORGS

Configure Your AI Client

Claude Desktop:


{

    "mcpServers": {

        "sf-log-mcp": {

            "command": "npx",

            "args": ["-y", "sf-log-mcp", "--allowed-orgs", "ALLOW_ALL_ORGS"]

        }

    }

}

VS Code / Cursor:


{

    "servers": {

        "sf-log-mcp": {

            "command": "npx",

            "args": ["-y", "sf-log-mcp", "--allowed-orgs", "ALLOW_ALL_ORGS"]

        }

    }

}

Then ask your AI: "List my recent Salesforce debug logs"

Multi-Server Setup

sf-log-mcp is designed to complement, not replace:


AI Client (Claude Desktop / VS Code / Cursor)
│
├── sf-log-mcp (this project)
│ Fetch, analyze, search debug logs
│ Detect silent failures
│
├── @certinia/apex-log-mcp (optional)
│ Deep performance profiling
│ CPU bottleneck detection
│
└── @salesforce/mcp (optional)

SOQL queries, metadata, test runs

sf-log-mcp fetches the log and saves it to disk. Certinia's tools read the same file for performance analysis. The AI combines both results -> silent failure detection + performance profiling in one conversation.

What I Learned Building This

1. The Status Field is a Lie

This was the core insight that shaped the entire architecture. Filtering by Status = 'Fatal Error' catches maybe 5-10% of real issues. The rest are silent : HTTP 200s with error bodies, caught exceptions, empty query results, skipped flow paths. The only way to find them is to read the actual log content.

2. MCP is the Right Abstraction

Before MCP, I would have built a CLI tool or a VS Code extension. MCP means I build once and it works everywhere (Claude Desktop, VS Code, Cursor, Windsurf, any future client). The AI decides when and how to use the tools. I just expose the capabilities.

3. Parsers Need to Be Opinionated

A generic parser that returns "here are all the events" is useless to an AI. The parsers need to warn - "this callout returned 200 but the body contains an error keyword." That opinion is what makes the AI's analysis actionable.

4. Health Scores Drive Efficient Debugging

Without the health score, the AI would analyze every section of every log. With it, the AI triages first: "This log is CRITICAL, let me check callouts and exceptions." It cuts the number of tool calls in half.

5. Three Dependencies is Enough

@modelcontextprotocol/sdk for MCP, @salesforce/core for auth + API, zod for validation. That's it. No express, no axios, no lodash. The entire package is 44.8 KB.

What's Next

Windsurf testing - Verifying compatibility with the Windsurf AI IDE
Real-time log tailing - Stream logs as they're generated (SSE transport)
Custom analysis rules - User-defined patterns for domain-specific silent failures
Certinia integration guide - Step-by-step workflow combining both servers

Top comments (3)

Carlos Arias • Apr 6

I'm confused between mcp server and acp server. Can you explain that?

Likhit Kumar V P • Apr 9

Hey, MCP connects an AI assistant to external tools and data sources (like your sf-log-mcp giving Claude the ability to fetch Salesforce logs).
MCP is AI ↔ Tools

ACP is about multiple AI agents talking to and coordinating with each other, passing tasks down a pipeline.
ACP is Agent ↔ Agent.

Here for example,
MCP -> You ask Cursor/Claude "What's wrong with my Salesforce integration?" cursor autonomously calls your sf-log-mcp tools, fetches the logs, analyzes them, and replies. One AI, using tools.

ACP -> You ask an orchestrator agent "Debug my Salesforce integration and file a Jira ticket." It delegates to a log-fetching agent, which hands results to an analysis agent, which hands a summary to a Jira-writing agent that creates the ticket. Multiple agents, coordinating.

Harjot Singh • Jun 1

this is such a necessary tool for Salesforce debugging. the silent failures can be really frustrating to track down. at Moonshift, we help developers get a full next.js + postgres + auth app deployed in about 7 minutes, and you own the code on your github. if you're interested, I can set you up with a free build to check it out.