DEV Community

Cover image for Run Your Harper AI Agent on Google Cloud Vertex AI — 3 Files Changed
Stephen Goldberg for Harper

Posted on

Run Your Harper AI Agent on Google Cloud Vertex AI — 3 Files Changed

In Part 1 we built a conversational AI agent on Harper — semantic cache, vector memory, local embeddings, web search, chat UI — all in one process. It talked to Claude through Anthropic's direct API.

That works great for solo developers and startups. But if your org runs on Google Cloud, you probably want Claude going through Vertex AI — same billing, same IAM, same audit logs as everything else in your GCP project.

Good news: it took three file changes and zero rewrites to the agent logic.

Why Vertex AI?

If you're already on GCP, running Claude through Vertex means:

  • Consolidated billing — Claude costs show up in the same invoice as your Compute Engine, BigQuery, and Cloud Storage
  • IAM and org policies — control who can call Claude with the same roles and permissions you already manage
  • Data residency — choose regional, multi-region, or global endpoints depending on where your data needs to stay
  • No API key management — authenticate with GCP service accounts instead of passing around Anthropic API keys
  • Quota controls — set per-project, per-model token limits through GCP's quota system

The underlying model is identical. Same Claude, same capabilities, same response quality. The only difference is the front door.

What We Changed

The Anthropic SDK and the Vertex SDK share the same messages.create() interface. The only difference is how you initialize the client.

1. Install the Vertex SDK

npm install @anthropic-ai/vertex-sdk
Enter fullscreen mode Exit fullscreen mode

One new dependency. It sits alongside @anthropic-ai/sdk — both stay installed so you can switch between providers with an environment variable.

2. Update the config (lib/config.js)

Before — Anthropic-only:

export const config = {
  anthropic: {
    apiKey: () => required('ANTHROPIC_API_KEY'),
    model: () => optional('CLAUDE_MODEL', 'claude-sonnet-4-5-20250929'),
  },
}
Enter fullscreen mode Exit fullscreen mode

After — provider-aware:

export const config = {
  provider: () => optional('LLM_PROVIDER', 'anthropic'),
  anthropic: {
    apiKey: () => required('ANTHROPIC_API_KEY'),
    model: () => optional('CLAUDE_MODEL', 'claude-sonnet-4-5-20250929'),
  },
  vertex: {
    projectId: () => required('VERTEX_PROJECT_ID'),
    region: () => optional('VERTEX_REGION', 'global'),
    model: () => optional('VERTEX_MODEL', 'claude-sonnet-4-6'),
  },
}
Enter fullscreen mode Exit fullscreen mode

LLM_PROVIDER controls which path the agent takes. Default is anthropic, so existing deployments don't break.

3. Update the agent (resources/Agent.js)

The client initialization goes from a one-liner to a conditional:

import Anthropic from '@anthropic-ai/sdk'
import { AnthropicVertex } from '@anthropic-ai/vertex-sdk'

let _client
const getClient = () => {
  if (_client) return _client
  if (config.provider() === 'vertex') {
    _client = new AnthropicVertex({
      projectId: config.vertex.projectId(),
      region: config.vertex.region(),
    })
  } else {
    _client = new Anthropic({ apiKey: config.anthropic.apiKey() })
  }
  return _client
}
Enter fullscreen mode Exit fullscreen mode

Every downstream call — getClient().messages.create(...) — stays exactly the same. The Vertex SDK is API-compatible with the Anthropic SDK. Same messages, same tools, same system, same max_tokens. No refactoring.

The only functional difference: Anthropic's server-side web search tool isn't available on Vertex by default (it requires a GCP org policy change), so we skip it:

const tools = isVertex() ? [] : [WEB_SEARCH_TOOL]

let apiResponse = await getClient().messages.create({
  model: getModel(),
  max_tokens: 1024,
  ...(tools.length && { tools }),
  system: SYSTEM_PROMPT,
  messages,
})
Enter fullscreen mode Exit fullscreen mode

That's it. The semantic cache, vector context, local embeddings, cost tracking, chat UI — all untouched.

GCP Setup (5 Minutes)

If you don't have a GCP project yet, create one at console.cloud.google.com. Then:

1. Enable the Vertex AI API:

https://console.developers.google.com/apis/api/aiplatform.googleapis.com/overview?project=YOUR_PROJECT_ID
Enter fullscreen mode Exit fullscreen mode

2. Enable Claude in the Model Garden:

Go to Vertex AI Model Garden, search "Claude", pick the model you want (e.g. Claude Sonnet 4.6), and enable it. You'll agree to Anthropic's terms here.

3. Request quota:

New projects start with 0 tokens/min for partner models. Go to IAM & Admin → Quotas, filter for your Claude model, and request an increase. Even 100K tokens/min is plenty for testing.

4. Create a service account:

Go to IAM → Service Accounts → Create. Give it the Vertex AI User role. Download the JSON key.

5. Configure .env:

LLM_PROVIDER=vertex
VERTEX_PROJECT_ID=my-gcp-project
VERTEX_REGION=us-east5
GOOGLE_APPLICATION_CREDENTIALS=./my-service-account-key.json
Enter fullscreen mode Exit fullscreen mode

6. Start the agent:

npm run dev
Enter fullscreen mode Exit fullscreen mode

Open http://localhost:9926/Chat and start chatting. The agent is now running Claude through your GCP project.

Switching Back

Want to go back to Anthropic's direct API? Change one line:

LLM_PROVIDER=anthropic
Enter fullscreen mode Exit fullscreen mode

Restart. Done. Web search comes back automatically.

What Didn't Change

This is the part worth emphasizing. Switching to Vertex AI required zero changes to:

  • The semantic cache — Harper's HNSW vector search doesn't care where the LLM response came from
  • The local embeddingsbge-small-en-v1.5 runs in-process regardless of LLM provider
  • The schema — same three tables, same vector index, same TTL
  • The chat UI — same HTML, same WebSocket-free polling, same sidebar stats
  • The cost tracking — token counts come back in the same format from both SDKs
  • The deploymentharperdb deploy . works the same way

Harper handles everything below the LLM call. The LLM call itself is a single function with two possible backends. Swapping backends is a config change, not a rewrite.

The Full Picture

.env: LLM_PROVIDER=vertex
         │
         ▼
┌─────────────────────────┐
│      resources/Agent.js │
│                         │
│  getClient() ─────────► AnthropicVertex (GCP credentials)
│  getModel()  ─────────► claude-sonnet-4-6
│                         │
│  Everything else:       │
│  same cache, same       │
│  embeddings, same       │
│  vector search, same    │
│  cost tracking          │
└────────┬────────────────┘
         │
         ▼
┌─────────────────────────┐
│         Harper          │
│  DB + Vector + Cache +  │
│  API + Embeddings       │
│  (unchanged)            │
└─────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Try It

The repo is at github.com/stephengoldberg/agent-example-harper. Clone it, pick your provider, and npm run dev.

If you're already running the agent from Part 1, the diff is small:

npm install @anthropic-ai/vertex-sdk
# Update .env with your GCP config
npm run dev
Enter fullscreen mode Exit fullscreen mode

Three files changed. Zero agent logic rewritten. Same Claude, enterprise-grade GCP integration.

Top comments (0)