Drop-in Perplexity Sonar Replacement with AWS Bedrock Nova Grounding

#aws #bedrock #python #webdev

Part 2 of my series on building a low-cost personal AI stack on AWS.
Part 1 — I Squeezed My $1k Monthly OpenClaw API Bill with ~$20/Month in AWS Credits
Part 3 — From 3-Minute Cold Starts to ~20 Seconds: Whisper on AWS Lambda + EFS

If you're running an AI assistant or agent framework that uses Perplexity's Sonar API for web search, you're paying per query — or burning through your monthly credit allocation faster than you'd like.

I'm on Perplexity Pro, which comes with $5/month in API credits. Sounds fine until you hit mid-month and realize OpenClaw has quietly burned through all of it. I wanted something uncapped that didn't add another bill. If you're an AWS user with any credits sitting around — that $25 from a workshop, an event promo, or re:Invent swag — there's a better option: route those queries through Amazon Bedrock's Nova Premier grounding instead.

I built bedrock-web-search-proxy, a FastAPI proxy that makes Bedrock Nova Premier look exactly like the Perplexity Sonar API. Change one URL, keep everything else the same.

What is Nova Grounding?

Amazon Nova Premier supports a nova_grounding system tool that lets the model search the web in real-time and return answers with citations — similar to Perplexity Sonar. The difference: it runs on Bedrock, so it counts against your AWS credits rather than a separate Perplexity subscription.

Why Not Just Use Brave Search's Free Tier?

Brave does have an AI Answers API that returns synthesized answers with citations — similar to Perplexity. Two catches though:

Credit card required — even the $5/month free tier needs a card on file as an anti-fraud measure
Undocumented model — Brave doesn't clearly disclose which LLM powers the answers, so you're trusting a black box

With Nova grounding, you know exactly what's running (Nova Premier on Bedrock), and it counts against AWS credits you likely already have. No new billing relationship, no mystery model.

Apps That Use Perplexity API

The wrapper is a drop-in for any app that supports Perplexity as a provider:

OpenClaw — tools.web.search.perplexity.baseUrl config
Open WebUI — web search integration
LibreChat — via Perplexity MCP server
Cursor — Perplexity MCP for web research
Continue.dev — Sonar models for codebase context
AnythingLLM — Perplexity as cloud LLM provider
LiteLLM — web search interception

Proof It's Actually Grounded (Not Hallucinated)

Here's a direct API call asking for the current Bitcoin price:

curl -s http://localhost:7000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nova-premier-web-grounding",
    "messages": [{"role": "user", "content": "What is the Bitcoin price right now?"}],
    "max_tokens": 200
  }'

Response:

{
  "choices": [{
    "message": {
      "content": "The current price of Bitcoin (BTC) is $67,254.57 USD, reflecting a 0.54% increase in the last 24 hours. Last updated February 20, 2026 at 04:34 UTC."
    }
  }],
  "citations": [
    "https://www.latestly.com/technology/bitcoin-price-today-february-20-2026-btc-price-at-usd-67243-up-compared-to-yesterdays-usd-66941-mark-7321498.html",
    "https://www.binance.com/en/price/bitcoin"
  ]
}

The citation URL contains today's date in the slug. Not hallucinated — Nova Premier actually fetched and synthesized live web content.

Setup

1. Install and run (one line, no cloning needed)

uvx --from git+https://github.com/gabrielkoo/bedrock-web-search-proxy bedrock-web-search-proxy

Or run directly from the raw script:

uv run https://raw.githubusercontent.com/gabrielkoo/bedrock-web-search-proxy/main/main.py

Both require uv and AWS credentials with bedrock:InvokeModel on us.amazon.nova-premier-v1:0. Region defaults to us-east-1 — override with AWS_DEFAULT_REGION if needed.

2. Configure your app

For OpenClaw, update ~/.openclaw/openclaw.json:

{
  "tools": {
    "web": {
      "search": {
        "provider": "perplexity",
        "perplexity": {
          "baseUrl": "http://localhost:7000/v1",
          "apiKey": "nova-grounding",
          "model": "nova-premier-web-grounding"
        }
      }
    }
  }
}

⚠️ The apiKey must not be a real pplx- key — OpenClaw detects that prefix and overrides baseUrl back to Perplexity's servers.

For other apps, just point the Perplexity base URL to http://your-host:7000/v1 and use any model name — the wrapper routes everything to Nova Premier.

Model Aliases

All standard Perplexity model names are accepted and routed to Nova Premier (the only Nova model that currently supports the grounding tool):

Request model	Bedrock model
`nova-premier-web-grounding`	`us.amazon.nova-premier-v1:0`
`sonar-pro`, `sonar-pro-online`	`us.amazon.nova-premier-v1:0`
`sonar`, `sonar-mini`, `sonar-turbo`	`us.amazon.nova-premier-v1:0`

Cost

Nova Premier usage counts against your AWS credits — so if you have any sitting around (a $25 promo from an AWS event, workshop, or re:Invent swag bag), this is effectively free. Check your Billing console — you might have more than you think.

AWS Community Builders: covered by $500/year credits
Others with AWS credits: same deal — credits apply
No credits: check the Bedrock pricing page for current Nova Premier rates

Caveats

Streaming doesn't return citations[] — Nova limitation. Non-streaming works fine, and OpenClaw's web_search tool uses non-streaming.
MAX_CONCURRENT semaphore defaults to 5 — tune via env var if needed.
Region: Nova Premier grounding requires us-east-1.

Wrapping Up

If you're already on AWS Bedrock for your LLM workloads, there's no reason to pay Perplexity separately for web-grounded search. The wrapper is ~350 lines of Python, has 44 tests, and is OpenAI SDK-compatible — so it works with anything that speaks the Perplexity or OpenAI chat completions API.

Repo: github.com/gabrielkoo/bedrock-web-search-proxy