DEV Community

Cover image for I Built an Open-Source, Agentic OSINT Platform You Run With Your Own API Key
Tommaso Bertocchi
Tommaso Bertocchi

Posted on • Originally published at openosint.tech

I Built an Open-Source, Agentic OSINT Platform You Run With Your Own API Key

Before publishing — delete this block. GIFs on dev.to have no Giphy liquid tag: embed them as a plain markdown image pointing at the direct https://media.giphy.com/media/<ID>/giphy.gif URL (limit: 200 megapixels per frame). Every ![alt](REPLACE_WITH_GIPHY_URL) below is a placeholder with the alt text written and a search term in the comment — open Giphy, hit Copy link → GIF Link, paste it in. Cover image: 1000×420 px, keep any text within the left ~840px (social cards crop the rest).

TL;DR

  • What: an open-source, agentic OSINT platform — 16 tools, one natural-language interface.
  • How it's different: BYOK (bring your own LLM key) + a client-side agent loop + a stateless backend. Your keys and your queries stay with you.
  • Where: github.com/OpenOSINT/OpenOSINT · live demo at demo.openosint.tech
  • Run it: git clone, uv sync, uv run openosint repl. ~2 minutes.

OSINT tooling forces a bad trade-off. On one side: a pile of disconnected CLI scripts you glue together by hand. On the other: a polished SaaS that wants your investigation data on their servers and a subscription on your card.

I wanted the power of an orchestrated platform with the privacy of a local script — so I built OpenOSINT: the LLM does the orchestration, you bring your own API key, and nothing sensitive touches a server you don't control.

Detective pinning photos and red string to a conspiracy board

Table of Contents


What it actually is

OpenOSINT is a toolbox plus a brain.

The toolbox: 16 OSINT tools — username lookups, email/domain intelligence, IP geolocation, breach checks, metadata extraction, and more. Each is a clean, typed function with one job.

See the tool categories
  • Identity — username enumeration across platforms, email validation & reputation
  • Network — IP geolocation, ASN/WHOIS, reverse DNS
  • Domain — DNS records, subdomain discovery, certificate data
  • Exposure — breach/leak checks, paste monitoring
  • Artifacts — file & image metadata extraction

(Exact tool list lives in the repo README — it grows over time.)


The brain: an agent loop. You ask in plain language — "what can you find on this domain?" — the model picks the tools, chains them, and hands back a synthesized answer instead of 12 raw JSON blobs.

The twist: the brain runs on your key, and the loop runs client-side. The backend is a stateless dispatcher.

Orchestration vs. privacy

Every setup makes you choose. OpenOSINT tries not to:

Loose scripts Hosted SaaS OpenOSINT
Orchestration ❌ manual
Data stays local
Keys stay yours
One NL interface
Open source sometimes rarely ✅ MIT

Two-buttons sweating meme:

The architecture that matters: BYOK

One rule drives everything: the server should know as little as possible.

┌─────────────────────────────────────┐
│  Your machine / browser              │
│   ┌──────────────┐                   │
│   │  Agent loop  │  ← your LLM key   │
│   │ (client-side)│                   │
│   └──────┬───────┘                   │
│          │ tool call (no secrets)    │
└──────────┼──────────────────────────┘
           ▼
┌─────────────────────────────────────┐
│  Stateless FastAPI backend           │
│  - rate limiting                     │
│  - real client-IP detection          │
│  - dispatches the 16 tools           │
│  - holds NO investigation state      │
└─────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Three deliberate choices:

  1. BYOK. LLM inference is configured client-side, with adapters for multiple providers. Your key talks to your model — the backend never sees it.
  2. Client-side agent loop. The reasoning ("call this, then that") happens next to you, not on a shared server.
  3. Stateless backend. No stored query history. Restart it, scale it, throw it away — there's nothing sitting on a box you don't own.

One core, four front-ends

The same 16 tools, four ways in — use OpenOSINT however you already work:

  • CLI — scripting and one-off lookups
  • REPL — interactive, conversational investigation in your terminal
  • MCP server — plug it into Claude or any MCP client; the tools become available to your assistant directly
  • Web UI — browser front-end with the visual graph
# CLI: one-shot
openosint lookup username johndoe

# REPL: interactive
openosint repl
Enter fullscreen mode Exit fullscreen mode

Register a tool once → it shows up everywhere.

The Entity Correlation Graph

Raw results are noise. What turns lookups into an investigation is seeing how entities connect: this username links to that email, which resolves to that domain, hosted on that IP.

OpenOSINT renders this as an interactive Entity Correlation Graph (Cytoscape.js): nodes are entities, edges are discovered relationships. You drag, zoom, and watch the picture assemble as the agent works.

OpenOSINT Entity Correlation Graph building in real time

This is quietly evolving toward a proper ontology + entity-resolution layer — the idea behind platforms like Palantir Gotham, but open and self-hosted.

Quick start (2 minutes)

With Python and uv:

git clone https://github.com/OpenOSINT/OpenOSINT.git
cd openosint
uv sync
uv run openosint repl
Enter fullscreen mode Exit fullscreen mode

Bring your own key:

export OPENOSINT_LLM_PROVIDER=anthropic   # or openai, etc.
export OPENOSINT_API_KEY=sk-...
Enter fullscreen mode Exit fullscreen mode

Then just ask:

> what can you find about the domain example.com?
Enter fullscreen mode Exit fullscreen mode

The agent picks the tools, runs them, correlates the output, and answers.

OpenOSINT REPL running an investigation end to end

Prefer to look before installing? Live demo: demo.openosint.tech.

Drop it into Claude (or any MCP client)

Because OpenOSINT ships an MCP server, an MCP-aware assistant can call the 16 tools as part of its own reasoning — no copy-pasting between windows:

{
  "mcpServers": {
    "openosint": {
      "command": "uvx",
      "args": ["openosint", "mcp"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Why BYOK is the whole point

"Bring your own key" reads like a cost feature. It isn't — it's a trust feature.

In OSINT, the query itself is sensitive. The username you're profiling, the domain you're digging into — that's signal about what you're working on. A hosted tool that proxies your LLM calls sees all of it.

With BYOK + a client-side loop:

  • Your prompts go straight from your machine to your model provider
  • Your API key is never transmitted to the backend
  • Your investigation state lives with you

Orchestration without handing over the thing you're trying to keep private.

Suspicious side-eye,

Where it's going

  • A real ontology layer — typed entities and relationships, not labeled blobs
  • Stronger entity resolution (deduping "the same person across three handles")
  • More tools, same client-side, key-safe model

The MIT core stays free and open. The direction is "self-hostable Gotham," not "another locked SaaS."


One question for you: which OSINT source would you wire in first? Drop it in the comments — that's genuinely how the roadmap gets prioritized. 👇

If this is useful, a ⭐ helps it reach more people:

GitHub logo OpenOSINT / OpenOSINT

AI-powered OSINT agent with interactive REPL, MCP server, and CLI. 16 tools. Works with Claude, GPT-4, or local models. For authorized security research only.

mcp-name: io.github.OpenOSINT/openosint

OpenOSINT

OpenOSINT

OSINT agent for security researchers and analysts: 18 investigation tools behind a natural-language interface.


Use it as a REPL, CLI, MCP server, or browser Web UI.


The AI issues hard-stop tool calls; your code executes the real binary — hallucinated findings are structurally impossible.


Release PyPI PyPI downloads License MIT GitHub Stars MCP MCP Registry Sponsored by IP2Location

▶ Try the live demo

Run a real OSINT investigation in your browser — bring your own Anthropic / OpenRouter / Ollama key, no signup.

pip install openosint
Enter fullscreen mode Exit fullscreen mode

Quick Start

# Interactive AI REPL (default)
openosint

# Web interface
openosint web

# Direct tool (no AI)
openosint email target@example.com
Enter fullscreen mode Exit fullscreen mode

Usage

Start the REPL and investigate any target — the agent decides which tools to run and chains them on findings:

openosint > investigate target@example.com
  -> generate_dorks('target@example.com')
  -> search_email('target@example.com')
  Found: Spotify, WordPress, Gravatar, Office365

  -> search_breach('target@example.com')
  Found in 2 breaches: LinkedIn (2016), Adobe (2013)

  -> search_username('johndoe99')   <- pivoted from email findings
  Found: GitHub, Reddit,

Built solo, in the open. Issues and PRs welcome.

Confident mic drop

Top comments (0)