Tommaso Bertocchi

Posted on May 21 • Originally published at openosint.tech

I Built an AI-Powered OSINT Agent That Investigates Targets Autonomously — From Your Terminal

#python #security #osint #mcp

Legal disclaimer: OpenOSINT is intended for legal and authorized use only — penetration testing with permission, investigating your own accounts, journalistic research. Users are solely responsible for compliance with applicable law. See DISCLAIMER.md.

You type a target. An AI agent decides which tools to run. It chains them based on findings. It writes you a structured report. You never touch a prompt.

That's OpenOSINT.

I've been building this since early this year and just hit v2.12.0. The project started as a simple MCP server wrapping a handful of OSINT binaries. It's grown into something I actually use daily: a full agentic OSINT framework with a terminal REPL, a web UI, a direct CLI, and full MCP server support for Claude Code and Claude Desktop.

Let me walk you through what it does and how it works.

What Is OpenOSINT?

OpenOSINT is a modular OSINT framework with three interfaces that share the same 11-tool core:

Interface	How to invoke	What it is
AI REPL	`openosint`	Claude-powered terminal. Type targets in natural language. Agent decides what to run.
Direct CLI	`openosint email addr`	Run individual tools without AI, for scripting or quick lookups.
MCP Server	registered via `claude mcp add`	Exposes all 11 tools to any MCP-compatible client (Claude Code, Claude Desktop).

The framework is written in Python, built on asyncio, uses prompt_toolkit + Rich in the REPL, and the AI layer talks to Anthropic's native tool use API directly.

No embedded model. No output massaging. When the agent issues a tool call, the real binary executes and real stdout goes back. Hallucination in tool results is structurally impossible.

The 11 Tools

Tool	Backend	What it finds
`search_email`	holehe	Social accounts linked to an email (Spotify, Gravatar, Office365…)
`search_username`	sherlock	Accounts across 300+ platforms
`search_breach`	HaveIBeenPwned v3 API	Data breach exposure, leaked data types
`search_whois`	python-whois	Domain registrant, registrar, creation/expiry dates
`search_ip`	ipinfo.io	Geolocation, ASN, hostname
`search_ip2location`	IP2Location API	Enhanced geolocation + VPN/Proxy/Tor/datacenter detection
`search_domain`	sublist3r	Subdomain enumeration
`generate_dorks`	built-in	12 targeted Google dork URLs (no network call)
`search_paste`	psbdmp.ws	Pastebin dump mentions
`search_phone`	phoneinfoga	Carrier, country, line type
`search_censys`	Censys API	Open ports, services, certificate history

If a binary is absent from PATH, that tool returns a descriptive error — the rest of the framework keeps running.

The AI REPL

This is my favourite part. Run openosint with no arguments and you land in an interactive session powered by the Anthropic tool use API.

openosint ❯ investigate target@example.com

  → generate_dorks('target@example.com')
  → search_email('target@example.com')
  ✓ Found: Spotify, WordPress, Gravatar, Office365

  → search_breach('target@example.com')
  ✓ Found in 2 breaches: LinkedIn (2016), Adobe (2013)

  ╭──────────────── Report ────────────────╮
  │ ## Summary                             │
  │ Single target — high confidence.       │
  │                                        │
  │ ## Online Presence                     │
  │ Spotify · WordPress · Gravatar         │
  │                                        │
  │ ## Data Breaches                       │
  │ LinkedIn (2016) · Adobe (2013)         │
  ╰────────────────────────────────────────╯

  ✓ Report saved → reports/2026-05-11_report.md

You don't have to specify which tools to run. Type a natural language instruction and the agent figures it out:

investigate target@example.com → email + breach + dorks
find all accounts for johndoe99 → username search across 300+ platforms
what subdomains does example.com have? → domain tool
check if +14155552671 is mobile → phone tool

Reports are auto-saved after every investigation containing structured findings. Available REPL commands:

Command	Description
`clear`	Reset conversation memory
`save`	Save last report to `reports/`
`tools`	List available tools and their status
`config`	Show current configuration
`exit` / Ctrl-D	Exit

The Web Interface

The website (openosint.tech) ships with full documentation in the classic man(1) style. But there's also a local web UI — a browser-based AI chat interface with real-time streaming, tool result cards, and light/dark theme.

pip install "openosint[web]"
openosint web
# → opens automatically at http://localhost:8080

Features:

AI chat with inline tool results
Full conversation history per session
Light/dark theme (preference saved in browser)
Ollama support — run it with local models, no API key required
API key management via Settings modal

A hosted version at openosint.tech/app is coming soon.

MCP Server Mode

This was actually the original reason I built the project. OpenOSINT exposes all 11 tools to any MCP-compatible client. Once registered, you can run full autonomous OSINT investigations directly from Claude Code without leaving your editor.

Claude Code:

claude mcp add openosint python /absolute/path/to/OpenOSINT/openosint/mcp_server.py
claude mcp list

Claude Desktop — add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "openosint": {
      "command": "python",
      "args": ["/absolute/path/to/OpenOSINT/openosint/mcp_server.py"]
    }
  }
}

Then from Claude Code:

> Investigate target@example.com. If you find an associated username,
  trace it across other platforms and compile a full report.

The model issues tool calls natively. No prompting tricks. No wrappers.

Architecture

The internal layering is strict and intentional:

Layer	Path	Responsibility
Core tools	`openosint/tools/`	Async wrappers around external binaries and APIs. Stateless.
AI agent	`openosint/agent.py`	Anthropic tool use loop. Maintains conversation history per session.
REPL	`openosint/repl.py`	Interactive terminal session. prompt_toolkit + Rich.
MCP server	`openosint/mcp_server.py`	MCP tool schema exposure for external AI clients.
CLI	`openosint/cli.py`	Entry point. Launches REPL or dispatches direct commands.

No layer imports from a layer above it. The core tools have zero knowledge of MCP, argparse, or the agent loop. This makes each surface independently testable and the whole thing easy to extend.

Installation

git clone https://github.com/OpenOSINT/OpenOSINT.git
cd OpenOSINT
pip install -e .
export ANTHROPIC_API_KEY=sk-ant-...

External dependencies (must be in PATH):

pip install holehe sherlock-project sublist3r
# phoneinfoga: download binary from https://github.com/sundowndev/phoneinfoga/releases

Optional environment variables:

Variable	Tool	Purpose
`HIBP_API_KEY`	`search_breach`	HaveIBeenPwned API key
`IPINFO_TOKEN`	`search_ip`	ipinfo.io token (higher rate limits)
`IP2LOCATION_API_KEY`	`search_ip2location`	IP2Location API key

A Note on Design

The most interesting architectural decision was keeping the AI agent completely out of the tool layer. The tools are dumb — they wrap a binary or API call, enforce a timeout, and return a string. The agent is the only thing that knows about conversation history, tool chaining, and report generation.

This means you can use the CLI for quick scripting without pulling in any AI overhead, and the same code powers both the REPL and the MCP server.

The other thing I'm proud of is the timeout enforcement. Every external subprocess gets a hard timeout. If holehe or sherlock hangs on a slow platform, the tool aborts cleanly and returns a partial result rather than blocking the whole session.

What's Next

Hosted web app at openosint.tech/app — currently running locally only
More tools (suggestions welcome via GitHub Issues)
Ollama tool use support in the REPL (partially implemented)

DEV Community