DEV Community

Cover image for I built an AI agent that runs autonomous OSINT investigations from your terminal
Tommaso Bertocchi
Tommaso Bertocchi

Posted on

I built an AI agent that runs autonomous OSINT investigations from your terminal

Hacker typing at terminal

You know the OSINT workflow. Open a terminal. Run holehe against an email. Copy a username you found. Switch tools. Run sherlock. Open a browser. Check HaveIBeenPwned manually. Pull up a WHOIS tab. Take notes. Repeat.

Every tool is a silo. Every pivot is manual. The investigation logic lives entirely in your head.

I wanted to fix that.


What I built

OpenOSINT is an open-source Python framework with an AI agent at its core. You describe a target in natural language — an email address, a username, a domain, an IP, a phone number — and the agent decides which tools to run, chains them based on what it finds, executes everything against the real binaries, and compiles a structured Markdown report.

Three interfaces:

  • Interactive AI REPL (default) — type natural language, agent chains the tools autonomously
  • Direct CLI — run individual tools directly, no AI, perfect for scripting
  • MCP Server — expose all 9 tools to Claude Code or Claude Desktop

The demo

Here's a real session. No mocking. The agent receives an email, runs discovery, extracts a username, pivots to search it across 300+ platforms, checks breaches, and saves a report — all unchained:

OpenOSINT demo

$ openosint
openosint ❯ investigate target@example.com

  → generate_dorks('target@example.com')
  → search_email('target@example.com')
  ✓ Found: Spotify, WordPress, Gravatar, Office365

  → search_breach('target@example.com')
  ✓ Found in 2 breaches: LinkedIn (2016), Adobe (2013)

  → search_username('target_handle')
  ✓ Found on: GitHub, Reddit, HackerNews, Twitter

  ╭──────────────── Report ────────────────╮
  │ ## Summary                             │
  │ Single target — high confidence.       │
  │                                        │
  │ ## Online Presence                     │
  │ Spotify · WordPress · Gravatar         │
  │                                        │
  │ ## Data Breaches                       │
  │ LinkedIn (2016) · Adobe (2013)         │
  ╰────────────────────────────────────────╯

  ✓ Report saved → reports/2026-05-11_report.md
Enter fullscreen mode Exit fullscreen mode

The agent went email → accounts → username pivot → cross-platform search → breach check. No human orchestration.


The architecture

Working on code

The codebase has three layers with a hard no-upward-import rule:

Layer Path Responsibility
Core tools openosint/tools/ Async wrappers around binaries and APIs. Stateless. No AI.
AI agent openosint/agent.py Anthropic tool use loop. Per-session conversation history.
Interfaces repl.py, mcp_server.py, cli.py REPL, MCP server, direct CLI.

The AI layer is optional. The core tools run fine without it — the CLI and MCP server both bypass the agent entirely.

Why hallucination in tool results is structurally ruled out

The AI layer uses the Anthropic native tool use API. Here's the exact flow:

  1. Agent receives your prompt
  2. Model decides which tool to call → issues a hard stop
  3. Real binary executes (holehe, sherlock, etc.)
  4. Real output goes back into the context as a tool_result
  5. Model reads actual output, decides next step

The model never infers or synthesizes what a tool would return. It only ever sees real output. If sherlock finds 47 profiles, that exact number and those exact URLs go back. The agent can't make up results because it never generates them.


9 tools

Tool Backend What it finds
search_email holehe Social accounts linked to an email
search_username sherlock Accounts across 300+ platforms
search_breach HaveIBeenPwned v3 Breach exposure and leaked data types
search_whois python-whois Registrant, registrar, creation/expiry dates
search_ip ipinfo.io Geolocation, ASN, hostname, org
search_domain sublist3r Subdomain enumeration
generate_dorks built-in 12 targeted Google dork URLs (no network calls)
search_paste psbdmp.ws Pastebin dump mentions
search_phone phoneinfoga Carrier, country, line type

If a dependency is missing, that tool returns a descriptive error and the rest keeps running.


Installation

git clone https://github.com/OpenOSINT/OpenOSINT.git
cd OpenOSINT
pip install -e .
export ANTHROPIC_API_KEY=sk-ant-...
Enter fullscreen mode Exit fullscreen mode

External deps (via pip):

pip install holehe sherlock-project sublist3r
Enter fullscreen mode Exit fullscreen mode

phoneinfoga is a standalone binary — download from GitHub releases.

Optional env vars:

export HIBP_API_KEY=your_key     # HaveIBeenPwned v3
export IPINFO_TOKEN=your_token   # higher rate limits on ipinfo.io
Enter fullscreen mode Exit fullscreen mode

Requires Python 3.10+.


Using it

Interactive REPL

$ openosint
openosint ❯ investigate target@example.com
openosint ❯ find all accounts for johndoe99
openosint ❯ what subdomains does example.com have?
openosint ❯ check if +14155552671 is a mobile number
Enter fullscreen mode Exit fullscreen mode

Reports are auto-saved to reports/ as Markdown after every investigation containing structured findings.

REPL commands:

clear     Reset conversation memory
save      Save last report manually
tools     Show available tools and status
config    Show current configuration
help      All commands
exit      Quit
Enter fullscreen mode Exit fullscreen mode

Direct CLI (no AI)

openosint email target@example.com -t 60
openosint username johndoe99
openosint -v email target@example.com   # verbose
Enter fullscreen mode Exit fullscreen mode

MCP Server

Access Granted

All 9 tools are exposed as an MCP server. Register in Claude Code:

claude mcp add openosint python /absolute/path/to/OpenOSINT/openosint/mcp_server.py
claude mcp list   # verify
Enter fullscreen mode Exit fullscreen mode

Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "openosint": {
      "command": "python",
      "args": ["/absolute/path/to/OpenOSINT/openosint/mcp_server.py"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Then from Claude Code:

> Investigate target@example.com. If you find a linked username,
  trace it across other platforms and compile a full report.
Enter fullscreen mode Exit fullscreen mode

The agent chains exactly as it would in the REPL, but driven by Claude Code's context.


How the agent loop works (for the curious)

# Simplified version of openosint/agent.py

messages = [{"role": "user", "content": user_input}]

while True:
    response = anthropic.messages.create(
        model="claude-opus-4-5",
        tools=TOOL_SCHEMAS,   # all 9 tools as JSON schemas
        messages=messages
    )

    if response.stop_reason == "end_turn":
        break  # agent is done

    if response.stop_reason == "tool_use":
        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                # Execute the REAL binary
                result = await execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result   # real output, no inference
                })

        # Feed real results back into context
        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": tool_results})
Enter fullscreen mode Exit fullscreen mode

The loop runs until stop_reason == "end_turn". The agent decides when it has enough information to write the report.


What's next

Detective on the case

  • Shodan and Censys integration
  • Support for additional LLM providers (Ollama, GPT-4)
  • JSON and PDF export formats
  • Docker image for zero-setup deployment
  • Async parallel tool execution for multi-target investigations

Legal

OpenOSINT is for authorized security research, penetration testing, and investigative journalism only. Users are solely responsible for compliance with applicable law including GDPR, CCPA, and the CFAA. See DISCLAIMER.md.


GitHub: github.com/OpenOSINT/OpenOSINT
Docs: openosint.tech

Stars and issues welcome. If you build something with it, drop a comment — curious what use cases people find.

Top comments (0)