DEV Community

GnomeMan4201
GnomeMan4201

Posted on

Surface Watchdog: Building a Self-Running OSINT Daemon in Termux

⚠️ Disclaimer

This project is intended for educational and lawful use only. All monitoring performed by Surface Watchdog must be limited to publicly available information and must not be used to access, interfere with, or exploit any private systems, data, or accounts without permission.

By using this guide or any related code, you accept full responsibility for ensuring compliance with applicable laws (including computer misuse, unauthorized access, and privacy regulations) in your jurisdiction.

The author provides this material "as-is" with no warranty, for research, automation, and learning purposes. If you are uncertain about the legality of any action, consult a qualified professional before proceeding.

Introduction

Most people use Termux to run a few scripts.
I decided to turn it into a fully autonomous OSINT and content-tracking daemon — one that runs 24/7, cleans itself up, uses offline AI, and bundles intelligence into neat, portable reports.

This is not a SaaS. No dashboard logins. No rented VPS.
This is bare-metal automation on a smartphone, and it works.


What Is Surface Watchdog?

Surface Watchdog is a Termux based automation engine that:

Monitors a directory for new recon or post-monitoring results

Generates Markdown, HTML, and visual charts

Enriches data with snippets and metadata

Runs AI summaries offline via llama-cli

Bundles everything into a ZIP package

Cleans old runs, logs, and stale data automatically

Can run persistently with Termux:Boot (survives reboots)

In other words: it’s a personal OSINT pipeline that doesn’t need the cloud.


Why I Built This

Because I was tired of:

Manually cleaning old runs

Copy pasting results into reports

Relying on slow SaaS dashboards

Running five different tools for one job

Instead, I wanted an autonomous system:

Drop a JSON → get a full report.

No babysitting. No manual cleanup.

Pure automation.


How It Works

The script is modular. Each "step" is an independent unit:

  1. Analysis

Generates Markdown and HTML reports

Creates domain exposure charts

Enriches data with live snippets

Runs AI summarization (offline)

  1. Bundling

Packages the results into a single ZIP

Auto-cleans old bundles

Auto-cleans logs older than 7 days

Removes outdated raw JSON

  1. Watchdog Mode

Runs in the background

Watches for new JSON files in ~/post_surface_watchdog/runs

Automatically triggers the analysis pipeline


Code: The Heart of Surface Watchdog

Here’s the core daemon script:

cat > ~/post_surface_watchdog/surface_watchdog.sh <<'EOF'

!/data/data/com.termux/files/usr/bin/bash

set -e

SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
OUTPUT_DIR="$SCRIPT_DIR/output/latest_analysis"
CLEANUP_LOG="$OUTPUT_DIR/cleanup.log"

mkdir -p "$OUTPUT_DIR"

notify() { command -v termux-notification >/dev/null && termux-notification --title "Surface Watchdog" --content "$1" || echo "$1"; }

run_step() {
echo "[*] $1..."
eval "$2"
echo "[✓] $1 complete"
}

log_cleanup() { echo "$1" | tee -a "$CLEANUP_LOG"; }

STEP="$1"
LATEST_JSON=$(ls -t "$SCRIPT_DIR/runs/"*.json 2>/dev/null | head -n 1 || true)

case "$STEP" in
analysis|all)
run_step "Generating Markdown report" "python3 \"$SCRIPT_DIR/post_trace_markdown.py\" \"$LATEST_JSON\" > \"$OUTPUT_DIR/surface_report.md\""
run_step "Generating HTML report" "python3 \"$SCRIPT_DIR/post_trace_html.py\" \"$LATEST_JSON\" > \"$OUTPUT_DIR/surface_report.html\""
run_step "Generating domain exposure chart" "python3 \"$SCRIPT_DIR/post_trace_domain_chart.py\" \"$LATEST_JSON\" \"$OUTPUT_DIR/domain_exposure.png\""
ENRICHED_JSON="$OUTPUT_DIR/enriched_with_snippets.json"
run_step "Enriching data with snippets" "python3 \"$SCRIPT_DIR/post_trace_source_crawler.py\" \"$LATEST_JSON\" \"$ENRICHED_JSON\""
run_step "AI summarizing" "python3 \"$SCRIPT_DIR/post_trace_ai_summary.py\" \"$ENRICHED_JSON\" > \"$OUTPUT_DIR/ai_summaries.txt\""
;;
esac

case "$STEP" in
bundle|all)
TIMESTAMP=$(date +%Y%m%d_%H%M)
BUNDLE_NAME="surface_watchdog_bundle_${TIMESTAMP}.zip"
run_step "Bundling all output" "cd \"$OUTPUT_DIR\" && zip -r \"$BUNDLE_NAME\" . > \"$OUTPUT_DIR/bundle.log\""
echo "[✓] Bundle created: $OUTPUT_DIR/$BUNDLE_NAME"

    log_cleanup "[*] Cleaning up old bundles..."
    cd "$OUTPUT_DIR" && ls -t surface_watchdog_bundle_*.zip | tail -n +6 | while read -r old_bundle; do
        log_cleanup "[DEL] $old_bundle"
        rm -f "$old_bundle"
    done
    log_cleanup "[✓] Bundle cleanup complete"

    log_cleanup "[*] Cleaning up logs older than 7 days..."
    find "$OUTPUT_DIR" -type f -name "*.log" -mtime +7 -exec sh -c '
        for file; do echo "[DEL] $file"; rm -f "$file"; done
    ' sh {} + | tee -a "$CLEANUP_LOG"
    log_cleanup "[✓] Log cleanup complete"

    log_cleanup "[*] Cleaning up old raw JSON..."
    find "$SCRIPT_DIR/runs" -type f -name "enriched_refs_*.json" ! -newer "$OUTPUT_DIR/enriched_with_snippets.json" -exec sh -c '
        for file; do echo "[DEL] $file"; rm -f "$file"; done
    ' sh {} + | tee -a "$CLEANUP_LOG"
    log_cleanup "[✓] Raw JSON cleanup complete"
;;
Enter fullscreen mode Exit fullscreen mode

esac

echo "[✓] Finished step: $STEP"
notify "Surface Watchdog finished: $STEP"
EOF

chmod +x ~/post_surface_watchdog/surface_watchdog.sh


Running It

cd ~/post_surface_watchdog
./surface_watchdog.sh all

Want it to run forever and auto-detect new JSON?

while true; do ./surface_watchdog.sh all; sleep 10; done


What You Get

After a run, you’ll have:

surface_report.md – Clean Markdown report

surface_report.html – Polished HTML report

domain_exposure.png – Visual domain chart

ai_summaries.txt – Offline AI driven insights

surface_watchdog_bundle_*.zip – One click bundle with everything inside


Example: Real surface exposure data, instantly visualized on-device. Each wedge represents a domain where public content is found.


Why This Matters

This is a portable reconnaissance and monitoring platform:

No server.

No vendor lock in.

No third party dependencies.

It’s you vs. the surface web, running from your pocket.


Own your stack.

Automate your workflow.

Don’t outsource what you can build yourself.


Next Steps

In the next iteration, I’ll add:

Background service integration (Termux:Boot)

SQLite logging of all runs

Offline dashboard viewer with filters and history

In light of this analysis, several recommendations are warranted.

For Defenders (Blue Teams):
🍌Adopt a Mindset of Persistent Monitoring: Operate under the assumption that your organization's public digital footprint is being continuously and automatically monitored by adversaries.
🍌Enhance External Visibility: Implement your own automated tools to monitor public data sources for changes related to your organization, allowing you to see what an attacker sees.
🍌Strengthen Human-Layer Security: Implement stricter policies and provide better training for employees regarding their social media presence and contributions to public code repositories.
🍌Refine Network Traffic Analysis: Treat traffic originating from mobile carrier networks with a higher degree of scrutiny. While individual requests are innocuous, look for automated, non-human patterns of browsing or API queries that could indicate a surveillance daemon.

For Researchers and Developers:
🍌Acknowledge Dual-Use: Recognize the inherent dual-use nature of OSINT tools and, where feasible, build in ethical safeguards or clear terms of service that discourage malicious use.
🍌Focus on Detection: Direct research efforts toward developing techniques for detecting the patterns of automated, persistent OSINT activity, distinguishing it from legitimate human browsing and research.
🍌Promote Ethical Discourse: Actively engage in public and industry discussions about the legal and ethical boundaries of automated data collection, helping to shape norms and best practices for this powerful technology

Top comments (0)