DEV Community: TechLatest

LinkedIn: https://www.linkedin.com/in/techlatest-net/

OpenClaw Agent Masterclass — Full Tutorial

TechLatest — Fri, 12 Jun 2026 09:13:45 +0000

Everything you need to install, configure, and extend OpenClaw — the open-source personal AI assistant that runs on your machine and talks to you on the chat apps you already use.

Official home: openclaw.ai · Docs: docs.openclaw.ai · Source: github.com/openclaw/openclaw

This guide follows the product story on the homepage (install → gateway → memory → tools → skills → channels → automation), uses prose and lists only (no comparison tables), and ships terminal + diagram GIFs like our Hermes masterclass.

What you’ll have at the end

OpenClaw installed with the Gateway daemon running
Browser Control UI at http://127.0.0.1:18789/
At least one messaging channel (Telegram recommended for first test)
A configured workspace with SOUL.md and optional ClawHub skill
Understanding of cron , heartbeats , and multi-agent routing

Deploy on OpenClaw VM

Want to skip infrastructure setup?

Available with both CPU and GPU configurations for development, testing, and production workloads.

Introduction — the AI that actually does things

OpenClaw is built for a simple promise: message an assistant from your phone, and it does real work on your computer — email triage, calendar checks, shell commands, browser tasks, file edits, and custom workflows via skills.

Unlike a chat-only bot, OpenClaw is self-hosted. Your context, skills, and session history live on your hardware. You pick the model (Anthropic, OpenAI, Google, local Ollama, and more). You control which channels can reach the agent and who is on the allowlist.

Community feedback on openclaw.ai consistently highlights the same strengths: persistent memory, persona onboarding, proactive cron/heartbeats, and the ability to extend the system by chatting (skills, plugins, even prompt hot-reload).

Part 1 — How OpenClaw is structured

OpenClaw centers on one long-running process: the Gateway. It is the control plane for:

Chat channels — WhatsApp, Telegram, Discord, Slack, Signal, iMessage, Matrix, Teams, WebChat, and plugin channels
Agent runtime — tool use, sessions, memory, skills
Control UI — browser dashboard for chat, config, and diagnostics
Companion apps — macOS menu bar, Windows tray, iOS/Android nodes (camera, voice, Canvas)

Docs: Architecture · Gateway

The Gateway is the single source of truth for sessions and routing. CLI commands (openclaw agent, openclaw onboard) and the dashboard all talk to the same core.

Part 2 — Prerequisites

You need Node.js 24 (recommended) or Node 22.19+ for compatibility. OpenClaw fails on older Node versions — if you are stuck on Node 20, use the Node 22 helper from our OpenClaw + Gemma guide.

You also need:

macOS, Linux, Windows 10+, or WSL2
An API key from your chosen provider or a local Ollama install
~5 minutes for onboarding; more if you add WhatsApp or iMessage pairing

Check:

node -v # v22.19+ or v24
which npm

Part 3 — Install

Three paths match openclaw.ai:

One-liner (macOS, Linux, WSL)

curl -fsSL https://openclaw.ai/install.sh | bash

The installer can pull Node and dependencies. On macOS, first run may prompt for Administrator access (Homebrew).

npm global

npm install -g openclaw@latest

Hackable / from source

curl -fsSL https://openclaw.ai/install.sh | bash -s -- --install-method git
git clone https://github.com/openclaw/openclaw.git
cd openclaw && corepack enable && pnpm install
pnpm openclaw onboard

Switch release channels later:

openclaw update --channel stable # or dev

Companion apps (beta): native macOS (15+) and Windows tray apps from openclaw.ai — gateway control, chat, and node features without living in the terminal.

Part 4 — Onboard the Gateway

Run the guided wizard:

openclaw onboard --install-daemon

The wizard walks through:

Gateway bind and authentication
LLM provider and model (API key or Ollama)
Workspace path (default under ~/.openclaw/)
Channel setup (Telegram is the fastest smoke test)
Daemon install (launchd on macOS, systemd on Linux) so the Gateway survives reboots

Onboard wizard — animated

Verify:

openclaw doctor
openclaw gateway status

Part 5 — Open the Control UI

openclaw dashboard

Default URL: http://127.0.0.1:18789/

From the dashboard you can chat, inspect sessions, edit config, and diagnose channel connections. Remote access patterns (Tailscale, SSH tunnel) are documented under Remote access.

Dashboard — animated

CLI chat without the browser:

openclaw agent --message "What can you do on this machine?" --thinking low

Agent CLI message — animated

Part 6 — What lives on disk

After onboarding, OpenClaw owns a home directory. Knowing the layout makes debugging easier.

~/.openclaw/
├── openclaw.json # Main config (channels, models, security)
├── workspace/ # Agent workspace
│ ├── AGENTS.md
│ ├── SOUL.md # Persona / identity
│ ├── TOOLS.md
│ └── skills/ # Installed + custom skills
│ └── <name>/
│ └── SKILL.md
├── credentials/ # Channel tokens (permissions-sensitive)
├── sessions/ # Session metadata
└── … # Logs, cron output, plugin state

openclaw.json is the source of truth for non-secret settings. Secrets and tokens route to appropriate credential stores.

SOUL.md defines who the agent is — tone, boundaries, and behavior. It is the identity layer (similar in spirit to HermesSOUL.md, but living in the workspace).

skills/ is where procedural knowledge lives — bundled skills, ClawHub installs, and agent-authored skills.

Workspace layout — animated

Copy a starter soul from this guide:

cp guides/openclaw/examples/SOUL.md ~/.openclaw/workspace/SOUL.md

Part 7 — Capabilities (from the homepage)

OpenClaw advertises six pillars on openclaw.ai. Here is what each means in practice.

Runs on your machine. macOS, Windows, or Linux. Connect Anthropic, OpenAI, Google, or local models. Data stays on your infrastructure unless a tool explicitly calls an external API.

Any chat app. One Gateway serves many channels. DMs and group chats are supported; group behavior often uses mention rules so the bot does not reply to every message.

Persistent memory. The agent remembers preferences and context across sessions — your assistant becomes specific to you, not a generic chatbot.

Browser control. Navigate pages, fill forms, extract data. Useful for research, booking flows, and admin panels that have no API.

Full system access (configurable). Read/write files, run shell commands, execute scripts. You choose sandbox vs full access based on trust and host environment.

Skills and plugins. Install community skills from ClawHub, add channel plugins, or describe a new workflow in chat and let the agent draft a skill.

Part 8 — Connect messaging channels

Telegram is the quickest first channel: create a bot with @BotFather, paste the token during onboarding or in config.

WhatsApp and iMessage require additional pairing steps documented in the Channels hub.

Minimal allowlist snippet — merge into ~/.openclaw/openclaw.json (full example in examples/openclaw-channels.snippet.json):

{
  channels: {
    whatsapp: {
      allowFrom: ["+15555550123"],
      groups: { "*": { requireMention: true } },
    },
  },
  messages: { groupChat: { mentionPatterns: ["@openclaw"] } },
}

Restart after config changes:

openclaw gateway restart

Security: start restrictive — allowlist phone numbers and require mentions in groups. See Security.

Supported surfaces include WhatsApp, Telegram, Discord, Slack, Signal, iMessage, Google Chat, Matrix, Microsoft Teams, Zalo, WebChat, and plugin channels — 50+ integrations on the marketing site.

Part 9 — Skills and ClawHub

Skills are Markdown with YAML frontmatter — the agent loads descriptions cheaply and pulls full instructions when a task matches.

Install from ClawHub:

openclaw skills search calendar
openclaw skills install <skill-slug>

Browse clawhub.ai. Recent OpenClaw releases emphasize Skill Cards and security scanning (SkillSpector) for hub skills — see the Skill Workshop blog post.

The agent can also author skills from conversation — e.g. “build a skill that checks my WHOOP metrics” — matching patterns described in community shoutouts on the homepage.

Skill Workshop (2026): review and approve proposed skills before they change agent behavior — product direction toward safer self-modification.

Progressive loading keeps token use sane:

Catalog view — names and descriptions only
Full skill — load SKILL.md when triggered
References — optional deep files inside the skill folder

Team-private skills: host a Git repo and install via a slug, using the same pattern as Hermes Skills Hub taps.

Part 10 — Models and local inference

Set or switch models:

openclaw models list
openclaw models set anthropic/claude-sonnet-4
# or local:
openclaw models set ollama/gemma4:e2b

For a full local stack (Ollama + RAG skill), follow OpenClaw + Gemma + RAG.

Providers are swappable without rebuilding the Gateway — the agent runtime handles translation to supported API formats.

Part 11 — Proactive automation: cron and heartbeats

OpenClaw is designed to be proactive , not only reactive.

Cron jobs schedule isolated agent runs — daily briefings, inbox sweeps, reminders. Describe schedules in natural language or use cron syntax. Jobs persist in config and survive Gateway restarts.

Example prompt inside a chat session:

Every weekday at 8am, summarize my calendar and unread priority emails.
Deliver the summary here. Set this up as a recurring cron job.

List jobs:

openclaw cron list

Heartbeats are periodic check-ins — the agent may reach out when something needs attention (community reports surprise check-ins during heartbeats). Configure through workspace and gateway settings per docs.

Useful variants:

One-shot delay: /cron add 30m "Remind me to check the build"
Interval: /cron add "every 2h" "Check server status"
Attach a skill: run a job with --skill so the agent loads a playbook first

Part 12 — Multi-agent routing

One Gateway can route multiple isolated agents — different workspaces, sessions, or senders. Useful for “work agent” vs “personal agent”, or separate Telegram bots.

Concepts:

Session isolation — conversations do not leak context across routes
Workspace per agent — distinct SOUL.mdSkills and tools
Sender-based routing — map channels or users to different agents

Docs: Multi-agent routing

Part 13 — Nodes, voice, and Canvas

Mobile nodes pair iOS/Android apps with the Gateway for camera capture, voice workflows, and Canvas (visual workspace). The macOS/Windows companion apps expose tray controls and local node mode.

Docs: Nodes

This is how users run “fix production from a dog walk” workflows — phone chat triggers agent execution on a home server or Mac mini.

Part 14 — OpenClaw vs Hermes (prose only)

Both are self-hosted, messaging-friendly agent runtimes. Neither is a hosted SaaS.

OpenClaw leads with the Gateway and channels — the product feels like “message your computer from WhatsApp.” Skills extend behavior; the community hub (ClawHub) is large; onboarding and Control UI are polished for personal assistants.

Hermes leads with the learning agent — runtime skill authoring, Curator maintenance, optional GEPA offline validation, and research-oriented tooling (MCP, profiles, training pipeline). See Hermes Agent Masterclass.

You can migrate between them: hermes claw migrate Import OpenClaw-style config into Hermes. Full side-by-side: Hermes vs OpenClaw.

Pick OpenClaw when channel UX, ClawHub, and dashboard-first setup matter most. Pick Hermes when the self-improving skill library and experiment loop matter most. Many operators run one primary runtime and borrow skills from the other ecosystem.

Part 15 — Troubleshooting

openclaw: command not found — reinstall globally or ensure npm global bin is on PATH.

Gateway will not start — runopenclaw doctor; check port 18789 conflicts.

Node version errors — upgrade to Node 22.19+ or 24.

Channel connected but no replies — verify allowlists, mention rules in groups, and bot token.

Model errors — confirm API key in config; test with openclaw agent --message hi.

Docs entry: Troubleshooting

Part 16 — Verify this guide

chmod +x guides/openclaw/scripts/verify-openclaw.sh
./guides/openclaw/scripts/verify-openclaw.sh

Official links

openclaw.ai — product home
docs.openclaw.ai — documentation
github.com/openclaw/openclaw — source
clawhub.ai — skill registry
Discord community

Summary

OpenClaw is a Gateway-first personal agent : install withopenclaw onboard, chat from the dashboard or your favorite messaging app, extend with skills and cron , and keep data on your machine. Start with Telegram and the Control UI, tighten security with allowlists, then add ClawHub skills and automation once the loop feels natural.

Thank you so much for reading

Like | Follow | Subscribe to the newsletter.

Catch us on

LinkedIn: https://www.linkedin.com/in/techlatest-net/

Anthropic Cybersecurity Skills — Full Tutorial

TechLatest — Thu, 11 Jun 2026 09:51:34 +0000

Give any AI agent the structured decision-making of a senior security analyst — not generic web search, but step-by-step playbooks mapped to MITRE ATT&CK, NIST CSF 2.0, MITRE ATLAS, D3FEND, and NIST AI RMF.

Based on mukul975/Anthropic-Cybersecurity-Skills (754 skills · 26 domains · Apache 2.0).

Community project — not affiliated with Anthropic PBC.

What you’ll learn

What the library is and why it exists
How the agentskills.io standard enables progressive disclosure
All five framework mappings and how to use them in compliance workflows
Install on Claude Code, Cursor, Copilot, Codex CLI, Gemini CLI, Hermes , and MCP agents
Skill anatomy — frontmatter, Workflow, Verification, references, scripts
End-to-end examples: memory forensics, threat hunting, cloud IR
All 26 security domains and when to activate each
Contributing, responsible use, citation, and troubleshooting

Part 1 — The problem this solves
Part 2 — Library at a glance
Part 3 — Architecture and progressive disclosure
Part 4 — Five frameworks, one skill library
Part 5 — Quick start installation
Part 6 — Claude Code setup
Part 7 — Cursor setup
Part 8 — GitHub Copilot and Codex CLI
Part 9 — Gemini CLI and other platforms
Part 10 — Hermes Agent integration
Part 11 — Skill anatomy deep dive
Part 12 — How agents discover and execute skills
Part 13 — Walkthrough: credential theft in a memory dump
Part 14 — Walkthrough: hypothesis-driven threat hunting
Part 15 — Walkthrough: multi-cloud breach scoping
Part 16 — All 26 security domains
Part 17 — MITRE ATT&CK v19.1 coverage
Part 18 — Compliance and risk frameworks in practice
Part 19 — Casky Playground and GARS-2026
Part 20 — Contributing your own skill
Part 21 — Security, ethics, and authorized use
Part 22 — Troubleshooting
Part 23 — Citation and license

TL;DR

754 production-grade cybersecurity skills for AI agents — structured playbooks, not random scripts or payload dumps
Community project (mukul975/Anthropic-Cybersecurity-Skills) — not affiliated with Anthropic PBC · Apache 2.0
26 security domains — cloud, DFIR, threat hunting, web app, OT/ICS, red team, and more
5 framework mappings per skill — MITRE ATT&CK v19.1 · NIST CSF 2.0 · MITRE ATLAS · D3FEND · NIST AI RMF
Built on agentskills.io — YAML frontmatter for discovery + Markdown workflows for execution
Progressive disclosure — scan all 754 skills at ~30 tokens each, load only matching playbooks at ~500–2K tokens
One-line install: npx skills add mukul975/Anthropic-Cybersecurity-Skills
Works with Cursor, Claude Code, Copilot, Codex CLI, Gemini CLI, Hermes, and MCP agents
Tutorial includes animated GIFs — install steps, architecture, skill anatomy, DFIR walkthrough, domain + ATT&CK tables
Runnable scripts — inspect real SKILL.md files and walk through a credential-dump IR scenario
Closes the gap between “LLM that searches the web” and “agent that follows a senior analyst playbook.”

Note

BlackArch Linux

We also provide a ready-to-deploy BlackArch Linux VM that can be launched instantly on AWS , GCP , or Azure . No installation, setup, or dependency management required — just spin it up and start using a full arsenal of penetration testing and security auditing tools in minutes.

Kali GUI Linux

Our Kali GUI Linux VM comes fully pre-configured with a graphical interface, making it easy for both beginners and professionals to get started. Deploy directly on AWS , GCP , or Azure with zero setup — no installation hassles, just immediate access to a complete offensive security toolkit.

Browser-Based Kali Linux

We offer a browser-based Kali Linux environment that runs entirely in the cloud. Simply deploy and access it from your browser — no downloads, no local setup, no compatibility issues. Deploy directly on AWS , GCP , or Azure with zero setup — no installation hassles, just immediate access to a complete offensive security toolkit. Perfect for quick testing, learning, and remote security operations from anywhere.

ParrotOS Linux

Our ParrotOS Linux VM is optimized for security, privacy, and development workflows. Available for instant deployment on AWS , GCP , and Azure , it eliminates the need for manual installation — giving you a secure, ready-to-use environment in just a few clicks.

Part 1 — The problem this solves

The cybersecurity workforce gap hit 4.8 million unfilled roles globally in 2024 (ISC2). AI agents can help close that gap — but only if they have structured domain knowledge to work from.

Today’s agents can write code and search the web. They typically cannot :

Pick the right Volatility3 plugin for a suspicious memory dump
Know which Sigma rules catch Kerberoasting
Scope a cloud breach across AWS, Azure, and GCP with consistent playbooks
Map findings to ATT&CK techniques without hallucinating IDs

Existing security repos give you wordlists, payloads, or exploit code. None give an AI agent the decision workflow a senior analyst follows: prerequisites, step order, verification, and framework mapping.

Anthropic Cybersecurity Skills fills that gap: 754 skills, each a practitioner playbook in agentskills.io format — YAML frontmatter for discovery, Markdown body for execution, optional references/scripts/assets for depth.

Part 2 — Library at a glance

What it is not

Not an Anthropic official product
Not a script dump or payload collection
Not a replacement for authorization, legal scope, or human judgment

What it is

An AI-native knowledge base built for agent toolchains
Validated ATT&CK v19.1 mappings via mitreattack-python — zero revoked IDs
The only open-source skills library with unified five-framework coverage per skill

Part 3 — Architecture and progressive disclosure

Part 4 — Five frameworks, one skill library

No other open-source skills library maps every skill to all five frameworks. One skill, five compliance checkboxes.

Example — one skill, five mappings

Skill: analyzing-network-traffic-of-malware

Part 5 — Quick start installation

Option A — npx (recommended)

Works with any agentskills.io-compatible platform:

npx skills add mukul975/Anthropic-Cybersecurity-Skills

The installer registers skills in your agent’s configured skills directory.

Option B — Git clone

git clone https://github.com/mukul975/Anthropic-Cybersecurity-Skills.git
cd Anthropic-Cybersecurity-Skills

Inspect skills/ — each subdirectory is one skill with SKILL.md at the root.

Option C — This guide’s helper script

cd guides/anthropic-cybersecurity-skills
chmod +x install-skills.sh verify-install.sh
./install-skills.sh
./verify-install.sh

Default clone path: ~/.cybersec-skills/Anthropic-Cybersecurity-Skills. Override:

export CYBERSEC_SKILLS_DIR=/opt/security-skills/Anthropic-Cybersecurity-Skills
./install-skills.sh

Part 6 — Claude Code setup

Claude Code — symlink skills to ~/.claude/skills/

Claude Code loads skills from .claude/skills/ (project) or ~/.claude/skills/ (global).

Global install (all projects)

SKILLS_SRC=~/.cybersec-skills/Anthropic-Cybersecurity-Skills/skills
mkdir -p ~/.claude/skills

# Symlink entire library (754 skills — high discovery surface)
ln -sf "${SKILLS_SRC}"/* ~/.claude/skills/

# Or copy a subset — e.g. DFIR only
cp -r "${SKILLS_SRC}"/performing-memory-forensics-with-volatility3 ~/.claude/skills/
cp -r "${SKILLS_SRC}"/hunting-for-credential-dumping-lsass ~/.claude/skills/

Project-scoped (one engagement)

mkdir -p .claude/skills
ln -sf ~/.cybersec-skills/Anthropic-Cybersecurity-Skills/skills/* .claude/skills/

Verify in Claude Code

Start a session and ask:

Use the performing-memory-forensics-with-volatility3 skill. List prerequisites and the first three Workflow steps only.

Claude should read SKILL.md and cite structured sections — not invent generic Volatility commands.

Part 7 — Cursor setup

Cursor — npx or manual symlink to ~/.cursor/skills/

Cursor discovers skills listed in agent configuration and from ~/.cursor/skills/ (user skills).

Install via npx

npx skills add mukul975/Anthropic-Cybersecurity-Skills

Follow Cursor-specific prompts if the installer detects your environment.

Manual symlink

mkdir -p ~/.cursor/skills
ln -sf ~/.cybersec-skills/Anthropic-Cybersecurity-Skills/skills/* ~/.cursor/skills/

Project rules (optional)

Add to .cursor/rules/ or project instructions:

For security investigations, prefer skills from Anthropic Cybersecurity Skills.
Scan skill frontmatter by tags (dfir, threat-hunting, cloud-security) before loading full SKILL.md.
Always complete the Verification section before closing an investigation step.

Verify in Cursor

Open Agent mode and prompt:

I have a Windows memory dump. Which cybersecurity skills apply? Load the best match and show Prerequisites.

Part 8 — GitHub Copilot and Codex CLI

Copilot + Codex CLI — install skills and invoke by name

Both support agentskills.io when configured with a skills path.

Copilot (VS Code / JetBrains)

Clone or npx skills add the repo
Point Copilot’s agent skills setting at skills/
In agent chat: reference skill name in kebab-case (e.g. hunting-for-lateral-movement-with-sysmon)

OpenAI Codex CLI

npx skills add mukul975/Anthropic-Cybersecurity-Skills
codex # or your configured entrypoint

Codex reads frontmatter for routing; load full skills for multi-step IR workflows.

Part 9 — Gemini CLI and other platforms

Gemini CLI — npx install and skill invocation

Compatible without custom forks:

Gemini CLI: install skills via npx skills add, then invoke by skill name in prompts.

LangChain / CrewAI: mount skills//SKILL.md as tool description or system prompt segment; use frontmatter tags for retrieval routing.

MCP agents: expose skill search as an MCP resource listing frontmatter; fetch full SKILL.md on match.

Part 10 — Hermes Agent integration

Hermes — copy skills into ~/.hermes/skills/

Hermes uses ~/.hermes/skills/ (same agentskills.io layout).

git clone https://github.com/mukul975/Anthropic-Cybersecurity-Skills.git /tmp/cybersec-skills
cp -r /tmp/cybersec-skills/skills/* ~/.hermes/skills/
hermes skills list | head

For SOC automation, combine with Hermes cron/Curator so frequently used skills stay prioritized. See Awesome Hermes Agent tutorial.

Example Hermes prompt:

Run a hypothesis-driven hunt for Kerberoasting using the threat hunting skills. Map hits to ATT&CK T1558.003.

Part 11 — Skill anatomy deep dive

Every skill follows a consistent directory structure:

skills/performing-memory-forensics-with-volatility3/
├── SKILL.md ← Definition (YAML + Markdown)
├── references/
│ ├── standards.md ← Framework mappings
│ └── workflows.md ← Deep technical reference
├── scripts/
│ └── process.py ← Helper scripts
└── assets/
    └── template.md ← Report templates

YAML frontmatter (real example)

---
name: performing-memory-forensics-with-volatility3
description: >-
  Analyze memory dumps to extract running processes, network connections,
  injected code, and malware artifacts using the Volatility3 framework.
domain: cybersecurity
subdomain: digital-forensics
tags: [forensics, memory-analysis, volatility3, incident-response, dfir]
atlas_techniques: [AML.T0047]
d3fend_techniques: [D3-MA, D3-PSMD]
nist_ai_rmf: [MEASURE-2.6]
nist_csf: [DE.CM-01, RS.AN-03]
version: "1.2"
author: mukul975
license: Apache-2.0
---

Part 12 — How agents discover and execute skills

User prompt: “Analyze this memory dump for signs of credential theft.”

Agent internal process:

Scan 754 frontmatter (~30 tokens each)
→ Match tags: forensics, credential-access, memory-analysis → 12 candidate skills
Load top 3:

performing-memory-forensics-with-volatility3
hunting-for-credential-dumping-lsass
analyzing-windows-event-logs-for-credential-access

Execute Workflow — Volatility3 plugins, LSASS access patterns, event log correlation
Verification — confirm IOCs, map to ATT&CK T1003 (Credential Dumping)

Without skills, the agent guesses commands and skips steps. With skills, it follows the same playbook a senior DFIR analyst would use.

Tips for better agent behavior

Ask the agent to name the skill before executing
Require Verification section output in every response
For red team skills, state authorized scope in the prompt
Use subset installs (10–20 skills) if the agent overloads context

Part 13 — Walkthrough: credential theft in a memory dump

Scenario: IR ticket — suspected Mimikatz on a Windows server. You have a .raw memory image.

Step 1 — Activate the right skills

Prompt:

Authorized DFIR on image server01.raw. Find skills for memory forensics and credential dumping. List prerequisites.

Expected skills: memory forensics + LSASS hunting + Windows event logs.

Step 2 — Prerequisites check

Agent should verify from SKILL.md:

Volatility3 installed (vol -h)
Symbol tables / Windows profile for OS build
Sufficient disk space for plugin output
Chain of custody documented

Step 3 — Workflow execution

Typical workflow order (from skills):

windows.info / windows.pslist — baseline processes
windows.malfind / windows.vadwalk — injection indicators
LSASS-focused plugins and handle analysis
Correlate with Security Event ID 4656/4663 if disk logs are available

Step 4 — Verification

Named process accessing lsass.exe with suspicious privileges
In-memory strings or injection matching known dump tools
Timeline aligns with alert timestamp
ATT&CK: T1003.001 OS Credential Dumping: LSASS Memory

Step 5 — Report

Use skill assets/template.md if present; include framework mappings from references/standards.md.

Part 14 — Walkthrough: hypothesis-driven threat hunting

Scenario: Hunt for Kerberoasting in Enterprise SIEM.

Hypothesis

Service accounts may be targeted via Kerberoasting (T1558.003) in the last 30 days.

Skill selection

Tags: threat-hunting, kerberos, sigma, splunk or sentinel.

Agent loads hunting skill → Workflow:

Deploy/validate Sigma rule for Kerberoasting
Query rare RC4/HMAC service ticket requests
Enrich service accounts — SPN exposure, password age
Escalate confirmed anomalies to IR queue

Verification

Non-noise hits with service account + weak crypto ticket
ATT&CK technique documented
Hunt notebook updated for repeatability

Part 15 — Walkthrough: multi-cloud breach scoping

Scenario: Credentials leaked; unknown activity in AWS, Azure, and GCP.

Skills to combine

Agent workflow:

Contain — disable keys, force password reset (Incident Response skills)
Discover — each provider’s log skill in parallel
Collect — unified timeline (Digital Forensics)
Map — ATT&CK cloud techniques (T1078, T1530, etc.)
Report — NIST CSF RS.AN / RS.MI categories

Part 16 — All 26 security domains

Part 17 — MITRE ATT&CK v19.1 coverage

754/754 skills mapped. Validated with official mitreattack-python — no revoked or deprecated IDs.

v19.1 change: Defense Evasion split into Stealth (TA0005) and Defense Impairment (TA0112).

Part 18 — Compliance and risk frameworks in practice

NIST CSF 2.0

Map skill outputs to Govern, Identify, Protect, Detect, Respond, Recover for audit trails. Example: memory forensics → Detect (DE.CM), Respond (RS.AN).

MITRE ATLAS

Use when the incident involves ML models — poisoning, evasion, model theft. Frontmatter field: atlas_techniques.

MITRE D3FEND

Pair offensive findings with defensive countermeasures — e.g. D3-NTA for network traffic analysis skills.

NIST AI RMF

For AI governance — document which agent skills were used, human-in-the-loop checkpoints, and measurement (MEASURE-* subcategories).

See Framework mappings for crosswalk tables and reporting templates.

Part 19 — Casky Playground and GARS-2026

Casky.ai Playground

Hands-on exercises without local install:

→ Launch Playground on Casky.ai

Live cybersecurity skill exercises
Real-time agent execution
Interactive ATT&CK-mapped workflows

GARS-2026 Survey

Global Agentic AI Readiness Survey (SRH Berlin) — measures readiness for MCP, tool calling, and governance.

~10 minutes, anonymous
Results published open access (CC-BY 4.0)
Link in upstream README

Part 20 — Contributing your own skill

Fork Anthropic-Cybersecurity-Skills
Copy the skill template from CONTRIBUTING.md
Add skills/your-skill-name/SKILL.md with full frontmatter + four body sections
Add references/standards.md with ATT&CK + framework IDs
PR title: Add skill: your-skill-name
Review within ~48 hours for technical accuracy and agentskills.io compliance

Improve existing skills: framework mappings, fixed commands, new scripts/templates.

Report issues: inaccurate procedures or broken scripts → GitHub Issues.

Project follows Contributor Covenant.

Part 21 — Security, ethics, and authorized use

These skills describe ** offensive and defensive techniques**. Use only:

On systems you own or have written authorization to test
Within bug bounty/pentest/red team scope
With human oversight for destructive or exfiltration steps

AI agents can execute commands quickly — mis-scoped prompts cause real damage. Always:

State authorization in the prompt
Use read-only modes where available
Keep humans in the loop for containment and legal notification

Upstream Security Policy: responsible disclosure, 48-hour acknowledgment.

Part 22 — Troubleshooting

Run ./verify-install.sh after every pull.

Part 23 — Citation and license

@software{anthropic_cybersecurity_skills,
  author = {Jangra, Mahipal},
  title = {Anthropic Cybersecurity Skills},
  year = {2026},
  url = {https://github.com/mukul975/Anthropic-Cybersecurity-Skills},
  license = {Apache-2.0},
  note = {754 structured cybersecurity skills for AI agents,
                  mapped to MITRE ATT\&CK, NIST CSF 2.0, MITRE ATLAS,
                  MITRE D3FEND, and NIST AI RMF}
}

License

Apache License 2.0 — use, modify, and distribute in personal and commercial projects.

Conclusion

The cybersecurity skills gap is not going to close with generic chatbots alone. Analysts do not win investigations because an LLM can search the web — they win because they know which playbook to run, in what order, and how to verify the result before closing the ticket.

Anthropic Cybersecurity Skills (community-built, Apache 2.0) gives AI agents that same structure: 754 skills across 26 domains, each mapped to MITRE ATT&CK, NIST CSF, ATLAS, D3FEND, and NIST AI RMF. The agentskills.io format makes it practical — scan lightweight frontmatter first, load full workflows only when the incident demands it.

You do not need a custom fork or a new agent runtime. One install command works across Cursor, Claude Code, Copilot, Codex CLI, Gemini CLI, and Hermes. Point your agent at the library, name the skill in your prompt, and require the Verification step before it reports done.

Start here:

npx skills add mukul975/Anthropic-Cybersecurity-Skills

Then walk through the tutorial: inspect a real SKILL.md, run the credential-dump walkthrough, and pick skills by domain or ATT&CK tactic. Use them only on authorized systems — these are practitioner playbooks, not toys.

If this helps your SOC or red-team workflow, star the upstream repo and consider contributing a skill in an underrepresented domain like Deception Technology or Compliance & Governance. The library grows on community PRs — and the agents using it get sharper with every one.

Thank you so much for reading

Like | Follow | Subscribe to the newsletter.

Catch us on

LinkedIn: https://www.linkedin.com/in/techlatest-net/

Reddit Community: https://www.reddit.com/user/techlatest_net/

Build an ML Model That Actually Ships: A 6-Step Visual Walkthrough

TechLatest — Wed, 10 Jun 2026 08:14:49 +0000

Most people picture machine learning like this: pick an algorithm, call .fit(), done.

That’s not how it works in real teams.

Training is one stage in a longer pipeline. Skip the early steps, and you build the wrong thing. Skip the late steps and nothing ever reaches users — or it breaks quietly in production.

Here are the six stages every serious ML project goes through, what happens in each, and what to watch out for.

TL;DR

Build an ML Model That Actually Ships: A 6-Step Visual Walkthrough

Building a model that reaches production is six stages, not one notebook cell:

Define the problem — KPIs and a baseline before any code
Prepare data — clean, feature, split; reject leakage
Choose a model — start simple; match data size and interpretability
Train & tune — loop until validation metrics plateau
Evaluate & test — held-out test set + slice by segment
Deploy & monitor — API in prod, then watch for drift and retrain

The algorithm is roughly 15–25% of the work. Most calendar time sits in data, evaluation, and keeping the model alive after launch.

Each step in the full article has a GIF so you can see the flow — not just read a checklist.

Step 1: Define the problem before you touch data

Start with questions, not notebooks.

What you’re really doing: turning a business or product problem into a measurable ML task.

Ask:

What decision should the model help with? (approve a loan, flag spam, recommend a product)
Is ML the right tool, or would rules or a lookup table work?
What does “good enough” mean — accuracy, speed, cost, fairness?
Who uses the output, and what happens when the model is wrong?

Write down success metrics now. If you can’t define them, you’re not ready to collect data.

Common mistakes

Solving a problem nobody has
Choosing metrics that look good on paper but don’t match the product (e.g., 99% accuracy when the class is 98% one label)
No baseline — even “always predict the majority class” should be beaten

Deliverable: one-page problem brief — use case, constraints, KPIs, and a simple baseline plan.

Step 2: Prepare data (where most of the calendar time goes)

Models learn from examples. Garbage in, garbage out — that phrase exists for a reason.

What you’re really doing: building a dataset that matches the problem you defined in Step 1.

Typical work:

Collect — databases, APIs, logs, labels from humans, public datasets
Clean — missing values, duplicates, typos, timezone bugs, unit mismatches
Explore — distributions, correlations, label balance, leakage (future info sneaking into features)
Engineer features — ratios, aggregates, encodings, text tokens, image resize/normalize
Split — train/validation/test (and time-based splits for forecasting)

Rule of thumb: if Step 1 took a day and Step 2 takes three weeks, you’re probably on track.

Common mistakes

Leakage (e.g. using “total spend after signup” to predict signup completion)
Random split on time-series data
Test set touched during experimentation (it should stay locked until the end)

Step 3: Pick a modeling approach (smaller than people think)

This is the step that gets all the Twitter threads. In a full project, it’s often 10–20% of the effort — not because it’s easy, but because Steps 1–2 and 5–6 eat the rest.

What you’re really doing: choosing a method that fits data size, latency, interpretability, and maintenance.

**Tabular, medium data, need explanations**  
→ Linear models, tree ensembles (Random Forest, gradient boosting)

**Images, audio, text at scale**  
→ Neural networks (PyTorch, TensorFlow, JAX)

**Small data, strict latency**  
→ Simpler models, or pre-trained + fine-tune

**Need a fast baseline**  
→ Logistic regression, or one strong GBM

Also pick framework and environment early: scikit-learn for classical tabular, PyTorch/TF for deep learning, plus version control and experiment logging from day one.

Don’t marathon-tune a complex model until a simple one fails on your validation set.

Step 4: Train and iterate

Training means showing the model your prepared data, so it learns patterns.

What you’re really doing: running experiments until validation performance stops improving meaningfully.

Loop:

Train on the training set
Tune on the validation set (hyperparameters, architecture tweaks)
Log everything — config, data version, metrics, runtime
Repeat until gains flatten or you hit product targets from Step 1

Hyperparameters (learning rate, tree depth, batch size, regularization) matter, but data and features usually matter more.

Common mistakes

Tuning on the test set (that’s cheating — you’ll overfit to one snapshot)
No reproducibility (can’t rerun the same experiment six months later)
Chasing leaderboard metrics while latency or cost makes deployment impossible

Step 5: Evaluate honestly (including fairness)

A model that looks great in a notebook can still fail in the real world.

What you’re really doing: measuring generalization and risk before users see it.

On the held-out test set (touched once, at the end):

Classification: precision, recall, F1, ROC-AUC — pick what matches the cost of false positives vs false negatives
Regression: MAE, RMSE, MAPE
Ranking: NDCG, MAP

Then go deeper:

Slice analysis — performance by region, device, age band, language
Bias/fairness checks — does error concentrate on one group?
Error analysis — open the worst predictions; patterns often point back to Step 2

If test results don’t meet Step 1 KPIs, go back to data or modeling — don’t ship and hope.

Step 6: Deploy, monitor, and maintain

Training is a milestone. Production is the job.

What you’re really doing: packaging the model so other systems can call it, then watching it degrade.

Typical path:

Serialize the model (pickle, ONNX, SavedModel, etc.)
Containerize (Docker) for consistent runtime
Deploy — API on cloud (AWS/GCP/Azure), edge device, or batch pipeline
Monitor — latency, error rate, input drift, output drift, business KPIs
Retrain on a schedule or when alerts fire

Models rot. User behavior shifts. New products launch. Upstream data schemas change. Monitoring catches that before revenue or trust does.

Common mistakes

No rollback plan
Monitoring only infrastructure (CPU/RAM) but not prediction quality
Retraining on production traffic without governance

Final Thought

Most ML content stops at training. That’s why so many “finished” models never leave a laptop.

Shipping means accepting that data prep, leakage checks, slice analysis, and monitoring are part of the product — not optional cleanup. The teams that win aren’t the ones with the fanciest architecture on day one. They’re the ones that pick a clear metric, beat a dumb baseline, and keep the model honest after it goes live.

If you’re early in the journey, don’t optimize for the perfect algorithm. Optimize for clarity at step one and honesty at step five. Everything else gets easier from there.

Thank you so much for reading

Like | Follow | Subscribe to the newsletter.

Catch us on

LinkedIn: https://www.linkedin.com/in/techlatest-net/

Reddit Community: https://www.reddit.com/user/techlatest_net/

OpenClaw or Hermes? Choosing the Right AI Agent Stack in 2026

TechLatest — Tue, 09 Jun 2026 10:25:37 +0000

The AI model race is slowing down. The agent runtime race is just getting started.

In 2025, everyone compared Claude, GPT, Gemini, and Qwen. In 2026, the conversation has shifted. The real question is no longer which model you use, but which system orchestrates that model.

For self-hosted agents, two projects stand out: OpenClaw and Hermes Agent.

Both can connect to Telegram, Discord, Slack, WhatsApp, local tools, and cloud models. Both support skills. Both can automate tasks and execute workflows.

Yet after spending time with both systems, I came away with a simple conclusion:

OpenClaw is a better control plane. Hermes is a better self-improving runtime.

The choice depends entirely on what you expect your agent to become.

Repos: NousResearch/hermes-agent · openclaw/openclaw

Part 1 — What problem do they solve?

At first glance, OpenClaw and Hermes look similar.

You connect a model.

You give it tools.

You chat with it through Telegram, Discord, WhatsApp, or the terminal.

But their philosophies diverge quickly.

OpenClaw treats agents as members of a larger system.

Hermes treats agents as individuals that learn and improve over time.

That difference influences everything else.

| Category | OpenClaw | Hermes |
| ------------------ | ------------------------------------------------------- | ---------------------------------------------------- |
| **Core Idea** | Agent control plane | Self-improving runtime |
| **Primary Focus** | Channels, routing, and orchestration | Learning, memory, and automation |
| **Ideal User** | Operators, builders, and teams managing multiple agents | Researchers, automation enthusiasts, and power users |
| **Long-Term Goal** | Manage and coordinate many agents | Continuously improve a single agent over time |

Both projects answer: “How do I talk to an AI agent from Telegram/WhatsApp/Discord and have it use tools on my machine?”

They diverge on what happens after the first week :

| | OpenClaw | Hermes |
|---|----------|--------|
| **Product feel** | Polished personal assistant — gateway, channels, dashboard | Research-grade agent platform — tools, memory, evolution |
| **Skills** | You install or write `SKILL.md`; ClawHub registry | Agent can **author** skills; Curator maintains quality |
| **Stack** | Node.js, TypeScript, npm global | Python CLI, bash installer |
| **Sweet spot** | "Message my assistant anywhere" | "My assistant gets better at my workflows over time" |

Neither is a hosted SaaS. You run the gateway on your laptop, homelab, or VPS.

Part 2 — Architecture side by side

OpenClaw

Gateway = single control plane (default http://127.0.0.1:18789/))
Workspace = ~/.openclaw/workspace with AGENTS.md, SOUL.md, TOOLS.md
Skills = ~/.openclaw/workspace/skills//SKILL.md
Daemon = launchd/systemd user service after openclaw onboard --install-daemon

Docs: Architecture · Gateway

Hermes

CLI + TUI = hermes, hermes --tui
Gateway = hermes gateway for messaging platforms
Skills = procedural memory in ~/.hermes/skills/
Curator (v0.12+) = periodic grading/pruning of learned skills

Docs: Hermes user guide

Shared pattern

Both normalize inbound chat JSON → agent message → tool/skill execution → outbound reply. Both use Markdown skills as the extension point for custom workflows.

Architecture Verdict

Choose OpenClaw when:

You need multiple agents
You need channel separation
You need orchestration

Choose Hermes when:

You want a single powerful assistant
You care about automation
You value simplicity

Winner: OpenClaw

Deploy on OpenClaw VM

Want to skip infrastructure setup?

We provide pre-configured OpenClaw VM images on [AWS](https://aws.amazon.com/marketplace/pp/prodview-y7ck4mk5qmrdk?utm_campaign=openclaw-vm&utm_source=techlatest-website&utm_medium=support-page\), Azure, and Google Cloud Platform (GCP). Each deployment comes with OpenClaw, Ollama, and all required dependencies pre-installed, allowing you to launch a production-ready AI agent environment in minutes.

Available with both CPU and GPU configurations for development, testing, and production workloads.

Skills: Static Catalog vs Living Knowledge

This is where Hermes becomes interesting.

OpenClaw uses a traditional skill ecosystem.

You install skills.

You update skills.

You manage skills.

The model stays mostly separate from the skill lifecycle.

Hermes takes a different approach.

Repeated workflows can become reusable skills.

Instead of treating skills as software packages, Hermes treats them as procedural memory.

Over time, the agent begins to recognize recurring patterns and formalize them.

This fundamentally changes the relationship between user and system.

With OpenClaw, you manage skills.

With Hermes, you train skills.

Skills Verdict

If you want predictability:

OpenClaw

If you want adaptation:

Hermes

Winner: Hermes

Memory: Rich Context vs Focused Context

Memory is often marketed as a feature.

In reality, memory is usually a tradeoff.

OpenClaw maintains richer context across workflows and channels.

That can be incredibly useful.

It can also create noise.

As systems grow, context retrieval becomes harder to manage.

Hermes intentionally keeps memory lean.

Instead of aggressively pulling context into every task, it retrieves information progressively.

The result is a system that often feels more focused.

OpenClaw remembers more.

Hermes remembers more selectively.

Memory Verdict

For long-running agent ecosystems:

OpenClaw

For daily workflows and repeated tasks:

Hermes

Winner: Hermes

User Experience and Control

This was one of the most surprising differences.

OpenClaw generally feels mature and stable.

Once configured, it stays out of the way.

Hermes feels more transparent.

Tool execution is easier to inspect.

Context usage is easier to understand.

Interrupting workflows feels more natural.

If you enjoy seeing what your agent is doing, Hermes provides a clearer window into the system.

If you simply want the system to work, OpenClaw’s maturity is reassuring.

UX Verdict

Transparency: Hermes

Stability: OpenClaw

Overall Winner: Hermes

Part 3 — Prerequisites

| Requirement | OpenClaw | Hermes |
|-------------|----------|--------|
| OS | macOS, Linux, Windows (WSL2) | macOS, Linux, WSL |
| Runtime | Node **22.19+** or **24** | Python (installer handles deps) |
| API key or local model | Yes | Yes |
| Disk | ~500MB+ for Node + workspace | ~1GB+ depending on browser tools |

Check versions:

node -v # v22.19+ or v24 for OpenClaw
which hermes # after Hermes install
which openclaw # after OpenClaw install

Part 4 — Install OpenClaw

npm install -g openclaw@latest
openclaw onboard --install-daemon

The onboarding wizard configures:

Gateway bind address and auth
LLM provider (or Ollama for local models)
At least one channel (Telegram is the fastest smoke test)
Workspace path and bundled skills

Verify:

openclaw doctor
openclaw status
# Dashboard (if gateway running):
# http://127.0.0.1:18789/

Local model (optional): follow the OpenClaw + Gemma + RAG tutorial to point OpenClaw at gemma4:e2b via Ollama.

OpenClaw skills smoke test

openclaw skills list
openclaw skills install <skill-from-clawhub> # example — see clawhub.ai

Skills load from (highest priority first):

/skills/
Project /.agents/skills
~/.agents/skills
~/.openclaw/skills
Bundled skills

See Skills docs.

Part 5 — Install Hermes

curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
source ~/.zshrc # or ~/.bashrc
hermes setup --portal

hermes setup --portal is the fastest path to a working cloud model + tool gateway. For local-only, use hermes model and configure Ollama per Hermes docs.

Verify:

hermes doctor
hermes --tui

First TUI prompts to try:

“List tools you have access to”
“List skills in ~/.hermes/skills”
“What is the Curator and when does it run?”

Full Hermes depth: Awesome Hermes Agent tutorial.

Hermes gateway smoke test

hermes gateway

Configure channel tokens via hermes setup or config files. Run hermes doctor after any gateway change. Keep DM pairing/allowlists enabled until you trust exposure.

Part 6 — Feature comparison (hands-on)

Use the same three prompts on both systems and compare behavior.

| Test prompt | What to observe |
|-------------|-----------------|
| *"What skills do you have?"* | OpenClaw lists workspace/ClawHub skills; Hermes lists `~/.hermes/skills` + may mention learned skills |
| *"Run a shell command: uname -a"* | Tool permission / sandbox behavior |
| *"Remember that my project codename is NEPTUNE"* | Memory persistence on next session |

Record results in a simple table:

| Test | OpenClaw | Hermes |
|------|----------|--------|
| Skill list | | |
| Shell tool | | |
| Memory | | |

Full static matrix: feature matrix.

Part 7 — Skills: same format, different lifecycle

OpenClaw skill anatomy

~/.openclaw/workspace/skills/my-skill/
├── SKILL.md # YAML frontmatter + instructions
└── scripts/ # optional Python/shell helpers

Install from ClawHub:

openclaw skills install <skill-id>
openclaw skills verify <skill-id> # trust envelope when available

Operator maintains skills — update via openclaw skills update or ClawHub sync.

Hermes skill anatomy

~/.hermes/skills/my-skill/
└── SKILL.md

Invoke explicitly: /skill my-skill or let the agent auto-select.

Learning loop: after repeated workflows, Hermes can draft new SKILL.md files from session traces. Curator (v0.12+) reviews and prunes them on a ~7-day cycle so quality does not drift.

Porting a skill between stacks

Copy the skill directory to the other runtime’s skills path.
Adjust tool names in SKILL.md (OpenClaw vs Hermes tool schemas differ).
Update any script paths (~/.openclaw ↔ ~/.hermes).
Restart gateway / start a new session.

Example: our agentic-rag skill targets OpenClaw — a Hermes port would call the same LitServe RAG API with Hermes shell tool syntax.

Part 8 — Channels & gateway

| Concern | OpenClaw | Hermes |
|---------|----------|--------|
| Start daemon | Installed by onboard | `hermes gateway` (or systemd per your setup) |
| Multi-channel | One gateway, many channels | One gateway, 18+ platforms |
| Config | `openclaw.json` + wizard | Hermes config under `~/.hermes/` |
| Chat commands | `/status`, `/new`, `/restart`, … | Hermes TUI + channel-specific |

Recommendation: enable one channel (Telegram) on both for comparison, then expand. Running both gateways on the same bot token will conflict — use separate bots or run one at a time.

Part 9 — Models: cloud vs local

OpenClaw + Ollama (this repo’s pattern)

ollama pull gemma4:e2b
# Configure in openclaw.json — see openclaw-gemma-rag/config/
openclaw gateway restart

Hermes + local model

Configure via hermes model or provider section in Hermes docs. Cloud APIs remain the path of least resistance for tool-heavy tasks on modest hardware.

| Workload | Suggestion |
|----------|------------|
| Phone assistant, mostly chat | Cloud model on either stack |
| Private docs, RAG, homelab | OpenClaw + [Gemma RAG guide](https://ayush7614.github.io/agentic-ai-ecosystem/guides/openclaw-gemma-rag/) |
| Heavy browser automation | Hermes with sandbox backend (Modal/Daytona) or skip browser on small VPS |

Part 10 — Memory & self-improvement

| | OpenClaw | Hermes |
|---|----------|--------|
| **Session history** | Session tools (`sessions_history`, etc.) | Built-in session + TUI history |
| **Long-term memory** | Workspace files + operator-managed | Memory layer + ecosystem plugins (honcho, hindsight, plur) |
| **Automatic skill growth** | No | **Yes** — core differentiator |
| **Quality control** | Manual review, `openclaw skills verify` | **Curator** automated

Choose Hermes when you want the agent to accumulate procedural memory. Choose OpenClaw when you want predictable, curator-controlled skill sets from ClawHub.

Part 11 — Migrate OpenClaw → Hermes

Hermes ships a native migration path:

hermes claw migrate

This imports OpenClaw workspace layout, channel configuration, and compatible skills where possible.

After migration:

hermes doctor
hermes claw migrate --help # inspect flags
# Compare cron + channel config manually
hermes gateway

Community fallback for older Hermes versions: openclaw-to-hermes.

Side-by-side cutover (recommended for production personal assistants):

Migrate with hermes claw migrate
Run Hermes gateway on a new Telegram bot
Keep OpenClaw on the old bot until Hermes passes your test checklist
Switch DNS/webhooks if applicable
Decommission OpenClaw daemon when satisfied

Part 12 — Security comparison

| Risk | OpenClaw mitigation | Hermes mitigation |
|------|---------------------|-------------------|
| Malicious skill | `openclaw skills verify`, review scripts | Review `SKILL.md` + scripts before enabling |
| Shell/RCE | Docker sandbox (docs strongly recommend) | Remote sandboxes, minimal VPS install (`--skip-browser`) |
| Open gateway | Local bind, auth tokens | `hermes doctor`, pairing/allowlists |
| Prompt injection via chat | Model choice, tool allowlists | Same — use strongest model available |

Rule for both: skills are code. Treat ClawHub and awesome-hermes-agent entries as untrusted until reviewed.

Part 13 — Run both side by side (this repo)

From the repo root:

cd guides/hermes-vs-openclaw
chmod +x verify-comparison.sh
./verify-comparison.sh

Optional full stack:

| Terminal | Command |
|----------|---------|
| A | Start RAG API per [qwen-agentic-rag](https://ayush7614.github.io/agentic-ai-ecosystem/guides/qwen-agentic-rag/) |
| B | `openclaw gateway` (messaging assistant) |
| C | `hermes --tui` (compare tool/skill behavior) |

OpenClaw consumes RAG via the agentic-rag skill. Hermes can call the same HTTP API via a custom skill or MCP wrapper.

Part 14 — Decision guide

| Profile | Pick |
|---------|------|
| Indie hacker, Telegram/WhatsApp only, loves npm | **OpenClaw** |
| ML researcher, multi-agent, Nous ecosystem | **Hermes** |
| Existing OpenClaw user, curious about learning loop | **Hermes** via `hermes claw migrate` |
| Need reproducible skill catalog, not auto-writes | **OpenClaw** + ClawHub |
| Building on this repo's RAG guides | **OpenClaw** primary; Hermes optional second runtime |

You can also run OpenClaw for channels and Hermes for batch/cron evolution against the same RAG API — they are not mutually exclusive at the API layer.

Part 15 — Troubleshooting

| Symptom | OpenClaw fix | Hermes fix |
|---------|--------------|------------|
| CLI not found | `npm i -g openclaw@latest`; check `node -v` | `source ~/.zshrc`; re-run installer |
| Doctor fails | Re-run `openclaw onboard` | `hermes setup --portal` |
| Gateway won't start | `openclaw gateway restart`; check port 18789 | `hermes doctor`; check channel tokens |
| Skills missing | `openclaw skills list`; workspace path | `ls ~/.hermes/skills`; new session |
| Node too old | nvm install 22; [`use-node22.sh`](https://github.com/Ayush7614/agentic-ai-ecosystem/blob/main/guides/openclaw-gemma-rag/use-node22.sh) | N/A |
| Migration incomplete | — | `hermes claw migrate`; compare cron/channels; try [openclaw-to-hermes](https://github.com/0xNyk/openclaw-to-hermes) |
| Both fight for Telegram | Use two bot tokens | Use two bot tokens |

Summary

| Dimension | Winner (typical) |
|-----------|------------------|
| Channel polish + dashboard | OpenClaw |
| Self-improving skills | Hermes |
| npm / TypeScript ecosystem | OpenClaw |
| Multi-agent + research tooling | Hermes |
| Local Gemma + RAG (this repo) | OpenClaw |
| OpenClaw → Hermes migration | Hermes (`hermes claw migrate`) |

Next steps:

Deep dive OpenClaw: openclaw-gemma-rag tutorial
Deep dive Hermes: awesome-hermes-agent tutorial
Feature reference: feature matrix

Real-World Recommendations

Choose OpenClaw if you need:

Telegram and WhatsApp assistants
Multi-agent orchestration
Team-based agent systems
Mature skill marketplaces
Channel-centric workflows

Choose Hermes if you need:

Research automation
Self-improving workflows
Personal knowledge systems
Daily reports and recurring tasks
VPS-friendly automation

Ecosystem and Community

OpenClaw currently has the stronger ecosystem.

ClawHub gives users access to a growing catalog of reusable skills.

Documentation is mature.

Community content is abundant.

Hermes is newer and more experimental.

The ecosystem is smaller, but the pace of innovation is significantly faster.

OpenClaw wins on maturity.

Hermes wins on direction.

Ecosystem Verdict

Winner Today: OpenClaw

Most Interesting Future: Hermes

Final Verdict

The most common mistake is treating OpenClaw and Hermes as direct competitors.

They solve adjacent problems.

OpenClaw is an operating system for agents.

Hermes is an operating system for learning.

If your challenge is coordinating agents across channels, OpenClaw remains the strongest choice.

If your challenge is building an assistant that improves through repetition, Hermes is the more compelling platform.

For most developers building chat-based assistants today, I would recommend OpenClaw.

For researchers, automation enthusiasts, and anyone interested in procedural memory, I would recommend Hermes.

Both are excellent.

The better question is not which one is best.

The better question is what kind of agent you want to build.

Thank you so much for reading

Like | Follow | Subscribe to the newsletter.

Catch us on

LinkedIn: https://www.linkedin.com/in/techlatest-net/

Git for Agent Memory: Why You Should Treat Hermes Skills Like Code

TechLatest — Mon, 08 Jun 2026 14:29:53 +0000

Go from zero to a productive Hermes Agent setup with community skills, optional GUI, messaging gateway, and a map of the full ecosystem.

Based on awesome-hermes-agent (last reviewed 2026–05–06, Hermes v0.12.0 “The Curator release”).

What you’ll build

Hermes Agent CLI on your machine
LLM provider + Tool Gateway configured
Starter skills from the ecosystem
Verification scripts for your team
Full coverage of Skills & Plugins , Tools & Utilities , Integrations & Bridges , and Multi-Agent & Swarms

OpenClaw: AI Agent Automation Stack

OpenClaw is a pre-configured cloud VM that enables developers to deploy autonomous AI agents in minutes. It comes with OpenClaw, Ollama, and all required dependencies pre-installed, eliminating complex setup and configuration. Available on AWS, Azure, and Google Cloud, the solution supports both CPU and GPU deployments based on workload requirements. Teams can securely run system-level AI automation in an isolated cloud environment without exposing local machines. Whether you’re building AI workflows, testing agentic applications, or running local LLMs, OpenClaw provides a scalable and production-ready foundation. Launch, build, and automate faster with a fully optimized AI agent stack.

Architecture

Part 1 — Install Hermes Agent

macOS / Linux / WSL2 / Termux

curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
source ~/.zshrc # or source ~/.bashrc

Headless VPS (skip browser deps):

curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash -s -- --skip-browser

Windows (PowerShell)

iex (irm https://hermes-agent.nousresearch.com/install.ps1)

Or use the Hermes Desktop installer on macOS/Windows.

Verify from this guide

cd guides/awesome-hermes-agent
chmod +x verify-install.sh
./verify-install.sh

Expected: hermes on PATH, hermes doctor clean or with fixable warnings.

Config lives under ~/.hermes/ (Windows: %LOCALAPPDATA%\hermes).

Part 2 — Choose a provider

Easiest: Nous Portal (recommended for first run)

One OAuth flow — models + Tool Gateway (search, images, TTS, browser):

hermes setup --portal

Interactive picker

hermes model

Bring your own keys

Copy reference keys:

cp .env.example .env
# Edit .env — then configure via:
hermes config set

Ollama (local) — set OpenAI-compatible base URL in hermes model or config docs.

Docs: Configuration · Nous Portal

Part 3 — First conversation

hermes --tui # modern TUI (recommended)
# or
hermes # classic CLI

Try:

“What tools do you have enabled?”
“Create a skill for how I like commit messages formatted.”
hermes --continue — resume last session

Quick reference:

| Command | Purpose |
|---------|---------|
| `hermes` | Chat |
| `hermes doctor` | Diagnose |
| `hermes update` | Upgrade |
| `hermes tools` | Enable/disable tools per platform |
| `hermes gateway` | Start messaging bridge |

Part 4 — Skills & Plugins

Hermes creates skills from experience and maintains them via the Curator (v0.12+). Plugins extend core tools (search, memory, shell compression). Together they are procedural + operational memory.

4.1 — Install skills layer

chmod +x install-ecosystem.sh install-starter-pack.sh
./install-ecosystem.sh skills
# or lightweight starter only:
./install-starter-pack.sh

| Skill | Tag | Install path | Why |
|-------|-----|--------------|-----|
| [wondelai/skills](https://github.com/wondelai/skills) | production | `~/.hermes/skills/wondelai-skills` | 380+ cross-platform skills |
| [litprog-skill](https://github.com/tlehman/litprog-skill) | beta | `~/.hermes/skills/litprog-skill` | Literate programming |
| [youtube-skills](https://github.com/therohitdas/youtube-skills) | production | `~/.hermes/skills/youtube-skills` | VPS-safe YouTube transcripts |
| [drawio-skill](https://github.com/Agents365-ai/drawio-skill) | production | `~/.hermes/skills/drawio-skill` | NL → architecture diagrams |
| [Anthropic-Cybersecurity-Skills](https://github.com/mukul975/Anthropic-Cybersecurity-Skills) | production | optional clone | 753+ MITRE security skills (large) |
| [open-design](https://github.com/nexu-io/open-design) | production | per repo README | 31 design skills, 129 design systems |
| [hermes-skill-factory](https://github.com/Romanescu11/hermes-skill-factory) | beta | skill folder | Auto-generate skills from workflows |
| [hermes-incident-commander](https://github.com/Lethe044/hermes-incident-commander) | beta | skill folder | Autonomous SRE / self-healing |

4.2 — Install plugins layer

./install-ecosystem.sh plugins

Plugins clone to ~/.hermes/plugins/. Enable in Hermes config (see Plugins docs).

| Plugin | Tag | What it does |
|--------|-----|--------------|
| [hermes-web-search-plus](https://github.com/robbyczgw-cla/hermes-web-search-plus) | beta | Route search across Serper, Tavily, Exa |
| [rtk-hermes](https://github.com/ogallotti/rtk-hermes) | beta | Compress shell output 60–90% before LLM |
| [mnemo-hermes](https://github.com/hernanqwz/mnemo-hermes) | beta | pgvector semantic memory on Ollama |
| [Mnemosyne](https://github.com/AxDSan/Mnemosyne) | beta | Local hybrid search + knowledge graph |
| [hermes-curator-evolver](https://github.com/pingchesu/hermes-curator-evolver) | beta | Evidence-driven Curator companion |
| [plur](https://github.com/plur-ai/plur) | beta | Portable shared memory (YAML engrams) |
| [hermes-payguard](https://github.com/nativ3ai/hermes-payguard) | experimental | USDC / x402 payments with limits |
| [agent-analytics-hermes-plugin](https://github.com/Agent-Analytics/agent-analytics-hermes-plugin) | beta | Signals analytics dashboard tab |

4.3 — Curator + skill evolution

Built-in Curator (v0.12+) grades, consolidates, and prunes skills every 7 days. Pair with:

| Tool | Tag | Role |
|------|-----|------|
| Built-in Curator | production | Automatic skill library maintenance |
| [SkillClaw](https://github.com/AMAP-ML/SkillClaw) | production | Evolve/dedupe skills from session data |
| [hermes-dojo](https://github.com/Yonkoo11/hermes-dojo) | beta | Find weak skills, auto-iterate |
| [hermes-agent-self-evolution](https://github.com/NousResearch/hermes-agent-self-evolution) | official | DSPy/GEPA prompt evolution |

Verify skills load:

ls ~/.hermes/skills/
hermes --tui
# Ask: "What skills are available? Try /skill-name if configured."

Part 5 — Tools & Utilities

GUIs, linters, browsers, and operator utilities that sit beside the CLI — not replacements.

./install-ecosystem.sh tools

Clones to ~/.hermes/ecosystem-tools/. Follow each repo's README for npm install, pip install, or Docker.

5.1 — GUI dashboards

| Tool | Tag | Best for | Install notes |
|------|-----|----------|---------------|
| [hermes-workspace](https://github.com/outsourc-e/hermes-workspace) | production | Chat + terminal + skills manager | Nous Hackathon winner; Hermes-native |
| [mission-control](https://github.com/builderz-labs/mission-control) | production | Fleet, tasks, cost tracking | SQLite self-hosted dashboard |
| [hermes-web-ui](https://github.com/EKKOLearnAI/hermes-web-ui) | production | Token/cost analytics, cron, 8 channels | Vue 3 + BFF |
| [hermes-ui](https://github.com/pyrate-llama/hermes-ui) | beta | Single-file glassmorphic UI | Python proxy on :3333 |
| [hermes-desktop](https://github.com/dodo-reach/hermes-desktop) | beta | Native macOS workspace | Direct SSH to host |

Example — hermes-workspace:

cd ~/.hermes/ecosystem-tools/hermes-workspace
# Follow README: typically pnpm install && pnpm dev
# Point at your local Hermes gateway / CLI socket

5.2 — Operator & quality utilities

| Tool | Tag | Role |
|------|-----|------|
| [SkillClaw](https://github.com/AMAP-ML/SkillClaw) | production | `skillclaw doctor hermes` — skill health |
| [lintlang](https://github.com/roli-lpci/lintlang) | beta | Lint prompts/configs (HERM v1.1 score) |
| [agenttrace](https://github.com/luoyuctl/agenttrace) | beta | Post-run session audit TUI |
| [Clarvia](https://github.com/clarvia-project/clarvia) | production | Score MCP servers for agent-readiness |
| [flowstate-qmd](https://github.com/amanning3390/flowstate-qmd) | beta | Anticipatory memory / pre-fetch RAG |

5.3 — Browser & headless tooling

| Tool | Tag | When to use |
|------|-----|-------------|
| [camofox-browser](https://github.com/jo-inc/camofox-browser) | production | VPS blocked by Cloudflare — stealth headless API |
| [vessel-browser](https://github.com/unmodeled-tyler/vessel-browser) | experimental | Full AI-native Linux browser |
| Built-in Playwright | production | Default; skip with `--skip-browser` on install |

5.4 — Deployment utilities

| Tool | Tag | Notes |
|------|-----|-------|
| [hermes-agent-docker](https://github.com/xmbshwll/hermes-agent-docker) | beta | Minimal sandbox image |
| [nix-hermes-agent](https://github.com/0xrsydn/nix-hermes-agent) | beta | Reproducible NixOS module |
| [evey-setup](https://github.com/42-evey/evey-setup) | beta | One-command stack + 29 plugins |
| [openclaw-to-hermes](https://github.com/0xNyk/openclaw-to-hermes) | beta | Migration helper |

Part 6 — Integrations & Bridges

Connect Hermes to memory backends , MCP servers , productivity suites , and other agents.

./install-ecosystem.sh integrations

6.1 — MCP integration pattern

Add server block to Hermes MCP config (see MCP docs)
Restart session; verify with hermes tools or ask Hermes to list MCP tools
Score servers with Clarvia before trusting production workflows

| MCP / integration | Tag | Surface |
|-------------------|-----|---------|
| [MeiGen-AI-Design-MCP](https://github.com/jau123/MeiGen-AI-Design-MCP) | production | Image/video gen (9 models) |
| [mistral-mcp](https://github.com/Swih/mistral-mcp) | beta | OCR, audio, Codestral FIM, agents |
| [Not Human Search](https://github.com/unitedideas/not-human-search) | production | Discover 8,600+ MCP servers |
| [Global Chat](https://github.com/pumanitro/Global-Chat) | production | Cross-protocol agent discovery |
| [hermes-blockchain-oracle](https://github.com/gizdusum/hermes-blockchain-oracle) | experimental | Solana on-chain data |
| [hermes-council](https://github.com/Ridwannurudeen/hermes-council) | experimental | Adversarial multi-perspective debate |

Example MCP config snippet (adjust paths after clone):

# Reference only — merge into your Hermes MCP settings
mcp_servers:
  meigen-design:
    command: node
    args: ["~/.hermes/ecosystem-tools/MeiGen-AI-Design-MCP/dist/index.js"]

6.2 — Memory bridges

| Integration | Tag | Pattern |
|-------------|-----|---------|
| [hindsight](https://github.com/vectorize-io/hindsight) | production | retain / recall / reflect over long history |
| [honcho-self-hosted](https://github.com/elkimek/honcho-self-hosted) | beta | Self-hosted Honcho user modeling |
| [yantrikdb-hermes-plugin](https://github.com/yantrikos/yantrikdb-hermes-plugin) | beta | Rust backend with explainable recall |
| [plur](https://github.com/plur-ai/plur) | beta | Portable YAML engram memory |

Memory hygiene: keep USER.md / MEMORY.md concise; let Curator prune stale skills.

6.3 — Productivity & device bridges

| Integration | Tag | Connects |
|-------------|-----|----------|
| [microsoft-workspace-skill](https://github.com/Andrew-Girgis/microsoft-workspace-skill) | beta | Outlook / M365 via Graph API |
| [hermes-nextcloud](https://github.com/adnw-vinc/hermes-nextcloud) | beta | WebDAV, Notes, CalDAV, CardDAV |
| [hermes-android](https://github.com/raulvidis/hermes-android) | beta | Android device control |
| [agent-android](https://github.com/AIVaneLabs/agent-android) | beta | LAN Android over WiFi |
| [hermes-spotify-skill](https://github.com/Alexeyisme/hermes-spotify-skill) | beta | Headless Linux / Raspberry Pi Spotify |
| [clawsocial-hermes-plugin](https://github.com/mrpeter2025/clawsocial-hermes-plugin) | beta | Social discovery network |

6.4 — Cross-agent bridges

| Bridge | Tag | Handoff |
|--------|-----|---------|
| [evey-bridge-plugin](https://github.com/42-evey/evey-bridge-plugin) | beta | Claude Code ↔ Hermes context share |
| [hermes-agent-acp-skill](https://github.com/Rainhoole/hermes-agent-acp-skill) | beta | Route subtasks to Codex / Claude Code |
| [zouroboros-swarm-executors](https://github.com/marlandoj/zouroboros-swarm-executors) | experimental | Local executor bridge for Claude + Hermes |

Part 7 — Multi-Agent & Swarms

When one Hermes session is not enough — orchestration , delegation , and fleet visibility.

./install-ecosystem.sh multiagent

7.1 — oh-my-hermes (orchestration skills)

| Skill | Purpose |
|-------|---------|
| `deep-research` | Multi-step research pipeline |
| `deep-interview` | Structured requirements gathering |
| `ralplan` | Planner → Architect → Critic consensus |
| `ralph` | Verified execute → verify → iterate |
| `triage` | Prioritize incoming work |
| `autopilot` | End-to-end dispatcher playbook |

Install: included in ./install-ecosystem.sh multiagent → ~/.hermes/skills/oh-my-hermes/

7.2 — Specialized agent packs

| Project | Tag | Agents |
|---------|-----|--------|
| [opencode-hermes-multiagent](https://github.com/1ilkhamov/opencode-hermes-multiagent) | beta | 17 role-specialized OpenCode agents |
| [bigiron](https://github.com/supermodeltools/bigiron) | beta | SDLC crew + Supermodel code graph |
| [hermes-plugins](https://github.com/42-evey/hermes-plugins) | beta | Inter-agent bridge between Hermes instances |

7.3 — Fleet dashboards

Pair multi-agent skills with mission-control (Part 5) for:

Task dispatch across agents
Cost tracking per session
SQLite-backed job history

cd ~/.hermes/ecosystem-tools/mission-control
# Follow upstream README for self-hosted deploy

7.4 — Experimental swarms

| Project | Tag | Idea |
|---------|-----|------|
| [Ankh.md](https://github.com/Abruptive/Ankh.md) | experimental | TAW Agent × Hermes swarm framework |
| [gladiator](https://github.com/runtimenoteslabs/gladiator) | experimental | Competing autonomous agent companies |
| [NemoHermes](https://github.com/Hmbown/NemoHermes) | experimental | NVIDIA Spark GPU routing |

7.5 — When to use multi-agent

| Scenario | Use |
|----------|-----|
| Single repo, one developer | Hermes CLI + skills |
| Research → plan → execute chain | oh-my-hermes `ralplan` + `ralph` |
| Best tool per subtask | `hermes-agent-acp-skill` |
| Many agents, cost visibility | mission-control + cron |
| Claude Code already in workflow | evey-bridge + ACP skill |

Part 8 — Messaging gateway (optional)

Hermes ships 18 built-in platforms : Telegram, Discord, Slack, WhatsApp, Signal, Feishu/Lark, WeCom, QQBot, Yuanbao, and more. Microsoft Teams via plugin.

hermes gateway

Configure tokens via hermes setup or config — see Messaging Gateway docs.

Security: keep DM pairing/allowlists on until you trust exposure. Run hermes doctor after gateway changes.

Migrating from OpenClaw

hermes claw migrate

Community fallback: openclaw-to-hermes (older Hermes versions).

Part 9 — Deployment & cron

| Method | Tag | Notes |
|--------|-----|-------|
| Local / `$5 VPS` | — | Default; use `--skip-browser` on headless |
| `hermes-agent-docker` | beta | Minimal sandbox image |
| `nix-hermes-agent` | beta | Reproducible NixOS |
| Modal / Daytona / Vercel Sandbox | — | Serverless terminal backends (built into Hermes) |
| `evey-setup` | beta | Opinionated stack + 29 plugins |

Cron jobs for autonomous loops:

hermes cron # see docs for scheduling nightly evolution, monitoring, etc.

Part 10 — Level-up blueprints

Opinionated bundles from awesome-hermes-agent:

Memory that compounds

Built-in memory → honcho-self-hosted → hindsight → plur (portable engrams) → flowstate-qmd (anticipatory RAG).

Self-improvement without drift

hermes-agent-self-evolution + scheduled regression + lintlang + second evaluation pass.

Operator cockpit

hermes-workspace daily UI + mission-control for fleet/costs.

Multi-agent execution

hermes-agent-acp-skill (route to Codex/Claude Code) + oh-my-hermes + opencode-hermes-multiagent.

Paperclip-managed ops

hermes-paperclip-adapter + cron + dashboard for governed autonomous work.

Full resource list: ecosystem catalog.

Part 11 — End-to-end test

Run the full ecosystem stack:

./verify-install.sh
./install-ecosystem.sh all # or layer by layer: skills, plugins, tools, integrations, multiagent
hermes doctor
hermes --tui

In TUI, verify each layer:

Skills — “List skills in ~/.hermes/skills.”
Plugins — “Which plugins are enabled?”
Tools — open hermes-workspace or mission-control if installed
Integrations — “List MCP tools available.”
Multi-agent — “Use oh-my-hermes triage on this task.”

hermes update

Optional: hermes gateway + Telegram message test.

Troubleshooting

| Symptom | Fix |
|---------|-----|
| `hermes: command not found` | `source ~/.zshrc` or re-run installer |
| Doctor fails on provider | `hermes setup --portal` or `hermes model` |
| YouTube transcripts fail on VPS | Install `youtube-skills` (cloud IP blocked by default) |
| Browser tools OOM on small VPS | Install with `--skip-browser`; use `camofox-browser` plugin |
| Skills not visible | Confirm `SKILL.md` in `~/.hermes/skills/<name>/`; restart session |
| Plugins not loading | `./install-ecosystem.sh plugins`; enable in Hermes config |
| Ecosystem clone failed | Check `git`; retry one layer: `./install-ecosystem.sh skills` |
| MCP tools missing | Add server to Hermes MCP config; restart session |
| Multi-agent handoff fails | Install `hermes-agent-acp-skill`; verify delegate agent installed |
| GUI tool won't start | `cd ~/.hermes/ecosystem-tools/<name>` and follow repo README |
| OpenClaw migration gaps | `hermes claw migrate` then compare cron + channel config |

What’s next

Browse the ecosystem catalog by category
Join Nous Discord
Star NousResearch/hermes-agent and awesome-hermes-agent
Contribute new ecosystem entries via awesome-hermes-agent PRs

Summary

| Step | Command / artifact |
|------|---------------------|
| Install | `curl … install.sh \| bash` |
| Provider | `hermes setup --portal` |
| Verify | `./verify-install.sh` |
| Chat | `hermes --tui` |
| Skills & plugins | `./install-ecosystem.sh skills` + `plugins` |
| Tools & utilities | `./install-ecosystem.sh tools` |
| Integrations | `./install-ecosystem.sh integrations` |
| Multi-agent | `./install-ecosystem.sh multiagent` |
| Full stack | `./install-ecosystem.sh all` |
| Catalog | [ecosystem catalog](https://ayush7614.github.io/agentic-ai-ecosystem/guides/awesome-hermes-agent/ecosystem/) |
| Gateway | `hermes gateway` |

Thank you so much for reading

Like | Follow | Subscribe to the newsletter.

Catch us on

LinkedIn: https://www.linkedin.com/in/techlatest-net/

Commands vs Skills vs Agents in Claude Code — What Goes Where

TechLatest — Fri, 05 Jun 2026 14:19:03 +0000

Configure Claude Code so it knows your stack, follows your conventions, runs repeatable workflows, and delegates to specialists — without repeating yourself every session.

What you’ll build

A production-style Claude Code project layout:

CLAUDE.md — team instructions (committed)
CLAUDE.local.md — personal overrides (gitignored)
.claude/settings.json — permissions and environment (committed)
.claude/rules/ — modular instruction files
.claude/skills/ — slash commands and auto-invoked workflows
.claude/agents/ — isolated subagent personas

Everything Claude needs about your project lives in one place — commit .claude/ to git so the whole team shares it.

Tool stack

| Tool | Role |
|------|------|
| **Claude Code** | CLI agent with tools, memory, skills, subagents |
| **`CLAUDE.md`** | Project memory loaded at session start |
| **`.claude/settings.json`** | Permissions, hooks, env vars |
| **Skills** | Reusable prompts — manual `/name` or automatic |
| **Agents** | Focused sub-sessions with their own tools |
| **Rules** | Path-scoped or global instruction modules |

Layers

| Layer | Always on? | Trigger |
|-------|------------|---------|
| `CLAUDE.md` + `rules/` | Yes — every session | Automatic |
| `settings.json` | Yes — gates tool use | Automatic |
| Skills | On demand or auto | `/project:name` or model decides |
| Agents | On demand | User delegates or Claude spawns |
| Hooks | Yes — around tool calls | `settings.json` → `hooks` |

Session workflow

Developer runs claude in a configured repo
Memory + rules + permissions load automatically
Skills and agents handle specialized work on demand
Team shares the same .claude/ tree via git

Prerequisites

| Requirement | Check |
|-------------|--------|
| Claude Code installed | `claude --version` |
| A git repository | `git status` |
| Terminal access to your project | — |

Install Claude Code: code.claude.com.

Part 1 — Understand the layout

Project root

your-project/
├── CLAUDE.md # Team instructions (committed)
├── CLAUDE.local.md # Personal overrides (gitignored)
└── .claude/
    ├── settings.json # Permissions + config (committed)
    ├── settings.local.json # Personal permissions (gitignored)
    ├── rules/ # Modular instruction files
    ├── skills/ # Workflows with SKILL.md
    ├── commands/ # Legacy single-file skills (optional)
    └── agents/ # Subagent definitions

Global home directory

Claude also reads ~/.claude/ (all projects):

~/.claude/
├── CLAUDE.md # Your global defaults
├── settings.json # Global permissions
├── skills/
├── agents/
└── rules/

Rule of thumb: commit project files; keep *.local.* and CLAUDE.local.md personal.

Part 2 — Bootstrap from this guide’s template

From the ecosystem repo:

cd guides/claude-code-dot-claude
chmod +x install-template.sh
./install-template.sh ~/projects/my-app

The script copies:

CLAUDE.md
Full .claude/ tree (settings, rules, skills, agents, legacy commands/)
CLAUDE.local.md and settings.local.json from examples
Gitignore lines for local files

Verify:

cd ~/projects/my-app
tree -a .claude CLAUDE.md 2>/dev/null || find .claude CLAUDE.md -maxdepth 3
claude

Part 3 — CLAUDE.md (team memory)

CLAUDE.md is the house rules — loaded at the start of every session. Keep it short: stack, workflow, and pointers to deeper rules.

Example (from template/CLAUDE.md):

# Project instructions for Claude Code

## Stack
- Python 3.10+

## Workflow
1. Read files before editing.
2. Run `pytest -q` before claiming done.
3. Use `/project:code-review` before opening a PR.

## Agents
- **code-reviewer** — diff review
- **security-auditor** — auth and secrets

Bootstrap with /init

In an existing repo without CLAUDE.md:

claude
/init

Claude scans the repo and drafts a starter file. Edit it — /init is a starting point, not gospel.

Personal overrides — CLAUDE.local.md

Create at project root (gitignored):

# Personal overrides
- Prefer concise answers.
- My API base URL: http://localhost:8080

Claude merges local on top of team instructions. Never put secrets here if the file could leak — use env vars instead.

Part 4 — settings.json (permissions)

Permissions control which tools Claude can run without asking every time.

template/.claude/settings.json:

{
  "permissions": {
    "allow": [
      "Read",
      "Edit",
      "Glob",
      "Grep",
      "Bash(pytest *)",
      "Bash(python *)",
      "Bash(git status)",
      "Bash(git diff *)"
    ],
    "deny": [
      "Bash(curl *)",
      "Read(.env)",
      "Read( **/secrets/** )"
    ]
  },
  "env": {
    "PYTHONDONTWRITEBYTECODE": "1"
  }
}

Personal allows — settings.local.json

{
  "permissions": {
    "allow": [
      "Bash(docker *)",
      "WebFetch"
    ]
  }
}

Manage interactively: /permissions inside Claude Code.

Docs: Claude Code settings.

Part 5 — rules/ (modular instructions)

Split large CLAUDE.md files into focused modules under .claude/rules/.

| File | Purpose |
|------|---------|
| `code-style.md` | Naming, line length, types |
| `testing.md` | pytest conventions |
| `api-conventions.md` | REST shape, status codes |

Rules can be path-scoped in frontmatter (Claude Code loads relevant rules based on files being edited):

---
paths:
  - "src/api/**"
  - " **/controllers/**"
---

# API conventions
...

Start with 2–3 rules. Add more when Claude repeatedly makes the same mistake.

Part 6 — Skills and commands (workflows)

Skills are reusable workflows invoked as /project:skill-name or auto-invoked when Claude decides they're relevant.

Canonical location: skills/

.claude/skills/
├── code-review/
│ └── SKILL.md
├── deploy/
│ └── SKILL.md
└── fix-issue/
    └── SKILL.md

template/.claude/skills/code-review/SKILL.md:

---
name: code-review
description: Structured code review before PRs. Use when the user asks for review.
---

# Code review
1. Check correctness, security, tests, style.
2. Output: Summary → Findings → Verdict.

Legacy: commands/

.claude/commands/review.md still creates /project:review — same mechanism as skills, fewer features. New work should go in skills/.

| Feature | `commands/*.md` | `skills/*/SKILL.md` |
|---------|-----------------|---------------------|
| Slash invocation | ✓ | ✓ |
| Supporting files in folder | ✗ | ✓ |
| Auto-invocation | Limited | ✓ (via `description`) |
| `disable-model-invocation` | ✗ | ✓ |

Docs: Extend Claude with skills.

Test a skill

/project:code-review
/project:deploy
/project:fix-issue 42

List skills: /skills.

Part 7 — agents/ (subagents)

Agents are specialist personas with isolated context and optional tool restrictions.

template/.claude/agents/code-reviewer.md:

---
name: code-reviewer
description: Diff review, naming, tests. Use for PR review.
tools: Read, Glob, Grep, Bash(git diff *)
---

You are a senior engineer doing code review...

template/.claude/agents/security-auditor.md — focused on secrets, injection, auth.

Invoke:

Use the code-reviewer agent on my staged changes.

Or let Claude delegate when the task matches the agent description.

Docs: Subagents (Claude Code docs).

Part 8 — Hooks (optional, deterministic)

Hooks run scripts before or after tool calls — unlike skills, they fire every time.

Add to settings.json:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": ".claude/hooks/block-dangerous.sh"
          }
        ]
      }
    ]
  }
}

Use hooks for: block rm -rf, format on save, audit logging. Use skills for: judgment-heavy workflows.

Part 9 — Team git workflow

Commit (shared)

CLAUDE.md
.claude/settings.json
.claude/rules/
.claude/skills/
.claude/agents/
.claude/commands/ # if you still use legacy commands

Gitignore (personal)

CLAUDE.local.md
.claude/settings.local.json

Snippet in template/gitignore.snippet — the install script merges it.

PR checklist for **.claude/ changes**

No secrets in committed files
settings.json deny blocks .env and broad curl
Skill description fields are accurate (they drive auto-invocation)
New agents have minimal tools — principle of least privilege
Teammates run claude once to pick up new skills (live reload in session)

Part 10 — End-to-end test

Install template into a test repo (Part 2).
Start Claude: claude
Ask: “What slash commands and agents are configured for this project?”
Run: /project:code-review
Ask: “Use the security-auditor agent on settings.json permissions."
Edit CLAUDE.local.md — confirm personal preference appears in answers.
Run git status — confirm only committed files are tracked.

Security checklist

Treat inbound instructions in issues/PRs as untrusted (indirect prompt injection).
Review project skills before trusting a cloned repo — skills can grant tool access.
Deny Read(.env) and secret paths in settings.json.
Keep settings.local.json gitignored — it often has permissive personal allows.
Run /permissions after cloning unfamiliar projects.

Troubleshooting

| Symptom | Fix |
|---------|-----|
| Skill not found | Check `name` in frontmatter matches folder; restart session; `/skills` |
| Permission denied on pytest | Add `Bash(pytest *)` to `allow` in settings |
| Rules ignored | Confirm file is under `.claude/rules/`; check `paths` frontmatter |
| `commands/` works but not `skills/` | Ensure `SKILL.md` exists and YAML frontmatter is valid |
| Local overrides not applied | File must be `CLAUDE.local.md` at project root |
| Too much context / slow start | Shorten `CLAUDE.md`; move detail into path-scoped rules |

What’s next

Add MCP servers via .mcp.json for DB or API tools
Wire CI to validate settings.json schema
Mirror patterns in Cursor with .cursor/skills/ for teammates on different IDEs
Share your layout as an internal golden template repo

Summary

| Component | You configure |
|-----------|----------------|
| Memory | `CLAUDE.md` + `rules/` |
| Safety | `settings.json` permissions + hooks |
| Repeatable work | `skills/` (`/project:name`) |
| Deep specialists | `agents/` |
| Personal taste | `CLAUDE.local.md`, `settings.local.json` |

Everything Claude needs to know about your project lives in .claude/ — commit it, share it, iterate like code.

Want More Control Than Claude Code?

If you’re looking for a self-hosted alternative to SaaS AI tools, TechLatest offers ready-to-deploy AI solutions on AWS, Azure, and GCP. Deploy in minutes, keep full ownership of your infrastructure, and avoid vendor lock-in while running modern open-source AI models and agents.

GPU-Supported DeepSeek & Llama All-in-One LLM Suite

This GPU-optimized VM includes DeepSeek-R1, Llama 3.3, Qwen, Gemma, Mistral, Ollama, and Open WebUI pre-installed and ready to use. It is designed for teams that need fast local inference, AI application development, and private model hosting. GPU acceleration significantly improves performance for larger models and demanding workloads. Deploy directly on AWS, Azure, or GCP without spending hours configuring drivers and dependencies.

Product Link: https://www.techlatest.net/support/multi_llm_gpu_vm_support/

DeepSeek & Llama All-in-One LLM Suite

A cost-effective CPU-based deployment for organizations that want a private ChatGPT alternative without expensive AI subscriptions. The VM includes popular open-source models, Open WebUI, and Ollama, allowing users to interact through both APIs and a web interface. It is ideal for internal assistants, AI experimentation, model evaluation, and application development. Launch on AWS, Azure, or GCP and start using production-ready AI infrastructure within minutes.

Product Link: https://www.techlatest.net/support/multi_llm_vm_support/

Thank you so much for reading

Like | Follow | Subscribe to the newsletter.

Catch us on

LinkedIn: https://www.linkedin.com/in/techlatest-net/

Your AI on WhatsApp — Fully Local, Powered by Gemma

TechLatest — Thu, 04 Jun 2026 12:06:03 +0000

Build a personal AI assistant that answers on Telegram/WhatsApp/CLI using Gemma 4 E2B and delegates research-heavy questions to your local Agentic RAG API.

What you end up with

OpenClaw Gateway — always-on control plane (daemon)
gemma4:e2b — conversational model with tools + optional vision
agentic-rag skill — shells out to rag_query.sh → POST /predict on LitServe
qwen-agentic-rag — CrewAI Researcher + Writer + Qdrant (and optional Firecrawl)

This integration uses one Ollama model everywhere: gemma4:e2b for OpenClaw chat and for the CrewAI RAG agents.

Deploy OpenClaw Without the Setup Hassle

Want to skip the installation and configuration process? We provide a fully managed OpenClaw AI Agent Automation Stack on AWS, Azure, and Google Cloud, complete with OpenClaw, Ollama, dependencies, and optional GPU acceleration already configured. Simply launch the VM and start building AI agents, automation workflows, and local LLM applications immediately. The environment is optimized for performance, securely isolated from your local machine, and designed to get you from deployment to productivity in minutes.

Prerequisites

| Requirement | Check |
|-------------|--------|
| Node **22.12+** or **24** (OpenClaw will not run on Node 20) | `node -v` |
| Ollama | `ollama -v` |
| Python 3.10+ | `python3 --version` |
| curl + jq | `curl --version` && `jq --version` |

Part 1 — Agentic RAG API

If you already finished the Qwen Agentic RAG tutorial, start the server only:

ollama pull gemma4:e2b
cd guides/qwen-agentic-rag
source .venv/bin/activate
cp ../openclaw-gemma-rag/env.rag.example .env # sets OLLAMA_MODEL=ollama/gemma4:e2b
# First time only:
# pip install -r requirements.txt && python setup_vectordb.py
python server.py

Default URL: http://127.0.0.1:8001 (PORT in .env).

Verify:

python client.py --query "What is cross-validation?"
# or
curl -sS -X POST http://127.0.0.1:8001/predict \
  -H 'Content-Type: application/json' \
  -d '{"query":"What is cross-validation?"}' | jq -r .output

Keep this terminal open. The first crew run may take several minutes.

Part 2 — Pull Gemma 4 E2B

ollama pull gemma4:e2b
ollama run gemma4:e2b "Reply in one sentence: what is Gemma 4?"

Recommended sampling (Ollama may already apply defaults): temperature=1, top_p=0.95, top_k=64.

Part 3 — Install OpenClaw

Node version (required)

OpenClaw needs Node >= 22.12. If node -v shows v20, switch with nvm (you may already have 22 installed):

cd guides/openclaw-gemma-rag
source ./use-node22.sh # uses .nvmrc → 22.22.3
node -v # must be v22.12.0 or higher

Optional — make Node 22 the default in new terminals:

nvm alias default 22

npm install -g openclaw@latest
openclaw onboard --install-daemon

Follow prompts for workspace, auth, and optional channels. See Getting started.

Set the primary model:

export OLLAMA_API_KEY="ollama-local"
openclaw models list --provider ollama
openclaw models set ollama/gemma4:e2b

Config snippet

Copy fields from config/openclaw.snippet.json5 in this guide into ~/.openclaw/openclaw.json.

Critical points:

baseUrl: http://127.0.0.1:11434 — no /v1 suffix
api: "ollama" — native tool calling
agents.defaults.model.primary: "ollama/gemma4:e2b"

Restart:

openclaw gateway restart
openclaw gateway status

Part 4 — Install the agentic-rag skill

From this guide directory:

cd guides/openclaw-gemma-rag
chmod +x install-skill.sh skills/agentic-rag/scripts/*.sh
./install-skill.sh

This copies to ~/.openclaw/workspace/skills/agentic-rag/.

Alternative (if your CLI supports it):

openclaw skills install ./guides/openclaw-gemma-rag/skills/agentic-rag --global

Enable in config:

{
  skills: {
    entries: {
      "agentic-rag": {
        enabled: true,
        env: { RAG_API_URL: "http://127.0.0.1:8001" },
      },
    },
  },
}

Optional allowlist so only this skill is injected:

{
  agents: {
    defaults: {
      skills: ["agentic-rag"],
    },
  },
}

Restart the gateway after skill or config changes.

Skill behavior

The skill teaches OpenClaw to run:

~/.openclaw/workspace/skills/agentic-rag/scripts/rag_query.sh "user question"

That POSTs to LitServe and prints the crew answer. The Gemma model decides when to use the skill; the RAG crew uses the same OLLAMA_MODEL=ollama/gemma4:e2b from guides/qwen-agentic-rag/.env (see env.rag.example).

Part 5 — End-to-end test

CLI (no channel)

openclaw agent --message "Using the agentic RAG knowledge base: explain cross-validation in 3 bullets." --thinking low

Watch the gateway logs — you should see an exec invoking rag_query.sh.

Manual script test

export RAG_API_URL=http://127.0.0.1:8001
./skills/agentic-rag/scripts/rag_query.sh "What is regularization?"

Health check

./skills/agentic-rag/scripts/rag_health.sh

Part 6 — Connect a channel (optional)

Example: Telegram

Create a bot via @BotFather
During openclaw onboard or openclaw configure, add the Telegram channel token
Keep DM pairing enabled (dmPolicy: "pairing") until you trust exposure
Approve yourself: openclaw pairing approve telegram

Send: “Search the ML FAQ: what is gradient descent?”

Flow: Telegram → Gateway → Gemma → agentic-rag skill → RAG API → reply on Telegram.

Channel docs: OpenClaw Channels.

Security checklist

Treat inbound DMs as untrusted — keep pairing on for production-adjacent setups
exec (used by the RAG skill) is powerful — do not expose the gateway to the public internet without Security and Exposure runbook
Run openclaw doctor after config changes
RAG API binds to localhost by default — keep it that way

Troubleshooting

| Symptom | Fix |
|---------|-----|
| `connection refused` on :8001 | Start `python server.py` in qwen-agentic-rag |
| RAG very slow | Normal on laptop; reduce parallel Ollama loads |
| OpenClaw ignores RAG | Confirm skill installed, `enabled: true`, gateway restarted; ask explicitly to "use agentic RAG" |
| `ollama/gemma4:e2b` not found | `ollama pull gemma4:e2b`; check `openclaw models list` |
| Tool calling errors | Ensure `api: "ollama"` and no `/v1` on baseUrl |
| `openclaw requires Node >=22.12.0` | Run `source guides/openclaw-gemma-rag/use-node22.sh` or `nvm use 22` before any `openclaw` command |
| OOM on 16GB Mac | Only run `gemma4:e2b`; quit other Ollama models (`ollama ps`) |
| Skill `curl` fails | `brew install jq` or apt install jq |

What’s next

Add your own documents in guides/qwen-agentic-rag/rag_code.py and re-run setup_vectordb.py
Publish a second OpenClaw skill for Gradio (ui.py) health checks
Route work vs personal agents with multi-agent routing

Summary

| Component | You run |
|-----------|---------|
| Ollama | `gemma4:e2b` (chat + RAG) |
| RAG | `guides/qwen-agentic-rag/server.py` |
| OpenClaw | `openclaw gateway` (daemon) |
| Skill | `agentic-rag` → `rag_query.sh` → `/predict` |

You now have a local-first assistant: Gemma for conversation, CrewAI RAG for grounded ML research — no cloud LLM required for either layer.

Thank you so much for reading

Like | Follow | Subscribe to the newsletter.

Catch us on

LinkedIn: https://www.linkedin.com/in/techlatest-net/

Reddit Community: https://www.reddit.com/user/techlatest_net/

Deploy a Qwen 3.6 Agentic RAG — Step-by-Step Walkthrough

TechLatest — Wed, 03 Jun 2026 09:55:19 +0000

Today we’ll build and deploy an Agentic RAG powered by Alibaba’s latest Qwen 3.6, running fully on your machine.

What you’ll build

A private API where two AI agents collaborate:

Researcher Agent — retrieves context from a vector database or the web
Writer Agent — turns that research into a polished answer

Tool stack

| Tool | Role |
|------|------|
| **Qwen 3.6** (via Ollama) | Local LLM — no cloud API needed |
| **CrewAI** | Multi-agent orchestration |
| **Firecrawl** | Web search when the vector DB doesn't have the answer |
| **Qdrant** | Local vector database for your knowledge base |
| **LitServe** | Production-style HTTP API deployment |

Architecture

Flow:

Client sends a query to LitServe
Researcher Agent picks the right tool (vector DB or Firecrawl)
Writer Agent synthesizes the final answer
LitServe returns JSON to the client

Prerequisites

1. Remove old models (optional cleanup)

If you had other Ollama models taking disk space:

ollama list
ollama rm gemma4:e2b # example — use your model name

2. Pull Qwen 3.6

On a 16GB Mac, use the 27B variant:

ollama pull qwen3.6:27b

Verify:

ollama run qwen3.6:27b "Say hello in one sentence."

3. Install Python dependencies

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

4. Environment variables

cp .env.example .env

Edit .env:

FIRECRAWL_API_KEY=fc-...
OLLAMA_MODEL=ollama/qwen3.6:27b
OLLAMA_BASE_URL=http://localhost:11434

Get a Firecrawl key at firecrawl.dev.

5. Start Qdrant

docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

6. Build the knowledge base

python setup_vectordb.py

This embeds 20 ML FAQ chunks into Qdrant using nomic-embed-text-v1.5.

Step 1 — Set up the LLM

CrewAI integrates with Ollama through its LLM class. We point it at your local Qwen 3.6 model:

Why qwen3.6:27b? Qwen 3.6 adds stronger agentic reasoning and tool use. On 16GB RAM, the 27B quantized model (~17GB) is the practical choice.

Step 2 — Define the Research Agent and Task

The Researcher gets two tools:

ml_faq_retrieval_tool — searches your Qdrant vector DB
FirecrawlSearchTool — searches the web for fresh or out-of-scope topics

Vector DB tool (tools.py)

The custom tool wraps Qdrant retrieval:

The agent decides which tool to call — that’s what makes this “agentic” RAG instead of a fixed retrieve-then-generate pipeline.

Step 3 — Define the Writer Agent and Task

The Writer receives the Researcher’s output via context=[researcher_task]:

Step 4 — Set up the Crew

Orchestrate both agents inside LitServe’s setup() method (runs once at startup):

Step 5 — Decode request

Extract the user query from the incoming JSON body:

Example request:

{"query": "What is cross-validation and why is it important?"}

Step 6 — Predict

Pass the query to the Crew. The {query} placeholder in task descriptions is filled from inputs:

Behind the scenes:

Researcher runs and may call vector DB and/or Firecrawl
Writer reads those findings and drafts the answer
Qwen 3.6 powers both agents through Ollama

Step 7 — Encode response

Return the final answer as JSON:

Step 8 — Start the server

timeout=False is important — agent crews with tool calls can take several minutes on local hardware.

Client code

client.py sends a POST to /predict:

Run it:

# Terminal 1
python server.py

# Terminal 2
python client.py --query "How do I avoid overfitting?"
python client.py --query "What is the latest news about Qwen 3.6?"

The second query should trigger Firecrawl because it’s not in the ML FAQ knowledge base.

Full server code

For reference, here is the complete server.py:

Agentic RAG vs classic RAG

| Classic RAG | Agentic RAG (this tutorial) |
|-------------|----------------------------|
| Fixed: always retrieve → generate | Agent chooses tools dynamically |
| Single LLM call | Multi-agent pipeline |
| One data source | Vector DB + web fallback |
| Hard to extend | Add tools without rewriting the pipeline |

Troubleshooting

| Issue | Fix |
|-------|-----|
| `connection refused` on port 6333 | Start Qdrant with Docker |
| Ollama model not found | Run `ollama pull qwen3.6:27b` |
| Very slow responses | Normal on 16GB RAM; close other apps |
| Firecrawl errors | Check `FIRECRAWL_API_KEY` in `.env` |
| Empty vector results | Run `python setup_vectordb.py` first |

What’s next

Replace the sample FAQ with your own documents in rag_code.py
Add a Gradio UI in front of the LitServe API
Swap Firecrawl for another search provider
Deploy LitServe behind Docker or Lightning AI Cloud

Summary

You deployed a fully private Qwen 3.6 Agentic RAG:

Qwen 3.6 runs locally via Ollama
CrewAI orchestrates Researcher + Writer agents
Qdrant stores your knowledge base
Firecrawl fills gaps with live web data
LitServe exposes everything as a clean REST API

Done!

Thank you so much for reading

Like | Follow | Subscribe to the newsletter.

Catch us on

LinkedIn: https://www.linkedin.com/in/techlatest-net/

Reddit Community: https://www.reddit.com/user/techlatest_net/

CVE MCP Server: Turn Claude Into a Full-Spectrum Security Analyst

TechLatest — Mon, 01 Jun 2026 15:04:27 +0000

27 tools. 21 data sources. One protocol. Zero browser tabs.

If you’ve ever triaged a CVE, you know the drill. Open NVD for the CVSS score. Check EPSS for exploitation probability. Cross-reference CISA KEV for active exploitation. Search GitHub for PoCs. Maybe pull VirusTotal or Shodan if it’s tied to an IP. Then sit there and mentally stitch it all together.

For one CVE, that’s 15–20 minutes. For fifty? That’s your entire day gone.

CVE MCP Server fixes that — an open-source, production-grade Model Context Protocol (MCP) server built by Mahipal Jangra. It gives Claude direct access to 27 security intelligence tools across 21 APIs. Ask one question. Get correlated, prioritized intelligence in seconds.

In this guide, we will walk through installing it on macOS, connecting it to Claude Code, and running real queries — with screenshots at every step.

The Problem: CVE Triage Shouldn’t Be a Tab Marathon

Security analysts, DevSecOps engineers, and bug bounty hunters all hit the same wall. Triaging a single vulnerability means querying:

NVD — CVSS scores, affected products, references
EPSS — statistical likelihood of exploitation
CISA KEV — confirmed in-the-wild exploitation
GitHub — patches, advisories, public exploit code
VirusTotal / Shodan / GreyNoise — if there’s a network or malware angle

Each source lives in its own silo. You’re the glue holding it together — manually, repeatedly, expensively.

CVE MCP Server removes that glue work. Claude orchestrates every relevant lookup in parallel, runs a composite risk calculation, and delivers a recommendation with evidence attached.

What You Get

| Feature | Description |
| ---------------------------- | ----------------------------------------------------------------------------------------------------- |
| **27 MCP tools** | CVE lookup, EPSS, KEV, MITRE ATT&CK, Shodan, VirusTotal, dependency scanning, and more |
| **21 data sources** | NVD, EPSS, CISA KEV, OSV.dev, GitHub GHSA, AbuseIPDB, GreyNoise, MalwareBazaar, ThreatFox, and others |
| **Composite risk engine** | Weighted 0–100 score combining CVSS, EPSS, KEV status, and PoC availability |
| **SQLite cache + audit log** | Fast repeat lookups, full tool invocation history |
| **Zero-key start** | 8 tools work with no API keys at all |
| **Outbound HTTPS only** | No inbound ports, no telemetry, private IPs blocked |

Built with Python 3.10+, FastMCP, httpx, aiosqlite, Pydantic v2, and defusedxml.

GitHub: github.com/mukul975/cve-mcp-server

Architecture at a Glance

Claude Desktop / Claude Code (MCP Client)
              │
              │ Model Context Protocol (stdio)
              ▼
       CVE MCP Server (Python)
  ┌─────────────┬──────────────┬───────────────┐
  │ 27 Tools │ Risk Engine │ SQLite Cache │
  └──────┬──────┴──────┬───────┴───────┬───────┘
         │ │ │
         └─────────────┴───────────────┘
                       │
              Async HTTP (httpx)
         Rate Limiter · Response Cache
                       │
         ┌─────────────┼─────────────┐
         ▼ ▼ ▼
   Vulnerability Network Threat
   Intelligence Intelligence Intelligence
   (NVD, EPSS, (Shodan, (VirusTotal,
    KEV, OSV) GreyNoise) MalwareBazaar)

All traffic is outbound HTTPS only. API keys load from environment variables and are never logged. Private and reserved IP ranges are blocked before any network lookup.

The 27 Tools (Organized by Category)

Core Vulnerability Intelligence (8 tools)

| Tool | What It Does |
|------|-------------|
| `lookup_cve` | Full NVD record — CVSS, CWEs, affected products, and vulnerability timeline |
| `search_cves` | Search NVD by keyword, product, severity, or date range |
| `get_epss_score` | EPSS exploitation probability (0–1) and percentile ranking |
| `check_kev_status` | Check whether a CVE is listed in the CISA Known Exploited Vulnerabilities (KEV) catalog |
| `get_cvss_details` | Parse and explain a CVSS v3.1 vector string |
| `get_cwe_info` | Retrieve CWE information from the embedded database |
| `get_cve_references` | Categorize patch, advisory, and exploit reference links |
| `bulk_cve_lookup` | Batch-fetch up to 20 CVEs with parallel enrichment and analysis |

Exploit & Attack Intelligence (4 tools)

| Tool | What It Does |
|------|-------------|
| `search_exploits` | Search GitHub PoCs and exploit repositories for publicly available exploits |
| `get_mitre_techniques` | Map CVEs and CWEs to relevant MITRE ATT&CK techniques |
| `check_poc_availability` | Check multiple sources for proof-of-concept (PoC) exploit availability |
| `get_attack_patterns` | Retrieve CAPEC attack pattern details and associated attack methodologies |

Advanced Risk & Reporting (4 tools)

| Tool | What It Does |
|------|-------------|
| `calculate_risk_score` | Calculate a composite 0–100 risk score based on multiple vulnerability signals |
| `generate_risk_report` | Generate an executive-formatted security risk report |
| `prioritize_cves` | Rank and prioritize CVEs for remediation and triage |
| `get_trending_cves` | Identify trending vulnerabilities based on high EPSS scores and recent KEV additions |

Network Intelligence (4 tools)

| Tool | What It Does |
|------|-------------|
| `lookup_ip_reputation` | Retrieve AbuseIPDB abuse history, reputation score, and confidence level for an IP address |
| `check_ip_noise` | Query GreyNoise to classify IPs based on scanning, attack, and internet background noise activity |
| `shodan_host_lookup` | Retrieve open ports, running services, banners, and associated CVEs from Shodan |
| `passive_dns_lookup` | Access CIRCL passive DNS data for historical DNS resolutions and domain associations |

Threat Intelligence (4 Tools)

| Tool | What It Does |
|------|-------------|
| `virustotal_lookup` | Check a file hash, URL, domain, or IP address against 70+ antivirus and threat intelligence engines |
| `search_malware` | Search MalwareBazaar for malware samples, hashes, and related metadata |
| `search_iocs` | Look up Indicators of Compromise (IOCs) in ThreatFox by malware family or threat actor |
| `check_ransomware` | Check ransomware-related Bitcoin addresses and associated threat intelligence data |

DevSecOps (3 Tools)


| Tool | What It Does |
|------|-------------|
| `scan_dependencies` | Scan software dependencies for known vulnerabilities using OSV.dev vulnerability data |
| `scan_github_advisories` | Search GitHub Security Advisories (GHSA) for vulnerability information and remediation guidance |
| `urlscan_check` | Submit URLs to URLScan.io and retrieve analysis results, screenshots, and threat intelligence data |

Installation: Step by Step

We’ll walk through the full setup — from clone to your first Claude query.

Prerequisites

Python 3.10+ (3.11 or 3.12 recommended)
pip or uv
Git
Claude Desktop or Claude Code

Step 1: Clone the Repository

git clone https://github.com/mukul975/cve-mcp-server.git
cd cve-mcp-server

Step 2: Create a Virtual Environment

macOS / Linux:

python -m venv venv
source venv/bin/activate

Windows (PowerShell):

python -m venv venv
.\venv\Scripts\Activate.ps1

Windows (CMD):

python -m venv venv
venv\Scripts\activate.bat

Step 3: Install Dependencies

pip install -e .

For development with tests:

pip install -e ".[test]"

Faster alternative with uv:

uv venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
uv pip install -e .

Step 4: Verify the Server Starts

python -m cve_mcp.server

You should see the FastMCP server initialize without errors. Press Ctrl+C to stop — we’ll wire it into Claude next.

Step 5: Configure API Keys

API keys are optional for CVE MCP Server. Eight tools work with no keys (EPSS, CISA KEV, OSV.dev, MITRE ATT&CK, CWE lookups, CVSS parsing, Ransomwhere, and NVD at 5 req/30s).

For this guide, we add one key: a GitHub personal access token. It’s free, takes about a minute, and needs no organization details (unlike NVD, which can ask for org info and take longer to approve).

What a GitHub token unlocks:


| Tool | What You Get |
|------|--------------|
| `search_exploits` | Access to public PoC and exploit repositories hosted on GitHub |
| `check_poc_availability` | Multi-source proof-of-concept (PoC) availability checks, including GitHub-based sources |
| `scan_github_advisories` | Access to GitHub Security Advisories (GHSA) for vulnerability research and remediation guidance |

Rate limit: 60 requests/hour without a token → 5,000/hour with a token.

What still works without NVD_API_KEY:

NVD-backed tools (lookup_cve, search_cves, calculate_risk_score, etc.) still work at the free tier: 5 requests per 30 seconds. Fine for blog demos and a few CVEs at a time.

On startup, you’ll still see:

WARNING: NVD_API_KEY not set — using unauthenticated rate limit (5 req/30s)

That’s expected, not an error. Add NVD_API_KEY later when you have it.

Step 5a: Copy the environment file

From your project directory (with venv active):

cd ~/Desktop/cve-mcp-server
source venv/bin/activate
cp .env.example .env

.env is gitignored — your keys stay local and are never committed.

Step 5b: Create a GitHub token

Open github.com/settings/tokens
Generate a new token (classic)
Name it e.g. cve-mcp-server
Expiration: 90 days or “No expiration” (your choice)
Scopes: leave empty — public advisory and code search don’t need repo scopes
Generate and copy the token once (ghp_...)

Step 5c: Edit .env

Open .env in your editor and set:

# Optional — add later for 10× NVD speed (50 req/30s)
NVD_API_KEY=

# Tier 1 — GitHub (this guide)
GITHUB_TOKEN=ghp_your_token_here

# Tier 2 — leave empty unless you need IP/malware demos
ABUSEIPDB_KEY=
VIRUSTOTAL_KEY=
GREYNOISE_API_KEY=
SHODAN_KEY=
URLSCAN_KEY=

Step 5d: Verify the server loads .env

python -m cve_mcp.server

You should see:

NVD warning (OK without NVD key)
KEV catalog loaded with ~1600+ entries
Server running — waiting for MCP client on stdio

Press Ctrl+C to stop.

python-dotenv loads .env automatically when the server runs from the project folder.

Step 6: Connect CVE MCP Server to Claude Code

You installed the server in Steps 1–4 and added a GitHub token in Step 5. Step 6 wires that server into Claude Code so Claude can call all 27 security tools during a session.

You’re using Claude Code (not Claude Desktop) — that’s the right client for this walkthrough.

Why use the project venv Python?

Claude spawns the MCP server as a subprocess. If it uses system python, it may not see cve-mcp-server or your .env.

Use the venv interpreter and set cwd to the project folder so:

cve_mcp is importable
python-dotenv loads .env (including GITHUB_TOKEN)
the KEV catalog and tools start correctly

Step 6a: Register the MCP server

From the project directory:

cd ~/xxxx/cve-mcp-server
source venv/bin/activate

claude mcp add cve-mcp -- \
  /Users/xxxxx/xxxx/cve-mcp-server/venv/bin/python \
  -m cve_mcp.server

Replace the path if your clone lives elsewhere — always use absolute paths.

Verify:

claude mcp list

Expected:

cve-mcp: .../venv/bin/python -m cve_mcp.server
  Scope: Project config (shared via .mcp.json)
  Status: ✓ Connected

Step 6b: Approve the server (one-time)

The first time you open Claude in this project, you may see:

⏸ Pending approval (run `claude` to approve)

Run claude from ~/Desktop/cve-mcp-server
When prompted, trust/approve cve-mcp for this project
Run claude mcp list again — status should be Connected

This is a security gate: Claude won’t run project MCP servers until you explicitly allow them.

Step 6c: Project config (.mcp.json)

Claude Code stores project MCP settings in .mcp.json. Example for macOS:

{
  "mcpServers": {
    "cve-mcp": {
      "command": "/Users/xxxx/xxxx/cve-mcp-server/venv/bin/python",
      "args": ["-m", "cve_mcp.server"],
      "cwd": "/Users/xxxx/xxxx/cve-mcp-server"
    }
  }
}

Notes:

No API keys in JSON if you use .env — keep secrets in .env only
Windows readers: use venv\Scripts\python.exe and a Windows absolute cwd
Commit .mcp.json only if paths are generic or documented; machine-specific paths are often kept local

Alternative: pass GITHUB_TOKEN via CLI

If .env isn’t loaded (unusual when cwd is correct):

claude mcp remove cve-mcp -s project

claude mcp add cve-mcp \
  -e GITHUB_TOKEN=ghp_your_token_here \
  -- /Users/xxxx/xxxx/cve-mcp-server/venv/bin/python \
  -m cve_mcp.server

Step 7: Your First Real Queries (Copy for Medium)

After Steps 1–6, Claude Code is connected to cve-mcp with your GitHub token in .env. Step 7 is where it pays off: one question, many APIs, correlated answers.

Before you ask anything

cd ~/xxxx/cve-mcp-server
Run claude
Approve cve-mcp (pick option 2 — trust for all future sessions in this project)
Confirm: claude mcp list → ✓ Connected

Query 1: Log4Shell triage (free tools + GitHub token)

Prompt:

What is CVE-2021-44228? Is it in CISA KEV? What is the EPSS score? Are there public exploits on GitHub? Be concise and cite tool results.

Tools Claude used (live run):

| Tool | Source |
| ---------------------------- | -------------------------------------------------- |
| `lookup_cve` | NVD (free tier) |
| `check_kev` | CISA Known Exploited Vulnerabilities (KEV) Catalog |
| `get_epss_score` | EPSS (Exploit Prediction Scoring System) |
| `check_exploit_availability` | GitHub (using your personal access token) |

Actual result summary:

CVSS 10.0 CRITICAL — Log4j2 RCE
CISA KEV: Yes — added 2021–12–10, known ransomware use
EPSS: 94.36% (100th percentile)
GitHub PoCs: 7 repos (e.g., Puliczek bypass PoC ★950)
Verdict: Emergency patch priority

No NVD_API_KEY needed for this demo; NVD ran at 5 req/30s.

Query 2: Scan Python dependencies (no keys)

Prompt:

Scan these PyPI packages for vulnerabilities: requests 2.28.0, flask 2.2.0, django 3.2.0. List CVEs found and severity.

Tool: scan_dependencies → OSV.dev (free)

Actual result summary:

## Example Dependency Scan Results

| Package | CVEs | Worst Finding |
|---------|------|---------------|
| `requests` 2.28.0 | 5 | **MEDIUM** (e.g., CVE-2023-32681) → upgrade to ≥ 2.32.4 |
| `flask` 2.2.0 | 3 | **HIGH** CVE-2023-30861 → upgrade to ≥ 2.2.5 |
| `django` 3.2.0 | 55 | **CRITICAL** CVE-2022-34265 (CVSS 9.8), EPSS 92.83% |

Verdict: Upgrade django to the latest 3.2.x LTS immediately.

Query 3: GitHub Security Advisories (uses your token)

Prompt:

Search GitHub security advisories for django in the pip ecosystem. Summarize top findings.

Tool: scan_github_advisories (benefits from GITHUB_TOKEN)

Actual result summary:

~300 advisories spanning 2008–2026
Recent (2025–2026): DoS / algorithmic complexity/timing
High-impact classics: SQLi (CVE-2022–28346, CVE-2020–9402)
Takeaway: Stay on a supported Django LTS

Conclusion

CVE triage used to mean a dozen browser tabs and mental glue work — NVD, EPSS, CISA KEV, GitHub, and more. In this walkthrough we installed CVE MCP Server (open source, by Mahipal Jangra) and wired it into Claude Code so Claude can call 27 tools across 21 data sources over a single protocol.

You cloned the repo, created a venv, installed the package, confirmed the server starts, added a GitHub token (without waiting on NVD approval), approved the MCP server in Claude Code, and ran three real queries:

Log4Shell — CVSS, KEV, EPSS, and public PoCs in one answer
PyPI dependency scan — no extra API keys
Django GitHub advisories — powered by your GitHub token

That’s the point: one question, correlated intelligence, seconds instead of minutes per CVE.

What to do next

Keep using it — Paste CVE IDs, requirements.txt lines, or suspicious IPs into Claude and let the server orchestrate lookups.
Add NVD_API_KEY when you can — Free from NIST; removes the 5 req/30s limit and speeds up NVD-heavy workflows.
Add Tier 2 keys only if you need them — AbuseIPDB, GreyNoise, Shodan, VirusTotal for IP and malware demos.
Star the repo if this saved you time: github.com/mukul975/cve-mcp-server — contributions and issues are welcome on the upstream project.
Report bugs upstream — Installation problems in this post vs bugs in the server itself; the latter belong on the project’s GitHub.

A note on scope

CVE MCP Server is read-only intelligence — it does not scan your network or exploit targets. API keys stay in .env; use redacted values in screenshots and posts. All traffic is outbound HTTPS; private IPs are blocked on network tools.

GARS-2026

If you use agentic AI in security workflows, consider the GARS-2026 (Global Agentic AI Readiness Survey) — 60 questions, anonymous, supervised by SRH Berlin. It measures how ready teams are for MCP, tool calling, and human-in-the-loop security automation.

Survey: mahipal.engineer/survey

Closing line

Security work shouldn’t require fifteen tabs for one CVE. CVE MCP Server turns that workflow into a conversation — and after Steps 1–7, you’ve got it running on your machine.

This was an independent setup guide. Credit for the project goes to Mahipal Jangra. MIT licensed.

Thank you so much for reading

Like | Follow | Subscribe to the newsletter.

Catch us on

LinkedIn: https://www.linkedin.com/in/techlatest-net/

Reddit Community: https://www.reddit.com/user/techlatest_net/

Claude Opus 4.8: The Complete Guide to Anthropic’s Most Powerful AI Model Yet

TechLatest — Fri, 29 May 2026 08:08:48 +0000

Anthropic has officially released Claude Opus 4.8 , its most capable generally available AI model to date. Building upon the strong foundation of Claude Opus 4.7, the new release introduces improvements across coding, agentic workflows, reasoning, tool usage, long-context handling, and developer productivity.

The launch also introduces several ecosystem enhancements, including Dynamic Workflows for Claude Code , Effort Control , Fast Mode , Mid-Conversation System Messages , and improved prompt caching.

For developers, AI engineers, DevRel teams, cybersecurity researchers, and enterprises building AI-native products, Claude Opus 4.8 represents one of the most significant upgrades in the Anthropic ecosystem.

In this guide, we’ll cover:

What Claude Opus 4.8 is
Key improvements over Opus 4.7
Benchmark performance
Claude Code enhancements
Cursor workflows
API changes
Effort levels explained
Fast Mode
Long-context capabilities
Migration guide
Practical developer workflows
Pricing
What comes next

What is Claude Opus 4.8?

Claude Opus 4.8 is Anthropic’s flagship large language model designed for:

Advanced reasoning
Long-horizon agentic coding
Software engineering
Research workflows
Multi-step planning
Enterprise automation
Cybersecurity analysis
Large context understanding

Anthropic describes it as their most capable generally available model , surpassing Claude Opus 4.7 in nearly every major category while maintaining API compatibility.

Unlike many benchmark-focused releases, Opus 4.8 focuses heavily on:

Reliability
Honest reasoning
Reduced hallucinations
Better judgment
Stronger agent workflows

Why Claude Opus 4.8 Matters

Modern AI development increasingly relies on autonomous systems that can:

Analyze repositories
Refactor codebases
Perform migrations
Run tools
Execute commands
Verify outputs

The challenge has never been raw intelligence alone.

The challenge is:

Can the model consistently make good decisions over long periods of time?

Anthropic’s answer with Opus 4.8 is improved:

Agent reliability
Long-context retention
Tool usage accuracy
Self-correction
Uncertainty reporting

This makes Opus 4.8 particularly valuable for engineering teams using AI in production.

Benchmarks

| Benchmark | Claude Opus 4.8 | Claude Opus 4.7 | GPT-5.5 | Gemini 3.1 Pro |
| ------------------------------------------------------------------- | --------------- | --------------- | --------- | -------------- |
| **Agentic Coding (SWE-Bench Pro)** | **69.2%** | 64.3% | 58.6% | 54.2% |
| **Agentic Terminal Coding (Terminal-Bench 2.1)** | 74.6% | 66.1% | **78.2%** | 70.3% |
| **Multidisciplinary Reasoning (Humanity's Last Exam - No Tools)** | **49.8%** | 46.9% | 41.4% | 44.4% |
| **Multidisciplinary Reasoning (Humanity's Last Exam - With Tools)** | **57.9%** | 54.7% | 52.2% | 51.4% |
| **Agentic Computer Use (OSWorld-Verified)** | **83.4%** | 82.8% | 78.7% | 76.2% |
| **Knowledge Work (GDPval-AA)** | **1890** | 1753 | 1769 | 1314 |
| **Agentic Financial Analysis (Finance Agent v2)** | **53.9%** | 51.5% | 51.8% | 43.0% |

Key Takeaways

Claude Opus 4.8 leads in 6 out of 7 benchmarks.
It achieves the highest score in SWE-Bench Pro (69.2%), demonstrating strong real-world software engineering capabilities.
GPT-5.5 remains the leader in Terminal-Bench 2.1 (78.2%), indicating stronger terminal-based agent performance.
Claude Opus 4.8 delivers the best results in:

✅ Agentic Coding

✅ Multidisciplinary Reasoning

✅ Computer Use

✅ Knowledge Work

✅ Financial Analysis

The jump from Opus 4.7 → Opus 4.8 is consistent across every benchmark, showing Anthropic’s focus on improving reliability, reasoning, and long-horizon agent workflows.

Major Improvements in Claude Opus 4.8

1. Better Agentic Coding

One of the largest improvements is in long-running coding tasks.

Anthropic specifically optimized:

Codebase-scale understanding
Refactoring
Repository navigation
Large-scale migrations
Multi-step engineering tasks

Developers reported that Opus 4.8:

Gets lost less frequently
Handles context better
Produces fewer broken implementations
Recovers better after context compression

This is especially important for:

Claude Code
Cursor
IDE agents
Autonomous software engineering systems

2. Improved Honesty and Reliability

A common AI problem is premature confidence.

Models often:

Assume success
Hide uncertainty
Miss edge cases
Claim tasks are completed when they are not

Anthropic reports that Opus 4.8 is approximately:

4× less likely to allow flaws in generated code to pass without mentioning them.

Instead, it more frequently:

Flags uncertainty
Requests clarification
Notes limitations
Reports incomplete work

For production engineering environments, this behavior is extremely valuable.

3. Better Tool Usage

Tool calling is critical for modern AI agents.

Opus 4.8 improves:

Tool selection
Tool triggering
Multi-step tool chains
Agent decision making

Anthropic specifically targeted a weakness in Opus 4.7 where the model occasionally skipped tools that should have been used.

The new version is significantly more reliable when deciding:

When to search
When to execute
When to inspect files
When to call APIs

4. Long Context Improvements

Claude Opus 4.8 includes:

1 Million Token Context Window

Available on:

Claude API
Amazon Bedrock
Google Vertex AI

Microsoft Foundry currently supports:

200K token context

This massive context window allows developers to work with:

Entire repositories
Large documentation sets
Enterprise knowledge bases
Massive logs
Multi-file projects

without aggressive chunking strategies.

Getting Started with Claude Opus 4.8 in Anthropic Workbench

Before exploring advanced workflows, developers can experiment with Claude Opus 4.8 directly inside Anthropic’s Workbench. The environment allows prompt engineering, model evaluation, API testing, and workflow prototyping without writing any application code.

Anthropic Workbench provides a playground for testing Claude Opus 4.8 prompts, system instructions, and model configurations before deploying them into production.

Dynamic Workflows in Claude Code

Perhaps the most exciting release is:

Dynamic Workflows

This feature enables Claude Code to:

Plan work
Spawn hundreds of parallel sub-agents
Execute tasks simultaneously
Verify outputs
Merge findings

Instead of a single linear agent workflow, Claude can coordinate large numbers of specialized workers.

Example:

A large enterprise migration involving:

300,000+ lines of code
Hundreds of files
Multiple frameworks

can now be broken into parallel tasks and completed significantly faster.

Anthropic positions this as the future of AI-assisted software engineering.

Effort Control: A New Way to Use Claude

Anthropic now gives users direct control over how much reasoning Claude performs.

Available Effort Levels

Low

Best for:

Quick answers
Documentation lookup
Fast interactions

Benefits:

Lower latency
Lower token consumption

Medium

Good balance between:

Cost
Speed
Quality

Ideal for most day-to-day work.

High (Default)

The new default setting.

Optimized for:

Coding
Analysis
Research
Agent workflows

Provides stronger reasoning while maintaining reasonable response times.

Extra / XHigh

Recommended for:

Difficult engineering tasks
Architecture reviews
Complex debugging
Long-running workflows

Uses more reasoning tokens for higher quality outputs.

Max

Highest reasoning investment.

Best reserved for:

Mission-critical tasks
Research
Advanced problem solving

Fast Mode

Anthropic also introduced:

Claude Opus 4.8 Fast Mode

Fast Mode can generate outputs up to:

2.5× faster

than standard Opus execution.

This is particularly useful for:

Coding assistants
Interactive IDE workflows
Enterprise applications
Agent pipelines

Fast Mode delivers:

Higher throughput
Reduced waiting times
Improved developer experience

while still using the same underlying Opus 4.8 model.

Claude Code Workflows

Opus 4.8 shines inside Claude Code.

Workflow #1: Large Repository Refactoring

Example prompt:

Analyze this repository and migrate all legacy authentication middleware to the new architecture.

Opus 4.8 can:

Discover affected files
Create migration plans
Apply changes
Run tests
Verify results

Workflow #2: Architecture Reviews

Prompt:

Review the codebase for scalability bottlenecks and propose improvements.

Claude can:

Identify hotspots
Suggest patterns
Recommend optimizations
Generate implementation plans

Workflow #3: Automated Bug Hunting

Prompt:

Investigate intermittent failures in CI and determine likely root causes.

Opus 4.8 performs:

Log analysis
Dependency inspection
Code tracing
Hypothesis generation

Using Claude Opus 4.8 in Cursor

Cursor users can benefit significantly from Opus 4.8.

Recommended use cases:

Code Reviews

Pull request reviews
Security analysis
Performance audits

Repository Understanding

Ask Claude:

Explain this architecture and identify technical debt.

The 1M context window allows much deeper repository understanding.

Multi-File Refactoring

Claude excels at:

Framework migrations
API upgrades
Dependency modernization

across large codebases.

Documentation Generation

Generate:

Architecture docs
README files
API documentation
Internal onboarding guides

with significantly better context awareness.

API Enhancements

Mid-Conversation System Messages

One of the most important API updates.

Previously:

Updating instructions often required rebuilding conversation history.

Now developers can inject:

{
  "role": "system",
  "content": "Updated instructions"
}

mid-conversation.

Benefits:

Better prompt caching
Lower costs
Cleaner agent architectures
Dynamic permissions

This is particularly useful for:

Multi-agent systems
Autonomous workflows
Long-running tasks

Refusal Stop Details

Refusals now provide richer metadata.

Applications can distinguish between:

Safety refusals
Capability limitations
Policy constraints

allowing better routing and user experiences.

Lower Prompt Cache Threshold

Previous minimum:

Higher token requirement

New minimum:

1,024 tokens

Benefits:

More cache hits
Lower costs
Faster repeated workflows

without requiring code changes.

Adaptive Thinking

Claude Opus 4.8 continues using:

Adaptive Thinking

Instead of always reasoning, the model decides:

When deep thinking is necessary
When a direct response is sufficient

Advantages:

Reduced token waste
Faster responses
Improved efficiency

Simple questions receive direct answers.

Complex problems trigger deeper reasoning automatically.

Benchmark Performance

Anthropic reports improvements across:

Coding
Agentic tasks
Tool usage
Reasoning
Practical knowledge work

Key highlights include:

Better long-horizon performance
Stronger software engineering capabilities
Improved real-world task completion
More reliable autonomous workflows

Perhaps most importantly:

The gains are not limited to benchmark scores.

They are visible in actual developer workflows.

Migration Guide

Upgrading from Opus 4.7 is straightforward.

Change Model Name

Before:

model = "claude-opus-4-7"

After:

model = "claude-opus-4-8"

Review Effort Settings

Opus 4.8 defaults to:

effort = "high"

For coding workflows:

effort = "xhigh"

is often recommended.

Remove Context Window Beta Headers

The 1M token context window is now standard.

Legacy beta headers can be removed.

Adopt Mid-Conversation System Messages

This is one of the easiest ways to:

Reduce costs
Improve caching
Simplify agent design

Pricing

Standard Mode:

$5 / million input tokens
$25 / million output tokens

Fast Mode:

$10 / million input tokens
$50 / million output tokens

Despite the capability improvements, standard pricing remains unchanged from Opus 4.7.

What About Claude Mythos?

Anthropic also revealed progress on:

Claude Mythos

Currently available to a limited group of organizations under Project Glasswing.

Mythos is expected to:

Exceed Opus-level intelligence
Target cybersecurity workloads
Require stronger safeguards

Anthropic plans broader availability after completing safety evaluations.

This suggests Opus 4.8 may be the final major step before Anthropic introduces an entirely new capability tier.

Final Verdict

Claude Opus 4.8 is not a revolutionary jump over Opus 4.7, but it is a meaningful upgrade in the areas that matter most to developers.

Its strengths include:

✅ Better coding performance

✅ Improved agent reliability

✅ Stronger long-context handling

✅ Better tool usage

✅ More honest reasoning

✅ Dynamic Workflows in Claude Code

✅ 1M token context window

✅ Effort control

✅ Faster execution options

For developers using Claude Code, Cursor, IDE agents, autonomous coding systems, or enterprise AI workflows, Claude Opus 4.8 is currently one of the strongest AI models available in production.

The combination of stronger reasoning, improved honesty, large-context understanding, and scalable agent workflows makes it a compelling choice for teams building the next generation of AI-powered software.

CVE Lite CLI: The Dependency Scanner That Actually Tells You What to Run (Not Just What’s Broken)

TechLatest — Mon, 25 May 2026 17:24:25 +0000

Last week, I was 20 minutes from pushing a hotfix. CI passed. Tests green. Then Dependabot pinged: “12 vulnerabilities found.”

I clicked through. Got a list of CVE IDs. No fix commands. No “upgrade this, not that.” Just a wall of red and a vague sense of dread.

I spent the next hour:

Googling each CVE
Checking if it was direct or transitive
Figuring out which parent package to bump
Testing if the upgrade broke anything
Finally, writing the right npm install command

By the time I pushed, the “quick fix” wasn’t quick at all.

If you’ve shipped JavaScript or TypeScript, you know this feeling. The gap between “something’s vulnerable” and “here’s exactly what to run to fix it” is where good intentions go to die.

That’s the exact problem CVE Lite CLI tries to solve.

It’s not another dashboard. Not another CI gate that blocks your PR at 2 AM. It’s a lightweight, local-first CLI that reads your lockfile, checks for known vulnerabilities, and spits out copy-and-run fix commands.

No account. No config. No source code leaves your machine.

I installed it yesterday. Scanned a few real projects. Here’s what actually happened — and whether it’s worth adding to your workflow.

First Things First: What Is This Thing, Really?

CVE Lite CLI is an OWASP Incubator Project — peer-reviewed by the same org behind the OWASP Top 10 — that scans your package-lock.json, pnpm-lock.yaml, yarn.lock, or bun.lock for known vulnerabilities.

But here’s the twist: instead of dumping a list of CVE IDs and calling it a day, it gives you:

✅ Copy-and-run fix commands — npm install @, pnpm add @, etc.

✅ Direct vs. transitive visibility — shows if the vuln is in something you installed or buried three levels deep

✅ Parent-aware remediation — for transitive deps, it tells you whether npm update Is enough, or if you need to bump the parent itself

✅ Offline mode — sync the advisory DB once, scan forever with zero network calls

✅ Usage-aware filtering — optionally check if vulnerable packages are actually imported in your code (cuts noise fast)

It’s built for the moment right before you push: fast, honest, and actionable.

Why This Feels Different (The Philosophy)

Most security tooling is designed for pipelines, not people.

Dependabot files PRs you’ll merge eventually. CI scanners block builds hours after you’ve context-switched. Dashboards surface CVE IDs with no clear path to resolution.

By the time you see a finding, the code is already reviewed, the momentum is gone, and you’re just trying to unblock the merge.

CVE Lite CLI flips that. It assumes:

“The best time to fix a vulnerable dependency is when you’re already in the terminal, about to push — not after CI fails.”

So it runs locally. It’s fast. It gives you the exact command to run. And it gets out of your way.

That’s not flashy. But it’s how real developers work.

Step 1: Installing CVE Lite CLI

Getting started takes less than a minute. No accounts, no cloud onboarding, no configuration files.

# Create a working directory
mkdir cve-lite-blog-test
cd cve-lite-blog-test

# Verify local environment
npm -v
# 10.8.2

node -v
# v20.20.2

# Install globally
npm install -g cve-lite-cli

The install pulls in ~43 packages and completes in ~16 seconds on a standard connection. A deprecation warning prebuild-install may appear—this is a transitive dependency notice and doesn’t block functionality. npm may also surface a version update prompt; neither requires action to run the scanner.

Step 2: Preparing a Controlled Test Environment

To evaluate CVE Lite CLI against a known baseline, we scaffolded a minimal Node.js project and intentionally installed dependency versions with documented vulnerabilities.

# Initialize a default package.json
npm init -y

# Install known vulnerable versions for testing
npm install lodash@4.17.20 express@4.17.1

npm init -y generates a standard package.json with default fields. The subsequent install pulls in lodash@4.17.20 and express@4.17.1, along with their transitive dependencies.

npm’s built-in audit immediately flags the risk:

Added 51 packages, and audited 52 packages in 2s

8 vulnerabilities (3 low, 5 high)

To address all issues, run:

npm audit fix

This output is familiar to any JavaScript developer. It confirms vulnerabilities exist and suggests a bulk fix command. However, it doesn’t clarify which vulnerabilities are direct vs. transitive, whether it npm audit fix will introduce breaking changes, or which parent packages actually need updating.

This is where CVE Lite CLI’s workflow diverges. Instead of a generic fix suggestion, it parses the same lockfile and returns a structured remediation plan with package-manager-aware commands, dependency path context, and severity prioritization.

Step 3: Running the First Scan (And Dealing With Unexpected Results)

With the test project ready, we ran the initial CVE Lite CLI scan:

cve-lite .

The output was immediate:

CVE Lite CLI (1.17.3)
✓ Scan dependencies
✓ Highlight critical issues
✓ Show a clear fix plan

Fast. Local. Developer-first.

Advisory source: OSV (https://api.osv.dev)
Parsed 69 packages from package-lock (package-lock.json)
✓ Queried OSV in 1 batch
✓ Scan complete. No known vulnerabilities found.

npm audit just reported 8 vulnerabilities, but CVE Lite found none.

This isn’t a bug. It’s a feature of how different vulnerability databases work:

npm audit checks against the npm security advisory database, which includes npm-specific metadata and sometimes broader matching rules
CVE Lite CLI queries the OSV (Open Source Vulnerabilities) database, which is a curated, cross-ecosystem standard

The discrepancy likely means:

npm’s database has broader matching (e.g., flagging version ranges rather than exact versions)
Some npm advisories haven’t been mirrored to OSV yet
npm may have already applied silent fixes during install

To verify what’s actually installed:

npm list lodash express

This shows the exact resolved versions in the dependency tree. If npm auto-fixed during install, the vulnerable versions might already be gone.

Step 4: Forcing the Vulnerable Baseline (Why npm “Helped” Too Much)

The npm list output confirms what happened:

express@4.22.2
lodash@4.18.1

Instead of installing express@4.17.1 and lodash@4.17.20, NPM's semver resolver automatically upgraded both packages to the latest patch versions within their major ranges. This is npm's default behavior when newer, non-vulnerable releases exist, and it's exactly what you want in production.

For testing purposes, however, it means our dependency tree is already clean. To demonstrate CVE Lite CLI’s remediation workflow, we need to pin the exact vulnerable versions and prevent automatic resolution.

# Remove existing modules and lockfile to start fresh
rm -rf node_modules package-lock.json

# Force exact vulnerable versions in package.json
npm install lodash@4.17.20 express@4.17.1 --save-exact

# Verify the resolved versions
npm list lodash express

Expected output:

cve-lite-blog-test@1.0.0 /path/to/project
├── express@4.17.1
└── lodash@4.17.20

With the vulnerable baseline locked in place, we can now run CVE Lite CLI against a dependency tree that actually contains known advisory matches.

Terminal showing npm list output with express@4.22.2 and lodash@4.18.1, followed by the clean reinstall and verification commands.

Next: Running cve-lite . against the pinned vulnerable versions to capture the actual findings, dependency path context, and generated fix commands.

Step 5: Running the Scan Against a Vulnerable Baseline (And Reading the Output)

After pinning the exact vulnerable versions (lodash@4.17.20 and express@4.17.1) and regenerating the lockfile, we ran the scanner:

cve-lite .

Here’s the actual output from our test environment:

>_ CVE Lite CLI (1.17.3)
────────────────────────────────
✔ Scan dependencies
✔ Highlight critical issues
✔ Show a clear fix plan

Fast. Local. Developer-first.

Advisory source: OSV (https://api.osv.dev)
Parsed 51 packages from package-lock (package-lock.json)
✓ Queried OSV in 1 batch
✓ Loaded 17 vulnerability detail records
⠙ Analyzing vulnerability findings 1/14: validating fix target for body-parser
⠹ Analyzing vulnerability findings 2/14: validating fix target for cookie@0.4.
⠸ Analyzing vulnerability findings 2/14: validating fix target for cookie@0.4.
⠼ Analyzing vulnerability findings 3/14: validating fix target for express@4.1
⠴ Analyzing vulnerability findings 4/14: validating fix target for lodash@4.17
⠦ Analyzing vulnerability findings 4/14: validating fix target for lodash@4.17
⠧ Analyzing vulnerability findings 5/14: validating fix target for path-to-reg
⠋ Analyzing vulnerability findings 7/14: validating fix target for send@0.17.1
⠙ Analyzing vulnerability findings 8/14: validating fix target for serve-stati
⠹ Analyzing vulnerability findings 8/14: validating fix target for serve-stati
⠸ Analyzing vulnerability findings 9/14: resolving remediation for body-parser
⠼ Analyzing vulnerability findings 10/14: resolving remediation for cookie@0.4
⠴ Analyzing vulnerability findings 11/14: resolving remediation for path-to-re
⠦ Analyzing vulnerability findings 12/14: resolving remediation for qs@6.7.0..
⠧ Analyzing vulnerability findings 13/14: resolving remediation for send@0.17.
⠇ Analyzing vulnerability findings 14/14: resolving remediation for serve-stat
✓ Analyzed vulnerability findings

────────────────────────────────
📦 Vulnerabilities found
────────────────────────────────

HIGH lodash@4.17.20
            Direct dependency
            Fix: upgrade to 4.18.0

HIGH body-parser@1.19.0
            Transitive dependency
            Fix: upgrade express to 4.22.0

HIGH path-to-regexp@0.1.7
            Transitive dependency
            Fix: upgrade express to 4.22.0

────────────────────────────────
🛠 Copy And Run These Fix Commands
────────────────────────────────

Detected package manager: npm (package-lock.json)
1 command group ready across 2 packages (1 high).
Validation: scanned 3 package versions; 2 are still known vulnerable.

High severity fix commands
> npm install express@4.22.0 lodash@4.18.0

────────────────────────────────
Summary
────────────────────────────────

8 packages · 17 CVEs
4 high · 1 medium · 3 low
2 direct · 6 transitive

✖ Scan complete. 4 urgent issues found.
Run with --verbose for fix plan, paths, and full table.

How to Read This Output (Without Getting Overwhelmed)

The scan completes in under 3 seconds and structures findings around action, not just awareness.

| Section | What it tells you | Why it matters for engineering teams |
| ----------------------------------------------- | ---------------------------------------- | ---------------------------------------------------------------------------------------------------------------- |
| `Parsed 51 packages` | Scope of the dependency tree | Confirms the scanner is analyzing your actual lockfile, not a cached snapshot |
| `HIGH / MEDIUM / LOW` | Severity tier mapped to CVSS/OSV scoring | Enables triage by business impact, not just vulnerability count |
| `[Direct dependency] / [Transitive dependency]` | Ownership context | Tells you whether your team controls the fix or needs to coordinate with a parent package maintainer |
| `Fix: upgrade to X.Y.Z` | Exact, package-manager-aware command | Copy, paste, run. No advisory page hunting, no version guessing |
| `1 command group ready across 2 packages` | Consolidated remediation | Instead of multiple separate `npm install` commands, you get one grouped command that resolves multiple findings |

Key observation: The scanner identified that updating express@4.22.0 resolves both the body-parser and path-to-regexp transitive vulnerabilities. This parent-aware logic prevents the common anti-pattern of manually pinning transitive dependencies, which often breaks future semver resolution or introduces compatibility drift.

What This Means for Your Workflow

Before CVE Lite CLI, resolving these four high-severity findings would typically involve:

Opening each CVE link in a browser
Checking whether the vulnerability applies to your usage pattern
Determining if the package is direct or transitive
Researching the minimum safe version for each dependency
Constructing the correct npm install or npm update command
Testing whether the upgrade introduces breaking changes

With CVE Lite CLI, that workflow collapses to:

Run cve-lite .
Copy the suggested command: npm install express@4.22.0 lodash@4.18.0
Run it
Rescan to verify

That’s not automation replacing judgment. It’s tooling removing friction so engineers can focus on what actually requires human insight: impact assessment, compatibility testing, and release coordination.

Terminal output showing the structured finding list with severity badges, dependency types, and the consolidated fix command.

Step 6: Applying the Fix and Verifying the Result (Real Iterative Workflow)

CVE Lite CLI surfaced four high-severity findings and returned a consolidated remediation command. We applied the fix:

# Apply the consolidated fix command from Step 5
npm install express@4.22.0 lodash@4.18.0

npm upgraded both packages, updated the lockfile, and reinstalled affected transitive dependencies. Then we rescanned to verify:

cve-lite .

Here’s the actual output after the first round of fixes:

>_ CVE Lite CLI (1.17.3)
────────────────────────────────
✔ Scan dependencies
✔ Highlight critical issues
✔ Show a clear fix plan

Fast. Local. Developer-first.

Advisory source: OSV (https://api.osv.dev)
Parsed 70 packages from package-lock (package-lock.json)
Cache: 51 package match records, 17 advisory detail records
✓ Queried OSV in 1 batch
✓ Loaded 1 vulnerability detail record
✓ Analyzed vulnerability findings

────────────────────────────────
📦 Vulnerabilities found
────────────────────────────────

────────────────────────────────
🛠 Copy And Run These Fix Commands
────────────────────────────────

Detected package manager: npm (package-lock.json)
1 command group ready across 1 package (1 medium).

Medium severity parent upgrades
> npm install express@4.22.2

────────────────────────────────
Summary
────────────────────────────────

1 package · 1 CVE
1 medium
0 direct · 1 transitive

▲ Scan complete. 1 issue found.
Run with --verbose for fix plan, paths, and full table.

| Observation | What it means | Why it matters |
| ----------------------------------------------------- | ----------------------------------------------------- | ---------------------------------------------------------------- |
| `Parsed 70 packages (up from 51)` | New dependencies resolved during upgrade | Confirms the lockfile reflects the actual installed tree |
| `Loaded 1 vulnerability detail record (down from 17)` | Most findings resolved by the first fix | Shows measurable progress, not just “still broken” |
| `1 medium severity (down from 4 high)` | Risk reduced, not eliminated | Realistic expectation: remediation is iterative |
| `0 direct • 1 transitive` | Remaining issue is in a dependency of a dependency | Tells you the fix requires updating a parent, not pinning a leaf |
| `npm install express@4.22.2` | Consolidated command to resolve the remaining finding | One command, not three. Less cognitive load |

What This Output Tells You (And Why It’s Actually Good News)

Key insight: Dependency remediation is rarely a one-shot operation. You fix the highest-severity issues, rescan, and address the next layer. CVE Lite CLI makes this iterative loop visible and actionable — instead of hiding it behind a generic “run npm audit fix" suggestion.

Step 7: Applying the Final Fix

The scanner recommends a single command to resolve the remaining medium-severity finding:

# Apply the final parent upgrade
npm install express@4.22.2

Then rescan to confirm the tree is clean:

cve-lite .

Expected clean output:

>_ CVE Lite CLI (1.17.3)
────────────────────────────────
✔ Scan dependencies
✔ Highlight critical issues
✔ Show a clear fix plan

Fast. Local. Developer-first.

Advisory source: OSV (https://api.osv.dev)
Parsed 70 packages from package-lock (package-lock.json)
Cache: 51 package match records, 17 advisory detail records
✓ Queried OSV in 1 batch
✓ Loaded 0 vulnerability detail records
✓ Analyzed vulnerability findings

────────────────────────────────
📦 Vulnerabilities found
────────────────────────────────
✓ No known vulnerabilities found.

────────────────────────────────
Summary
────────────────────────────────

0 packages · 0 CVEs
0 high · 0 medium · 0 low

✓ Scan complete. All dependencies clean.

Verification: Cross-Check with npm Audit (Optional but Recommended)

To ensure alignment between scanning tools, cross-check with npm’s built-in audit:

npm audit

What This Means for Your Release Workflow

Before CVE Lite CLI, verifying a multi-stage fix required:

Running npm audit fix or manually constructing upgrade commands
Waiting for CI to re-run and report results
Checking dashboards to confirm findings were resolved
Often repeating the cycle if new transitive issues surfaced

With CVE Lite CLI, the loop collapses to:

Run cve-lite . → get fix command
Apply fix → rescan locally in seconds
Push when clean

That shift — from “wait for CI to tell me it’s broken” to “verify before I push” — is what reduces release friction and prevents vulnerable code from reaching review queues in the first place.

Terminal output showing post-fix scan with “No known vulnerabilities found” and clean npm audit output.

Step 8: Generating a Shareable HTML Report (For Compliance and Team Visibility)

Once the dependency tree is clean — or while findings still need remediation — teams often need to document the security posture for compliance audits, stakeholder updates, or handoff to other engineers. CVE Lite CLI can generate a self-contained HTML report that consolidates findings, fix commands, and severity summaries in a shareable format.

# Generate and automatically open HTML report
cve-lite . --report

Step 9: Testing Against Real-World Repositories (Beyond the Toy Project)

The minimal test project proved the workflow works. But engineering teams care about how tools behave against real codebases with complex dependency trees, monorepos, and transitive chains.

We tested CVE Lite CLI against three real projects to see how it scales:

Option A: OWASP Juice Shop (Deliberately Vulnerable)

OWASP Juice Shop is a deliberately insecure Node.js application designed for security training. It’s the perfect safe, legal target for testing vulnerability scanners.

# Clone Juice Shop
git clone https://github.com/juice-shop/juice-shop.git
cd juice-shop

# Install dependencies (this pulls in known vulnerable packages)
npm install

# Run CVE Lite CLI scan
cve-lite .

# Generate verbose output with full dependency paths
cve-lite . --verbose

# Create HTML report for documentation
cve-lite . --report ./juice-shop-report --no-open

Auto-Open in Browser

# Scan and automatically open report in your default browser
cve-lite . --report

This will:

Generate the HTML report in ./report directory (relative to your current working directory)
Automatically open report/index.html in your system's default browser
Keep the terminal free for other commands

Final Thoughts

Most vulnerability scanners are good at telling developers what’s broken.

Far fewer are good at telling them what to actually do next.

That’s where CVE Lite CLI feels different.

After testing it across both controlled environments and real-world repositories, the biggest takeaway wasn’t just that it detected vulnerabilities correctly — most modern scanners can do that. The real value was how much friction it removed from the remediation process itself.

Instead of:

digging through advisory pages
tracing transitive dependency chains manually
guessing safe upgrade versions
constructing install commands by hand

The workflow became:

cve-lite .

Copy the suggested fix command.

Run it.

Rescan.

Done.

That sounds simple, but simplicity is exactly what modern dependency security tooling has been missing.

The project also gets an important philosophical point right: developers are far more likely to fix vulnerabilities when the feedback loop happens locally, immediately, and inside their normal workflow — not hours later in a failing CI pipeline or buried inside a security dashboard.

And because the tool:

works offline
supports npm, pnpm, Yarn, and Bun
understands transitive remediation paths
integrates with SARIF and CI pipelines
generates shareable HTML reports
and now even plugs into AI coding assistants

…it fits naturally into both solo developer workflows and larger engineering environments.

Is it a replacement for full AppSec platforms? No.

It won’t detect malware hidden in packages before advisories exist. It won’t replace SAST, DAST, container scanning, SBOM management, or runtime protection. And it shouldn’t.

What it does instead is narrower — and arguably more useful day-to-day:

It helps developers fix dependency vulnerabilities faster, with less noise and less guesswork.

That’s a surprisingly important gap in the JavaScript ecosystem.

If your current workflow involves waiting for CI to fail, opening five browser tabs for every CVE, and manually piecing together remediation commands, CVE Lite CLI is absolutely worth testing.

Because at the end of the day, the best security tool is usually the one developers will actually use before they push code.

Thank you so much for reading

Like | Follow | Subscribe to the newsletter.

Catch us on

LinkedIn: https://www.linkedin.com/in/techlatest-net/