Hermes Agent: The Self-Improving Challenger to OpenClaw
Last month I wrote about the claw ecosystem — OpenClaw, NanoClaw, PicoClaw, ZeroClaw, and the managed variants. The post covered the security crisis, the alternatives, and which framework to pick depending on your priorities.
Since then, a new contender has entered the space. Not another Claw fork or rewrite — a fundamentally different approach to what an AI agent should be.
Hermes Agent by Nous Research launched on February 25, 2026. In under two months, it has accumulated 82,000+ GitHub stars, 11,000 forks, and contributions from 320+ developers. The latest release, v0.9.0, shipped on April 13 with mobile support, a local web dashboard, and the deepest security hardening pass yet.
Where the Claw ecosystem optimized for ecosystem breadth — more channels, more skills, more integrations — Hermes optimized for something else entirely: an agent that gets better the more you use it.
The Learning Loop
This is what makes Hermes fundamentally different from every framework I covered in the Claw post.
When you complete a complex task with Hermes, the agent doesn’t just forget what it did. It autonomously creates a structured skill document describing the approach, the tools used, and the outcome. Next time a similar task appears, the agent references that skill for faster, more accurate execution. Over time, these skills compound — the agent builds a library of proven solutions specific to your projects and workflows.
The learning system has three layers:
Persistent memory. Two markdown files — MEMORY.md (environment info, lessons learned, system state) and USER.md (your preferences, work style, decisions) — are loaded at session start. An SQLite database with FTS5 full-text search enables recall across weeks of sessions. The memory snapshot is frozen at session start to preserve the LLM’s prefix cache, which Nous Research claims reduces token costs by 80–90% compared to loading full context every turn.
Autonomous skill creation. After solving a complex task, Hermes generates a skill document following the agentskills.io open standard. These skills are portable — they work across any framework that supports the standard, including OpenClaw.
User modeling. Honcho dialectic modeling builds a deepening profile of who you are across sessions. The agent learns your coding style, your preferred tools, your decision patterns, and adapts its behavior accordingly.
No Claw framework does this. OpenClaw skills are static files you write and maintain. NanoClaw uses per-session CLAUDE.md files. ZeroClaw has SQLite memory with FTS5 search but no autonomous skill creation. Hermes is the first framework where the agent actively improves itself through use.
Bottom line: The learning loop is Hermes’s core bet. If it works as advertised, an agent that’s been running for months should meaningfully outperform one you just installed. Whether the skill quality holds up at scale remains to be seen — the project is only two months old.
Architecture
Language: Python (93%) | License: MIT | Stars: 82k+ | Created by: Nous Research
Hermes is a Python application that runs as a persistent process on your server. It exposes a terminal UI with multiline editing, slash-command autocomplete, and streaming tool output. A gateway process handles messaging platform connections, routing incoming messages to the agent.
Model flexibility
No vendor lock-in. Hermes supports 200+ models through Nous Portal, OpenRouter, OpenAI, Anthropic, Xiaomi MiMo, z.ai/GLM, Kimi/Moonshot, MiniMax, Hugging Face, Ollama, vLLM, SGLang, or any custom endpoint. Switch models with hermes model — no code changes. Automatic failover chains handle provider errors.
Messaging platforms
Telegram, Discord, Slack, WhatsApp, Signal, Email, iMessage (via BlueBubbles), WeChat, WeCom, Feishu/Lark, and CLI — 16 platforms total from a single gateway process. Not quite OpenClaw’s 50+, but covering the platforms most people actually use.
Execution model
40+ built-in tools covering terminal access, file operations, web browsing, browser automation, vision, image generation, text-to-speech, and multi-model reasoning.
Hermes spawns isolated subagents with their own conversations, terminals, and Python RPC scripts for zero-context-cost pipelines. Natural language cron scheduling handles reports, backups, and briefings running unattended through the gateway.
Deployment
Runs on a $5 VPS, a GPU cluster, or serverless infrastructure that hibernates when idle. The v0.9.0 release added Android/Termux support and a local web dashboard for managing settings, sessions, skills, and the gateway from a browser.
Security: Seven Layers Deep
In the Claw post, I argued that security was the defining differentiator in the agent ecosystem. OpenClaw had 9 CVEs, 135,000+ exposed instances, and a supply chain attack that compromised 1 in 5 ClawHub packages. NanoClaw bet on container isolation. ZeroClaw bet on Rust and allowlists.
Hermes takes a defense-in-depth approach with seven security layers. Here’s how each one works.
1. Command approval
Three configurable modes:
- Manual (default): Always prompts before executing dangerous patterns — recursive deletes, privilege escalation, system config overwrites, SQL drops, pipe-to-interpreter chains (
curl | sh), and process kills. - Smart: An auxiliary LLM auto-approves low-risk commands, auto-denies dangerous ones, and escalates uncertain cases to the user.
- Off (YOLO mode): Disables all safety checks. Activated via
--yoloflag,/yoloslash command, or environment variable.
Unanswered approval prompts default to denial after 60 seconds — fail-closed by design.
2. Tirith pre-execution scanner
Before any command runs, Tirith scans for prompt injection, credential exfiltration, SSH backdoor patterns, homograph URL spoofing, and pipe-to-interpreter attacks. Auto-installs with SHA-256 verification. If Tirith is unavailable, execution proceeds by default (tirith_fail_open: true) — a configurable trade-off between availability and security.
3. Container isolation
Six sandbox backends:
| Backend | Isolation | Cmd Approval | Use Case |
|---|---|---|---|
| Local | None | Yes | Development |
| Docker | Container | Skipped | Production |
| SSH | Remote machine | Yes | Separate server |
| Singularity | Container | Skipped | HPC clusters |
| Modal | Cloud sandbox | Skipped | Scalable compute |
| Daytona | Cloud workspace | Skipped | Persistent dev envs |
When running in Docker, containers launch with --cap-drop ALL, --security-opt no-new-privileges, process limits (--pids-limit 256), and noexec,nosuid tmpfs mounts. Command approval is skipped inside containers because the container itself provides the security boundary.
4. Credential management
Environment variables containing KEY, TOKEN, SECRET, PASSWORD, CREDENTIAL, PASSWD, or AUTH are blocked by default in execute_code and terminal tools. MCP subprocesses receive only safe variables (PATH, HOME, USER, LANG, TERM, SHELL, TMPDIR, XDG_*) plus explicitly configured overrides. GitHub PATs, OpenAI keys, and bearer tokens are redacted from output.
5. Context file injection protection
Hermes scans AGENTS.md, .cursorrules, and SOUL.md files for prompt injection, hidden instructions, credential theft patterns, and invisible Unicode characters before loading them into context.
6. SSRF protection
Private networks (RFC 1918), loopback, link-local, CGNAT, cloud metadata hostnames, and reserved ranges are blocked. DNS failures are treated as blocked — fail-closed.
7. DM pairing authentication
New messaging connections require an 8-character pairing code (32-char unambiguous alphabet) with 1-hour TTL. Rate-limited to 1 request per 10 minutes, max 3 pending codes, and 5 failed attempts triggers a 1-hour lockout. Pairing files are chmod 0600.
Bottom line: Hermes ships with more security layers than any Claw framework except ZeroClaw. The Tirith scanner and context file injection protection address threat vectors that no Claw framework handles. The weak spots are Docker running as root by default and tirith_fail_open: true as the default — both configurable, but insecure defaults remain insecure defaults.
Hermes vs OpenClaw
| OpenClaw | Hermes Agent | |
|---|---|---|
| Language | TypeScript | Python |
| GitHub Stars | 345k+ | 82k+ |
| License | MIT | MIT |
| Core Philosophy | Gateway — routing, permissions, channels | Learning loop — skills that improve over time |
| Skill Ecosystem | ClawHub (5,000+ static skills) | Auto-generated + agentskills.io |
| Memory Model | Manual (developer-maintained files) | Autonomous (FTS5 + user modeling) |
| LLM Providers | 15+ | 200+ (via OpenRouter + direct) |
| Messaging Channels | 50+ | 16 |
| Sandbox | Docker (documented escapes) | 6 backends + container hardening |
| Pre-exec Scanning | None | Tirith (injection, exfiltration, backdoors) |
| Supply Chain Risk | High (36% malicious skills in audit) | Low (no marketplace, conservative vetting) |
| Self-Improvement | None | Autonomous skill creation + refinement |
Security Head-to-Head
Readers of the Claw post asked me to compare security models more directly. Here’s how Hermes stacks up against OpenClaw on the dimensions that matter most.
Default network exposure. OpenClaw binds to 0.0.0.0:18789 by default — the single decision responsible for 135,000+ exposed instances. Hermes binds to localhost by default. The gateway requires explicit DM pairing with rate limiting and lockout. Advantage: Hermes.
Supply chain risk. OpenClaw’s ClawHub had 36% of audited skills containing prompt injection, with over 1,184 malicious packages in the ClawHavoc campaign. Hermes has no centralized skill marketplace. Skills are generated locally by the agent or manually installed. The agentskills.io standard is emerging but hasn’t had a comparable supply chain incident. Advantage: Hermes.
Pre-execution scanning. OpenClaw has no built-in command scanning — dangerous commands run if the user (or the LLM) approves them. Hermes’s Tirith scanner inspects every command for injection patterns, credential exfiltration, and backdoor signatures before execution. This is a capability no Claw framework offers. Advantage: Hermes.
Container isolation. OpenClaw’s Docker sandbox has documented escape vulnerabilities (CVE-2026-24763). Hermes Docker containers run with --cap-drop ALL, --security-opt no-new-privileges, process limits, and noexec tmpfs mounts. However, a community security review found that Hermes containers run as root by default with no USER directive, and retained DAC_OVERRIDE capabilities increase the blast radius of any in-container compromise. Neither approach is airtight — NanoClaw’s per-session container isolation and ZeroClaw’s Rust memory safety remain stronger primitives.
Credential management. OpenClaw historically stored credentials in plaintext configuration files. Hermes filters environment variables by pattern, redacts tokens from output, and mounts credential files read-only in containers. ZeroClaw still leads here with encrypted-at-rest secrets, but Hermes is a significant step up from OpenClaw.
Context file protection. Hermes scans project configuration files (.cursorrules, AGENTS.md, SOUL.md) for prompt injection before loading them into context. No Claw framework does this. This matters because prompt injection via project files is an increasingly common attack vector in autonomous agents.
Bottom line: Hermes is meaningfully more secure than OpenClaw out of the box. It’s not as hardened as ZeroClaw (Rust memory safety, encrypted secrets, workspace scoping) or as isolated as NanoClaw (container-per-session), but it addresses threat vectors — pre-execution scanning, context file injection, credential filtering — that no Claw framework handles.
Where Hermes Falls Short
No framework is a free lunch. Here’s where Hermes has real weaknesses.
Smaller ecosystem. 82k stars is impressive for two months, but OpenClaw’s 345k stars and 5,000+ ClawHub skills represent a significantly larger community. If you need a pre-built integration for a niche messaging platform (IRC, Nostr, Twitch, Zalo), OpenClaw probably has it. Hermes probably doesn’t.
Messaging coverage gap. 16 platforms vs OpenClaw’s 50+. Hermes covers the mainstream channels well, but if your use case is multi-channel presence across every conceivable platform, OpenClaw is still the only option.
Docker root-by-default. The container runs as root with no USER directive. Combined with retained DAC_OVERRIDE capabilities, this weakens the container security boundary. Configurable, but the default should be non-root.
Tirith fail-open default. If the Tirith scanner is unavailable, commands execute without scanning. For an unattended gateway agent, this means a Tirith crash silently degrades the security posture. The default should be fail-closed.
Python dependency chain. Hermes is 93% Python. While this makes it accessible to contributors (Python is the lingua franca of ML/AI), it inherits Python’s dependency management challenges and lacks the memory safety guarantees of ZeroClaw (Rust) or the minimal dependency chain of PicoClaw (Go single binary).
Learning loop is unproven at scale. The autonomous skill creation is the core value proposition, but the project is two months old. How well do auto-generated skills hold up after six months of use? Do they accumulate noise? Does skill quality degrade as the library grows? These are open questions.
execute_code sandbox bypass. The execute_code tool includes terminal in its allowed tools, which can potentially bypass command approval. A known gap that undermines the otherwise thoughtful security model.
Updated Decision Framework
My Claw ecosystem post ended with a decision framework. Here’s the updated version with Hermes in the picture.
You want an agent that improves over time → Hermes Agent. The learning loop, persistent memory, and autonomous skill creation are unique in this space. Use Docker or Modal backend for production isolation.
You want maximum integrations and ecosystem → OpenClaw, but invest heavily in hardening. The security situation hasn’t materially changed since last month.
You want something you can fully audit → NanoClaw. 500 lines of TypeScript, container isolation, Anthropic-only.
You want to run agents on constrained hardware → PicoClaw. Still the only option for $10 RISC-V boards.
Security is your primary concern → ZeroClaw. Rust memory safety and deny-by-default allowlists remain the strongest security primitive.
You don’t want to manage infrastructure → MaxClaw ($19/mo) or KimiClaw (browser-based).
You want both breadth and learning → Run both. Hermes and OpenClaw both support the agentskills.io standard. You can use OpenClaw for its gateway routing and channel breadth while using Hermes for its learning loop. The ecosystem is converging on shared abstractions that make this increasingly practical.
Final Thoughts
The Claw ecosystem proved that people want autonomous AI agents. The security crisis proved that “move fast” without “secure by default” has real consequences.
Hermes represents a philosophical shift. The Claw frameworks all compete on the same axes — more channels, more skills, more integrations, better security. Hermes competes on a different axis entirely: an agent that gets better the more you use it.
This raises an interesting question that none of these frameworks have fully answered: who owns the learned knowledge? As agents accumulate understanding of your codebase, your workflows, and your decision patterns, that learned context becomes genuinely valuable. It’s not in the model weights — it’s in the skills, the memory files, the user profiles that Hermes builds over time. Today it lives on your server. But as managed hosting options emerge for Hermes (MiniMax already has a partnership), the question of data ownership and portability becomes critical.
The agentskills.io standard is a step toward portability — skills created by Hermes can theoretically run in any compatible framework. But the memory and user modeling data doesn’t have an equivalent standard yet.
The agent landscape is moving from “how many integrations can we support” to “how well can the agent learn.” Hermes is the first framework to make that bet explicitly. Whether it pays off depends on how well the learning loop scales — and that’s a question only time and real-world usage can answer.
References
Hermes Agent
- Hermes Agent GitHub Repository — GitHub
- Hermes Agent Official Documentation — Nous Research
- Hermes Agent Security Guide — Nous Research
- Hermes Agent v0.9.0 Release Notes — GitHub
- Hermes Agent: The Self-Improving Open-Source AI Agent Framework (v0.7.0 Deep Dive) — DEV Community, 2026
- Security and Command Approval — DeepWiki
Comparisons
- OpenClaw vs. Hermes Agent: The Race to Build AI Assistants That Never Forget — The New Stack, 2026
- What Is Hermes Agent? The OpenClaw Alternative with a Built-In Learning Loop — MindStudio, 2026
- Hermes Agent vs OpenClaw: The New Frontier in AI Agent Frameworks — AlphaMatch, 2026
- AI Developers 2026: The Ultimate Guide to Hermes vs. OpenClaw — WenHaoFree, 2026