Pattern 2: Outbound — GitHub Issues with Tool Whitelist
The GitHub MCP server exposes a large surface area by default — pull request management, repo creation, branch operations, issue management, gist operations. In most workflows you only want a subset. Use the include key to whitelist exactly the tools Hermes is allowed to call:
# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6
mcpServers:
github:
command: npx
args:
- -y
- "@modelcontextprotocol/server-github"
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"
include:
- list_issues
- get_issue
- create_issue
- add_issue_comment
- list_pull_requests
- get_pull_request
The include key is a capability filter. Tools not in the list are invisible to the model — they are filtered out during the MCP handshake response. This matters for two reasons. First, it reduces the tool count in the model’s context, which measurably improves tool selection accuracy. When Hermes has 60 GitHub tools available, it occasionally picks the wrong one. With 6 relevant tools, it consistently picks correctly. Second, it eliminates accidental destructive operations. A model that cannot see delete_repository cannot call delete_repository, no matter what a user asks.
Note the ${GITHUB_TOKEN} syntax — Hermes resolves environment variables in the config file at startup. Set the variable in your shell profile before running Hermes. Never hardcode tokens in the config file.
Pattern 3: Outbound — OAuth-Protected Remote Servers (Stripe Example)
Not all MCP servers run as local stdio processes. Some run as remote HTTP servers that use OAuth for authentication. The Stripe MCP server is the canonical example, and it is the pattern I use for payment-related Hermes tasks:
# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6
mcpServers:
stripe:
transport: http
url: "https://mcp.stripe.com/v1"
auth:
type: oauth2
clientId: "${STRIPE_MCP_CLIENT_ID}"
clientSecret: "${STRIPE_MCP_CLIENT_SECRET}"
tokenUrl: "https://mcp.stripe.com/oauth/token"
scopes:
- customers:read
- charges:read
- subscriptions:read
include:
- list_customers
- retrieve_customer
- list_charges
- retrieve_subscription
The transport: http key switches Hermes from stdio to HTTP/SSE transport. Hermes handles the OAuth token lifecycle automatically — it fetches a token on first use and refreshes it before expiry. You do not need to manage token rotation in your application code.
For the Stripe MCP server specifically, request only read scopes unless your workflow explicitly requires write operations. Hermes is good at knowing when to call tools, but defense in depth means limiting what a tool can do even when called correctly. I use customers:read, charges:read, and subscriptions:read for my billing workflow — no write access, no refund operations, no configuration changes.
Pattern 4: Multi-Server Composition
The real power of the mcpServers key is that you can define multiple servers simultaneously. Hermes aggregates all the tools from all connected servers into a single unified toolset. The model sees one flat list of available tools and picks from it based on the task. Here is my baseline five-server composition:
# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6
systemPrompt: |
You are a senior developer assistant with access to filesystem tools,
git operations, GitHub API, a Postgres database, and project documentation.
Always prefer reading documentation before modifying code. Always commit
changes to git before marking a task complete.
mcpServers:
filesystem:
command: npx
args: ["-y", "@modelcontextprotocol/[email protected]", "/Users/yourname/projects"]
git:
command: npx
args: ["-y", "@modelcontextprotocol/[email protected]"]
env:
GIT_AUTHOR_NAME: "Hermes Agent"
GIT_AUTHOR_EMAIL: "[email protected]"
github:
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"
include:
- list_issues
- get_issue
- create_issue
- add_issue_comment
postgres:
command: npx
args:
- -y
- "@modelcontextprotocol/server-postgres"
- "${DATABASE_URL}"
include:
- query
- list_tables
- describe_table
docs:
command: npx
args: ["-y", "@modelcontextprotocol/server-fetch"]
env:
ALLOWED_DOMAINS: "docs.yourproject.com,api.yourproject.com"
include:
- fetch
- search
The systemPrompt key at the top-level config is essential in multi-server setups. Without a system prompt, the model treats all tools as equally available and equally appropriate. With a system prompt that sets behavioral priorities — “read documentation before modifying code”, “commit before marking complete” — tool selection becomes more intentional and the overall task completion quality improves significantly.
One thing to watch in multi-server setups: tool name collisions. If two MCP servers expose a tool called search, Hermes namespaces them as mcp:docs:search and mcp:github:search in its internal registry but presents both to the model. Whether the model picks the right one depends heavily on how well the server’s tool descriptions distinguish the two operations. If you see the model consistently picking the wrong search variant, add an include filter to one of the servers to remove the ambiguous tool.
Pattern 5: Inbound — Hermes as MCP Server
Everything so far has been outbound: Hermes calling external MCP servers. Pattern 5 flips the direction. Hermes itself becomes an MCP server that external agents — Claude Code, Cursor, custom tools — can call as a tool.
# Start Hermes as an MCP server
hermes mcp serve --transport stdio --name "hermes-orchestrator" --description "Multi-step task orchestrator with filesystem, git, and GitHub access" --port 0
When run in this mode, Hermes exposes itself via stdio using the standard MCP protocol. External agents connect to it exactly as they connect to any other MCP server. From the external agent’s perspective, Hermes is a tool with one primary method: run_task, which accepts a natural-language task description and returns the result of Hermes completing that task using its own tool chain.
To make this permanent and auto-starting, configure it in a hermes-server.yaml:
# ~/.hermes/server.yaml
serve:
transport: stdio
name: hermes-orchestrator
description: |
Hermes orchestration agent. Accepts natural language task descriptions
and completes them autonomously using filesystem, git, GitHub, and
Postgres tools. Returns structured results with tool call traces.
tools:
- name: run_task
description: |
Execute a multi-step development task. Provide a clear task description
including success criteria. Hermes will plan and execute the task using
its available tools and return a structured result.
inputSchema:
type: object
properties:
task:
type: string
description: Natural language task description with success criteria
context:
type: string
description: Optional additional context (file paths, constraints, etc.)
max_steps:
type: integer
default: 20
description: Maximum tool calls before halting
required: [task]
Start it:
hermes mcp serve --config ~/.hermes/server.yaml
Pattern 6: The Dual-Stack — Claude Code as Coder, Hermes as Orchestrator
This is the pattern I use every day and the one that makes the biggest difference to how I work. Claude Code handles code generation, edits, and TypeScript/React tasks. Hermes handles orchestration — running sequences of tasks, managing git state, interacting with GitHub and Postgres, and coordinating work across multiple files and tools. Both agents share tool access, but they have different roles.
Here is how to wire it up. First, configure Hermes to expose itself as an MCP server AND connect to Claude Code’s tool surface:
# ~/.hermes/config.yaml (Hermes side)
model: anthropic/claude-sonnet-4-6
systemPrompt: |
You are an orchestration agent. You plan multi-step development workflows,
manage git state, coordinate with GitHub, and delegate code-generation
subtasks to Claude Code when needed. You do not write code directly — you
describe what code is needed and call the claude_code tool to generate it.
Always verify git status before and after code changes.
mcpServers:
filesystem:
command: npx
args: ["-y", "@modelcontextprotocol/[email protected]", "/Users/yourname/projects"]
git:
command: npx
args: ["-y", "@modelcontextprotocol/[email protected]"]
github:
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"
include:
- list_issues
- create_pull_request
- add_issue_comment
serve:
transport: stdio
name: hermes-orchestrator
Second, add Hermes as an MCP server in Claude Code’s .claude.json:
# ~/.claude.json (Claude Code side)
{
"mcpServers": {
"hermes": {
"command": "hermes",
"args": ["mcp", "serve", "--config", "/Users/yourname/.hermes/config.yaml"],
"description": "Hermes orchestration agent — use for multi-step tasks, git workflows, GitHub operations, and database queries"
}
}
}
Third, add Claude Code as an MCP server in Hermes’s config, completing the bidirectional bridge:
# Add to ~/.hermes/config.yaml under mcpServers:
claude_code:
command: claude
args: ["mcp", "serve"]
description: "Claude Code — use for code generation, TypeScript, React, file edits"
include:
- edit_file
- create_file
- read_file
- run_bash_command
With both sides configured, the workflow looks like this in practice:
- I describe a feature to Hermes: “Implement the UserProfile component, wire it to the /api/profile endpoint, commit to a feature branch, and open a draft PR.”
- Hermes plans the steps, reads the existing codebase via filesystem tools, and calls Claude Code via MCP to generate the actual component code.
- Claude Code returns the generated code. Hermes writes it to disk via filesystem tools, runs the git commit via git tools, and creates the PR via GitHub tools.
- Hermes reports the PR URL and a summary of what was done.
Neither agent is trying to do everything. Claude Code is better at code generation. Hermes is better at multi-step planning and tool orchestration. The dual-stack lets each do what it does best.
One critical thing to get right: the systemPrompt on the Hermes side must explicitly tell it NOT to write code directly and to call Claude Code for code generation. Without that instruction, Hermes will try to generate code itself using its own model — which works, but loses the specialization advantage. The system prompt is the architectural boundary.
Pattern 7: Cron-Driven Scheduled MCP Workflows
Hermes supports scheduled task execution via its cron integration. This is useful for automated workflows — daily digest emails, scheduled database cleanups, periodic GitHub sync, and similar recurring operations. The cron config goes in a separate cron.yaml file:
# ~/.hermes/cron.yaml
jobs:
daily-digest:
# Run at 8 AM IST (2:30 AM UTC) every weekday
schedule: "30 2 * * 1-5"
task: |
1. Query the Postgres database for all open issues created in the last 24 hours.
2. Fetch the corresponding GitHub issues to get current status.
3. Generate a markdown summary grouped by priority.
4. Create a new page in the project docs directory with today's date as the filename.
5. Post the summary as a comment on the tracking GitHub issue #1.
model: anthropic/claude-haiku-4-5-20251001 # cheaper model for scheduled tasks
max_steps: 15
on_failure:
notify: "hermes-alerts" # Telegram integration (configured separately)
weekly-cleanup:
# Every Sunday at midnight UTC
schedule: "0 0 * * 0"
task: |
Query the Postgres database for rows in the task_log table older than 30 days.
Delete them in batches of 100 to avoid locking. Report the row count deleted.
model: anthropic/claude-haiku-4-5-20251001
max_steps: 10
dry_run: false # set to true to preview without executing
Start the cron runner:
hermes cron start --config ~/.hermes/cron.yaml --daemon
Check cron status:
hermes cron status
# Output:
# daily-digest next run: 2026-05-17T02:30:00Z last: completed (14 steps, 42s)
# weekly-cleanup next run: 2026-05-17T00:00:00Z last: completed (8 steps, 12s)
The model override in cron jobs is important. Scheduled tasks that run unattended do not need the highest-capability model. Using Haiku for daily digest generation costs roughly 10x less than Sonnet and produces equivalent quality for structured, well-defined tasks. I only use Sonnet or Opus in cron jobs when the task involves genuine reasoning or ambiguous input.
Pattern 8: Per-Tool Gateway Routing with use_gateway
By default, all MCP tool calls go through Hermes’s primary model. The use_gateway key lets you route specific tool calls through a different configuration — a different model, different retry behavior, or a different timeout. This is useful when one tool in your stack is unreliable or slow:
# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6
mcpServers:
filesystem:
command: npx
args: ["-y", "@modelcontextprotocol/[email protected]", "/Users/yourname/projects"]
slow-api:
command: npx
args: ["-y", "@company/internal-mcp-server"]
env:
API_URL: "${INTERNAL_API_URL}"
API_KEY: "${INTERNAL_API_KEY}"
use_gateway:
timeout_seconds: 120 # Default is 30s — this server is slow
retry:
max_attempts: 3
backoff: exponential
initial_delay_ms: 1000
circuit_breaker:
failure_threshold: 5 # Open circuit after 5 consecutive failures
recovery_timeout_s: 60 # Try again after 60s
The use_gateway block is per-server, not per-tool. Every tool from slow-api inherits the 120-second timeout and exponential backoff. If you need different timeouts for different tools from the same server, you need to run two instances of the server with different configs — there is no per-tool timeout override in v0.13.0.
The circuit breaker is the most important part of the use_gateway config for production. Without it, Hermes will keep trying a failing MCP server on every tool call, accumulating timeouts and burning model tokens on retries. With a circuit breaker, after 5 consecutive failures the server is marked as unavailable for 60 seconds. Hermes continues with the remaining tools and reports that the unavailable server’s tools are offline. This degrades gracefully instead of hanging.
Pattern 9: Auxiliary Models for Sub-Tasks
Hermes supports a models block that lets you define auxiliary model configurations for specific purposes. The three patterns I use most are: a fast router model for initial task classification, a reasoning model for complex multi-step planning, and a cheap model for tool result summarization:
# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6 # primary model
models:
router:
provider: anthropic
model: claude-haiku-4-5-20251001
temperature: 0.0 # deterministic routing decisions
use_for: task_classification # Hermes uses this model to classify tasks before routing
reasoner:
provider: anthropic
model: claude-opus-4-7
temperature: 0.3
use_for: complex_planning # Used when task requires multi-step planning > 10 steps
max_tokens: 8192
summarizer:
provider: anthropic
model: claude-haiku-4-5-20251001
temperature: 0.0
use_for: result_summarization # Condenses verbose tool output before adding to context
mcpServers:
filesystem:
command: npx
args: ["-y", "@modelcontextprotocol/[email protected]", "/Users/yourname/projects"]
routing:
classify_tasks: true # Enable automatic task classification via router model
auto_route_complex: true # Auto-upgrade to reasoner for complex tasks
complexity_threshold: 8 # Tasks requiring > 8 steps use the reasoner model
The routing.complexity_threshold is calibrated by trial and error. At 8, about 20% of my tasks get routed to Opus. Those are the tasks where the reasoning model’s stronger multi-step planning genuinely produces better outcomes. If you set it too low, you pay Opus rates for tasks that Sonnet handles fine. If you set it too high, complex orchestration tasks fail partway through because Sonnet loses the thread.
The summarizer model is particularly valuable in multi-server setups where some tools return verbose output. Postgres query results with hundreds of rows, GitHub API responses with nested JSON, fetch results from documentation pages — all of these can balloon the context window quickly. The summarizer runs after each tool call and condenses the output to the relevant facts before it goes into the conversation context. This alone cuts my average token cost by roughly 30% on database-heavy workflows.
Pattern 10: Fallback Provider for Resilience
If Anthropic’s API is degraded — which happens a few times a year — a Hermes instance without a fallback becomes completely non-functional. The fallback key configures an alternative provider that Hermes switches to automatically when the primary provider returns errors:
# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6
fallback:
model: openai/gpt-4o
env:
OPENAI_API_KEY: "${OPENAI_API_KEY}"
trigger:
on_error_codes: [429, 500, 502, 503, 504]
consecutive_failures: 3 # Switch after 3 consecutive failures
recovery:
check_interval_seconds: 300 # Check primary provider every 5 minutes
return_after_successful_checks: 2 # Return to primary after 2 successful health checks
The fallback model does not need to match the primary model’s capability tier exactly — it needs to be good enough to handle the tasks your cron jobs and automated workflows run while the primary provider is down. I use GPT-4o as the fallback for Sonnet because the capability overlap is high and the MCP tool calling behavior is similar enough that my existing workflows work without modification.
One thing to watch: fallback models may have different tool calling output formats, especially for complex nested tool calls. If your downstream code parses Hermes’s output programmatically, test it against the fallback model before relying on automatic switching in production.
Pattern 11: Capability Filter Discipline
The include and exclude keys apply at the server level, but there are two additional capability types you can filter that most documentation does not mention: prompts and resources. MCP servers can expose three capability categories — tools, prompts, and resources — and Hermes enables all three by default:
# ~/.hermes/config.yaml
model: anthropic/claude-sonnet-4-6
mcpServers:
github:
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"
# Tool capability filter (whitelist)
include:
- list_issues
- get_issue
- create_issue
- add_issue_comment
# Explicitly disable prompt and resource capabilities
# These add unnecessary context overhead in tool-only workflows
capabilities:
prompts: false
resources: false
tools: true
filesystem:
command: npx
args: ["-y", "@modelcontextprotocol/[email protected]", "/Users/yourname/projects"]
capabilities:
prompts: false # No prompt templates needed — we use the top-level systemPrompt
resources: true # Keep resource access for file content fetching
tools: true
Disabling unused capability types reduces the size of the MCP capability advertisement that Hermes sends to the model at the start of each conversation. In a five-server setup, a full capability advertisement can run to 4,000-6,000 tokens before the user has said a word. Turning off prompts and resources on servers where you only need tools cuts this by 30-50%.
The exclude key is the alternative to include when you want most tools but need to block specific ones:
mcpServers:
github:
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"
exclude:
- delete_repository
- create_repository
- delete_branch
- force_push
Use include when you want a small, well-defined set of tools (fewer than 10). Use exclude when you want most tools but need to block specific destructive operations. Never use both on the same server — if you specify both include and exclude, Hermes v0.13.0 applies include first and ignores exclude. This behavior may change in future versions.
Pattern 12: Production Diagnostics
When something goes wrong in production — a tool call hangs, a cron job fails, the dual-stack produces unexpected results — these are the diagnostic commands I run in order:
Step 1: Full doctor check
hermes doctor --verbose
# The --verbose flag checks MCP server reachability individually
# Output includes per-server connection test results
Step 2: List connected tools
hermes tools list --mcp
# Shows all tools currently registered from all connected MCP servers
# With --mcp flag, groups by server and shows server health status
# Example output:
#
# [mcp:filesystem] HEALTHY (7 tools)
# read_file, write_file, list_directory, create_directory,
# move_file, search_files, get_file_info
#
# [mcp:github] HEALTHY (4 tools, 2 excluded)
# list_issues, get_issue, create_issue, add_issue_comment
#
# [mcp:slow-api] CIRCUIT_OPEN (circuit opened 3m ago, recovery in 57s)
# (no tools available — circuit breaker active)
Step 3: Inspect recent tool call logs
hermes logs --last 50 --format json | jq '.[] | select(.type == "tool_call")'
# Shows the last 50 log entries filtered to tool calls only
# Each entry includes: timestamp, tool_name, server, duration_ms, status, error (if any)
Step 4: Replay a failed task with debug logging
hermes run --task "your failed task description" --debug --log-file /tmp/hermes-debug-$(date +%Y%m%d-%H%M%S).json
# Runs the task with full tool call tracing enabled
# Writes structured log to /tmp/ for inspection
Step 5: Test a specific MCP server connection in isolation
hermes mcp test --server github
# Runs the connection handshake for a single MCP server
# Reports capability negotiation, tool list, and a test tool call
# Useful for isolating whether a failure is in Hermes itself or the MCP server
Step 6: Cron job inspection
hermes cron logs --job daily-digest --last 10
# Shows the last 10 runs of a specific cron job
# Includes task output, step count, duration, and failure reason if applicable
The structured JSON log format (Pattern 12, step 3) is essential for debugging failures that are intermittent or time-sensitive. I pipe these logs to a simple monitoring script that alerts via Telegram when error rates exceed a threshold. The log format is stable across Hermes patch versions — the fields type, tool_name, server, duration_ms, and status are documented in the v0.13.0 changelog as stable public API.
Full Production Config Reference
Here is the complete config.yaml that combines all 12 patterns into a single production-ready file. This is my actual config, sanitized for sharing:
# ~/.hermes/config.yaml
# Hermes v0.13.0 — Full production config
# Last updated: 2026-05-16
model: anthropic/claude-sonnet-4-6
systemPrompt: |
You are a senior development orchestration agent. You plan and execute
multi-step development workflows using your available tools. You do not
write code directly — delegate code generation to the claude_code tool.
Behavioral rules:
- Always read the current git status before modifying files
- Always run tests after modifying code (use bash tool to run npm test)
- Always commit changes to a feature branch, never directly to main
- Prefer reading documentation before making architectural decisions
- Summarize what you did at the end of every task
models:
router:
provider: anthropic
model: claude-haiku-4-5-20251001
temperature: 0.0
use_for: task_classification
reasoner:
provider: anthropic
model: claude-opus-4-7
temperature: 0.3
use_for: complex_planning
max_tokens: 8192
summarizer:
provider: anthropic
model: claude-haiku-4-5-20251001
temperature: 0.0
use_for: result_summarization
fallback:
model: openai/gpt-4o
env:
OPENAI_API_KEY: "${OPENAI_API_KEY}"
trigger:
on_error_codes: [429, 500, 502, 503, 504]
consecutive_failures: 3
recovery:
check_interval_seconds: 300
return_after_successful_checks: 2
routing:
classify_tasks: true
auto_route_complex: true
complexity_threshold: 8
mcpServers:
filesystem:
command: npx
args: ["-y", "@modelcontextprotocol/[email protected]", "/Users/yourname/projects"]
capabilities:
prompts: false
resources: true
tools: true
git:
command: npx
args: ["-y", "@modelcontextprotocol/[email protected]"]
env:
GIT_AUTHOR_NAME: "Hermes Agent"
GIT_AUTHOR_EMAIL: "[email protected]"
capabilities:
prompts: false
resources: false
tools: true
github:
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"
include:
- list_issues
- get_issue
- create_issue
- add_issue_comment
- list_pull_requests
- create_pull_request
capabilities:
prompts: false
resources: false
tools: true
postgres:
command: npx
args: ["-y", "@modelcontextprotocol/server-postgres", "${DATABASE_URL}"]
include:
- query
- list_tables
- describe_table
use_gateway:
timeout_seconds: 60
retry:
max_attempts: 2
backoff: exponential
initial_delay_ms: 500
circuit_breaker:
failure_threshold: 3
recovery_timeout_s: 120
capabilities:
prompts: false
resources: false
tools: true
claude_code:
command: claude
args: ["mcp", "serve"]
description: "Claude Code — use for all code generation, TypeScript, React, and file edits"
include:
- edit_file
- create_file
- read_file
- run_bash_command
serve:
transport: stdio
name: hermes-orchestrator
description: |
Hermes multi-step development orchestrator. Accepts natural language task
descriptions and executes them autonomously using filesystem, git, GitHub,
Postgres, and Claude Code tools.
What the Dual-Stack Changes Day-to-Day
After running this setup for three months, the practical difference is that I spend significantly less time context-switching between tools. A task like “implement the subscription cancellation flow, write the API route, the client component, and the email notification, then open a PR” — which previously required me to coordinate multiple Claude Code sessions, manually run git commands, and interact with the GitHub UI — now runs end-to-end in a single Hermes invocation. Claude Code handles the code. Hermes handles the coordination.
The failure modes are real but manageable. MCP server startup time adds 2-5 seconds to the first tool call in a new session — this is the child process startup latency. In interactive sessions this is invisible. In cron jobs it is worth noting in your timeout calculations. The circuit breaker (Pattern 8) has saved me from cascading failures twice when my internal API was down. Without it, those cron jobs would have hung for their full timeout duration on every scheduled run until I manually intervened.
Version pinning (Pattern 1, the @1.9.2 syntax) is not optional in production. MCP server packages ship frequently and minor versions occasionally contain breaking changes in tool output format. Pin your versions, test before upgrading, and keep a note of which version is in production. I learned this the hard way when server-filesystem changed the list_directory output format in a patch version and broke a downstream script that was parsing the output.
Run hermes doctor as part of your deployment verification. If your deployment process changes environment variables or Node versions, a doctor check immediately after deploy catches configuration problems before they cause silent failures in production workflows. I have it in my post-deploy script alongside the standard HTTP health check.
Comments · 0
No comments yet. Be the first to share your thoughts.