Getting started
AgentGuards checks every message before it reaches your AI agent, and every response before it reaches your users. This guide walks you through signup, installation, and what to do if a check blocks something unexpectedly.
Dashboard setup
Everything you need is in the dashboard — no CLI tools required to configure checks.
Sign up
Go to agentguards.co and click Start free. No credit card required. You get 5,000 checked requests per month on the free plan.
Copy your API key
In the dashboard, open Settings → API Keys. Click New key, give it a name (e.g. "local-dev"), and copy the ag_... token. You will not be able to see it again after closing the dialog.
Configure checks
Go to Checks. Every check is on by default. Toggle any check off, or click a check to tune its threshold — useful if a particular check is too sensitive for your use case.
Set up an integration
Follow the install steps below for your agent. Paste your API key where prompted. Restart your agent and send a test message — you should see the request appear in your dashboard logs.
Check your usage
The dashboard shows how many requests you have checked and how much of your monthly quota is consumed. Per-request logs are not available yet — they are coming in a future update.
Supported agents
AgentGuards works with any agent that accepts MCP servers, hooks, or can call a REST API.
Claude Code
Hooks run at the OS level — the check happens before Claude processes your message. MCP runs inside the session and requires a CLAUDE.md instruction to call check_input.
Setup guide →OpenAI Codex
Add AgentGuards as an MCP server in ~/.codex/config.toml. Codex calls check_input on each turn via the MCP protocol.
Setup guide →VS Code Copilot
Add the MCP server to .vscode/mcp.json in your workspace. Requires VS Code 1.99+ with agent mode enabled.
Setup guide →Any LLM app (API)
Call /v1/guardrails/evaluate-input before sending the prompt to your model, or route through the Gateway to check and forward in one step.
Setup guide →Install
Pick the method that matches how you use your agent. Hooks are the strongest option for Claude Code — they run before Claude sees the message.
Claude Code — Hooks (recommended)
The hook script runs as a system process. Claude never processes your message if the hook blocks it — there is no way for an injected instruction to bypass it. The env vars go in settings.json not your shell profile, so every session sees them.
# 1. Download the hook script
curl -o ~/.claude/agentguards_hook.py \
https://prod.agentguards.co/static/agentguards_hook.py
# 2. Add your API key to Claude Code settings
# Open ~/.claude/settings.json and add the block below{
"env": {
"AGENTGUARDS_URL": "https://prod.agentguards.co",
"AGENTGUARDS_API_KEY": "ag_YOUR_TOKEN_HERE"
},
"hooks": {
"UserPromptSubmit": [
{
"hooks": [{
"type": "command",
"command": "python3 ~/.claude/agentguards_hook.py UserPromptSubmit"
}]
}
],
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [{
"type": "command",
"command": "python3 ~/.claude/agentguards_hook.py PreToolUse"
}]
}
],
"PostToolUse": [
{
"matcher": "Bash|WebFetch|WebSearch",
"hooks": [{
"type": "command",
"command": "python3 ~/.claude/agentguards_hook.py PostToolUse"
}]
}
]
}
}Restart Claude Code. The hook fires on every message and every Bash tool call.
Claude Code — MCP server
The MCP approach runs inside the Claude session. It works well alongside a CLAUDE.md that instructs Claude to call check_input on each turn. Less strong than hooks for blocking attacks, but useful for building guardrails into your own agent workflows.
claude mcp add agentguards \
--env AGENTGUARD_URL=https://prod.agentguards.co \
--env AGENTGUARD_API_KEY=ag_YOUR_TOKEN_HERE \
-- npx -y @agentguards/mcpOpenAI Codex
Add the MCP server entry to your Codex config. Codex will call AgentGuards on each turn via the MCP protocol.
# ~/.codex/config.toml
[mcp_servers.agentguards]
command = "npx"
args = ["-y", "@agentguards/mcp"]
env = { AGENTGUARD_URL = "https://prod.agentguards.co", AGENTGUARD_API_KEY = "ag_YOUR_TOKEN_HERE" }Any app — REST API
Call the check endpoint directly before passing a prompt to your model. Works from any language or framework.
curl -X POST https://prod.agentguards.co/v1/guardrails/evaluate-input \
-H "X-API-Key: ag_YOUR_TOKEN_HERE" \
-H "Content-Type: application/json" \
-d '{"text": "your prompt here"}'See the Gateway docs to check and forward to an LLM in a single call.
What gets approved, what gets blocked
AgentGuards returns one of four decisions on every request. The worst decision across all checks wins.
Normal traffic — no issues found
The request passes through unchanged. Your agent receives it and responds as normal.
PII detected (email, phone, SSN, credit card)
The matched text is replaced with a placeholder (e.g. [EMAIL]) before the prompt reaches the model. Your agent still runs — it just does not see the raw sensitive value.
Prompt injection, jailbreak, secret, data exfiltration, restricted topic, or toxicity
The request is stopped. The agent never receives the message. The user sees a formatted block message explaining which check triggered.
Borderline — matched but below the block threshold
The request is allowed, but counted separately in your usage stats. Useful for tuning thresholds before enabling hard blocks.
To see which checks produce which decisions, read the supported checks reference.
What happens to your prompt data
The short version: we check it, then discard it.
Processed in AWS — EU region
All checks run on infrastructure hosted in AWS eu-north-1 (Stockholm). Data does not leave the EU.
Not stored by default
Prompt content is evaluated in memory and discarded. We store the decision, the check results, and metadata (timestamp, tenant ID, use-case) — not the prompt text itself. Log-level prompt storage can be enabled per tenant for debugging and is opt-in.
Not used for training
We do not use your prompt content to train models, fine-tune classifiers, or improve AgentGuards checks. The ML models we use are pre-trained and run locally in our inference environment.
Not sold or shared
Prompt data is not shared with third parties, sold, or used for any purpose outside of providing the service to you.
API keys are encrypted at rest
Your tenant API key is stored encrypted. The AgentGuards system key used to call upstream model providers is never exposed in logs or API responses.
Something got blocked unexpectedly
False positives happen, especially with domain-specific language or code that superficially resembles an attack pattern. Here is how to fix it.
Check which rule fired
When a block happens, the agent displays a message naming which check triggered (e.g. "prompt_injection") and the reason. Per-request logs in the dashboard are coming soon — for now, use this message to identify the check to tune.
Tune the threshold
In dashboard → Checks, click the check that fired. Raise the threshold (e.g. injection score from 0.7 to 0.85) to reduce sensitivity. Changes take effect immediately — no restart needed.
Disable a specific check
If a check is consistently producing false positives for your use case, toggle it off in dashboard → Checks. You can re-enable it at any time.
Temporarily allow all traffic for debugging
Set AGENTGUARDS_FAIL_OPEN=true in your settings.json env block. This lets traffic through if a check returns an error, but does NOT bypass explicit block decisions. Use only while diagnosing — remove it when done.
# In your dashboard → Checks, toggle off the check causing the block.
# Or tune the threshold for that check to be less sensitive.
# To temporarily allow everything while debugging, you can set:
AGENTGUARDS_FAIL_OPEN=true
# in your settings.json "env" block — this allows traffic if the check
# returns an error, but does NOT bypass blocks that explicitly matched.Still stuck? Email support with the correlation ID from the log entry and we will look at the specific request.
Uninstall
Remove AgentGuards from your agent in a few steps.
Claude Code — Hooks
Remove the hook entries from ~/.claude/settings.json. You can also remove the env block if you added it only for AgentGuards.
# Remove the hook entries from ~/.claude/settings.json
# Then optionally delete the script
rm ~/.claude/agentguards_hook.pyClaude Code — MCP
claude mcp remove agentguardsOpenAI Codex
Remove the [mcp_servers.agentguards] section from ~/.codex/config.toml.
REST API / Gateway
Remove the evaluate-input call from your application code. No client software to uninstall.
To revoke your API key, go to dashboard → Settings → API Keys and delete the key. The key will stop working immediately.