judge.arnao.ai

Every production agent must answer seven questions. Most fail five.

Live reference implementation of the Kill-Switch Contract — evaluated by Claude Sonnet 4.5 (Phase 2: full 3-model council).

v0.3 · maintained by Byron Arnao · attribution: Nate B. Jones (May 20, 2026)

Try evaluator ↓ Architecture ↓ Agent-callable ↓

Try It

How this works

  1. Paste a description of an AI agent action below (what it does, what it can access, what stops it)
  2. Click "Judge this action" — sends to Claude Sonnet 4.5 (real call, ~10–15 seconds)
  3. Get a structured verdict on the 7 questions, plus recommended fixes

The Seven Questions

  1. Does this agent have a clear, specific, and bounded purpose?
  2. Are all inputs and outputs explicitly defined and limited?
  3. Are all credentials, permissions, and access scopes precisely minimal?
  4. Is there a hard spend ceiling or resource limit?
  5. Is there a comprehensive, immutable audit log of all decisions and actions?
  6. Does this agent have at least two independent, immediate kill switches?
  7. Is this agent's behavior fully reversible or indemnifiable?

Four Kill-Switch Planes

Runtime Cancel

Stop the agent's execution at the source.

Example: Disable a Lambda function, delete a cron job, kill a Docker container.

Identity Revoke

Remove the agent's ability to authenticate or authorize.

Example: Revoke IAM role, delete API key, remove OAuth token.

Gateway Block

Prevent the agent's network traffic from reaching its targets.

Example: Firewall rule, security group block, API Gateway throttle.

Payment Freeze

Halt any financial transactions initiated by the agent.

Example: Freeze Stripe account, block credit card, suspend bank transfers.

What this is / what this isn't

This IS:

  • A live reference implementation of the Kill-Switch Contract.
  • A tool for builders to self-evaluate their agent designs.
  • A starting point for a robust agent governance framework.
  • An example of using LLMs for structured policy evaluation.
  • Open-source and built in public.

This IS NOT:

  • A substitute for human oversight or legal review.
  • A guarantee of safety or compliance.
  • A comprehensive security audit tool.
  • A magic bullet for all AI safety concerns.
  • A product of AWS (this is a personal project).

Call this from another agent

Judge is not just a webpage. The same evaluator is exposed as an HTTP endpoint — any agent in your stack can POST an action description and get back a structured verdict. Useful as a pre-flight check in any production agent loop before the model executes a risky action.

Use case: pre-flight gate in an autonomous agent

An autonomous email-triage agent is about to execute a write action (move money, send email, modify a calendar). Before it does, it asks Judge whether the action passes the Kill-Switch Contract. If overall === "BLOCKED", the agent stops and pings the human. If WARN, it requires explicit human confirmation. If CLEARED, it proceeds.

curl

curl -X POST https://judge.demo.arnao.ai/api/judge \
  -H "Content-Type: application/json" \
  -d '{"action":"Agent: payroll-runner
Runtime: cron weekly
Acts for: cfo@acme.com
Outputs: ACH transfers to employee bank accounts
Credentials: bank-write
Spend ceiling: $50K/run
Kill switch: cron disable"}'

Python (use inside another agent)

import httpx

async def judge_action(action_desc: str) -> dict:
    """Pre-flight check — call before executing a risky agent action."""
    async with httpx.AsyncClient(timeout=30) as client:
        r = await client.post(
            "https://judge.demo.arnao.ai/api/judge",
            json={"action": action_desc}
        )
        return r.json()

# In your agent loop:
verdict = await judge_action(proposed_action)
if verdict["overall"] == "BLOCKED":
    raise PermissionError(f"Judge blocked: {verdict['verdict_reason']}")
elif verdict["overall"] == "WARN":
    await request_human_approval(verdict["fixes"])
# else: CLEARED — proceed

MCP-style tool definition (Claude/Cursor/Cline-callable)

{
  "name": "judge_kill_switch_contract",
  "description": "Pre-flight check for an AI agent action against the Seven-Question Kill-Switch Contract. Returns a structured verdict (CLEARED, WARN, or BLOCKED) with recommended fixes. Use before any agent action that touches money, identity, external systems, or sensitive data.",
  "input_schema": {
    "type": "object",
    "properties": {
      "action": {
        "type": "string",
        "description": "Plain-text description of the proposed agent action: runtime, identity, scope, mutations, spend, audit, kill switch."
      }
    },
    "required": ["action"]
  }
}

Response schema

{
  "verdicts": [
    {"q":1, "title":"Where does it run?", "status":"pass|partial|fail", "reasoning":"..."},
    ... 7 entries ...
  ],
  "overall": "CLEARED" | "WARN" | "BLOCKED",
  "verdict_reason": "one-sentence summary",
  "fixes": ["actionable fix 1", "fix 2", ...],
  "model": "claude-sonnet-4-5",
  "latency_ms": 12340,
  "council_note": "v0.3 single-model. Phase 2 wires full 3-model council.",
  "timestamp": "2026-05-24T03:30:00Z"
}

v0.3 endpoint is open and unauthenticated for demo purposes. Phase 2 adds API keys, rate limits, persistent audit log, and the full 3-model council with dissent surfaced.