judge.arnao.ai v0.3

Try It

How this works

Paste a description of an AI agent action below (what it does, what it can access, what stops it)
Click "Judge this action" — sends to Claude Sonnet 4.5 (real call, ~10–15 seconds)
Get a structured verdict on the 7 questions, plus recommended fixes

The Seven Questions

Does this agent have a clear, specific, and bounded purpose?
Are all inputs and outputs explicitly defined and limited?
Are all credentials, permissions, and access scopes precisely minimal?
Is there a hard spend ceiling or resource limit?
Is there a comprehensive, immutable audit log of all decisions and actions?
Does this agent have at least two independent, immediate kill switches?
Is this agent's behavior fully reversible or indemnifiable?

Four Kill-Switch Planes

Runtime Cancel

Stop the agent's execution at the source.

Example: Disable a Lambda function, delete a cron job, kill a Docker container.

Identity Revoke

Remove the agent's ability to authenticate or authorize.

Example: Revoke IAM role, delete API key, remove OAuth token.

Gateway Block

Prevent the agent's network traffic from reaching its targets.

Example: Firewall rule, security group block, API Gateway throttle.

Payment Freeze

Halt any financial transactions initiated by the agent.

Example: Freeze Stripe account, block credit card, suspend bank transfers.

What this is / what this isn't

This IS:

A live reference implementation of the Kill-Switch Contract.
A tool for builders to self-evaluate their agent designs.
A starting point for a robust agent governance framework.
An example of using LLMs for structured policy evaluation.
Open-source and built in public.

This IS NOT:

A substitute for human oversight or legal review.
A guarantee of safety or compliance.
A comprehensive security audit tool.
A magic bullet for all AI safety concerns.
A product of AWS (this is a personal project).

Call this from another agent

Judge is not just a webpage. The same evaluator is exposed as an HTTP endpoint — any agent in your stack can POST an action description and get back a structured verdict. Useful as a pre-flight check in any production agent loop before the model executes a risky action.

Use case: pre-flight gate in an autonomous agent

An autonomous email-triage agent is about to execute a write action (move money, send email, modify a calendar). Before it does, it asks Judge whether the action passes the Kill-Switch Contract. If overall === "BLOCKED", the agent stops and pings the human. If WARN, it requires explicit human confirmation. If CLEARED, it proceeds.

curl

curl -X POST https://judge.demo.arnao.ai/api/judge \
  -H "Content-Type: application/json" \
  -d '{"action":"Agent: payroll-runner
Runtime: cron weekly
Acts for: cfo@acme.com
Outputs: ACH transfers to employee bank accounts
Credentials: bank-write
Spend ceiling: $50K/run
Kill switch: cron disable"}'

Python (use inside another agent)

import httpx

async def judge_action(action_desc: str) -> dict:
    """Pre-flight check — call before executing a risky agent action."""
    async with httpx.AsyncClient(timeout=30) as client:
        r = await client.post(
            "https://judge.demo.arnao.ai/api/judge",
            json={"action": action_desc}
        )
        return r.json()

# In your agent loop:
verdict = await judge_action(proposed_action)
if verdict["overall"] == "BLOCKED":
    raise PermissionError(f"Judge blocked: {verdict['verdict_reason']}")
elif verdict["overall"] == "WARN":
    await request_human_approval(verdict["fixes"])
# else: CLEARED — proceed

MCP-style tool definition (Claude/Cursor/Cline-callable)

{
  "name": "judge_kill_switch_contract",
  "description": "Pre-flight check for an AI agent action against the Seven-Question Kill-Switch Contract. Returns a structured verdict (CLEARED, WARN, or BLOCKED) with recommended fixes. Use before any agent action that touches money, identity, external systems, or sensitive data.",
  "input_schema": {
    "type": "object",
    "properties": {
      "action": {
        "type": "string",
        "description": "Plain-text description of the proposed agent action: runtime, identity, scope, mutations, spend, audit, kill switch."
      }
    },
    "required": ["action"]
  }
}

Response schema

{
  "verdicts": [
    {"q":1, "title":"Where does it run?", "status":"pass|partial|fail", "reasoning":"..."},
    ... 7 entries ...
  ],
  "overall": "CLEARED" | "WARN" | "BLOCKED",
  "verdict_reason": "one-sentence summary",
  "fixes": ["actionable fix 1", "fix 2", ...],
  "model": "claude-sonnet-4-5",
  "latency_ms": 12340,
  "council_note": "v0.3 single-model. Phase 2 wires full 3-model council.",
  "timestamp": "2026-05-24T03:30:00Z"
}

v0.3 endpoint is open and unauthenticated for demo purposes. Phase 2 adds API keys, rate limits, persistent audit log, and the full 3-model council with dissent surfaced.

judge.arnao.ai