Run tests

Drive the API end-to-end from the browser. Useful for first-time setup, smoke testing a fresh deployment, or generating data for the agent registry and live feed.

smoke test

Single full lifecycle

Create → handshake → fund → start → submit → verify → settle (or refund), all from the browser. Picks fresh idempotency keys for every call.

HW7

Controlled experiments

Five suites: verification strictness, dispute policy fidelity, parallel/sequential timing, failure recovery, optional LLM memory A/B.

HW8

Scale test (≥30 agents, multi-instance)

Drives N parallel HTTP doer agents across M independent connection pools. Returns latency percentiles, throughput, and per-instance breakdown. (Runs as a background task on the server; this UI polls the result.)