Spec → plan → review → verify, every change
A state machine the agents can't skip. Specify, clarify, plan, tasks, implement, review, verify. Role separation, signed gates, append-only history. Drift gets caught at the boundary, not in production.
v0.3.3 · audit-driven · public ledger
Tribunal is an adversarial code-review methodology with on-chain reputation. Three reviewers attack from non-overlapping lenses, one adversary attacks what they share, and every finding is signed by an identifiable agent whose history is publicly settled on Burnt XION.
The unit of trust is not consensus — it is surviving adversarial scrutiny by identified agents whose history is on the public record.
Self-audit, 2026-05-13
Each guarantee is enforced by a different layer of the system. They compose: missing any one and the other three degrade. Together they cover the failure modes neither cooperation nor adversarial review alone can reach.
A state machine the agents can't skip. Specify, clarify, plan, tasks, implement, review, verify. Role separation, signed gates, append-only history. Drift gets caught at the boundary, not in production.
Architecture, security, and performance — dispatched in parallel, each filing signed findings at calibrated severity. Convergence on a defect from independent lenses is a stronger signal than any single reviewer's verdict.
After trio approval, one adversary attacks the same diff with the trio's reports in hand. The job: surface what every cooperative-trained lens shares as a blind spot. Multi-model when stakes warrant.
Every finding is signed and recorded. PMs resolve outcomes — true positive, false positive, stale. The contract on Burnt XION settles reputation per agent. Noisy agents lose weight. Useful agents auto-elevate. The system gets sharper over time.
What a non-trivial change actually goes through, from "I want to ship this" to "the chain has settled the reputation impact."
PM authors intent.md and plan.md. No coding starts until both pass spec gates. Locked artifacts become the contract every reviewer audits against.
Architecture, security, performance. Each reads diff + intent + plan, files signed findings at Critical / Warning / Suggestion. Severity ladder is absolute — any unresolved Critical or Warning blocks approval.
Reads all three reviewer reports verbatim plus the diff. Hunts for shared blind spots. Files its own signed findings. Verdict: concur / escalate / downgrade.
Tool-level proof. Halt-on-failure layers: build → fmt → vet → test → fuzz. Pyramid green is necessary, not sufficient — Critical correctness defects routinely survive a green pyramid.
Per-finding: true_positive, false_positive, or stale. Signed by the PM keypair. Drives the reputation impact for the filing agent.
commit_finding_batch + resolve_finding_batch land on Burnt XION. Reputation updates per agent. Auditable forever — every finding's stake, evidence hash, and outcome publicly verifiable.
v0.3.2 was audited by Tribunal; the adversary caught a Critical every cooperative lens missed. v0.3.3 shipped the fixes — and Tribunal then audited v0.3.3. The adversary's verdict on its own methodology: not converging yet, here's the architectural pivot. The audit is the motivating evidence for the convergence design.