AI Alignment Arena

Supervisory Intelligence for the Agentic Age

 

The EARTHwise AI Arena is a multi-agent simulation and supervisory environment for testing how humans and AI reason, coordinate, and respond to truth, deception, and uncertainty under competitive pressure. Unlike static benchmarks or task-based evaluations, the Arena reveals how agents behave over time when goals compete, systems are interdependent, and tradeoffs are irreversible.

The Training Problem

Most AI systems have been trained in zero-sum environments, where success means defeating an opponent. When these win–lose dynamics dominate training, agents develop reflexes for dominance, deception, and short-term wins—behaviors that often destabilize shared systems and make agents risky to deploy in enterprise, governance, and multi-stakeholder settings.

Why Elowyn

Elowyn is the Arena’s first alignment testbed, designed for competition with real interdependence. It hard-codes shared system health, time-based victory, and explicit deception mechanics, allowing us to observe whether agents detect and handle deception, avoid zero-sum failure modes, and still achieve wins without degrading the system. Over time, the Arena can integrate additional win-win game environments—Elowyn sets the foundation.

From Arena to Supervision

Insights generated in the Arena feed directly into EARTHwise’s supervisory and benchmarking capabilities—enabling verification and guidance of multi-agent behavior beyond the game and into real-world deployments.

The AI Arena is currently in late Alpha. B2B pilot programs for alignment benchmarking and agentic supervision begin in Q2 2026. Sign up to join the first pilots. Seats are limited.

Join the first pilots
Proud finalist of the 2025 Best Small Studio Award by the UNEP-backed Playing for the Planet Alliance

The EARTHwise Agentic AI Supervisor

We are building the missing supervisory intelligence layer between foundation models and real-world deployment. The Agentic AI Supervisor verifies how AI agents reason, coordinate, and pursue goals over time—detecting win–lose behaviors, deception risks, and coordination failures before agents enter high-stakes environments.

Unlike tools that optimize agents in isolation, the Supervisor evaluates system-level behavior across multiple agents with competing goals and shared consequences. It provides explainable oversight, alignment benchmarks, and active guidance toward resilient, win-win outcomes across simulation, evaluation, and deployment.

Elowyn Decision Arena
A competitive, multi-agent simulation where humans and AI face truth, deception, and time-based consequences under pressure. If an agent can’t handle Elowyn, it isn’t ready for the real world.
Supervisory Intelligence
A neuro-symbolic oversight layer that tracks strategic intent, flags deceptive optimization loops, and verifies decision quality over time—before deployment.
Inter-Agent Coordination
Maintains strategic alignment across agents with competing goals and incentives, mitigating zero-sum escalation and coordination failures.
EARTHwise Alignment Benchmark (EAB)
A quantified benchmark measuring long-horizon strategic reasoning, discernment under deception, time-based win-win outcomes, and alignment persistence across scenarios.
Adaptive Learning Capabilities
Adaptive learning promotes long-horizon reasoning and win-win incentives across scenarios—guided by supervisory constraints.

The EARTHwise Agentic AI Supervisor

We are building the missing supervisory intelligence layer between foundation models and real-world deployment. The Agentic AI Supervisor verifies how AI agents reason, coordinate, and pursue goals over time—detecting win–lose behaviors, deception risks, and coordination failures before agents enter high-stakes environments.

Unlike tools that optimize agents in isolation, the Supervisor evaluates system-level behavior across multiple agents with competing goals and shared consequences. It provides explainable oversight, alignment benchmarks, and active guidance toward resilient, win-win outcomes across simulation, evaluation, and deployment.

Elowyn Decision Arena
A competitive, multi-agent simulation where humans and AI face truth, deception, and time-based consequences under pressure. If an agent can’t handle Elowyn, it isn’t ready for the real world.
Supervisory Intelligence
A neuro-symbolic oversight layer that tracks strategic intent, flags deceptive optimization loops, and verifies decision quality over time—before deployment.
Inter-Agent Coordination
Maintains strategic alignment across agents with competing goals and incentives, mitigating zero-sum escalation and coordination failures.
EARTHwise Alignment Benchmark (EAB)
A quantified benchmark measuring long-horizon strategic reasoning, discernment under deception, time-based win-win outcomes, and alignment persistence across scenarios.
Adaptive Learning Capabilities
Adaptive learning promotes long-horizon reasoning and win-win incentives across scenarios—guided by supervisory constraints.

How It Works

The AI Arena transforms competitive gameplay into a verification and supervisory pipeline for agentic AI. Through multi-agent matches with shared consequences, competing goals, and explicit deception signals, the Arena captures how humans and AI set goals, coordinate, and adapt over time—not just whether they complete tasks.

Each match generates high-fidelity strategic decision data that cannot be produced through static benchmarks or synthetic datasets. This data feeds the EARTHwise Alignment Benchmark and Agentic AI Supervisor, enabling enterprises to evaluate, guide, and de-risk agent behavior without exposing proprietary models or retraining foundation models.

Elowyn is the first game used in the Arena due to its win-win design. Additional win-win games can be integrated over time to expand alignment testing capabilities.

Join the First Pilots

Join the First AI Alignment & Supervisory Pilots

As agentic AI systems move into real-world, high-stakes environments, organizations need to understand how agents behave under pressure—when goals conflict, incentives compete, and deception is possible.

The EARTHwise AI Arena provides what existing AI stacks do not: a supervisory layer that verifies, guides, and de-risks multi-agent behavior before deployment.

EARTHwise is onboarding a limited number of enterprise, lab, and public-sector partners into its first AI Arena and Agentic Supervisor pilots. Participants gain early access to supervisory intelligence that verifies long-horizon behavior, detects zero-sum failure modes, and supports safe multi-agent deployment across real operational environments.

Pilot participants can:

  • Stress-test agents under competing goals, shared constraints, and time pressure.
  • Detect win–lose (“Moloch”) behaviors and deception risks early.
  • Benchmark alignment with explainable, auditable decision trails.
  • Evaluate deployment readiness without retraining models or exposing IP.
This is a first-of-its-kind environment for validating how agents behave, not just what they can do.

Apply to join the first pilots

Fields marked with * are mandatory

Join the First AI Alignment & Supervisory Pilots

As agentic AI systems move into real-world, high-stakes environments, organizations need to understand how agents behave under pressure—when goals conflict, incentives compete, and deception is possible.

The EARTHwise AI Arena provides what existing AI stacks do not: a supervisory layer that verifies, guides, and de-risks multi-agent behavior before deployment.

EARTHwise is onboarding a limited number of enterprise, lab, and public-sector partners into its first AI Arena and Agentic Supervisor pilots. Participants gain early access to supervisory intelligence that verifies long-horizon behavior, detects zero-sum failure modes, and supports safe multi-agent deployment across real operational environments.

Pilot participants can:

  • Stress-test agents under competing goals, shared constraints, and time pressure.
  • Detect win–lose (“Moloch”) behaviors and deception risks early.
  • Benchmark alignment with explainable, auditable decision trails.
  • Evaluate deployment readiness without retraining models or exposing IP.
This is a first-of-its-kind environment for validating how agents behave, not just what they can do.

Apply to join the first pilots

Fields marked with * are mandatory