AI Alignment Arena

Supervisory Intelligence for the Age of AI Agents 

 

The EARTHwise AI Arena is a multi-agent simulation and supervisory environment for testing how humans and AI reason, coordinate, and respond to truth, deception, and uncertainty under competitive pressure. Unlike task-based evaluations or static benchmarks, the Arena reveals not just what agents decide, but how they behave over time when faced with competing goals, interdependence, and irreversible tradeoffs.

The Training Problem

Many reinforcement-learning AI systems have been trained in zero-sum games where success is defined by defeating an opponent. When such win–lose environments dominate training, agents learn strategies optimized for domination rather than coordination. Under real-world conditions—ambiguity, time pressure, and competing interests—this often manifests as win–lose behavior, deceptive signaling, and short-term optimization, making such systems risky to deploy in enterprise, governance, and multi-stakeholder environments.

Why Elowyn

Elowyn is the first game introduced into the Arena, purpose-built for alignment testing through competitive yet interdependent play. Instead of rewarding domination, it embeds shared system health, time-based victory conditions, and explicit deception signaling into gameplay. This allows the Arena to test whether intelligence can detect and counter deception, resist zero-sum traps, and pursue victory without collapsing the shared system—capabilities required for safe deployment.

Over time, the Arena can support additional win-win game environments. Elowyn establishes the foundation.

The Agentic AI Supervisor

Beyond the game environment, EARTHwise is developing an Agentic AI Supervisor that applies these insights to real-world, multi-agent deployments. The Supervisor detects win–lose (“Moloch”) behaviors, monitors coordination under competing goals, and guides agents toward resilient, win-win outcomes—before and during high-stakes operations.

The AI Arena is currently in late Alpha. B2B pilot programs for alignment benchmarking and agentic supervision begin in Q2 2026. Sign up below to join the first pilots. Seats are limited.

Join the first pilots
Proud finalist of the 2025 Best Small Studio Award by the UNEP-backed Playing for the Planet Alliance

The EARTHwise Agentic AI Supervisor

We are building the missing supervisory intelligence layer between foundation models and real-world deployment. The Agentic AI Supervisor verifies how AI agents reason, coordinate, and pursue goals over time—detecting win–lose behaviors, deception risks, and coordination failures before agents enter high-stakes environments.

Unlike tools that optimize agents in isolation, the Supervisor evaluates system-level behavior across multiple agents with competing goals and shared consequences. It provides explainable oversight, alignment benchmarks, and active guidance toward resilient, win-win outcomes across simulation, evaluation, and deployment.

Elowyn Decision Arena
A competitive, multi-agent simulation where humans and AI face truth, deception, and time-based consequences under pressure. If an agent can’t handle Elowyn, it isn’t ready for the real world.
Supervisory Intelligence
A neuro-symbolic oversight layer that tracks strategic intent, flags deceptive optimization loops, and verifies decision quality over time—before deployment.
Inter-Agent Coordination
Maintains strategic alignment across agents with competing goals and incentives, mitigating zero-sum escalation and coordination failures.
EARTHwise Alignment Benchmark (EAB)
A quantified benchmark measuring long-horizon strategic reasoning, discernment under deception, time-based win-win outcomes, and alignment persistence across scenarios.
Adaptive Learning Capabilities
Adaptive learning promotes long-horizon reasoning and win-win incentives across scenarios—guided by supervisory constraints.

The EARTHwise Agentic AI Supervisor

We are building the missing supervisory intelligence layer between foundation models and real-world deployment. The Agentic AI Supervisor verifies how AI agents reason, coordinate, and pursue goals over time—detecting win–lose behaviors, deception risks, and coordination failures before agents enter high-stakes environments.

Unlike tools that optimize agents in isolation, the Supervisor evaluates system-level behavior across multiple agents with competing goals and shared consequences. It provides explainable oversight, alignment benchmarks, and active guidance toward resilient, win-win outcomes across simulation, evaluation, and deployment.

Elowyn Decision Arena
A competitive, multi-agent simulation where humans and AI face truth, deception, and time-based consequences under pressure. If an agent can’t handle Elowyn, it isn’t ready for the real world.
Supervisory Intelligence
A neuro-symbolic oversight layer that tracks strategic intent, flags deceptive optimization loops, and verifies decision quality over time—before deployment.
Inter-Agent Coordination
Maintains strategic alignment across agents with competing goals and incentives, mitigating zero-sum escalation and coordination failures.
EARTHwise Alignment Benchmark (EAB)
A quantified benchmark measuring long-horizon strategic reasoning, discernment under deception, time-based win-win outcomes, and alignment persistence across scenarios.
Adaptive Learning Capabilities
Adaptive learning promotes long-horizon reasoning and win-win incentives across scenarios—guided by supervisory constraints.

How It Works

The AI Arena transforms competitive gameplay into a verification and supervisory pipeline for agentic AI. Through multi-agent matches with shared consequences, competing goals, and explicit deception signals, the Arena captures how humans and AI set goals, coordinate, and adapt over time—not just whether they complete tasks.

Each match generates high-fidelity strategic decision data that cannot be produced through static benchmarks or synthetic datasets. This data feeds the EARTHwise Alignment Benchmark and Agentic AI Supervisor, enabling enterprises to evaluate, guide, and de-risk agent behavior without exposing proprietary models or retraining foundation models.

Elowyn is the first game used in the Arena due to its win-win design. Additional win-win games can be integrated over time to expand alignment testing capabilities.

Join the First Pilots

Join the First AI Alignment & Supervisory Pilots

As agentic AI systems move into real-world, high-stakes environments, organizations need to understand how agents behave under pressure—when goals conflict, incentives compete, and deception is possible.

The EARTHwise AI Arena provides what existing AI stacks do not: a supervisory layer that verifies, guides, and de-risks multi-agent behavior before deployment.

EARTHwise is onboarding a limited number of enterprise, lab, and public-sector partners into its first AI Arena and Agentic Supervisor pilots. Participants gain early access to supervisory intelligence that verifies long-horizon behavior, detects zero-sum failure modes, and supports safe multi-agent deployment across real operational environments.

Pilot participants can:

  • Stress-test agents under competing goals, shared constraints, and time pressure.
  • Detect win–lose (“Moloch”) behaviors and deception risks early.
  • Benchmark alignment with explainable, auditable decision trails.
  • Evaluate deployment readiness without retraining models or exposing IP.
This is a first-of-its-kind environment for validating how agents behave, not just what they can do.

Apply to join the first pilots

Fields marked with * are mandatory

Join the First AI Alignment & Supervisory Pilots

As agentic AI systems move into real-world, high-stakes environments, organizations need to understand how agents behave under pressure—when goals conflict, incentives compete, and deception is possible.

The EARTHwise AI Arena provides what existing AI stacks do not: a supervisory layer that verifies, guides, and de-risks multi-agent behavior before deployment.

EARTHwise is onboarding a limited number of enterprise, lab, and public-sector partners into its first AI Arena and Agentic Supervisor pilots. Participants gain early access to supervisory intelligence that verifies long-horizon behavior, detects zero-sum failure modes, and supports safe multi-agent deployment across real operational environments.

Pilot participants can:

  • Stress-test agents under competing goals, shared constraints, and time pressure.
  • Detect win–lose (“Moloch”) behaviors and deception risks early.
  • Benchmark alignment with explainable, auditable decision trails.
  • Evaluate deployment readiness without retraining models or exposing IP.
This is a first-of-its-kind environment for validating how agents behave, not just what they can do.

Apply to join the first pilots

Fields marked with * are mandatory