Name: Reinforce Tactics
Author: Reinforce Tactics

Evaluate Large Language Models

Reinforce Tactics provides a rigorous benchmark for testing LLM capabilities in strategic reasoning, spatial awareness, and long-horizon planning. Compare models head-to-head in competitive tournaments.

🟢

OpenAI GPT-5

Evaluate GPT-5 and GPT-5 Mini on complex tactical scenarios requiring multi-step planning.

🟣

Anthropic Claude

Benchmark Claude 4.5 Sonnet, Claude 4.5 Opus, and Claude Haiku 4.5 on strategic reasoning tasks.

🔵

Google Gemini

Test Gemini Pro and Gemini Ultra on spatial reasoning and resource management.

⚪

Custom Models

Integrate any LLM via API or local inference for comparative evaluation.

Run automated tournaments, generate ELO ratings, and analyze decision-making patterns across different model architectures and prompting strategies.

Learn About Tournaments

Rich Tactical Environment

Eight distinct unit types create a complex decision space that challenges AI agents to reason about positioning, resource allocation, and opponent modeling.

Warrior

Frontline Fighter

Stalwart defenders who excel in close combat. High durability makes them perfect for holding the line.

HP15

Attack10

Defense6

Movement3

Mage

Arcane Striker

Masters of mystical arts who can strike from afar and paralyze enemies for 3 turns.

HP10

Attack12

Defense4

Movement2

Cleric

Support Healer

Devoted healers who restore allies and cure status effects. Essential for sustained campaigns.

HP8

Attack2

Defense4

Movement2

Archer

Ranged Specialist

Precise marksmen with extended range from high ground. Enemies cannot counter-attack.

HP15

Attack5

Defense1

Movement3

Knight

Heavy Cavalry

Armored cavalry with devastating charge attacks. Deals +50% damage when moving 3+ tiles before attacking.

HP18

Attack8

Defense5

Movement4

Rogue

Stealth Assassin

Swift assassins who deal +50% flank damage when enemies are adjacent to allies. 15% evade chance (30% in forests).

HP12

Attack9

Defense3

Movement4

Barbarian

Berserker

Ferocious warriors with exceptional mobility and endurance. High HP and movement for aggressive tactics.

HP20

Attack10

Defense2

Movement5

Sorcerer

Battle Mage

Versatile spellcasters who can grant Haste for extra actions, or buff allies with +35% attack/defense.

HP10

Attack8

Defense3

Movement2

Built for AI Research

A complete tactical environment designed for reinforcement learning experimentation, LLM benchmarking, and AI development.

🎮

Turn-Based Tactical Combat

Strategic grid-based battles with attacks, counter-attacks, paralysis, and healing mechanics inspired by Fire Emblem and Advance Wars.

🤖

Gymnasium RL Environment

Full Gymnasium compatibility with multi-discrete action space, configurable reward shaping, and headless mode for high-speed training.

🧠

LLM Evaluation Framework

Benchmark GPT-4, Claude, Gemini, and other large language models on strategic reasoning, planning, and multi-step decision making.

🏆

Tournament System

Run automated tournaments between AI agents, track ELO ratings, and generate detailed performance analytics and leaderboards.

📊

Replay & Analysis Tools

Record battles, export replays to video, and analyze decision patterns. Essential for AI research and model interpretability.

🔧

Modular Architecture

Clean, extensible Python codebase for adding new units, mechanics, reward functions, and custom AI agents.

Explore the Documentation

Everything you need to start evaluating LLMs and training RL agents.

📖

Getting Started

Installation, setup, and quick start guide

⚔️

Game Mechanics

Units, combat system, and structures

🏆

Tournament System

Run AI tournaments and track ELO ratings

📋

Implementation Status

Features and development roadmap

Start Evaluating Your AI Models

Clone the repository, run your first LLM tournament, and discover how different models perform on strategic reasoning tasks. Open source and ready for research.

Read the Docs Star on GitHub