Pydantic AI Agents playing Diplomacy
- Add color:power:model key to map
- Run a big experiment set
- Analyze reuslts in R
- Update Readme
brew install git uv
git clone git@github.com:zachmayer/diplomacy-agents.git
cd diplomacy-agents
make install
make types
GitHub Actions runs make check-ci
for every push & PR.
All config (ruff, pyright) lives in pyproject.toml
. Makefile is self-documenting:
make help
Uses diplomacy python package for the game engine. The API docs are very useful:
- https://diplomacy.readthedocs.io/en/stable/api/diplomacy.engine.game.html
- https://diplomacy.readthedocs.io/en/stable/api/diplomacy.engine.map.html
- https://diplomacy.readthedocs.io/en/stable/api/diplomacy.engine.message.html
- https://diplomacy.readthedocs.io/en/stable/api/diplomacy.engine.power.html
- https://diplomacy.readthedocs.io/en/stable/api/diplomacy.engine.renderer.html
- https://diplomacy.readthedocs.io/en/stable/api/diplomacy.utils.export.html
-
No
cast()
or# type: ignore
outsidediplomacy_agents/engine.py
.
• If another file needs a cast, create a typed helper inengine.py
instead.
• The only allowed suppressions are for untyped third-party libraries. -
No raw
str
in public APIs.
• Plain strings are for human-readable text.
• For tokens, use explicit type aliases fromdiplomacy_agents/types.py
(e.g.,Phase
) orLiteral
types fromdiplomacy_agents/literals.py
(e.g.,Power
). -
Runtime-validated I/O.
• All data structures are defined as Pydantic models indiplomacy_agents/models.py
to ensure data is valid at module boundaries.
• All function parameters are explicitly typed — never use bare*args
or**kwargs
. -
Full static safety.
•pyright --strict
must report zero errors; CI enforces this viamake lint types
. -
Keep the helpers façade thin.
•engine.py
is the only module that talks directly to the untypeddiplomacy
library. Keep it minimal and well-commented. -
Zero-tolerance for legacy fallbacks.
• Code must target the dependency versions pinned inpyproject.toml
— no runtime version checks, try/except branches, or alternate code paths for "other" versions.
• If an API changes, update the project; do not add fallbacks or shims.
Before every commit, run make check-all
locally.
- Primary entry point:
make check-all
runs formatting, linting, type-checking, and tests. CI enforces same. - One-off script:
uv run script.py
(neverpython …
).
diplomacy_agents/models.py
: Contains all Pydantic data models.diplomacy_agents/types.py
: Contains all semantic type aliases (e.g.,type Order = str
).diplomacy_agents/literals.py
: Contains alltyping.Literal
definitions for constrained value sets.- Imports: All imports must be absolute and placed at the top of the file.
- Names: Use
snake_case
for variables and functions. Export public symbols via__all__
. - Simplicity: Keep code linear and obvious. Avoid clever one-liners or unnecessary branching. Use keyword-only arguments for clarity.
- Forbidden:
Any
,Optional
where a default is possible, unchecked reflection, and silent# type: ignore
comments. - Single escape hatch:
engine.py
is the only placecast()
or# type: ignore
may be used to handle the untypeddiplomacy
library. - Prefer explicit types: Use
Literal
,NewType
, and Pydantic models over raw primitives likestr
ordict
.
-
Validate external inputs at boundaries using Pydantic models.
-
Favor immutability: use frozen Pydantic models or tuples for data you don't intend to mutate.
-
Fail fast with clear exceptions; never use silent fallbacks.
-
Never "swallow" exceptions (e.g. blanket
except Exception: pass
or logging-only): either- re-raise the error,
- raise a domain-specific exception, or
- handle it in a way that guarantees program correctness.
Logging and continuing is only acceptable for non-critical, optional features that don't affect core behaviour and where the log message explicitly says what failed and why. In all other cases, let the error propagate so CI catches it.