Add ABIDES arena#104
Conversation
54128a6 to
bbc23f7
Compare
f9ccb8f to
6adff22
Compare
3acba5c to
526a714
Compare
|
@Muhtasham thanks so much for all this push, this is really exciting! I ended up taking a more thorough look though, and for a couple reasons, let's shelve this one for now (leave PR open rather than merge). To list the reasons:
I do really want to incorporate a market-making arena here, i think the head-to-head format is a key missing component that would be really desirable. for instance, the Jane Street ETC competition unfortunately doesn't seem to have any open source edition, but that'd be closer to what I'd imagine is appropriate for CodeClash. But I'll leave this open. perhaps we can keep chatting about it, but for now, no pressure to make this one work. |
|
Thanks John, that makes sense. I agree this is the right place to pause rather than force it in. To clarify the current design: #104 is independent score-maximization over matched seeded ABIDES worlds. Each player gets the same config/seed schedule with a market maker and ZI background traders, but those worlds are instantiated separately per player. So the submitted policies do not directly interact in the same market, and I agree that this is not head-to-head in the Bomberland/CoreWar sense. I also agree the scoring is harder to explain than ideal. It uses runtime-owned ledgers updated from trusted ABIDES exchange execution messages, but that does require stitching into ABIDES internals. I think the next useful version would be a redesign around shared-market competition: multiple CodeClash submissions trading in the same ABIDES market, with runtime-owned ledgers/scoring per participant and careful controls for ordering/position artifacts. That is more work and should probably be designed explicitly rather than patched onto this PR. I’ll leave this open as a reference implementation for the Docker/runtime adapter and restricted policy boundary, but won’t push for merge as-is. |
Summary
abides-sim/abidesc4bf157678928934417aba6073eb0651aeaf6d15, constrain Python dependencies, and pinpipin the arena imageabides_agent.pydefinesdecide(observation)and returns declarative buy/sell limit-order intentsCompetition Format
ABIDES is an independent score-maximization arena, not direct model-vs-model trading in the same market instance.
Each CodeClash player is evaluated in an isolated ABIDES market process using matched seeds/configuration: exchange, market maker, and background zero-intelligence traders. CodeClash compares players by average trusted mark-to-market profit across those matched market runs.
This follows the now-merged restricted-protocol pattern from CybORG (#110) and SCML (#111): submitted policies do not own mutable simulator objects, and the arena runtime owns validation, scoring, timeouts, and simulator integration. The competition is over who writes the better policy for the same seeded simulator conditions.
Runtime Behavior
decide(observation)code out-of-process with a per-decision timeout and passes only plain observation dictionariesLimitOrdersORDER_EXECUTEDmessages plus final exchange price, not from mutable submitted-code stateCRASH_SCOREwithout stalling the whole roundfrom helper import XworkHardening
TradingAgentsubclass submission contract with a restricted protocol boundaryvalidation_timeout,decision_timeout, andplayer_timeoutconfig knobsabides_results.jsonhandling from neutral0.0ties toCRASH_SCOREwith error detailsVerification
mainafter Add Bomberland arena #105, Restrict CybORG player protocol #110, and Restrict SCML player protocol #111 mergedUV_CACHE_DIR=/private/tmp/codeclash-uv-cache uv run ruff check codeclash/arenas/abides/abides.py codeclash/arenas/abides/runtime/run_abides.py tests/arenas/test_abides.py-> passedUV_CACHE_DIR=/private/tmp/codeclash-uv-cache uv run pytest -q tests/arenas/test_abides.py-> 13 passedUV_CACHE_DIR=/private/tmp/codeclash-uv-cache uv run pre-commit run --files codeclash/arenas/abides/runtime/run_abides.py tests/arenas/test_abides.py-> passeddocker build -t codeclash/abides -f codeclash/arenas/abides/ABIDES.Dockerfile .-> passedstatus: "ok",orders_submitted: 18,policy_errors: 0, and average scores of-900.0for both playerscash,shares,orders_submitted,policy_errors, andstatusdecide()calls hit the decision timeout without stalling the simulationNote: local
UV_CACHE_DIR=/private/tmp/codeclash-uv-cache uv run pytest -q tests/arenasreached 219 passed and 2 Figgie failures because Docker is not running locally (/Users/muhtasham/.docker/run/docker.sockmissing). The same latest branch's GitHub pytest check is green.