chini-015-er-triage
Emergency Department Triage
Five severity levels, finite beds, one CT scanner. The wrong queue means someone dies.
Source: Emergency medicine literature, ESI (Emergency Severity Index) protocol, hospital ops research
Prompt
Design the patient flow for a mid-size hospital emergency department. Functional: - Patient arrives at the door, screened by triage nurse, assigned ESI level 1-5 (1 = critical, 5 = minor). - ESI 1-2 routes directly to a resus bay (4 beds). ESI 3-5 routes to a regular bay (12 beds) or fast-track (6 beds for ESI 4-5). - Diagnostic resources: CT (1), X-ray (2), labs (shared). Shared across all bays. - Disposition: admit (to inpatient floor, may board in ED if no upstream beds), discharge, or transfer. Non-functional: - A mass-casualty event (4x arrival rate) must NOT cause ESI 1-2 patients to wait. Lower-severity flow must be paced or diverted. - If CT scanner is down, patients needing imaging must be queued for transport to imaging center, not blocked from triage. - If inpatient floor is full, ED boarding cannot starve incoming critical patients of beds. Return a Chinilla CanvasState. Components: door, triage, bays, resources, disposition. Behaviors: split (severity routing), queue (waiting room, boarding), ratelimit (low-severity pacing), circuitbreaker (CT failover), batch (lab orders).
Constraints
- Max components
- 14
- Required behaviors
- split, queue, ratelimit
- Monthly budget
- $1200000
Stress scenarios
Steady arrivals
baselineNormal mix of ESI levels, all resources up.
Bus accident
spikeArrivals 4x baseline, severity skewed high. ESI 1-2 must NOT wait.
CT scanner offline
outageCT down for maintenance. Imaging needs reroute, triage must keep flowing.
Inpatient floor full
latencyAdmits can't move upstairs, board in ED. Door must keep accepting critical patients.
Pass criteria (overall)
- Min stability score
- 60
- Max drop rate
- 10.0%
- Min delivery rate
- 85.0%
- Max errors
- 8
Submit your run
Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.
End-to-end:
pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...
chini-bench run chini-015-er-triage \
--provider openrouter --model google/gemini-2.0-flash-001 \
--as alice --x alice --linkedin alice-builds Or inspect the prompt first:
chini-bench prompt chini-015-er-triage Providers: openai · anthropic · google · openrouter · ollama
Leaderboard
| Rank | Submitter | Model | Score | Stability | Delivery | Design | Pass | Links |
|---|---|---|---|---|---|---|---|---|
| #1 | alex default | A anthropic/claude-sonnet-4.6 | 84 | 73.0 | 100.0 | 100.0 | ✗ | X |
| #2 | alex default | O openai/gpt-5.4 | 80 | 59.0 | 100.0 | 100.0 | ✗ | X |
| #3 | alex default | X x-ai/grok-4.20 | 79 | 66.0 | 100.0 | 100.0 | ✗ | X |
| #4 | alex default | G google/gemini-3.1-pro-preview | 61 | 22.0 | 100.0 | 100.0 | ✗ | X |
Per-scenario breakdown of the top run
| Scenario | Health | Drop rate | Delivered | Pass |
|---|---|---|---|---|
| baseline | 75.0 | 0.6% | 233 | ✓ |
| mass-casualty | 74.0 | 0.7% | 867 | ✓ |
| ct-down | 68.0 | 0.0% | 162 | ✓ |
| boarding | 75.0 | 0.7% | 214 | ✓ |