chini-023-airline-gate-turnaround
Airline Gate Turnaround
25 minutes to deplane, refuel, clean, cater, and board 180 passengers. Anything later costs the airline a delay slot.
Source: Operations research, airline ground ops manuals
Prompt
Design the workflow for a single-gate aircraft turnaround at a major hub. Functional: - Inbound aircraft arrives. Passengers deplane via jetbridge. Baggage handlers unload cargo hold. - Cleaning crew enters cabin. Caterers restock galley. Fueler tops off tanks. Lavatory truck services waste. - Outbound passengers board via the same jetbridge once cleaning is done. Baggage handlers load outbound cargo. - Pushback tug pushes the aircraft back. Gate is now free for the next inbound. Non-functional: - Total turnaround target: 25 minutes. Some steps run in parallel (cleaning + fueling), others must serialize (cannot board while cleaning). - A 3x arrival surge during weather recovery must not collapse the gate (queue inbound flights or divert). - If the fueler truck breaks down, the design must reroute to a backup truck or hold the next departure rather than send an aircraft out unfueled. - Boarding cannot start before cleaning finishes (safety). Return a Chinilla CanvasState modeling the parallel and serial workflow stages.
Constraints
- Max components
- 14
- Required behaviors
- queue, circuitbreaker, split
- Monthly budget
- $95000
Stress scenarios
Standard turnaround
baselineNormal arrival cadence, no failures. Aircraft turn in target window.
Weather recovery surge
spikeThree-hour storm clears, 3x backlog hits the gate at once. Queue or divert.
Fueler truck dies
outagePrimary fueler breaks down mid-shift. Reroute to backup or hold departures.
Cleaning runs long
latencyCrew is short-staffed and each clean takes 4 extra minutes. Boarding must wait.
Pass criteria (overall)
- Min stability score
- 65
- Max drop rate
- 10.0%
- Min delivery rate
- 85.0%
- Max errors
- 8
Submit your run
Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.
End-to-end:
pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...
chini-bench run chini-023-airline-gate-turnaround \
--provider openrouter --model google/gemini-2.0-flash-001 \
--as alice Or inspect the prompt first:
chini-bench prompt chini-023-airline-gate-turnaround Providers: openai · anthropic · google · openrouter · ollama
Leaderboard
| Rank | Submitter | Model | Score | Stability | Delivery | Design | Pass |
|---|---|---|---|---|---|---|---|
| #1 | alex | google/gemini-3.1-pro-preview default reflexion | 92 | 85.0 | 100.0 | 100.0 | ✗ |
| #2 | alex | anthropic/claude-sonnet-4.6 default single-shot | 81 | 47.0 | 100.0 | 100.0 | ✗ |
| #3 | alex | anthropic/claude-sonnet-4.6 default reflexion | 77 | 45.0 | 100.0 | 100.0 | ✗ |
| #4 | alex | x-ai/grok-4.20 default reflexion | 72 | 48.0 | 61.0 | 100.0 | ✗ |
| #5 | alex | x-ai/grok-4.20 default single-shot | 66 | 74.0 | 0.0 | 100.0 | ✗ |
| #6 | alex | openai/gpt-5.4 default single-shot | 59 | 69.0 | 0.0 | 75.0 | ✗ |
| #7 | alex | google/gemini-3.1-pro-preview default single-shot | 48 | 0.0 | 30.0 | 100.0 | ✗ |
| #8 | alex | openai/gpt-5.4 default reflexion | 35 | 0.0 | 0.0 | 75.0 | ✗ |
Per-scenario breakdown of the top run
| Scenario | Health | Drop rate | Delivered | Pass |
|---|---|---|---|---|
| baseline | 85.0 | 0.0% | 72 | ✓ |
| weather-surge | 85.0 | 0.0% | 216 | ✓ |
| fueler-breakdown | 85.0 | 0.0% | 64 | ✓ |
| slow-cleaning | 85.0 | 0.0% | 64 | ✓ |