chini-004-uber-dispatch

Ride Dispatch (Uber-style matching)

Match riders to drivers in real time. Stay alive when a region's matcher dies.

Source: Classic system-design interview corpus (Uber / Lyft dispatch)

Prompt

Design a real-time ride-dispatch system.

Functional:
- POST /ride accepts a rider's request (location).
- The system matches the rider to a nearby driver and returns the match.
- Driver location updates stream in continuously.

Non-functional:
- A spike in ride requests (rain in a city, end of a concert) at 5x must not collapse matching latency.
- If one regional matcher fails, requests in that region must reroute, not drop.
- Driver-location ingestion must not back-pressure the matching path.

Return a Chinilla CanvasState. You'll likely want separate paths for ingest vs match, a queue between them, and replication or routing for the matcher.

Constraints

Max components: 14
Required behaviors: queue, retry
Monthly budget: $1200

Stress scenarios

Baseline rides

baseline

Normal ride-request volume.

5x rush spike

spike

Ride volume jumps 5x (concert lets out).

Matcher outage

outage

A matching component fails. Requests must reroute.

Driver-location flood

cascade

Heavy location-update traffic with jitter. Must not back-pressure matches.

Pass criteria (overall)

Min stability score: 70
Max drop rate: 5.0%
Min delivery rate: 90.0%
Max errors: 5

Submit your run

Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.

End-to-end:

pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...

chini-bench run chini-004-uber-dispatch \
  --provider openrouter --model google/gemini-2.0-flash-001 \
  --as alice

Or inspect the prompt first:

chini-bench prompt chini-004-uber-dispatch

Providers: openai · anthropic · google · openrouter · ollama

Leaderboard

Rank	Submitter	Model	Score	Stability	Delivery	Design	Pass
#1	rl_v06_run2	rl_policy custom single-shot	89	78.0	100.0	60.0	✓
#2	rl_v06_run2	rl_policy custom single-shot	89	83.0	100.0	50.0	✗
#3	rl_v06_run2	rl_policy custom single-shot	89	82.0	94.0	75.0	✗
#4	rl_v06_run2	rl_policy custom single-shot	89	83.0	92.0	75.0	✗
#5	rl_v06_run1	rl_policy custom single-shot	87	82.0	87.0	60.0	✗
#6	rl_v06_run1	rl_policy custom single-shot	87	73.0	100.0	60.0	✓
#7	rl_v06_run1	rl_policy custom single-shot	87	82.0	87.0	85.0	✗
#8	rl_v06_run2	rl_policy custom single-shot	87	77.0	94.0	60.0	✗
#9	rl_v06_run2	rl_policy custom single-shot	87	83.0	92.0	35.0	✗
#10	rl_v06_run2	rl_policy custom single-shot	87	80.0	89.0	75.0	✗
#11	rl_v06_run2	rl_policy custom single-shot	87	80.0	91.0	85.0	✗
#12	rl_v06_run1	rl_policy custom single-shot	86	76.0	100.0	60.0	✗
#13	rl_v06_run2	rl_policy custom single-shot	86	80.0	87.0	85.0	✗
#14	rl_v06_run1	rl_policy custom single-shot	85	82.0	81.0	75.0	✗
#15	rl_v06_run2	rl_policy custom single-shot	85	80.0	100.0	25.0	✗
#16	rl_v06_run2	rl_policy custom single-shot	85	73.0	94.0	85.0	✗
#17	rl_v06_run2	rl_policy custom single-shot	85	74.0	92.0	85.0	✗
#18	rl_v06_run2	rl_policy custom single-shot	85	75.0	92.0	70.0	✗
#19	rl_v06_run2	rl_policy custom single-shot	85	78.0	87.0	60.0	✗
#20	rl_v06_run2	rl_policy custom single-shot	85	84.0	75.0	75.0	✗
#21	rl_v06_run2	rl_policy custom single-shot	85	84.0	75.0	75.0	✗
#22	rl_v06_run2	rl_policy custom single-shot	85	80.0	83.0	75.0	✗
#23	rl_v06_run2	rl_policy custom single-shot	85	84.0	75.0	75.0	✗
#24	rl_v06_run1	rl_policy custom single-shot	84	77.0	84.0	60.0	✗
#25	rl_v06_run2	rl_policy custom single-shot	84	78.0	84.0	75.0	✗
#26	rl_v06_run1	rl_policy custom single-shot	83	81.0	75.0	100.0	✗
#27	rl_v06_run1	rl_policy custom single-shot	83	77.0	83.0	85.0	✗
#28	rl_v06_run1	rl_policy custom single-shot	83	76.0	83.0	85.0	✗
#29	rl_v06_run1	rl_policy custom single-shot	83	80.0	75.0	75.0	✗
#30	rl_v06_run1	rl_policy custom single-shot	83	80.0	75.0	85.0	✗
#31	rl_v06_run2	rl_policy custom single-shot	83	81.0	75.0	60.0	✗
#32	rl_v06_run2	rl_policy custom single-shot	83	83.0	70.0	75.0	✗
#33	rl_v06_run2	rl_policy custom single-shot	83	71.0	100.0	50.0	✗
#34	rl_v06_run2	rl_policy custom single-shot	83	81.0	75.0	60.0	✗
#35	alex	anthropic/claude-sonnet-4.6 default single-shot	82	72.0	88.0	100.0	✗
#36	rl_v06_run1	rl_policy custom single-shot	82	80.0	74.0	75.0	✗
#37	rl_v06_run2	rl_policy custom single-shot	82	79.0	75.0	75.0	✗
#38	rl_v06_run1	rl_policy custom single-shot	81	76.0	92.0	25.0	✗
#39	rl_v06_run1	rl_policy custom single-shot	81	77.0	75.0	75.0	✗
#40	rl_v06_run2	rl_policy custom single-shot	81	80.0	78.0	60.0	✗
#41	rl_v06_run2	rl_policy custom single-shot	81	72.0	83.0	100.0	✗
#42	rl_v06_run2	rl_policy custom single-shot	81	81.0	75.0	60.0	✗
#43	rl_v06_run1	rl_policy custom single-shot	80	80.0	74.0	35.0	✗
#44	rl_v06_run2	rl_policy custom single-shot	80	80.0	67.0	75.0	✗
#45	rl_v06_run2	rl_policy custom single-shot	80	84.0	75.0	50.0	✗
#46	alex	google/gemini-3.1-pro-preview default reflexion	79	73.0	75.0	100.0	✗
#47	rl_v06_run2	rl_policy custom single-shot	79	83.0	75.0	50.0	✗
#48	rl_v06_run2	rl_policy custom single-shot	79	75.0	71.0	85.0	✗
#49	rl_v06_run2	rl_policy custom single-shot	79	72.0	75.0	60.0	✗
#50	rl_v06_run1	rl_policy custom single-shot	77	83.0	69.0	25.0	✗
#51	rl_v06_run2	rl_policy custom single-shot	77	69.0	75.0	85.0	✗
#52	rl_v06_run2	rl_policy custom single-shot	77	78.0	75.0	25.0	✗
#53	rl_v06_run2	rl_policy custom single-shot	77	69.0	75.0	100.0	✗
#54	rl_v06_run1	rl_policy custom single-shot	76	76.0	75.0	25.0	✗
#55	rl_v06_run2	rl_policy custom single-shot	75	77.0	55.0	60.0	✗
#56	rl_v06_run2	rl_policy custom single-shot	74	86.0	38.0	60.0	✗
#57	rl_v06_run1	rl_policy custom single-shot	72	77.0	46.0	85.0	✗
#58	alex	x-ai/grok-4.20 default reflexion	69	38.0	100.0	100.0	✗
#59	alex	openai/gpt-5.4 default reflexion	68	44.0	100.0	100.0	✗
#60	rl_v06_run1	rl_policy custom single-shot	68	81.0	25.0	75.0	✗
#61	alex	openai/gpt-5.4 default single-shot	53	65.0	0.0	75.0	✗
#62	alex	anthropic/claude-sonnet-4.6 default reflexion	46	0.0	100.0	100.0	✗
#63	rl_v06_run2	rl_policy custom single-shot	46	51.0	0.0	35.0	✗
#64	rl_v06_run2	rl_policy custom single-shot	36	32.0	0.0	45.0	✗
#65	alex	x-ai/grok-4.20 default single-shot	33	26.0	0.0	75.0	✗
#66	rl_v06_run2	rl_policy custom single-shot	33	25.0	0.0	35.0	✗
#67	alex	google/gemini-3.1-pro-preview default single-shot	30	19.0	0.0	100.0	✗

Per-scenario breakdown of the top run

Scenario	Health	Drop rate	Delivered	Pass
baseline	81.0	0.0%	300	✓
rush-spike	80.0	0.0%	1500	✓
matcher-outage	71.0	0.0%	72	✓
ingest-pressure	80.0	0.0%	300	✓

How is this scored? →