chini-001-url-shortener

URL Shortener (TinyURL)

Map long URLs to short tokens. Survive spike traffic on the redirect path.

Source: Classic system-design interview corpus (TinyURL / bit.ly clone)

Prompt

Design a URL shortener.

Functional:
- POST /shorten takes a long URL, returns a 7-character short code.
- GET /:code returns a 302 redirect to the original URL.
- The READ path (redirects) is 100x more frequent than writes.

Non-functional:
- p99 redirect latency must stay healthy under a 5x traffic spike.
- A cache outage must not bring down the system.

Return a Chinilla CanvasState that models the request flow end to end. Include at least one cache, at least one storage component, and a rate-limit or circuit-breaker behavior somewhere on the write path.

Constraints

Max components: 10
Required behaviors: storage, ratelimit
Monthly budget: $400

Stress scenarios

Baseline reads

baseline

Steady redirect traffic with no failures.

5x read spike

spike

Traffic suddenly multiplies 5x. Redirect path must hold.

Cache outage

outage

Cache layer disabled. All reads must fall through to storage.

Pass criteria (overall)

Min stability score: 70
Max drop rate: 5.0%
Min delivery rate: 90.0%
Max errors: 5

Submit your run

Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.

End-to-end:

pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...

chini-bench run chini-001-url-shortener \
  --provider openrouter --model google/gemini-2.0-flash-001 \
  --as alice

Or inspect the prompt first:

chini-bench prompt chini-001-url-shortener

Providers: openai · anthropic · google · openrouter · ollama

Leaderboard

Rank	Submitter	Model	Score	Stability	Delivery	Design	Pass
#1	rl_v06_run1	rl_policy custom single-shot	94	87.0	100.0	50.0	✓
#2	rl_smoke8	rl_policy custom single-shot	92	83.0	100.0	50.0	✓
#3	rl_v06_run1	rl_policy custom single-shot	92	83.0	100.0	50.0	✓
#4	rl_v06_run1	rl_policy custom single-shot	92	83.0	100.0	50.0	✓
#5	rl_v06_run1	rl_policy custom single-shot	92	83.0	100.0	50.0	✓
#6	rl_smoke_v2_persist	rl_policy custom single-shot	92	83.0	100.0	50.0	✓
#7	rl_v06_run2	rl_policy custom single-shot	92	83.0	100.0	50.0	✓
#8	rl_v06_run2	rl_policy custom single-shot	92	83.0	100.0	50.0	✓
#9	rl_v06_run1	rl_policy custom single-shot	91	87.0	100.0	25.0	✗
#10	rl_v06_run1	rl_policy custom single-shot	90	80.0	100.0	60.0	✓
#11	rl_v06_run2	rl_policy custom single-shot	90	80.0	100.0	60.0	✓
#12	alex	google/gemini-3.1-pro-preview default single-shot	89	77.0	100.0	75.0	✓
#13	rl_v06_run1	rl_policy custom single-shot	89	78.0	100.0	50.0	✓
#14	rl_v06_run2	rl_policy custom single-shot	89	83.0	100.0	50.0	✗
#15	rl_v06_run2	rl_policy custom single-shot	89	77.0	100.0	50.0	✓
#16	rl_v06_run2	rl_policy custom single-shot	89	83.0	100.0	25.0	✗
#17	rl_v06_run1	rl_policy custom single-shot	88	79.0	96.0	50.0	✓
#18	rl_v06_run2	rl_policy custom single-shot	88	82.0	89.0	50.0	✗
#19	rl_v06_run1	rl_policy custom single-shot	87	73.0	100.0	50.0	✓
#20	rl_v06_run1	rl_policy custom single-shot	87	73.0	100.0	50.0	✓
#21	rl_v06_run2	rl_policy custom single-shot	87	73.0	100.0	50.0	✓
#22	rl_v06_run2	rl_policy custom single-shot	87	73.0	100.0	50.0	✓
#23	rl_v06_run2	rl_policy custom single-shot	87	73.0	100.0	50.0	✓
#24	rl_v06_run1	rl_policy custom single-shot	86	83.0	83.0	35.0	✗
#25	rl_v06_run2	rl_policy custom single-shot	86	83.0	83.0	50.0	✗
#26	rl_v06_run2	rl_policy custom single-shot	86	83.0	83.0	50.0	✗
#27	rl_v06_run2	rl_policy custom single-shot	86	83.0	83.0	50.0	✗
#28	alex	google/gemini-3.1-pro-preview default reflexion	84	78.0	82.0	75.0	✗
#29	rl_v06_run2	rl_policy custom single-shot	84	67.0	100.0	50.0	✓
#30	rl_v06_run2	rl_policy custom single-shot	84	83.0	83.0	50.0	✗
#31	rl_v06_run2	rl_policy custom single-shot	84	83.0	83.0	25.0	✗
#32	rl_v06_run2	rl_policy custom single-shot	84	83.0	83.0	50.0	✗
#33	rl_v06_run2	rl_policy custom single-shot	84	83.0	83.0	25.0	✗
#34	rl_v06_run2	rl_policy custom single-shot	84	73.0	100.0	25.0	✗
#35	rl_v06_run2	rl_policy custom single-shot	84	68.0	100.0	50.0	✓
#36	rl_v06_run2	rl_policy custom single-shot	84	83.0	83.0	50.0	✗
#37	rl_smoke8	rl_policy custom single-shot	82	83.0	67.0	50.0	✗
#38	rl_v06_run1	rl_policy custom single-shot	82	83.0	67.0	50.0	✗
#39	rl_v06_run1	rl_policy custom single-shot	82	83.0	67.0	50.0	✗
#40	rl_v06_run1	rl_policy custom single-shot	82	83.0	67.0	50.0	✗
#41	rl_v06_run1	rl_policy custom single-shot	82	83.0	67.0	50.0	✗
#42	rl_v06_run1	rl_policy custom single-shot	82	83.0	67.0	50.0	✗
#43	rl_v06_run2	rl_policy custom single-shot	82	83.0	67.0	50.0	✗
#44	rl_v06_run2	rl_policy custom single-shot	82	83.0	67.0	50.0	✗
#45	rl_v06_run2	rl_policy custom single-shot	82	83.0	67.0	50.0	✗
#46	rl_v06_run2	rl_policy custom single-shot	82	83.0	67.0	50.0	✗
#47	rl_v06_run2	rl_policy custom single-shot	82	83.0	67.0	50.0	✗
#48	rl_v06_run2	rl_policy custom single-shot	82	83.0	67.0	50.0	✗
#49	rl_v06_run2	rl_policy custom single-shot	82	84.0	67.0	50.0	✗
#50	rl_v06_run2	rl_policy custom single-shot	82	83.0	67.0	50.0	✗
#51	rl_v06_run2	rl_policy custom single-shot	82	78.0	78.0	50.0	✗
#52	rl_v06_run2	rl_policy custom single-shot	82	84.0	67.0	50.0	✗
#53	rl_v06_run2	rl_policy custom single-shot	81	78.0	83.0	50.0	✗
#54	rl_v06_run1	rl_policy custom single-shot	79	83.0	67.0	25.0	✗
#55	rl_v06_run2	rl_policy custom single-shot	79	83.0	67.0	50.0	✗
#56	rl_v06_run2	rl_policy custom single-shot	79	83.0	67.0	25.0	✗
#57	rl_v06_run2	rl_policy custom single-shot	79	83.0	67.0	50.0	✗
#58	rl_v06_run2	rl_policy custom single-shot	79	83.0	67.0	50.0	✗
#59	rl_v06_run2	rl_policy custom single-shot	78	55.0	100.0	75.0	✗
#60	rl_smoke8	rl_policy custom single-shot	77	73.0	67.0	50.0	✗
#61	rl_v06_run2	rl_policy custom single-shot	77	83.0	67.0	25.0	✗
#62	alex	anthropic/claude-sonnet-4.6 default reflexion	75	57.0	100.0	100.0	✗
#63	rl_v06_run1	rl_policy custom single-shot	71	83.0	33.0	50.0	✗
#64	alex	anthropic/claude-sonnet-4.6 default single-shot	63	25.0	100.0	100.0	✗
#65	alex	openai/gpt-5.4 default reflexion	61	22.0	100.0	100.0	✗
#66	alex	x-ai/grok-4.20 default reflexion	60	20.0	100.0	100.0	✗
#67	rl_smoke8	rl_policy custom single-shot	55	52.0	29.0	75.0	✗
#68	alex	x-ai/grok-4.20 default single-shot	54	67.0	0.0	25.0	✗
#69	alex	openai/gpt-5.4 default single-shot	47	13.0	67.0	100.0	✗

Per-scenario breakdown of the top run

Scenario	Health	Drop rate	Delivered	Pass
baseline	87.0	0.0%	80	✓
5x-spike	85.0	0.0%	400	✓
cache-outage	88.0	0.0%	15	✓

How is this scored? →