chini-009-video-upload

Video Upload Pipeline

Accept large uploads. Transcode in the background. Survive a worker meltdown.

Source: Classic system-design interview corpus (YouTube / TikTok upload + transcode)

Prompt

Design a video upload and transcode pipeline.

Functional:
- POST /upload accepts a video file and returns immediately with an upload id.
- A background pipeline transcodes the source into multiple resolutions (480p, 720p, 1080p).
- When transcoding finishes, the asset is published and watchable.

Non-functional:
- A 4x burst of uploads must not break the accept path; backlog is acceptable, dropped uploads are not.
- If the transcode worker pool partially fails, in-flight jobs should retry and complete.
- Upload acceptance must not block on transcode completion (decoupled paths).

Return a Chinilla CanvasState. Expect a queue between accept and workers, retry behavior, and storage for the source + variants.

Constraints

Max components: 13
Required behaviors: queue, retry, storage
Monthly budget: $1500

Stress scenarios

Baseline uploads

baseline

Normal upload + transcode volume.

4x upload burst

spike

Creator event causes 4x uploads. Accept path must hold.

Worker failure

outage

A transcode worker dies. Jobs must retry and complete elsewhere.

Slow transcode

latency

Transcode latency triples. Pipeline must absorb without dropping uploads.

Pass criteria (overall)

Min stability score: 65
Max drop rate: 6.0%
Min delivery rate: 88.0%
Max errors: 5

Submit your run

Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.

End-to-end:

pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...

chini-bench run chini-009-video-upload \
  --provider openrouter --model google/gemini-2.0-flash-001 \
  --as alice

Or inspect the prompt first:

chini-bench prompt chini-009-video-upload

Providers: openai · anthropic · google · openrouter · ollama

Leaderboard

Rank	Submitter	Model	Score	Stability	Delivery	Design	Pass
#1	rl_v06_run2	rl_policy custom single-shot	93	84.0	100.0	100.0	✓
#2	rl_v06_run1	rl_policy custom single-shot	92	83.0	100.0	100.0	✓
#3	rl_v06_run1	rl_policy custom single-shot	92	82.0	100.0	100.0	✓
#4	rl_v06_run1	rl_policy custom single-shot	91	86.0	100.0	75.0	✗
#5	rl_v06_run2	rl_policy custom single-shot	91	81.0	100.0	100.0	✓
#6	rl_v06_run2	rl_policy custom single-shot	91	86.0	100.0	75.0	✗
#7	rl_v06_run2	rl_policy custom single-shot	90	79.0	98.0	100.0	✓
#8	rl_v06_run2	rl_policy custom single-shot	90	84.0	100.0	75.0	✗
#9	rl_v06_run2	rl_policy custom single-shot	90	83.0	100.0	75.0	✗
#10	rl_v06_run2	rl_policy custom single-shot	90	83.0	100.0	100.0	✗
#11	rl_v06_run2	rl_policy custom single-shot	90	83.0	100.0	75.0	✗
#12	rl_v06_run1	rl_policy custom single-shot	89	82.0	92.0	85.0	✗
#13	rl_v06_run2	rl_policy custom single-shot	89	82.0	92.0	85.0	✗
#14	rl_v06_run2	rl_policy custom single-shot	89	82.0	92.0	85.0	✗
#15	rl_v06_run2	rl_policy custom single-shot	88	83.0	88.0	85.0	✗
#16	rl_v06_run1	rl_policy custom single-shot	87	83.0	92.0	60.0	✗
#17	rl_v06_run1	rl_policy custom single-shot	87	81.0	87.0	85.0	✗
#18	rl_v06_run2	rl_policy custom single-shot	87	78.0	97.0	75.0	✗
#19	rl_v06_run2	rl_policy custom single-shot	87	81.0	87.0	85.0	✗
#20	rl_v06_run2	rl_policy custom single-shot	87	81.0	87.0	85.0	✗
#21	rl_v06_run2	rl_policy custom single-shot	87	81.0	87.0	85.0	✗
#22	rl_v06_run2	rl_policy custom single-shot	85	81.0	88.0	60.0	✗
#23	rl_v06_run2	rl_policy custom single-shot	85	82.0	87.0	60.0	✗
#24	rl_v06_run1	rl_policy custom single-shot	84	81.0	87.0	60.0	✗
#25	rl_v06_run2	rl_policy custom single-shot	84	85.0	88.0	60.0	✗
#26	alex	anthropic/claude-sonnet-4.6 default single-shot	83	72.0	88.0	100.0	✗
#27	alex	google/gemini-3.1-pro-preview default single-shot	83	72.0	88.0	100.0	✗
#28	rl_v06_run2	rl_policy custom single-shot	83	82.0	75.0	85.0	✗
#29	rl_v06_run2	rl_policy custom single-shot	83	88.0	75.0	85.0	✗
#30	rl_v06_run2	rl_policy custom single-shot	83	77.0	88.0	60.0	✗
#31	alex	x-ai/grok-4.20 default single-shot	82	80.0	75.0	100.0	✗
#32	rl_v06_run1	rl_policy custom single-shot	82	79.0	75.0	100.0	✗
#33	rl_v06_run2	rl_policy custom single-shot	82	80.0	75.0	85.0	✗
#34	rl_v06_run2	rl_policy custom single-shot	82	80.0	75.0	85.0	✗
#35	rl_v06_run2	rl_policy custom single-shot	82	79.0	75.0	85.0	✗
#36	rl_v06_run2	rl_policy custom single-shot	82	79.0	75.0	100.0	✗
#37	alex	openai/gpt-5.4 default single-shot	81	63.0	100.0	100.0	✗
#38	rl_v06_run1	rl_policy custom single-shot	81	84.0	67.0	100.0	✗
#39	rl_v06_run1	rl_policy custom single-shot	81	78.0	75.0	85.0	✗
#40	rl_v06_run2	rl_policy custom single-shot	81	82.0	75.0	60.0	✗
#41	rl_v06_run2	rl_policy custom single-shot	81	82.0	75.0	75.0	✗
#42	rl_v06_run2	rl_policy custom single-shot	81	78.0	75.0	85.0	✗
#43	rl_v06_run2	rl_policy custom single-shot	81	82.0	75.0	60.0	✗
#44	rl_v06_run2	rl_policy custom single-shot	81	82.0	75.0	75.0	✗
#45	rl_v06_run2	rl_policy custom single-shot	81	82.0	83.0	60.0	✗
#46	alex	google/gemini-3.1-pro-preview default reflexion	80	74.0	75.0	100.0	✗
#47	rl_v06_run2	rl_policy custom single-shot	80	76.0	75.0	85.0	✗
#48	rl_v06_run2	rl_policy custom single-shot	80	81.0	75.0	60.0	✗
#49	rl_v06_run2	rl_policy custom single-shot	80	81.0	75.0	60.0	✗
#50	rl_v06_run1	rl_policy custom single-shot	79	83.0	75.0	50.0	✗
#51	rl_v06_run1	rl_policy custom single-shot	79	84.0	75.0	50.0	✗
#52	rl_v06_run2	rl_policy custom single-shot	79	73.0	75.0	100.0	✗
#53	rl_v06_run1	rl_policy custom single-shot	78	81.0	75.0	60.0	✗
#54	rl_v06_run1	rl_policy custom single-shot	78	82.0	75.0	60.0	✗
#55	rl_v06_run2	rl_policy custom single-shot	77	73.0	75.0	60.0	✗
#56	rl_v06_run2	rl_policy custom single-shot	77	69.0	75.0	85.0	✗
#57	rl_v06_run2	rl_policy custom single-shot	76	83.0	75.0	50.0	✗
#58	rl_v06_run2	rl_policy custom single-shot	76	72.0	75.0	60.0	✗
#59	alex	x-ai/grok-4.20 default reflexion	74	61.0	75.0	100.0	✗
#60	alex	openai/gpt-5.4 default reflexion	68	64.0	54.0	100.0	✗
#61	rl_v06_run1	rl_policy custom single-shot	68	83.0	37.0	60.0	✗
#62	alex	anthropic/claude-sonnet-4.6 default reflexion	57	20.0	88.0	100.0	✗

Per-scenario breakdown of the top run

Scenario	Health	Drop rate	Delivered	Pass
baseline	85.0	0.0%	112	✓
upload-burst	81.0	0.1%	510	✓
worker-failure	85.0	0.0%	24	✓
slow-transcode	84.0	0.0%	28	✓

How is this scored? →