Launch special: 50% off Pro monthly with code LAUNCH50 Upgrade now
Skip to main content
← All problems
chini-003-twitter-timeline

Social Timeline (Twitter-style fanout)

Generate a personalized timeline for millions of users. Don't melt when a celebrity posts.

Source: Classic system-design interview corpus (Twitter / Instagram timeline)

Prompt

Design the timeline-generation system for a social network.

Functional:
- POST /tweet appends a post by a user.
- GET /timeline returns the most recent posts from accounts a user follows.
- Reads vastly outnumber writes (~100:1).

Non-functional:
- A celebrity post (high-fanout write) must NOT degrade timeline reads for other users.
- Survive a 10x read spike (a viral event).
- A timeline-cache failure should degrade gracefully (slower reads), not drop posts.

Return a Chinilla CanvasState. You will likely need a write path, a read path, a cache, and some kind of fanout/queue between them. Include a rate-limit behavior on the write side.

Constraints

Max components
14
Required behaviors
queue, ratelimit, storage
Monthly budget
$1500

Stress scenarios

Baseline reads + writes

baseline

Mixed traffic at normal rates.

10x viral spike

spike

Read traffic explodes 10x. Reads must survive even if write path slows.

Timeline cache failure

outage

Cache layer drops out. Reads must fall back, not error.

Noisy network

cascade

Process-time jitter at 25%. Tests resilience to flaky internal calls.

Pass criteria (overall)

Min stability score
70
Max drop rate
5.0%
Min delivery rate
90.0%
Max errors
5

Submit your run

Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.

End-to-end:
pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...

chini-bench run chini-003-twitter-timeline \
  --provider openrouter --model google/gemini-2.0-flash-001 \
  --as alice
Or inspect the prompt first:
chini-bench prompt chini-003-twitter-timeline
Providers: openai · anthropic · google · openrouter · ollama

Leaderboard

Rank Submitter Model Score Stability Delivery Design Pass
#1 rl_v06_run2
rl_policy
custom single-shot
89 77.0 100.0 75.0
#2 rl_v06_run2
rl_policy
custom single-shot
87 74.0 100.0 75.0
#3 rl_v06_run1
rl_policy
custom single-shot
86 72.0 100.0 75.0
#4 rl_v06_run2
rl_policy
custom single-shot
86 81.0 84.0 75.0
#5 rl_v06_run2
rl_policy
custom single-shot
86 81.0 100.0 25.0
#6 rl_v06_run2
rl_policy
custom single-shot
85 69.0 100.0 75.0
#7 rl_v06_run2
rl_policy
custom single-shot
85 75.0 100.0 50.0
#8 rl_v06_run2
rl_policy
custom single-shot
85 83.0 85.0 75.0
#9 rl_v06_run2
rl_policy
custom single-shot
85 80.0 100.0 25.0
#10 rl_v06_run2
rl_policy
custom single-shot
85 80.0 83.0 75.0
#11 rl_v06_run2
rl_policy
custom single-shot
84 79.0 81.0 75.0
#12 rl_v06_run2
rl_policy
custom single-shot
84 80.0 88.0 75.0
#13 rl_v06_run1
rl_policy
custom single-shot
82 80.0 90.0 25.0
#14 rl_v06_run2
rl_policy
custom single-shot
82 74.0 92.0 50.0
#15 rl_v06_run1
rl_policy
custom single-shot
81 80.0 88.0 25.0
#16 rl_v06_run2
rl_policy
custom single-shot
81 66.0 100.0 50.0
#17 rl_v06_run2
rl_policy
custom single-shot
81 66.0 100.0 50.0
#18 rl_v06_run2
rl_policy
custom single-shot
81 81.0 84.0 25.0
#19 rl_v06_run2
rl_policy
custom single-shot
81 71.0 100.0 25.0
#20 rl_v06_run2
rl_policy
custom single-shot
81 77.0 75.0 75.0
#21 rl_v06_run2
rl_policy
custom single-shot
81 80.0 70.0 75.0
#22 alex
x-ai/grok-4.20
default reflexion
80 79.0 68.0 100.0
#23 rl_v06_run2
rl_policy
custom single-shot
80 78.0 69.0 75.0
#24 rl_v06_run2
rl_policy
custom single-shot
80 79.0 68.0 75.0
#25 rl_v06_run2
rl_policy
custom single-shot
80 80.0 84.0 25.0
#26 rl_v06_run2
rl_policy
custom single-shot
80 79.0 68.0 75.0
#27 rl_v06_run2
rl_policy
custom single-shot
80 79.0 68.0 75.0
#28 alex
google/gemini-3.1-pro-preview
default single-shot
79 57.0 100.0 100.0
#29 rl_v06_run1
rl_policy
custom single-shot
79 79.0 83.0 35.0
#30 rl_v06_run1
rl_policy
custom single-shot
79 78.0 68.0 75.0
#31 rl_v06_run1
rl_policy
custom single-shot
79 78.0 68.0 75.0
#32 rl_v06_run2
rl_policy
custom single-shot
79 78.0 68.0 75.0
#33 rl_v06_run2
rl_policy
custom single-shot
79 67.0 100.0 25.0
#34 rl_v06_run2
rl_policy
custom single-shot
79 78.0 68.0 75.0
#35 rl_v06_run2
rl_policy
custom single-shot
79 77.0 75.0 75.0
#36 alex
openai/gpt-5.4
default single-shot
78 60.0 100.0 100.0
#37 rl_v06_run1
rl_policy
custom single-shot
78 76.0 65.0 75.0
#38 rl_v06_run2
rl_policy
custom single-shot
78 76.0 65.0 75.0
#39 rl_v06_run2
rl_policy
custom single-shot
78 76.0 65.0 75.0
#40 rl_v06_run2
rl_policy
custom single-shot
78 77.0 65.0 75.0
#41 rl_v06_run1
rl_policy
custom single-shot
77 78.0 68.0 50.0
#42 rl_v06_run1
rl_policy
custom single-shot
77 79.0 82.0 25.0
#43 rl_v06_run1
rl_policy
custom single-shot
77 64.0 100.0 25.0
#44 rl_v06_run2
rl_policy
custom single-shot
77 78.0 68.0 50.0
#45 rl_v06_run2
rl_policy
custom single-shot
77 81.0 64.0 50.0
#46 alex
anthropic/claude-sonnet-4.6
default single-shot
76 52.0 100.0 100.0
#47 alex
google/gemini-3.1-pro-preview
default reflexion
76 51.0 100.0 100.0
#48 rl_v06_run1
rl_policy
custom single-shot
76 77.0 65.0 50.0
#49 rl_v06_run2
rl_policy
custom single-shot
74 75.0 64.0 75.0
#50 rl_v06_run2
rl_policy
custom single-shot
74 78.0 68.0 25.0
#51 rl_v06_run1
rl_policy
custom single-shot
73 62.0 83.0 50.0
#52 rl_v06_run1
rl_policy
custom single-shot
73 77.0 66.0 25.0
#53 rl_v06_run1
rl_policy
custom single-shot
73 77.0 66.0 25.0
#54 rl_v06_run2
rl_policy
custom single-shot
73 76.0 65.0 25.0
#55 rl_v06_run2
rl_policy
custom single-shot
73 77.0 65.0 25.0
#56 rl_v06_run2
rl_policy
custom single-shot
73 77.0 66.0 25.0
#57 alex
anthropic/claude-sonnet-4.6
default reflexion
68 44.0 100.0 100.0
#58 rl_v06_run2
rl_policy
custom single-shot
68 80.0 34.0 75.0
#59 rl_v06_run1
rl_policy
custom single-shot
67 62.0 54.0 75.0
#60 rl_v06_run1
rl_policy
custom single-shot
64 62.0 53.0 50.0
#61 alex
x-ai/grok-4.20
default single-shot
42 16.0 47.0 100.0
#62 alex
openai/gpt-5.4
default reflexion
24 9.0 0.0 75.0
Per-scenario breakdown of the top run
Scenario Health Drop rate Delivered Pass
baseline 81.0 0.0% 280
viral-spike 82.0 0.0% 2520
cache-failure 72.0 0.0% 45
fanout-jitter 73.0 5.3% 71