chini-005-chat-fanout

Group Chat Fanout (WhatsApp-style)

Deliver messages to large group chats in order. No drops, no duplicates.

Source: Classic system-design interview corpus (WhatsApp / Slack messaging)

Prompt

Design the message-delivery backbone for a group-chat product.

Functional:
- POST /message sends a message to a group of N users.
- Every recipient must receive every message exactly once, in send order.
- Some users are offline and must receive messages when they reconnect.

Non-functional:
- A burst of group messages (5x normal) must not delay 1:1 messages.
- Loss of the offline-storage component must not lose in-flight messages.
- The system must enforce a per-sender rate limit to prevent abuse.

Return a Chinilla CanvasState. You'll need a queue, storage for offline users, fanout to recipients, and a rate-limit on the inbound side.

Constraints

Max components: 13
Required behaviors: queue, ratelimit, storage
Monthly budget: $900

Stress scenarios

Baseline messages

baseline

Normal mix of 1:1 and group chats.

5x group burst

spike

Group-message volume spikes 5x.

Offline storage outage

outage

The storage component for offline users is down.

Noisy network

cascade

Process-time jitter at 30%, ambient 5% drop. Tests resilience.

Pass criteria (overall)

Min stability score: 72
Max drop rate: 3.0%
Min delivery rate: 93.0%
Max errors: 4

Submit your run

Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.

End-to-end:

pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...

chini-bench run chini-005-chat-fanout \
  --provider openrouter --model google/gemini-2.0-flash-001 \
  --as alice

Or inspect the prompt first:

chini-bench prompt chini-005-chat-fanout

Providers: openai · anthropic · google · openrouter · ollama

Leaderboard

Rank	Submitter	Model	Score	Stability	Delivery	Design	Pass
#1	rl_v06_run2	rl_policy custom single-shot	92	82.0	100.0	75.0	✗
#2	rl_v06_run1	rl_policy custom single-shot	91	80.0	100.0	75.0	✗
#3	rl_v06_run1	rl_policy custom single-shot	91	79.0	100.0	75.0	✗
#4	rl_v06_run1	rl_policy custom single-shot	91	79.0	100.0	75.0	✗
#5	rl_v06_run2	rl_policy custom single-shot	91	80.0	100.0	75.0	✗
#6	rl_v06_run2	rl_policy custom single-shot	91	80.0	100.0	75.0	✗
#7	rl_v06_run2	rl_policy custom single-shot	91	80.0	100.0	75.0	✗
#8	rl_v06_run2	rl_policy custom single-shot	91	81.0	100.0	75.0	✗
#9	rl_v06_run2	rl_policy custom single-shot	91	81.0	100.0	75.0	✗
#10	rl_v06_run2	rl_policy custom single-shot	91	81.0	100.0	75.0	✗
#11	rl_v06_run2	rl_policy custom single-shot	91	80.0	100.0	75.0	✗
#12	rl_v06_run2	rl_policy custom single-shot	91	80.0	100.0	75.0	✗
#13	rl_v06_run2	rl_policy custom single-shot	91	80.0	100.0	75.0	✗
#14	rl_v06_run2	rl_policy custom single-shot	91	80.0	100.0	75.0	✗
#15	rl_v06_run2	rl_policy custom single-shot	91	80.0	100.0	75.0	✗
#16	rl_v06_run2	rl_policy custom single-shot	91	80.0	100.0	75.0	✗
#17	rl_v06_run2	rl_policy custom single-shot	91	80.0	100.0	75.0	✗
#18	rl_v06_run2	rl_policy custom single-shot	91	79.0	100.0	75.0	✗
#19	rl_v06_run2	rl_policy custom single-shot	90	77.0	100.0	75.0	✗
#20	rl_v06_run2	rl_policy custom single-shot	90	77.0	100.0	75.0	✗
#21	alex	google/gemini-3.1-pro-preview default reflexion	89	76.0	100.0	100.0	✗
#22	rl_v06_run1	rl_policy custom single-shot	89	75.0	100.0	75.0	✗
#23	rl_v06_run1	rl_policy custom single-shot	89	81.0	100.0	50.0	✗
#24	rl_v06_run1	rl_policy custom single-shot	89	81.0	100.0	50.0	✗
#25	rl_v06_run2	rl_policy custom single-shot	89	81.0	100.0	50.0	✗
#26	rl_v06_run2	rl_policy custom single-shot	89	81.0	100.0	50.0	✗
#27	rl_v06_run2	rl_policy custom single-shot	89	81.0	100.0	75.0	✗
#28	rl_v06_run2	rl_policy custom single-shot	89	80.0	100.0	75.0	✗
#29	rl_v06_run1	rl_policy custom single-shot	88	74.0	100.0	75.0	✗
#30	rl_v06_run1	rl_policy custom single-shot	88	80.0	92.0	75.0	✗
#31	rl_v06_run2	rl_policy custom single-shot	88	74.0	100.0	75.0	✗
#32	rl_v06_run1	rl_policy custom single-shot	87	82.0	100.0	25.0	✗
#33	rl_v06_run1	rl_policy custom single-shot	87	77.0	92.0	75.0	✗
#34	rl_v06_run1	rl_policy custom single-shot	86	81.0	100.0	25.0	✗
#35	rl_v06_run1	rl_policy custom single-shot	86	81.0	100.0	25.0	✗
#36	rl_v06_run1	rl_policy custom single-shot	86	81.0	84.0	75.0	✗
#37	rl_v06_run1	rl_policy custom single-shot	86	68.0	100.0	75.0	✗
#38	rl_v06_run1	rl_policy custom single-shot	86	81.0	84.0	75.0	✗
#39	rl_v06_run2	rl_policy custom single-shot	86	81.0	100.0	25.0	✗
#40	rl_v06_run2	rl_policy custom single-shot	86	81.0	92.0	50.0	✗
#41	rl_v06_run2	rl_policy custom single-shot	86	79.0	100.0	25.0	✗
#42	rl_v06_run2	rl_policy custom single-shot	86	75.0	100.0	50.0	✗
#43	rl_v06_run2	rl_policy custom single-shot	86	79.0	100.0	50.0	✗
#44	rl_v06_run2	rl_policy custom single-shot	86	82.0	91.0	75.0	✗
#45	rl_v06_run1	rl_policy custom single-shot	85	77.0	100.0	25.0	✗
#46	rl_v06_run1	rl_policy custom single-shot	85	77.0	100.0	25.0	✗
#47	rl_v06_run2	rl_policy custom single-shot	85	80.0	83.0	85.0	✗
#48	rl_v06_run1	rl_policy custom single-shot	84	81.0	100.0	25.0	✗
#49	rl_v06_run2	rl_policy custom single-shot	84	64.0	100.0	75.0	✗
#50	rl_v06_run2	rl_policy custom single-shot	83	75.0	83.0	75.0	✗
#51	rl_v06_run2	rl_policy custom single-shot	83	81.0	84.0	75.0	✗
#52	rl_v06_run2	rl_policy custom single-shot	83	81.0	84.0	75.0	✗
#53	rl_v06_run2	rl_policy custom single-shot	82	79.0	82.0	50.0	✗
#54	rl_v06_run2	rl_policy custom single-shot	81	78.0	80.0	50.0	✗
#55	alex	openai/gpt-5.4 default reflexion	80	66.0	88.0	100.0	✗
#56	rl_v06_run2	rl_policy custom single-shot	80	80.0	69.0	75.0	✗
#57	rl_v06_run1	rl_policy custom single-shot	79	79.0	82.0	35.0	✗
#58	rl_v06_run1	rl_policy custom single-shot	78	78.0	81.0	50.0	✗
#59	alex	google/gemini-3.1-pro-preview default single-shot	76	65.0	75.0	100.0	✗
#60	rl_v06_run2	rl_policy custom single-shot	74	79.0	68.0	50.0	✗
#61	alex	x-ai/grok-4.20 default single-shot	73	48.0	91.0	100.0	✗
#62	alex	x-ai/grok-4.20 default reflexion	72	57.0	75.0	100.0	✗
#63	alex	anthropic/claude-sonnet-4.6 default single-shot	71	69.0	57.0	100.0	✗
#64	alex	anthropic/claude-sonnet-4.6 default reflexion	58	11.0	100.0	100.0	✗
#65	alex	openai/gpt-5.4 default single-shot	45	40.0	20.0	100.0	✗

Per-scenario breakdown of the top run

Scenario	Health	Drop rate	Delivered	Pass
baseline	86.0	0.0%	216	✓
group-burst	85.0	0.0%	1080	✓
offline-storage-down	77.0	0.0%	30	✓
noisy-network	78.0	4.3%	54	✗

How is this scored? →