chini-010-notification-fanout

Notification Fanout (Push + Email + SMS)

One event, three channels. Slow SMS provider must not block push.

Source: Classic system-design interview corpus (cross-channel notification service)

Prompt

Design a notification service that delivers a single event across three channels: push, email, and SMS.

Functional:
- POST /notify takes an event and a recipient. The service fans out to all enabled channels.
- Each channel has a different downstream provider with different latency and reliability profiles.
- Per-channel preferences (a user opted out of SMS, etc.) are honored.

Non-functional:
- A 5x burst of notifications must not collapse the system.
- If the SMS provider becomes slow, push and email must still go out on time.
- A failed delivery on one channel must not block the others.

Return a Chinilla CanvasState. Expect a split into per-channel paths, queues per channel, and circuit-breaker on flaky downstreams.

Constraints

Max components: 12
Required behaviors: split, queue, circuitbreaker
Monthly budget: $700

Stress scenarios

Baseline notifications

baseline

Steady cross-channel notification volume.

5x broadcast burst

spike

Marketing broadcast 5x's the load.

SMS provider slowdown

latency

SMS provider gets slow. Push and email must not be held back.

Pass criteria (overall)

Min stability score: 70
Max drop rate: 5.0%
Min delivery rate: 90.0%
Max errors: 5

Submit your run

Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.

End-to-end:

pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...

chini-bench run chini-010-notification-fanout \
  --provider openrouter --model google/gemini-2.0-flash-001 \
  --as alice

Or inspect the prompt first:

chini-bench prompt chini-010-notification-fanout

Providers: openai · anthropic · google · openrouter · ollama

Leaderboard

Rank	Submitter	Model	Score	Stability	Delivery	Design	Pass
#1	rl_v06_run2	rl_policy custom single-shot	92	84.0	99.0	75.0	✓
#2	rl_v06_run2	rl_policy custom single-shot	92	85.0	99.0	75.0	✓
#3	rl_v06_run2	rl_policy custom single-shot	92	84.0	99.0	100.0	✓
#4	rl_v06_run1	rl_policy custom single-shot	91	83.0	97.0	75.0	✓
#5	rl_v06_run1	rl_policy custom single-shot	91	83.0	98.0	75.0	✓
#6	rl_v06_run2	rl_policy custom single-shot	91	83.0	98.0	100.0	✓
#7	rl_v06_run2	rl_policy custom single-shot	91	83.0	97.0	75.0	✓
#8	rl_v06_run1	rl_policy custom single-shot	90	82.0	97.0	75.0	✓
#9	rl_v06_run2	rl_policy custom single-shot	90	82.0	96.0	100.0	✓
#10	rl_v06_run2	rl_policy custom single-shot	90	82.0	95.0	75.0	✓
#11	rl_v06_run2	rl_policy custom single-shot	90	83.0	97.0	100.0	✗
#12	alex	google/gemini-3.1-pro-preview default reflexion	89	83.0	99.0	100.0	✗
#13	rl_v06_run1	rl_policy custom single-shot	89	80.0	95.0	75.0	✗
#14	rl_v06_run1	rl_policy custom single-shot	89	81.0	96.0	75.0	✗
#15	rl_v06_run1	rl_policy custom single-shot	89	80.0	95.0	75.0	✓
#16	rl_v06_run2	rl_policy custom single-shot	89	81.0	96.0	75.0	✓
#17	rl_v06_run2	rl_policy custom single-shot	89	81.0	96.0	75.0	✗
#18	rl_v06_run2	rl_policy custom single-shot	89	81.0	96.0	75.0	✓
#19	rl_v06_run2	rl_policy custom single-shot	89	81.0	95.0	75.0	✗
#20	rl_v06_run2	rl_policy custom single-shot	89	80.0	95.0	75.0	✗
#21	rl_v06_run2	rl_policy custom single-shot	89	81.0	96.0	75.0	✓
#22	rl_v06_run2	rl_policy custom single-shot	89	81.0	96.0	75.0	✓
#23	rl_v06_run2	rl_policy custom single-shot	89	81.0	96.0	75.0	✗
#24	rl_v06_run2	rl_policy custom single-shot	89	84.0	99.0	50.0	✗
#25	alex	anthropic/claude-sonnet-4.6 default single-shot	88	79.0	94.0	100.0	✓
#26	rl_v06_run1	rl_policy custom single-shot	88	80.0	94.0	75.0	✗
#27	rl_v06_run1	rl_policy custom single-shot	88	85.0	100.0	50.0	✗
#28	rl_v06_run2	rl_policy custom single-shot	88	82.0	97.0	50.0	✗
#29	rl_v06_run2	rl_policy custom single-shot	87	78.0	92.0	75.0	✗
#30	rl_v06_run2	rl_policy custom single-shot	87	78.0	93.0	75.0	✗
#31	rl_v06_run2	rl_policy custom single-shot	86	78.0	91.0	75.0	✗
#32	rl_v06_run2	rl_policy custom single-shot	86	85.0	79.0	75.0	✗
#33	alex	google/gemini-3.1-pro-preview default single-shot	85	73.0	94.0	75.0	✗
#34	rl_v06_run1	rl_policy custom single-shot	85	85.0	100.0	50.0	✗
#35	rl_v06_run2	rl_policy custom single-shot	85	85.0	100.0	50.0	✗
#36	rl_v06_run2	rl_policy custom single-shot	85	84.0	100.0	50.0	✗
#37	rl_v06_run1	rl_policy custom single-shot	84	67.0	100.0	75.0	✗
#38	rl_v06_run2	rl_policy custom single-shot	84	75.0	88.0	75.0	✗
#39	rl_v06_run2	rl_policy custom single-shot	84	75.0	88.0	75.0	✗
#40	rl_v06_run2	rl_policy custom single-shot	84	75.0	90.0	100.0	✗
#41	rl_v06_run2	rl_policy custom single-shot	84	78.0	91.0	100.0	✗
#42	rl_v06_run2	rl_policy custom single-shot	84	75.0	88.0	75.0	✗
#43	rl_v06_run1	rl_policy custom single-shot	83	83.0	72.0	75.0	✗
#44	rl_v06_run2	rl_policy custom single-shot	83	76.0	90.0	50.0	✗
#45	rl_v06_run2	rl_policy custom single-shot	83	74.0	88.0	100.0	✗
#46	rl_v06_run2	rl_policy custom single-shot	83	83.0	72.0	100.0	✗
#47	alex	x-ai/grok-4.20 default single-shot	82	80.0	72.0	75.0	✗
#48	rl_v06_run2	rl_policy custom single-shot	81	62.0	100.0	75.0	✗
#49	rl_v06_run1	rl_policy custom single-shot	79	68.0	84.0	60.0	✗
#50	rl_v06_run1	rl_policy custom single-shot	77	69.0	82.0	100.0	✗
#51	alex	openai/gpt-5.4 default reflexion	75	58.0	100.0	100.0	✗
#52	rl_v06_run2	rl_policy custom single-shot	75	82.0	46.0	75.0	✗
#53	rl_v06_run2	rl_policy custom single-shot	75	64.0	75.0	75.0	✗
#54	rl_v06_run1	rl_policy custom single-shot	74	81.0	45.0	60.0	✗
#55	alex	openai/gpt-5.4 default single-shot	71	57.0	74.0	100.0	✗
#56	rl_v06_run1	rl_policy custom single-shot	71	65.0	77.0	75.0	✗
#57	rl_v06_run2	rl_policy custom single-shot	71	77.0	40.0	75.0	✗
#58	rl_v06_run2	rl_policy custom single-shot	67	55.0	65.0	85.0	✗
#59	alex	x-ai/grok-4.20 default reflexion	66	55.0	61.0	100.0	✗
#60	rl_v06_run1	rl_policy custom single-shot	66	49.0	73.0	75.0	✗
#61	rl_v06_run2	rl_policy custom single-shot	54	52.0	28.0	70.0	✗
#62	alex	anthropic/claude-sonnet-4.6 default reflexion	48	4.0	100.0	100.0	✗

Per-scenario breakdown of the top run

Scenario	Health	Drop rate	Delivered	Pass
baseline	85.0	0.3%	286	✓
broadcast-burst	83.0	1.1%	1252	✓
sms-slow	85.0	0.0%	60	✓

How is this scored? →