Launch special: 50% off Pro monthly with code LAUNCH50 Upgrade now
Skip to main content
← All problems
chini-002-checkout

E-commerce Checkout with Idempotent Payments

Process checkouts without ever charging a customer twice. Survive a downstream payment-API outage.

Source: Classic system-design interview corpus (Stripe / payment gateway design)

Prompt

Design a checkout system that takes a cart and processes payment.

Functional:
- POST /checkout accepts an order, returns an order id.
- Payment is processed via an external payment provider that may time out.
- Every checkout must be idempotent: a retried checkout must NOT charge the user twice.

Non-functional:
- Survive a 5x peak (e.g., flash sale) without dropping more than 5% of orders.
- Survive a slow payment provider (added latency) without blocking the whole queue.
- Survive a brief payment-provider outage by queueing or circuit-breaking.

Return a Chinilla CanvasState. Include a queue, a retry behavior, and a circuit-breaker on the payment-provider edge.

Constraints

Max components
12
Required behaviors
queue, retry, circuitbreaker
Monthly budget
$800

Stress scenarios

Baseline orders

baseline

Steady checkout traffic.

5x flash sale

spike

Order volume spikes 5x for the duration.

Slow payment provider

latency

Payment provider response time grows by 1500ms.

Payment provider outage

outage

Payment provider is down. System must queue or fail safely (no double-charge).

Pass criteria (overall)

Min stability score
75
Max drop rate
3.0%
Min delivery rate
95.0%
Max errors
3

Submit your run

Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.

End-to-end:
pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...

chini-bench run chini-002-checkout \
  --provider openrouter --model google/gemini-2.0-flash-001 \
  --as alice
Or inspect the prompt first:
chini-bench prompt chini-002-checkout
Providers: openai · anthropic · google · openrouter · ollama

Leaderboard

Rank Submitter Model Score Stability Delivery Design Pass
#1 alex
anthropic/claude-sonnet-4.6
default single-shot
94 87.0 100.0 100.0
#2 alex
openai/gpt-5.4
default single-shot
89 75.0 100.0 100.0
#3 alex
x-ai/grok-4.20
default single-shot
86 69.0 100.0 100.0
#4 alex
google/gemini-3.1-pro-preview
default reflexion
84 65.0 100.0 100.0
#5 rl_v06_run1
rl_policy
custom single-shot
82 61.0 100.0 100.0
#6 rl_v06_run2
rl_policy
custom single-shot
82 70.0 88.0 75.0
#7 alex
google/gemini-3.1-pro-preview
default single-shot
80 56.0 100.0 100.0
#8 rl_v06_run2
rl_policy
custom single-shot
79 54.0 100.0 75.0
#9 rl_v06_run2
rl_policy
custom single-shot
79 72.0 77.0 75.0
#10 rl_v06_run2
rl_policy
custom single-shot
78 71.0 73.0 100.0
#11 rl_v06_run2
rl_policy
custom single-shot
77 75.0 76.0 75.0
#12 alex
anthropic/claude-sonnet-4.6
default reflexion
76 61.0 100.0 100.0
#13 rl_v06_run1
rl_policy
custom single-shot
75 82.0 83.0 25.0
#14 rl_v06_run2
rl_policy
custom single-shot
75 81.0 75.0 60.0
#15 rl_v06_run2
rl_policy
custom single-shot
74 70.0 63.0 100.0
#16 rl_v06_run1
rl_policy
custom single-shot
73 74.0 66.0 75.0
#17 rl_smoke8
rl_policy
custom single-shot
72 82.0 75.0 25.0
#18 rl_v06_run1
rl_policy
custom single-shot
72 82.0 75.0 25.0
#19 rl_v06_run1
rl_policy
custom single-shot
72 83.0 75.0 25.0
#20 rl_v06_run1
rl_policy
custom single-shot
72 82.0 75.0 25.0
#21 rl_v06_run1
rl_policy
custom single-shot
72 82.0 75.0 25.0
#22 rl_v06_run1
rl_policy
custom single-shot
72 82.0 75.0 25.0
#23 rl_v06_run1
rl_policy
custom single-shot
72 82.0 75.0 25.0
#24 rl_v06_run2
rl_policy
custom single-shot
72 82.0 75.0 25.0
#25 rl_v06_run2
rl_policy
custom single-shot
72 83.0 75.0 25.0
#26 rl_smoke8
rl_policy
custom single-shot
71 81.0 75.0 25.0
#27 rl_v06_run1
rl_policy
custom single-shot
71 81.0 75.0 25.0
#28 rl_v06_run1
rl_policy
custom single-shot
71 81.0 75.0 25.0
#29 rl_smoke8
rl_policy
custom single-shot
70 66.0 58.0 100.0
#30 rl_v06_run2
rl_policy
custom single-shot
70 58.0 67.0 100.0
#31 rl_v06_run2
rl_policy
custom single-shot
69 65.0 55.0 60.0
#32 rl_v06_run2
rl_policy
custom single-shot
69 69.0 61.0 50.0
#33 rl_v06_run2
rl_policy
custom single-shot
69 66.0 56.0 75.0
#34 rl_v06_run2
rl_policy
custom single-shot
68 64.0 56.0 75.0
#35 rl_v06_run1
rl_policy
custom single-shot
67 64.0 53.0 75.0
#36 rl_v06_run1
rl_policy
custom single-shot
67 63.0 52.0 100.0
#37 rl_v06_run2
rl_policy
custom single-shot
67 56.0 63.0 75.0
#38 alex
x-ai/grok-4.20
default reflexion
66 63.0 50.0 100.0
#39 rl_v06_run1
rl_policy
custom single-shot
66 62.0 51.0 100.0
#40 rl_v06_run2
rl_policy
custom single-shot
66 62.0 52.0 100.0
#41 rl_v06_run2
rl_policy
custom single-shot
66 62.0 51.0 100.0
#42 rl_v06_run2
rl_policy
custom single-shot
65 60.0 51.0 100.0
#43 rl_v06_run2
rl_policy
custom single-shot
65 62.0 50.0 75.0
#44 rl_v06_run2
rl_policy
custom single-shot
65 60.0 51.0 100.0
#45 rl_v06_run2
rl_policy
custom single-shot
65 61.0 49.0 75.0
#46 rl_v06_run2
rl_policy
custom single-shot
65 61.0 49.0 75.0
#47 rl_v06_run2
rl_policy
custom single-shot
65 60.0 51.0 75.0
#48 rl_v06_run2
rl_policy
custom single-shot
65 60.0 51.0 75.0
#49 rl_v06_run2
rl_policy
custom single-shot
65 60.0 51.0 75.0
#50 rl_v06_run2
rl_policy
custom single-shot
64 58.0 51.0 100.0
#51 rl_v06_run2
rl_policy
custom single-shot
63 60.0 47.0 85.0
#52 rl_v06_run2
rl_policy
custom single-shot
63 59.0 46.0 100.0
#53 rl_v06_run1
rl_policy
custom single-shot
62 62.0 51.0 50.0
#54 rl_v06_run1
rl_policy
custom single-shot
62 72.0 37.0 85.0
#55 rl_v06_run2
rl_policy
custom single-shot
62 62.0 51.0 60.0
#56 rl_v06_run2
rl_policy
custom single-shot
61 60.0 51.0 75.0
#57 rl_v06_run1
rl_policy
custom single-shot
60 54.0 45.0 100.0
#58 rl_v06_run2
rl_policy
custom single-shot
60 54.0 55.0 75.0
#59 rl_v06_run2
rl_policy
custom single-shot
59 55.0 42.0 100.0
#60 rl_v06_run2
rl_policy
custom single-shot
59 54.0 41.0 75.0
#61 rl_v06_run2
rl_policy
custom single-shot
59 55.0 40.0 85.0
#62 rl_v06_run2
rl_policy
custom single-shot
59 59.0 46.0 75.0
#63 rl_v06_run1
rl_policy
custom single-shot
58 52.0 41.0 75.0
#64 rl_v06_run2
rl_policy
custom single-shot
58 58.0 44.0 60.0
#65 rl_v06_run2
rl_policy
custom single-shot
57 52.0 40.0 100.0
#66 rl_v06_run2
rl_policy
custom single-shot
56 51.0 36.0 75.0
#67 rl_v06_run2
rl_policy
custom single-shot
55 53.0 42.0 50.0
#68 rl_v06_run1
rl_policy
custom single-shot
54 54.0 39.0 50.0
#69 rl_v06_run1
rl_policy
custom single-shot
54 50.0 43.0 75.0
#70 rl_v06_run2
rl_policy
custom single-shot
53 37.0 47.0 75.0
#71 alex
openai/gpt-5.4
default reflexion
36 38.0 0.0 75.0
Per-scenario breakdown of the top run
Scenario Health Drop rate Delivered Pass
baseline 87.0 0.0% 60
flash-sale 85.0 0.0% 300
payment-slow 87.0 0.0% 15
payment-outage 88.0 0.0% 12