Launch special: 50% off Pro monthly with code LAUNCH50 Upgrade now
Skip to main content
← All problems
chini-008-search-autocomplete

Search Autocomplete

Suggest as you type. Stay snappy when one shard goes dark.

Source: Classic system-design interview corpus (Google / Amazon search suggest)

Prompt

Design a search autocomplete service.

Functional:
- GET /suggest?q=<prefix> returns up to 10 ranked suggestions in <50ms p99.
- The suggestion index is updated nightly from a batch job; queries hit the live index.
- Popular queries should be served from cache; tail queries fall through to the index.

Non-functional:
- A 6x query spike (a celebrity event) must not collapse latency.
- If one index shard fails, queries that would have hit it should degrade to results from neighboring shards rather than 5xx.
- The nightly batch update must not interrupt live serving.

Return a Chinilla CanvasState. Expect a cache, an index store (storage), and replication / routing for shard failures.

Constraints

Max components
12
Required behaviors
storage, replicate
Monthly budget
$800

Stress scenarios

Baseline queries

baseline

Steady prefix-suggest traffic.

6x query spike

spike

Trending event causes 6x suggest traffic.

Index shard outage

outage

An index shard goes offline. Queries should degrade, not fail.

Pass criteria (overall)

Min stability score
70
Max drop rate
5.0%
Min delivery rate
90.0%
Max errors
5

Submit your run

Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.

End-to-end:
pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...

chini-bench run chini-008-search-autocomplete \
  --provider openrouter --model google/gemini-2.0-flash-001 \
  --as alice
Or inspect the prompt first:
chini-bench prompt chini-008-search-autocomplete
Providers: openai · anthropic · google · openrouter · ollama

Leaderboard

Rank Submitter Model Score Stability Delivery Design Pass
#1 rl_v06_run2
rl_policy
custom single-shot
91 82.0 100.0 50.0
#2 rl_v06_run2
rl_policy
custom single-shot
91 86.0 100.0 25.0
#3 rl_v06_run2
rl_policy
custom single-shot
90 79.0 100.0 25.0
#4 alex
google/gemini-3.1-pro-preview
default reflexion
89 83.0 100.0 75.0
#5 rl_v06_run2
rl_policy
custom single-shot
89 83.0 100.0 50.0
#6 alex
google/gemini-3.1-pro-preview
default single-shot
88 75.0 100.0 75.0
#7 rl_v06_run2
rl_policy
custom single-shot
88 82.0 91.0 50.0
#8 rl_v06_run1
rl_policy
custom single-shot
87 79.0 100.0 50.0
#9 rl_v06_run1
rl_policy
custom single-shot
87 83.0 84.0 25.0
#10 rl_v06_run1
rl_policy
custom single-shot
87 78.0 92.0 50.0
#11 rl_v06_run2
rl_policy
custom single-shot
86 82.0 83.0 60.0
#12 rl_v06_run2
rl_policy
custom single-shot
86 77.0 100.0 50.0
#13 rl_v06_run2
rl_policy
custom single-shot
86 79.0 89.0 50.0
#14 rl_v06_run2
rl_policy
custom single-shot
85 69.0 100.0 75.0
#15 rl_v06_run2
rl_policy
custom single-shot
85 81.0 82.0 50.0
#16 rl_v06_run2
rl_policy
custom single-shot
85 83.0 86.0 50.0
#17 alex
anthropic/claude-sonnet-4.6
default single-shot
84 67.0 100.0 100.0
#18 alex
openai/gpt-5.4
default reflexion
84 74.0 100.0 100.0
#19 rl_v06_run1
rl_policy
custom single-shot
84 67.0 100.0 50.0
#20 rl_v06_run2
rl_policy
custom single-shot
84 68.0 100.0 50.0
#21 rl_v06_run2
rl_policy
custom single-shot
84 72.0 100.0 25.0
#22 rl_v06_run2
rl_policy
custom single-shot
83 70.0 100.0 25.0
#23 rl_v06_run2
rl_policy
custom single-shot
83 70.0 100.0 50.0
#24 rl_v06_run2
rl_policy
custom single-shot
83 78.0 89.0 50.0
#25 rl_v06_run2
rl_policy
custom single-shot
83 81.0 83.0 25.0
#26 rl_v06_run1
rl_policy
custom single-shot
82 83.0 67.0 50.0
#27 rl_v06_run1
rl_policy
custom single-shot
82 83.0 67.0 50.0
#28 rl_v06_run1
rl_policy
custom single-shot
82 79.0 75.0 25.0
#29 rl_v06_run1
rl_policy
custom single-shot
82 83.0 67.0 50.0
#30 rl_v06_run2
rl_policy
custom single-shot
82 83.0 67.0 25.0
#31 rl_v06_run2
rl_policy
custom single-shot
82 83.0 67.0 25.0
#32 rl_v06_run2
rl_policy
custom single-shot
82 83.0 67.0 50.0
#33 rl_v06_run2
rl_policy
custom single-shot
82 83.0 67.0 25.0
#34 rl_v06_run2
rl_policy
custom single-shot
82 83.0 67.0 50.0
#35 rl_v06_run2
rl_policy
custom single-shot
82 83.0 67.0 50.0
#36 rl_v06_run2
rl_policy
custom single-shot
82 83.0 67.0 50.0
#37 rl_v06_run2
rl_policy
custom single-shot
82 83.0 67.0 25.0
#38 rl_v06_run2
rl_policy
custom single-shot
82 83.0 67.0 25.0
#39 rl_v06_run2
rl_policy
custom single-shot
82 83.0 67.0 50.0
#40 rl_v06_run2
rl_policy
custom single-shot
82 83.0 67.0 25.0
#41 rl_v06_run2
rl_policy
custom single-shot
82 83.0 67.0 50.0
#42 rl_v06_run2
rl_policy
custom single-shot
82 72.0 96.0 25.0
#43 rl_v06_run1
rl_policy
custom single-shot
81 82.0 67.0 25.0
#44 rl_v06_run2
rl_policy
custom single-shot
81 76.0 75.0 50.0
#45 rl_v06_run1
rl_policy
custom single-shot
79 83.0 67.0 50.0
#46 rl_v06_run1
rl_policy
custom single-shot
79 72.0 78.0 50.0
#47 rl_v06_run2
rl_policy
custom single-shot
79 83.0 67.0 25.0
#48 rl_v06_run2
rl_policy
custom single-shot
78 81.0 67.0 35.0
#49 rl_v06_run1
rl_policy
custom single-shot
77 83.0 50.0 50.0
#50 rl_v06_run2
rl_policy
custom single-shot
71 83.0 33.0 50.0
#51 rl_v06_run2
rl_policy
custom single-shot
70 76.0 41.0 50.0
#52 rl_v06_run2
rl_policy
custom single-shot
69 83.0 33.0 25.0
#53 rl_v06_run1
rl_policy
custom single-shot
67 79.0 33.0 60.0
#54 alex
x-ai/grok-4.20
default reflexion
62 83.0 0.0 75.0
#55 rl_v06_run1
rl_policy
custom single-shot
62 83.0 0.0 35.0
#56 rl_v06_run2
rl_policy
custom single-shot
59 82.0 0.0 25.0
#57 rl_v06_run2
rl_policy
custom single-shot
59 77.0 0.0 35.0
#58 rl_v06_run1
rl_policy
custom single-shot
55 69.0 0.0 10.0
#59 alex
x-ai/grok-4.20
default single-shot
54 67.0 0.0 25.0
#60 rl_v06_run1
rl_policy
custom single-shot
54 67.0 0.0 0.0
#61 alex
openai/gpt-5.4
default single-shot
53 65.0 0.0 50.0
#62 rl_v06_run2
rl_policy
custom single-shot
53 70.0 0.0 35.0
#63 rl_v06_run2
rl_policy
custom single-shot
40 44.0 0.0 20.0
#64 alex
anthropic/claude-sonnet-4.6
default reflexion
27 26.0 0.0 75.0
Per-scenario breakdown of the top run
Scenario Health Drop rate Delivered Pass
baseline 86.0 0.0% 352
celeb-spike 85.0 0.0% 1728
shard-outage 76.0 0.0% 30