Launch special: 50% off Pro monthly with code LAUNCH50 Upgrade now
Skip to main content
← All problems
chini-028-credential-stuffing

Credential Stuffing Defense

100k stolen credentials replayed against your login. Block the attack without locking out 50k real users.

Source: Application security, OWASP authentication threat model

Prompt

Design the authentication path for a consumer web service under credential-stuffing attack.

Functional:
- Users submit email + password to /login. Backend checks against hashed store, returns session token on success.
- Attacker replays a leaked credential dump (email/password pairs) at high volume from a botnet of distributed IPs.
- Some attacker pairs will succeed (real users reuse passwords). Most will fail.
- Real users continue to log in throughout the attack.

Non-functional:
- Block at least 70% of attack volume before it reaches the password-check stage.
- Real-user login success rate must stay above 80% during the attack (no global lockout).
- Defenses available: rate-limit per IP, rate-limit per account, captcha challenge on suspicious patterns, device-fingerprint, breach-password-list rejection, MFA enrollment ramp.
- Cannot rely on a single IP-based block: attacker is distributed.
- Cannot rely on a single account-based block: would lock out real users on shared passwords.
- Layered defense required: at least two independent gating mechanisms.

Return a CanvasState modeling the login path and layered defenses.

Constraints

Max components
12
Required behaviors
ratelimit, filter, circuitbreaker
Monthly budget
$4500

Stress scenarios

Normal login traffic

baseline

Standard daytime login volume, no attack.

Credential stuffing flood

adversarial

Distributed botnet replays leaked credentials. Block attack, preserve real users.

Low-and-slow attack

adversarial

Attacker spreads attempts across many IPs at low rate to evade per-IP limits.

Pass criteria (overall)

Min stability score
60
Max drop rate
50.0%
Min delivery rate
40.0%
Max errors
8

Submit your run

Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.

End-to-end:
pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...

chini-bench run chini-028-credential-stuffing \
  --provider openrouter --model google/gemini-2.0-flash-001 \
  --as alice --x alice --linkedin alice-builds
Or inspect the prompt first:
chini-bench prompt chini-028-credential-stuffing
Providers: openai · anthropic · google · openrouter · ollama

Leaderboard

Rank Submitter Model Score Stability Delivery Design Pass Links
#1 alex default
X x-ai/grok-4.20
77 47.0 96.0 75.0 X
#2 alex default
O openai/gpt-5.4
74 20.0 100.0 75.0 X
#3 alex default
G google/gemini-3.1-pro-preview
72 12.0 100.0 75.0 X
#4 alex default
A anthropic/claude-sonnet-4.6
65 43.0 100.0 50.0 X
Per-scenario breakdown of the top run
Scenario Health Drop rate Delivered Pass
baseline 79.0 3.9% 85
stuffing-attack 31.0 100.0% 172
low-and-slow 31.0 100.0% 98