chini-028-credential-stuffing
Credential Stuffing Defense
100k stolen credentials replayed against your login. Block the attack without locking out 50k real users.
Source: Application security, OWASP authentication threat model
Prompt
Design the authentication path for a consumer web service under credential-stuffing attack. Functional: - Users submit email + password to /login. Backend checks against hashed store, returns session token on success. - Attacker replays a leaked credential dump (email/password pairs) at high volume from a botnet of distributed IPs. - Some attacker pairs will succeed (real users reuse passwords). Most will fail. - Real users continue to log in throughout the attack. Non-functional: - Block at least 70% of attack volume before it reaches the password-check stage. - Real-user login success rate must stay above 80% during the attack (no global lockout). - Defenses available: rate-limit per IP, rate-limit per account, captcha challenge on suspicious patterns, device-fingerprint, breach-password-list rejection, MFA enrollment ramp. - Cannot rely on a single IP-based block: attacker is distributed. - Cannot rely on a single account-based block: would lock out real users on shared passwords. - Layered defense required: at least two independent gating mechanisms. Return a CanvasState modeling the login path and layered defenses.
Constraints
- Max components
- 12
- Required behaviors
- ratelimit, filter, circuitbreaker
- Monthly budget
- $4500
Stress scenarios
Normal login traffic
baselineStandard daytime login volume, no attack.
Credential stuffing flood
adversarialDistributed botnet replays leaked credentials. Block attack, preserve real users.
Low-and-slow attack
adversarialAttacker spreads attempts across many IPs at low rate to evade per-IP limits.
Pass criteria (overall)
- Min stability score
- 60
- Max drop rate
- 50.0%
- Min delivery rate
- 40.0%
- Max errors
- 8
Submit your run
Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.
End-to-end:
pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...
chini-bench run chini-028-credential-stuffing \
--provider openrouter --model google/gemini-2.0-flash-001 \
--as alice --x alice --linkedin alice-builds Or inspect the prompt first:
chini-bench prompt chini-028-credential-stuffing Providers: openai · anthropic · google · openrouter · ollama
Leaderboard
| Rank | Submitter | Model | Score | Stability | Delivery | Design | Pass | Links |
|---|---|---|---|---|---|---|---|---|
| #1 | alex default | X x-ai/grok-4.20 | 77 | 47.0 | 96.0 | 75.0 | ✗ | X |
| #2 | alex default | O openai/gpt-5.4 | 74 | 20.0 | 100.0 | 75.0 | ✗ | X |
| #3 | alex default | G google/gemini-3.1-pro-preview | 72 | 12.0 | 100.0 | 75.0 | ✗ | X |
| #4 | alex default | A anthropic/claude-sonnet-4.6 | 65 | 43.0 | 100.0 | 50.0 | ✗ | X |
Per-scenario breakdown of the top run
| Scenario | Health | Drop rate | Delivered | Pass |
|---|---|---|---|---|
| baseline | 79.0 | 3.9% | 85 | ✓ |
| stuffing-attack | 31.0 | 100.0% | 172 | ✗ |
| low-and-slow | 31.0 | 100.0% | 98 | ✗ |