chini-021-ddos-shield

DDoS Mitigation Shield

100M packets per second of garbage. Your customer's checkout still has to clear in 200ms.

Source: Network security operations, Cloudflare/Akamai post-mortems, real-world DDoS event analysis

Prompt

Design a DDoS mitigation layer in front of an e-commerce origin.

Functional:
- Inbound traffic terminates at edge scrubbers across 3 regions. Each scrubber classifies: clean, suspicious, attack.
- Clean traffic forwards to origin. Suspicious traffic challenged (JS challenge, captcha). Attack traffic dropped at edge.
- Behavioral baselines per route. Sudden 50x deviation from baseline triggers automatic challenge mode for that route.
- Customer support bypass: known authenticated session cookies skip the challenge, even under attack.

Non-functional:
- A 100x volumetric attack must NOT cause clean checkout traffic to be dropped at the origin. Edge absorbs.
- An L7 application-layer attack mimicking real users must be detected (anomaly in request mix) without false-positiving the customers.
- If one region's scrubber is compromised, the attack must NOT reach origin via that region. Failover blackholes the region instead of routing dirty traffic through.

Return a Chinilla CanvasState. Components: edge scrubbers, classifier, challenge layer, origin gateway, blackhole. Behaviors: filter (attack detection), ratelimit (per-route caps), split (clean/challenge/drop routing), circuitbreaker (region blackhole on compromise), replicate (multi-region scrubbing).

Constraints

Max components: 12
Required behaviors: filter, ratelimit, circuitbreaker
Monthly budget: $50000

Stress scenarios

Clean traffic

baseline

Normal customer mix, no attack. Latency must not be inflated by mitigation overhead.

100x volumetric attack

adversarial

Edge must absorb the flood, clean checkout must still reach origin.

Region scrubber compromised

outage

One region's scrubber is leaking dirty traffic. System must blackhole the region, not route through.

Application-layer attack

adversarial

Slow, low-volume attack mimicking real users. Detection must fire without flagging customers.

Pass criteria (overall)

Min stability score: 65
Max drop rate: 5.0%
Min delivery rate: 92.0%
Max errors: 5

Submit your run

Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.

End-to-end:

pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...

chini-bench run chini-021-ddos-shield \
  --provider openrouter --model google/gemini-2.0-flash-001 \
  --as alice

Or inspect the prompt first:

chini-bench prompt chini-021-ddos-shield

Providers: openai · anthropic · google · openrouter · ollama

Leaderboard

Rank	Submitter	Model	Score	Stability	Delivery	Design	Pass
#1	alex	openai/gpt-5.4 default reflexion	83	37.0	100.0	100.0	✗
#2	alex	openai/gpt-5.4 default single-shot	75	25.0	100.0	75.0	✗
#3	alex	google/gemini-3.1-pro-preview default reflexion	75	48.0	84.0	75.0	✗
#4	alex	google/gemini-3.1-pro-preview default single-shot	73	49.0	68.0	75.0	✗
#5	alex	anthropic/claude-sonnet-4.6 default single-shot	70	0.0	100.0	75.0	✗
#6	alex	x-ai/grok-4.20 default single-shot	69	0.0	100.0	75.0	✗
#7	alex	x-ai/grok-4.20 default reflexion	48	39.0	0.0	50.0	✗
#8	alex	anthropic/claude-sonnet-4.6 default reflexion	44	0.0	0.0	75.0	✗

Per-scenario breakdown of the top run

Scenario	Health	Drop rate	Delivered	Pass
baseline	44.0	8.3%	399	✗
volumetric	14.0	100.0%	1757	✗
region-compromised	76.0	0.0%	88	✓
l7-attack	14.0	100.0%	911	✗

How is this scored? →