chini-017-couch-to-5k
Couch to 5K
Three runs a week, nine weeks, one knee that hurts on Wednesdays. Get to the 5K without quitting.
Source: Couch to 5K running program, behavior change literature, the eternal hope of the new year
Prompt
Design a personal running progression system to take a sedentary adult from zero to a 5K run in 9 weeks without injury or dropout. Functional: - Three runs per week. Each session has a target structure (walk/jog intervals, then continuous jog, then 5K continuous). - Recovery days between runs. Sleep, hydration, soreness check before each session. - Weekly progression: intervals lengthen, walks shorten. Milestones: first 10-min jog, first 20-min, first 5K. - Optional cross-training (bike, swim) on recovery days when feeling good. Non-functional: - A bad-week event (work travel, sickness, life chaos = 4x friction) must not cause program abandonment. System auto-defers, repeats the prior week instead of skipping. - If knee pain or excessive soreness reported, the session must downgrade (more walk, less jog) before the user gets injured. Not skip the run entirely (loss of habit). - Missing two consecutive sessions must trigger a recovery week, not a guilt-driven catch-up double. Return a Chinilla CanvasState. Components: scheduler, sessions, recovery checks, progression rules, fallback paths. Behaviors: queue (session backlog), ratelimit (max sessions/week), circuitbreaker (pain trigger downgrade), retry (repeat-week deferral), split (training vs recovery routing).
Constraints
- Max components
- 11
- Required behaviors
- ratelimit, circuitbreaker, split
- Monthly budget
- $30
Stress scenarios
Normal week
baselineThree sessions land on schedule, recovery clean.
Travel + cold week
spikeFriction 4x normal. System must defer without breaking the program.
Knee pain reported
outagePain trigger fires. Session must downgrade, not be skipped outright.
Slow recovery between runs
latencySoreness lingers. Scheduler must extend gaps without dumping the queue.
Pass criteria (overall)
- Min stability score
- 60
- Max drop rate
- 12.0%
- Min delivery rate
- 82.0%
- Max errors
- 6
Submit your run
Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.
End-to-end:
pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...
chini-bench run chini-017-couch-to-5k \
--provider openrouter --model google/gemini-2.0-flash-001 \
--as alice --x alice --linkedin alice-builds Or inspect the prompt first:
chini-bench prompt chini-017-couch-to-5k Providers: openai · anthropic · google · openrouter · ollama
Leaderboard
| Rank | Submitter | Model | Score | Stability | Delivery | Design | Pass | Links |
|---|---|---|---|---|---|---|---|---|
| #1 | alex default | X x-ai/grok-4.20 | 76 | 80.0 | 54.0 | 100.0 | ✗ | X |
| #2 | alex default | G google/gemini-3.1-pro-preview | 70 | 39.0 | 100.0 | 100.0 | ✗ | X |
| #3 | alex default | O openai/gpt-5.4 | 50 | 60.0 | 0.0 | 75.0 | ✗ | X |
| #4 | alex default | A anthropic/claude-sonnet-4.6 | 49 | 58.0 | 0.0 | 75.0 | ✗ | X |
Per-scenario breakdown of the top run
| Scenario | Health | Drop rate | Delivered | Pass |
|---|---|---|---|---|
| baseline | 82.0 | 3.2% | 52 | ✗ |
| bad-week | 80.0 | 3.6% | 184 | ✓ |
| knee-pain | 76.0 | 0.0% | 0 | ✗ |
| long-recovery | 82.0 | 3.2% | 52 | ✗ |