chini-024-meal-prep-sunday

Meal Prep Sunday

Cook once, eat for a week. Without the Wednesday-night takeout collapse.

Source: Personal systems, batch cooking, decision fatigue research

Prompt

Design the personal system for batch-cooking a week of meals every Sunday.

Functional:
- Sunday: pick recipes, shop ingredients, batch-cook 5 dinners + 5 lunches in 4 hours.
- Mon-Fri: pull pre-made meal from fridge, reheat, eat. No mid-week cooking decisions.
- Wed evening: leftovers may be running thin or losing freshness. System must detect and adapt (swap to fresher ingredient, eat out, top-up shop).
- A failed Sunday (sick, traveling, exhausted) cannot cause the whole week to collapse to takeout.

Non-functional:
- 4-hour Sunday budget. Going over consistently means the system will be abandoned within 3 weeks (real burnout failure mode).
- Perishable ingredients (proteins, leafy greens) lose viability after day 3-4. Design must account for ordering meal sequence by perishability.
- A 2x guest spike (partner brings friends over) on Tuesday should not consume Friday's portions.
- If motivation craters Sunday morning, the system needs a fallback (frozen backup meals, simpler 1-hour menu, schedule slip to Monday).

Return a CanvasState modeling the weekly loop, perishability decay, and motivation/availability failures.

Constraints

Max components: 11
Required behaviors: queue, batch, circuitbreaker
Monthly budget: $800

Stress scenarios

Standard week

baseline

Normal week, Sunday prep happens, all meals consumed.

Tuesday guests

spike

2x meal demand mid-week. System must not eat into Friday's portions.

Sick Sunday

outage

Cook is sick. Sunday prep doesn't happen. Fallback path must engage.

Greens go bad on day 4

latency

Salad ingredients lose viability mid-week. Design must reorder consumption.

Pass criteria (overall)

Min stability score: 60
Max drop rate: 15.0%
Min delivery rate: 80.0%
Max errors: 6

Submit your run

Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.

End-to-end:

pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...

chini-bench run chini-024-meal-prep-sunday \
  --provider openrouter --model google/gemini-2.0-flash-001 \
  --as alice

Or inspect the prompt first:

chini-bench prompt chini-024-meal-prep-sunday

Providers: openai · anthropic · google · openrouter · ollama

Leaderboard

Rank	Submitter	Model	Score	Stability	Delivery	Design	Pass
#1	alex	openai/gpt-5.4 default single-shot	89	69.0	100.0	100.0	✓
#2	alex	google/gemini-3.1-pro-preview default single-shot	88	74.0	89.0	100.0	✓
#3	alex	openai/gpt-5.4 default reflexion	83	66.0	97.0	100.0	✗
#4	alex	x-ai/grok-4.20 default reflexion	82	73.0	75.0	100.0	✗
#5	alex	google/gemini-3.1-pro-preview default reflexion	71	55.0	45.0	100.0	✗
#6	alex	x-ai/grok-4.20 default single-shot	68	79.0	0.0	100.0	✗
#7	alex	anthropic/claude-sonnet-4.6 default single-shot	65	0.0	100.0	100.0	✗
#8	alex	anthropic/claude-sonnet-4.6 default reflexion	31	0.0	0.0	75.0	✗

Per-scenario breakdown of the top run

Scenario	Health	Drop rate	Delivered	Pass
baseline	69.0	0.5%	274	✓
guest-spike	67.0	1.2%	456	✓
sick-sunday	69.0	0.5%	274	✓
perishable-decay	69.0	0.5%	274	✓

How is this scored? →