chini-025-job-search-pipeline

Job Search Pipeline

100 applications in, 3 offers out. Without ghosting yourself in the middle.

Source: Personal systems, sales funnel theory applied to careers

Prompt

Design the personal pipeline for a 12-week job search.

Functional:
- Top of funnel: source companies (referrals, LinkedIn, job boards). Apply with tailored material.
- Middle: recruiter screen, technical screen, take-home, onsite loop. Each stage has a typical drop rate.
- Bottom: offer, negotiation, accept or decline.
- Parallel pipelines for different role types (IC, manager, founding) with shared application bandwidth.

Non-functional:
- Application velocity: 5-8 quality applications per week. Going dark for 2 weeks collapses momentum.
- Rejection batch (5+ rejections in one week) is normal but emotionally compounding. System needs a cooldown / reflection step before rage-applying.
- Offer collision: 2 offers arriving in the same week with different deadlines must not force a bad choice. Stall, accelerate, or create deadline parity.
- 3x application surge (laid-off mode) cannot mean 3x quality drop. Throughput must be bounded.
- A bad week (interview face-plant, recruiter ghost) cannot freeze the whole funnel.

Return a CanvasState modeling the funnel, drop rates per stage, and emotional/cognitive failure modes.

Constraints

Max components: 12
Required behaviors: queue, ratelimit, circuitbreaker
Monthly budget: $200

Stress scenarios

Steady search

baseline

Normal application cadence, normal drop rates. A few offers expected over 12 weeks.

Laid-off mode

spike

3x application volume with finite hours. Rate-limit or quality collapses.

5 rejections in one week

cascade

Compound emotional load. Cooldown step must engage before next batch of apps.

Recruiter ghost

outage

Primary recruiter pipeline goes silent. Funnel must continue via other channels.

Pass criteria (overall)

Min stability score: 60
Max drop rate: 40.0%
Min delivery rate: 5.0%
Max errors: 8

Submit your run

Submissions go through the chini-bench CLI. It calls your model with your key, scores the result locally, and posts to the leaderboard. Nothing leaves your machine except the canvas it produces.

End-to-end:

pip install git+https://github.com/collapseindex/chini-bench-cli.git
export OPENROUTER_API_KEY=...

chini-bench run chini-025-job-search-pipeline \
  --provider openrouter --model google/gemini-2.0-flash-001 \
  --as alice

Or inspect the prompt first:

chini-bench prompt chini-025-job-search-pipeline

Providers: openai · anthropic · google · openrouter · ollama

Leaderboard

Rank	Submitter	Model	Score	Stability	Delivery	Design	Pass
#1	alex	google/gemini-3.1-pro-preview default reflexion	55	24.0	25.0	100.0	✗
#2	alex	x-ai/grok-4.20 default reflexion	54	10.0	25.0	100.0	✗
#3	alex	openai/gpt-5.4 default reflexion	49	10.0	26.0	100.0	✗
#4	alex	x-ai/grok-4.20 default single-shot	48	22.0	0.0	75.0	✗
#5	alex	anthropic/claude-sonnet-4.6 default single-shot	48	22.0	0.0	75.0	✗
#6	alex	google/gemini-3.1-pro-preview default single-shot	47	7.0	0.0	100.0	✗
#7	alex	openai/gpt-5.4 default single-shot	45	15.0	0.0	75.0	✗
#8	alex	anthropic/claude-sonnet-4.6 default reflexion	34	0.0	0.0	75.0	✗

Per-scenario breakdown of the top run

Scenario	Health	Drop rate	Delivered	Pass
baseline	8.0	100.0%	0	✗
laidoff-surge	6.0	100.0%	0	✗
rejection-batch	9.0	100.0%	0	✗
recruiter-ghost	72.0	0.0%	104	✓

How is this scored? →