Monte Carlo

The Monte Carlo tab in the Overview panel runs the same design many times with different random seeds and aggregates the results into a distribution. It’s the answer to “how stable is this design under variation, not just at one seed?”

When to use it

A single deterministic run tells you what happens at one specific seed. That’s good for debugging a specific failure mode but doesn’t tell you whether the design is reliably good. Monte Carlo answers questions like:

“What’s the worst-case health across 50 random seeds?”
“Does my design hit its SLO 95% of the time?”
“How wide is the spread between best and worst run?”
“Which component is the most variable in peak queue depth?”

How to set it up

Open the Overview panel (O).
Click the Monte Carlo tab.
Pick a run count (5-100). Each run uses a different seed.
Optionally set SLO targets (min delivery rate, max dropped packets, min health).
Click Run Monte Carlo.

The progress indicator shows runs completed. Each run is a full deterministic simulation — same as hitting Play with a different seed — so total time scales linearly with run count. A 50-run batch on a typical 8-component design finishes in 5-10 seconds on the WASM engine.

What the results show

Distribution metrics

Avg / best / worst health — mean, max, and min health score across all runs.
Percentiles — P10, P50 (median), P90 to give you the shape of the distribution. A tight design has small spread between P10 and P90; a fragile one has wide spread.
Std dev — standard deviation of the health scores. Lower is more predictable.
Failure rate — fraction of runs where ANY packet was dropped. A 0% failure rate means every seed produced a clean run; anything higher means the design is fragile to seed variation.

Per-component reliability

Each component gets its own row showing the distribution across runs:

Peak queue P5/P95 range bar — the band shows where the peak queue depth landed in 90% of runs. A tight band means the queue size is predictable; a wide band means a few runs hit much higher peaks than typical.
Drop rate — average fraction of packets the component dropped across all runs.
Processing volume — average packets processed per run.

This is where you find components that “usually work but sometimes don’t” — the peak-queue band reveals tail risk that a single run hides.

SLO compliance

If you set SLO targets, the panel shows what fraction of runs PASSED the SLO:

Min delivery rate — fail if delivered/total < this percentage.
Max dropped — fail if dropped > this count.
Min health — fail if health < this score.

A design that passes 95%+ of the time meets the SLO confidently. 50-90% means it’s marginal. Below 50% means the SLO is unrealistic for this design.

Seed list

The seeds used for each run are preserved in the Repro bundle export. A collaborator who imports the bundle and re-runs Monte Carlo gets bit-identical numbers — useful for reproducibility on bug reports, paper appendices, or before/after comparisons.

How seeds affect the run

The PRNG seed determines every random decision in the run: filter drop choices, retry jitter, weighted-split routing, circuit-breaker probe timing. Different seeds produce different random outcomes for the same design.

What CAN change between seeds:

Which specific packets get dropped by a filter (the count is constrained by the drop rate, but which exact packets vary)
The order weighted-split routing picks downstream paths (over many packets the ratio matches the weights, but the per-packet ordering varies)
When retry jitter fires
Whether a circuit breaker happens to probe during a particular failure window

What does NOT change:

The design structure (components, connections, behaviors)
The number of seed packets
Aggregate rates and capacities
Deterministic component behavior (passthrough, delay, batch — these have no randomness)

A design that’s robust to seed variation will show small spread (P10 ≈ P90). A design that depends on a specific lucky seed will show wide spread and high std dev.

Health weights

Monte Carlo uses the same per-block health formula as Stability analysis (throughput × wT + drop × wD + queue × wQ). If you’ve adjusted the health weights in the Stability tab, those weights apply to Monte Carlo too — the panel uses one source of truth.

See Stability Analysis for the formula and per-weight tradeoffs.

Pro feature

Monte Carlo is a Pro feature. Free accounts get the Overview tab and stress test panel; Stability, Monte Carlo, and Parameter Sweep all require Pro.

Overview Panel — the dashboard that hosts the Monte Carlo tab
Stability Analysis — frame-to-frame consistency on a single seed
Parameter Sweep — Monte Carlo across a grid of behavior knobs
Engine Methodology — how seeds drive the deterministic engine