Skip to content

Stress Testing

The simulator runs your design step by step. Same design + same settings = same result every time, so you can fix something and tell whether it actually helped.

  1. Find the entry blocks (anything with no incoming line).
  2. Drop a batch of items at each entry block.
  3. Each step pushes one layer of the diagram.
  4. Each block runs its behavior (passthrough, filter, delay, etc.). Blocks with capacity buffer items, and drop the oldest one when the queue fills up.
  5. Items move along the outgoing lines.
  6. Backpressure: if the next block’s queue is at 80% or more, items wait one step before arriving instead of showing up instantly.
  7. Queue overflow drops the oldest item to keep order.
  8. The run stops when every item has been handled, or it hits a step limit.
  9. If the run would take more than an hour of simulated time, time compression scales every duration down to fit (see Time compression).

The simulator uses a fixed random seed (42) for anything random (like filter drop rates). That gives you:

  • Same design + same settings = same numbers every time.
  • A specific failure case you can reproduce on demand.
  • Apples-to-apples before and after a change.

Monte Carlo runs the same design many times with slightly different seeds, so you can see the spread instead of one snapshot. Each block’s processing time gets nudged a bit per run. The nudge grows with how deep the design is:

  • Base nudge: ±15%.
  • For deep designs (4+ blocks), it grows by 1% per extra layer, up to 25%.
  • Formula: 0.15 + max(0, 0.01 * (depth - 3)), capped at 0.25.

Deep nudges make downstream failures look correlated, the way they would in real life.

After a Monte Carlo run, you can check the results against goals you set. Open SLO Targets in the Monte Carlo tab and set up to three:

TargetRangeWhat it means
Min Delivery Rate0-100%Lowest share of items that have to make it through.
Max Dropped0+Most items you’ll tolerate dropping.
Min Health0-100%Lowest average system health for the run.

Leave a field empty to skip it. Once set, every run is checked against your goals:

  • Compliance card: shows how many runs hit the goals (e.g. “85% of runs met SLO”) with a strip of green (pass) and red (fail) blocks, one per run.
  • Per-run column: the results table gets an SLO column showing pass/fail for each run.
  • Pass rules: a run only passes if it hits all the goals you set.

Goals save with the project and update as you re-run.

If any block has a monthly cost set (Cost section in the block editor), Monte Carlo runs include a cost breakdown:

NumberWhat it means
Total MonthlyTotal monthly cost of every block.
Cost per DeliveredTotal monthly cost ÷ average items delivered across runs.
Cost per DroppedTotal monthly cost ÷ average items dropped (only shows if there were drops).

The cost card appears below the SLO compliance card. It tells you the cost per item that actually made it through. A design that drops a lot costs more per successful delivery.

Open the Overview panel from the left toolbar during a run. Three tabs at the top:

TabWhat it shows
OverviewHealth score, system profile radar (THROUGHPUT / LATENCY / RETENTION / RECOVERY / HEADROOM with traffic chips), performance bars (Utilization / Delivery / Active / Queue Peak), bottleneck callout, recommended fix.
StabilityTwo sub-tabs: Health (verdict + delivery health, instability bar, system-health-over-time chart, insights) and Timeline (step-by-step table with per-block detail). Pro only.
Monte CarloMany-run analysis with nudged seeds, SLO compliance, cost breakdown, health distribution histogram, percentile bars, per-run table. Pro only.

The lists of blocks and lines moved out of Overview to a separate Explorer panel (toolbar → Explorer, or press E). Each panel has a Copy for Docs button that copies exactly what you see.

The stress test controls live in their own floating panel, opened from the Overview panel. One-click scenarios plus manual breakage controls.

PresetWhat it does
BaselineNormal traffic (1x), single burst, nothing broken.
Peak Traffic5x traffic with a steady stream of arrivals.
OutageKills a random block, normal traffic.
Slowdown3x traffic with a steady stream, plus +3 steps of slowdown on a random block.

Click any preset to set everything up at once.

ControlRangeWhat it does
Traffic multiplier1-5xMultiplies the seed count (total = seeds × multiplier).
Arrival patternSingle burst or every 1-5 passesHow traffic arrives over time.

Set the seed count (1-1000) in the timeline bar at the bottom. Type a number or use the arrows. The traffic multiplier in the stress panel multiplies that number.

ControlWhat it does
Kill blockTake a block completely offline (drops everything).
Slow blockAdd extra processing time to a block.
Extra delay1-5 extra steps of slowdown on the slowed block.

While a run plays:

  • Queue heat — blocks tint orange to red as their queues fill.
  • Queue badge — a number appears when more than 3 items are waiting.
  • Line thickness — lines get thicker for paths carrying more traffic.
  • Line fill — every line has a colored fill that creeps from start to end as items pass through. Step 0 = empty line. By the end of a run, busy lines are fully filled. Heavy paths fill faster than light ones, so the design “lights up” path by path as you scrub the timeline.
  • Bottleneck alert — the current bottleneck swaps its icon for a red triangle (with a flash animation).
  • Per-step badges — show items in (+), dropped (-), and queued for each step.
  • Backpressure warnings — fire when a queue hits 80% full.
  • Throttle drops — fire when items get dropped because of a rate limit.
  • Dim what’s not running — blocks not involved this run dim to 30%, lines to 15%.

Hover a block during a run to see live numbers:

  • Received — total items that came in across the whole run.
  • Delivered — total items that left across the whole run.
  • Queued — items waiting at this point in the run.
  • In transit — items still traveling on lines toward this block.
  • Filtered — items intentionally dropped by this block (filter mode only, shown in purple).
  • Lost — items dropped because of failures (queue overflow, rate limit, circuit breaker, retries used up).

Numbers only show when they’re not zero.

Behavior events fly off blocks as colored numbers, like damage numbers in a game. Each kind has its own color and lane so a flurry of events reads as a stream, not a wall of text.

FloatColorWhen it fires
↓ IN +NSkyItems entering the inbox.
↑ OUT +NEmeraldItems leaving the block.
✗ LOST +NRedQueue overflow, killed block, retries used up. Pops with a bubble animation.
FLT +NVioletFilter-mode drops (intentional, not failures).
QUEUE +N / -NYellow / Orange / RedQueue depth changes; color gets hotter as the queue fills.
TRIPPED / HALF / CLOSEDRose / EmeraldCircuit breaker state changes.
RETRY xNOrangeA retry happened.
THROTTLEDYellowA rate limit dropped this item.
BATCH xNTealA batch released N items.
TRANSFORMEDAmberTransform-mode change happened.
QUEUE FULLRed, bubble-popQueue overflow alarm.

Floats stick around for 1.6 seconds (1.9 for big pops) so fast scrubs still catch them. Bigger numbers get bigger fonts: ”↑ OUT +1” stays small, ”↑ OUT +137” pops loud. Same-frame events stack in the same column instead of jittering, so the eye can follow a stream when you scrub fast.

Matching counters (SEED / OUT / FLT / LOST / QUEUE) live in the bottom strip and stay in sync with the floats.

The Stability tab shows a CI-1T verdict:

VerdictRangeWhat it means
Stable Behavior≤ 10%Low variation across steps, consistent.
Drifting Behavior≤ 30%Some variation, performance changing over time.
Variable Behavior≤ 55%High variation, hard to predict.
Erratic Behavior> 55%Wildly inconsistent, system breaking down.

Important: the verdict measures consistency, not correctness. A block that fails the same way every time will look “Stable” because it’s predictable. The system overrides Stable when real problems show up:

  • Errors found: Stable becomes Volatile.
  • Drops found: Stable becomes Drift (only real failure drops count, not intentional filter drops).

The recommendations explain when a low instability number is hiding real failures.

The simulator follows the line direction:

  • Forward lines carry items from start to end (the default).
  • Backward lines carry items from end to start.

For a “send and wait for reply” handshake (HTTP RPC, blocking DB query), use one forward line with the Sync flag in the line editor, not two arrows pointing in opposite directions.