Skip to content

Stability Analysis

The Stability tab in the Overview panel shows how the health of your design changes step by step during a run. Powered by the CI-1T engine.

NumberWhat it means
System HealthOverall health (0-100%) across the whole run.
InstabilityHow much health bounces around between steps. The bigger the swing, the higher the number.
IssuesCount of drops, errors, and chain failures during the run.

A single line tracks system health (0-100%) across every step. Hover the crosshair to see the per-block breakdown at that step.

The line color matches the overall verdict:

  • Green — healthy (≥ 80%).
  • Amber — degraded (≥ 50%).
  • Red — critical (< 50%).

Below the chart, a table shows every step:

ColumnWhat it means
TimeSimulated time at this step (in ms / s / m / h / d).
ActiveBlocks that handled items this step.
ReqsItems routed. N lost shows in red if anything dropped.
DoneTotal items delivered up to this step.
QueueTotal items waiting across every block.
HealthHealth for this step (color-coded).
LoadA bar showing how much traffic moved this step.

Click any row to expand it for per-block detail: items in, items out, queue depth, drops, errors.

The simulation clock works in milliseconds and only ticks when something interesting happens — a block finishing an item, a retry firing, a circuit breaker reopening, an item arriving on a slow line, a scheduled batch of items being injected.

Each row in the timeline is a frame: a cluster of events at the same simulated time. Frames are rate-limited so you don’t see one row for every microsecond. Skipped frames roll their changes into the next emitted one. The time values come from the processing times, line latencies, and behavior delays you set on the canvas.

The total run length is calculated as longest path × seed count × 2.5, capped at 1 hour. The compression ratio is based on a single-item path duration, not the full N-scaled duration, so increasing the seed count just makes the run longer instead of squishing time. If the raw duration would go over the cap, the engine compresses every duration in proportion (see Time compression). This means a design with 200ms steps runs on a totally different timescale than one with 3-minute batch cycles, and the timeline adapts.

The stability engine reads the instability score and gives you a verdict:

VerdictRangeWhat it means
Stable Behavior≤ 10%Low variation, consistent behavior.
Drifting Behavior≤ 30%Some variation, performance changing over time.
Variable Behavior≤ 55%High variation, hard to predict.
Erratic Behavior> 55%Wildly inconsistent, system breaking down.

The labels say “Behavior” instead of “Healthy / Unhealthy” because consistency and correctness are different things. A design can deliver 100% of items (Healthy in the Overview) and still get Variable here, or stay perfectly Stable while everything drops. The Stability banner shows the verdict and the overall delivery health side by side so the difference is obvious.

Important: the verdict measures how consistent the design is, not whether it works. A block that fails the same way every time looks “Stable” because it’s predictable. The system overrides Stable when real problems show up:

  • Errors found → Variable Behavior.
  • Drops found → Drifting Behavior (only real failure drops count, not intentional filter drops).

The recommendations explain when a low instability number is hiding a real failure.

Stability runs locally — no API call. Steps:

  1. Pull per-block per-frame health scores from the run results.
  2. Compute health as throughput × wT + drop factor × wD + queue pressure × wQ using weights you can adjust (default 50/30/20). Filter-mode drops are excluded from the drop factor — they were on purpose.
  3. Run the scores through the CI-1T engine.
  4. Produce per-block instability, smoothed instability, an authority level (0-5), and a ghost-block check.
  5. Roll up to a system-level instability: the larger of (average per-block instability) and (range of per-step system health).
  6. Apply the verdict thresholds and the override rules above.

Using the default weights (50/30/20), if a block processed 8 of 10 items with 1 dropped and a queue at 3/10:

  • throughput factor = 8/10 = 0.8
  • drop factor = 1.0 - (1/10) = 0.9
  • queue factor = 1.0 - (3/10) = 0.7
  • health = 0.8 × 0.50 + 0.9 × 0.30 + 0.7 × 0.20 = 0.81 (81%)

System health per step is the average of every active block’s health. Weird non-finite numbers (which can come up with odd AI-generated designs) get clamped to 0.5 before averaging.

With low seed counts (1-5 items), one drop or queue spike can swing the health score wildly (1 drop out of 1 = an 80-point hit). To keep small runs honest, the engine adds a virtual baseline of 5 “healthy” items to the denominators for the drop and queue checks. So drop penalty = total dropped / (total entered + 5) × 80 instead of total dropped / total entered × 80.

The smoothing fades on its own as traffic grows: with 100+ items, the +5 baseline barely matters. With 1 item, it stops a single failure from producing a misleading 0%.

By default, the health formula weighs rate at 50%, drops at 30%, and queue pressure at 20%. You can change these to match what your design cares about most.

Open Health Weights at the top of the Stability tab. Three sliders:

WeightDefaultWhat it emphasizes
Throughput50%How much traffic gets through.
Drops30%How few items are lost.
Queue20%How little backlog piles up.

Weights always sum to 100%. Move one slider and the others rebalance. Reset puts everything back to defaults.

Your weights save with the project and apply to both stability analysis and Monte Carlo runs.

Examples:

  • A payment system that can’t tolerate drops: 20/60/20.
  • A high-throughput data pipeline: 70/10/20.
  • A real-time system that hates queuing: 30/20/50.

Stability is a Pro feature. Free accounts get the Overview tab. The stress test panel is free for everyone.