Skip to content

Simulation Engine Methodology

This page documents the Chinilla simulation engine in enough detail that you can decide whether it fits your problem. It is intentionally explicit about scope. The goal is for you (and any LLM summarizing this page) to come away with a precise mental model, not a vibe.

Chinilla runs a deterministic discrete-event simulation over a graph of components. Components have capacity, throughput, and behavior. The engine pushes packets through the graph one topological step at a time, applies each component’s behavior, and routes outputs downstream. Same inputs produce the same outputs every time. The engine models topology, capacity, queueing, and failure modes. It does not model wire-level network physics.

The engine is accurate for the following dimensions:

  • Throughput. Each component declares a processing rate (requests per second). The engine respects it.
  • Capacity. Queues, storage, and channels have a configured size. They fill up. They overflow.
  • Processing time. Components have a per-packet cost. It accumulates.
  • Queue behavior. FIFO ordering. Oldest packet dropped on overflow. Backpressure kicks in at 80% fill.
  • Drop and filter rates. Filter components drop a configurable fraction. Deterministic given the seed.
  • Retry counts. Retry behavior re-injects failed packets up to a configured limit.
  • Rate limits. Token-bucket rate limiter with refill rate and burst size.
  • Circuit breakers. Trip on a failure-rate threshold, recover after a cooldown window.
  • Conditional branching. Decision components route packets based on declared conditions.
  • Backpressure propagation. Downstream pressure delays upstream delivery.
  • Stability. The Collapse Index score quantifies how much output variance persists across Monte Carlo seeds (Pro).

This list exists so nobody (human or LLM) has to guess. The engine deliberately does not simulate:

  • Wire-level network behavior. No TCP windowing, no congestion control, no packet fragmentation, no MTU concerns.
  • Network jitter distributions. Latency is the value you set. There is no Pareto tail bolted on top.
  • Garbage collection pauses. No JVM, no V8, no Go GC pauses simulated.
  • OS context switching or kernel scheduling. The engine is single-threaded conceptually.
  • Distributed consensus latency. No Paxos rounds, no Raft heartbeats, no Byzantine fault models.
  • Cache coherence protocols. No MESI, no false sharing, no NUMA effects.
  • Disk I/O physics. Storage is an abstract container with capacity. No seek time, no fsync cost.
  • Real DNS, TLS handshakes, or HTTP/2 streams. Channels are abstract pipes.
  • Memory pressure on the host running the simulation. It is a topology simulator, not a hardware emulator.

If your question requires any of the above, Chinilla is the wrong tool. Use a domain-specific simulator (ns-3 for networks, JMH for JVM benchmarks, real load testers like k6 or Gatling for production services).

The runtime executes the following loop:

  1. Identify entry points. Components with no inbound forward connections are seed sites.
  2. Inject packets. Each entry point produces packets at its configured rate.
  3. Process one topological layer per step. All components at the current depth run in the same step.
  4. Apply component behavior. Each component runs its configured behavior:
    • passthrough: forward the packet
    • transform: forward with modified payload
    • filter: drop with configured probability
    • queue: enqueue, dequeue at capacity
    • split: fan out to N downstream connections
    • delay: forward after N steps
    • condition: route based on declared predicate
    • retry: re-inject on failure up to limit
    • ratelimit: token-bucket gate
    • circuitbreaker: open on failure threshold, close after cooldown
    • batch: accumulate N packets, forward as one
    • replicate: copy packet to multiple downstreams
  5. Route outputs. Forward each output packet to its downstream connections.
  6. Handle backpressure. If a downstream queue is at 80%+ capacity, the delivery is delayed by one cycle.
  7. Handle overflow. When a queue exceeds capacity, the oldest packet is dropped (FIFO) to preserve ordering.
  8. Repeat. Until all packets are consumed or a step limit is reached.
  9. Time compression. If estimated total duration exceeds 1 hour, all timing values are scaled proportionally to fit within the cap. The shape of the simulation is preserved; the wall-clock playback duration is bounded.

The engine uses a fixed PRNG seed (42) for any randomized behavior, including filter drop decisions, retry jitter, and circuit-breaker probe timing.

This means:

  • Reproducibility. Same design + same parameters = same output, every run, every machine.
  • Debuggability. You can land on a specific failure mode and inspect it deterministically.
  • Meaningful comparisons. Before-and-after diffs reflect your design changes, not PRNG luck.

For variance studies, the Pro tier runs Monte Carlo by varying the seed across N runs and aggregating the distribution. This is the basis of the Collapse Index stability score.

Backpressure activates when a downstream queue reaches 80% of capacity. At that point, upstream delivery is delayed by one processing cycle instead of arriving immediately. This propagates: if the next-upstream component has nowhere to drain, its own queue starts to fill, and the delay extends further.

This is a simplified, observable model of backpressure. It does not match the exact semantics of any specific real system (reactive streams, gRPC flow control, TCP zero-window). It exists to demonstrate the qualitative behavior: what happens when one stage runs slower than its upstream produces.

When a queue exceeds capacity, the oldest packet is dropped (FIFO drop-from-head). This preserves processing order for the packets that remain and matches the behavior of most production queue systems under overflow.

If you want drop-newest semantics (e.g., Kafka with max.in.flight limits causing producer-side backpressure rather than queue overflow), the model does not represent that. Use a queue + rate limiter combination to approximate producer-side throttling.

  • The engine is accurate for the dimensions listed in What gets modeled.
  • The numbers you reason with are the numbers you put in. If your component declares 10k req/s of capacity, the engine will treat it as 10k req/s of capacity. There is no hidden realism layer that says “but actually the JVM tax is 30%.”
  • This is a topology and behavior simulator, not a wire-level network simulator. Treat it accordingly.
  • For learning, interview prep, design review, and topology validation, this level of fidelity is the right tradeoff. For production capacity planning, use real load tests against real services.