Synchronization for simulators is one of the core technical pieces in SimBricks. SimBricks combines simulators with seemingly incompatible simulation modes like discrete event simulation, cycle-by-cycle circuit simulation, and mathematical models. For ease of integration and performance, each simulator retains its own local clock (see our earlier post). For meaningful performance measurements, we synchronize these clocks across all simulators. SimBricks implements fully accurate (conservative) clock synchronization, efficiently and scalably through message passing between connected peer simulators.

Pairwise Local Synchronization is Sufficient

Global synchronization, i.e. aligning all individual simulator clocks tick-by-tick through barriers, trivially ensures correct simulations but incurs high overhead and scales poorly. In SimBricks we implement a more efficient and scalable synchronization approach that does not compromise accuracy: pairwise synchronization between communicating simulators through our message passing channels. SimBricks establishes point-to-point static communication channels between component simulators for directly communicating system components in the real system.

In SimBricks, message passing is the only communication mechanism and simulators share no other state. As a consequence of this architecture, the necessary and sufficient requirement for accurate simulations is that incoming messages are processed at the correct arrival time. For this, we annotate messages with the simulation time when the receiver must process them. The SimBricks channels between component simulators deliver messages in order, thus message timestamps always monotonically increase, i.e. a message with a timestamp is also an implicit promise that no earlier messages will arrive on that channel.

SimBricks simulation with four connected simulators exchanging timestamped
messages.

For example, the slowest simulator in the figure above can safely advance to the minimum of the timestamps of the latest incoming messages from all its peers, i.e. min(1000 ns, 400 ns, 200 ns) = 200 ns, since no messages with timestamps for earlier processing can arrive.

A corollary of this approach is that non-communicating simulators, e.g. peers A and B in the figure above need not synchronize directly, instead, the pairwise synchronization transitively ensures that messages can be processed on time at each boundary. To guarantee liveness, even when simulators do not have data messages to exchange, we introduce dummy synchronization messages, which our simulators send in the absence of data messages to allow their peers to proceed.

SimBricks leverages link latencies between components in the simulated system for efficient synchronization. In physical systems, links between components have a propagation delay or link latency between when the sender issues an operation and when it arrives. SimBricks synchronization uses this link latency Δ for synchronization slack.

SimBricks exploits link latencies for synchronization
slack.

When simulator B sends a message at time 200 ns over a link with a 200 ns latency, the receiving simulator only needs to process this message at time 400 ns. So when simulator A sees this message, it knows it can safely advance its clock to 400 ns without additional synchronization until then. This also implies that simulators only need to send dummy synchronization messages after not sending any messages for the last Δ interval. In the figure above, our slowest simulator can safely advance from 350 to 400 ns, after which it has to wait for further incoming messages. So peer simulator D limits the progress that A can make.

Synchronized modular simulations can never run faster than the slowest component. We have shown that our SimBricks synchronization mechanism is efficient, providing negligible overhead even if we scale to 1000s of simulators. In particular, in combination with our lightweight message passing, SimBricks minimizes overhead at the bottleneck simulator, which never has to wait for messages to continue executing.

Disabling Synchronization for Fast Functional Testing

Synchronization in SimBricks is optional. When accurate performance is not required, e.g. during functional testing, users can easily disable synchronization on a per-component basis. Here, each simulator advances time as quickly as possible and ignores the timestamps on incoming messages, instead handling them as soon as they arrive.

If you have questions or would like to learn more: