I am just finishing off reading the chapters of the Processor and System-on-Chip Simulation book (where I was part of contributing a chapter), and just read through the chapter about the Tensilica instruction-set simulator (ISS) solutions written by Grant Martin, Nenad Nedeljkovic and David Heine. They have a slightly different architecture from most other ISS solutions, since that they have an inherently variable target in the configurable and extensible Tensilica cores. However, the more interesting part of the chapter was the discussion on system modeling beyond the core. In particular, how they deal with interrupts to the core in the context of a temporally decoupled simulation.
This is a small detail, but one where I have always had a feeling that some fundamental assumption was missing in my discussions with various people from the hardware design community. It always seemed that hardware designers assumed a different basic design – and Grant Martin explained it very well just what that was. They only check for interrupts at the beginning of a time slice. Which makes interrupts less precise versus the code, but also makes the core interpreter fairly simple since all it has to do is to churn through instructions.
There is another solution, which is employed in Simics, where the processor can take an interrupt at any point in a time quantum. To do this, the processor needs to be aware of what is going to happen. The essentials of the solution is to have devices call the processor and tell it that they intend to interrupt it at some point T in time. The processor simulator then makes sure to stop and give the device model a chance to act at that exact point in time. It is obvious that this solution is easily generalized to cover all time callbacks needed to drive device work. A significant part of the responsibility for running the event-driven simulation is moved into the processor core.
Making the event queue visible to the processor also gives the processor a chance to hypersimulate, or skip idle time. Since it knows the next point in time that something will happen (either the end of a time quantum or an event posted by a device), it can very easily, safely, and repeatably jump forward in time without any impact on simulation semantics.
When dealing with multiple processors, this means that each processor will have precise interrupts from the devices that are close to it. Timers and IO interrupts tend to work closely with a certain processor for a prolonged period of time. Interrupts between processors suffer a time-quantum delay sometimes, but that is no worse than the solution of checking all interrupts at time-quantum boundaries.
Qemu uses a solution which is a mix of the two. According to the 2005 Usenix paper, devices do call into the processor to announce an interrupt, but this is handled by “soon” returning to the processor main loop. Processors are not responsible for keeping track of interrupts, making it very imprecise and not very repeatable when interrupts will happen.
Thus, we can see that there are a few different ways to implement interrupts in virtual platforms. Each approach comes from a different tradition and features different trade-offs.
I was a bit surprised by the comment in the Tensilica chapter that only correctly synchronized programs will work on a temporally decoupled simulation. In my experience, temporal decoupling is transparent to software functionality – all software runs. The perceived timing of operations can be different, and some tightly-coupled code might behave in suboptimal ways, but it certainly runs and works. And lets you observe parallel code errors.
Temporal decoupling is necessary in any fast platform, and its effect on semantics are really minor. With the simple tweak of having a processor know when interrupts might happen, it will also not affect the device-processor interface very much, maintaining very tight synchronization between processors and their controlled hardware.