Being a bit of a computer history buff, I am often struck by how most key concepts and ideas in computer science and computer architecture were all invented in some form or the other before 1970. And commonly by IBM. This goes for caches, virtual memory, pipelining, out-of-order execution, virtual machines, operating systems, multitasking, byte-code machines, etc. Even so, I have found a quite extraordinary example of this that actually surprised me in its range of modern techniques employed. This is a follow-up to a previous post, after having actually digested the paper I talked about earlier.
The paper in question was published in 1969, and is titled “A program simulator by partial interpretation“. In the previous post, I took note of its use of direct execution of software plus trapping of privileged instructions, but that was not really the most interesting bits in there.
They lay out in quite simple terms most of the key ideas behind today’s fast virtual platforms. Here are the best parts:
- They note that simulation of a computer is often used to overcome debugging difficulties, in particular repeating failed runs and tracing all that is going on in the target machine.
- They are hunting down race conditions using the simulator.
- They use recorded input and output to drive a deterministic simulation even of workloads involving communication with the external world.
- They simulate multiple processors on top of a single physical processor by means of giving each processor a certain time slice to do its work before switching to the next processor. This is known as temporal decoupling or quantized simulation today, and is a key to the high speed of solutions such as Simics. They note the same tradeoffs as we see today, 40 years later, for doing this: shorter slices more accurately depict the parallelism, but also cost performance.
- The temporally decoupled simulation also includes timers and similar non-CPU-hardware. Just like we do it today for virtual platforms.
- In a temporally decoupled simulation, they optimize the simulation of the IDL, Idle, instruction. When it is encountered, they skip immediately to the end of the time slice. This is what we today call idle-loop optimization or hypersimulation, and which is absolutely key to achieving scalable simulation of large multiprocessor and multi-machine setups (since most parts of a system are not usually maximally loaded).
- They are debugging operating systems on the simulator, not just user-level code.
The computer in question is a Japanese System/360-compatible machine called the HITAC-8400. The work was reported in 1969, but actually carried out in 1967.
There are some differences in scale and kind compared to today’s virtual platforms, but none that detract from the underlying principles. The 1967 system is host-on-host, so it is not the kind of cross-environment that is most common in today’s virtual platforms (Power Arch on x86, ARM on x86, etc.). The IO system is much easier to simulate since it is part of the instruction set of the processor rather than being a set of complex memory-mapped peripherals.
So the 1970 rule strikes again. Not the IBM rule, this time, this was all done by Hitachi. There are traces of similar work at IBM in other papers, but I have not been able to locate actual copies of any publication.