Unknown to most, IBM has one of the world’s longest records of using virtual platforms for software and firmware development and verification. This project has been ongoing since at least the days of the zSeries 900 machines, through z990, z9, and now z10. An excellent article on this virtual platform and its uses is found in the IBM Journal of Research and Development, number 1, 2009, . It is called “IBM System z10 Firmware Simulation”, by Körner et al.
The z10 is the latest generation of the classic IBM mainframe family that started with S/360 back in the 1960s. The simulation for just running the firmware of these beasts is making most other virtual platforms look positively puny – focusing on single SoCs for consumer or digital devices. It also shows that virtual platforms as a technology can scale all the way from single-core bare-metal simple machines that are useful for developing initial software for simple embedded systems up to servers and racks containing hundreds of processing units and very diverse hardware.
The teminology used is unusual, compared to the EDA/ESL and computer architecture research worlds. But it is good. The key concept is a “VPO”, Virtual Power On. For a computer of this class, doing Power On is a major event, and calling it a “boot” does not really cover its full complexity, involving many different layers of software running on the same and different computers. The VPO was targeted at four months prior to hardware tape-out — and this means that at that point in time the virtual system would be complete and the firmware complete enough to do a power on.
The simulation system used for the z10 mixes IBM’s in-house CECsim with Virtutech Simics. CECsim executes the code for the central zSeries processors, while Simics simulates the FSP-1 “flexible support processor” based on the Power Architecture. In previous generations of simulation, the FSP code had been host-compiled and run on an x86 workstation instead of running the actual Power Architecture binaries. Running the real binaries brought additional verification value to the software, finding 3 times more bugs than in the previous host-based simulation:
Because the Simics environment now enables us to execute all FSP code in simulation, a far greater amount of code is simulated. Correspondingly, the number of defects found in simulation also increased, by more than 33(Table 2).
The article also describes how hardware-accelerated simulation of the actual VHDL of complex new IO chips were used to validate the bits-and-cycles-level interfacing between code and the logic, as well as to validate the logic design itself.
Overall, the article is one of best presentations of comprehensive use of various types of simulation tools and techniques to remove firmware defects as early as possible in the system development project.
For more on the history of this, I refer to a previous blog post here, “The 1970 rules strikes again“, where I described some late 1960’s mainframe simulation technology and its uses. Also, browse the back issues of the IBM JRD archives, there are lots of nuggets to be found there!