Schlock Mercenary is a very funny web (and print) comic that I discovered earlier this year via a list at ArsTechnica. In reading up on back issues and back stories, I came across a nice little gem about simulation.
Category Archives: Computer Simulation Technology
Carbon Design Systems keeps putting out interesting blog posts at a good pace. Bill Neifert at recently put up a blog post about the various of speed/accuracy tradeoffs you can make when building virtual platforms. The main message of the blog is that you should use a mix of fast models (TLM + JIT, like the ARM Fast Models) and cycle-accurate generated-from-RTL models (like the models generated by Carbon’s tools). By switching between the levels of abstraction when you need to go fast or go deep, you get something that is pretty much the best of both worlds (I already blogged about the change between abstraction before). It makes perfect sense, and I am all with him. There are dragons in the middle land.
However, I do not quite agree with Bill about the absolute uselessness of the intermediate types of models, like SystemC TLM-2.0 AT. Basically, what is traditionally called “cycle accurate modeling” (while not derived from RTL).
I recently read the classic book The Soul of a New Machine by Tracy Kidder. Even though it describes the project to build a machine that was launched more than 30 years ago, the story is still fresh and familiar. Corporate intrigue, managing difficult people, clever engineering, high pressure, all familiar ingredients in computing today just as it was back then. With my interesting in computer history and simulation, I was delighted to actually find a simulator in the story too! It was a cycle-accurate simulator of the design, programmed in 1979.
Carbon Design Systems have been on a veritable blogging spree recently, pushing out a large number of posts around various topics. Maybe a bit brief for my taste in most cases (I have a tendency to throw out 1000+ word pseudo-articles when I take the time to write a blog), but sometimes very interesting nevertheless. I particularly liked a few posts on cache analysis, as they presented some good insight into not-quite-expected processor and cache behaviors.
Carbon Design Systems have been quite busy lately with a flurry of blog posts about various aspects of virtual prototype technology. Mostly good stuff, and I tend to agree with their push that a good approach is to mix fast timing-simplified models with RTL-derived cycle-accurate models. There are exceptions to this, in particular exploratoty architecture and design where AT-style models are needed. Recently, they posted about their new Swap ‘n’ Play technology, which is a old proven idea that has now been reimplemented using ARM fast simulators and Carbon-generated ARM processor models.
On my Wind River blog, I just posted a fairly long post about simulation abstraction levels. It was inspired by a cool article in ArsTechnica about Nintendo emulators, and the costs and benefits of being ever more faithful to the hardware.
I just read a quite interesting article by Christian Pinto et al, “GPGPU-Accelerated Parallel and Fast Simulation of Thousand-core Platforms“, published at the CCGRID 2011 conference. It discusses some work in using a GPGPU to run simulations of massively parallel computers, using the parallelism of the GPU to speed the simulation. Intriguing concept, but the execution is not without its flaws and it is unclear at least from the paper just how well this generalizes, scales, or compares to parallel simulation on a general-purpose multicore machine.
There is a new post at my Wind River blog, about hypersimulation in virtual platforms and how it lets virtual time fly much faster than real time. It was the result of simple mistake of leaving Simics running in the background as I did other work on my machine.
I just read an interview with Steve Furber, the original ARM designer, in the May 2011 issue of the Communications of the ACM. It is a good read about the early days of the home computing revolution in the UK. He not only designed the ARM processor, but also the BBC Micro and some other early machines.
There is a new post at my Wind River blog, about how you build virtual platforms with Simics. The post is more about the methodology than the nature of models, cycle accuracy, endianness, and all the other details of virtual platform modeling. I have written about modeling methodology on this blog too, and in particular I would recommend looking at “Two perspectives on modeling“.
By chance, I got to attend a day at the UPMARC Summer School with a very enjoyable talk by Francesco Zappa Nardelli from INRIA. He described his work (along with others) on understanding and modeling multiprocessor memory models. It is a very complex subject, but he managed to explain it very well.
There is a new post at my Wind River blog, about some computing history. Wind River turns thirty this year, Simics twenty, and simulation for debug (and probably debug in general) turns sixty. Computing has come a long way.
James Aldis of TI has published an article in the EEtimes about how Texas Instruments uses SystemC in the modeling of their OMAP2 platform. SystemC is used for early architecture modeling and performance analysis, but not really for a virtual platform that can actually run software. The article offers a good insight into the virtual platform use of hardware designers, which is significantly different from the virtual platform use of software designers.
Read More →
I am just finishing off reading the chapters of the Processor and System-on-Chip Simulation book (where I was part of contributing a chapter), and just read through the chapter about the Tensilica instruction-set simulator (ISS) solutions written by Grant Martin, Nenad Nedeljkovic and David Heine. They have a slightly different architecture from most other ISS solutions, since that they have an inherently variable target in the configurable and extensible Tensilica cores. However, the more interesting part of the chapter was the discussion on system modeling beyond the core. In particular, how they deal with interrupts to the core in the context of a temporally decoupled simulation.
I previously blogged about the HAVEGE algorithm that is billed as extracting randomness from microarchitectural variations in modern processors. Since it was supposed to rely on hardware timing variations, I wondered what would happen if I ran it on Simics that does not model the processor pipeline, caches, and branch predictor. Wouldn’t that make the randomness of HAVEGE go away?
In the June 2010 issue of Communications of the ACM, as well as the April 2010 edition of the ACM Queue magazine, George Phillips discusses the development of a simulator for the graphics system of the 1977 Tandy-RadioShack TRS-80 home computer. It is a very interesting read for all interested in simulation, as well as a good example of just why this kind of old hardware is much harder to simulate than more recent machines.
Endianness is a topic in computer architecture that can give anyone a headache trying to understand exactly what is happening and why. In the field of computer simulation, it is a pervasive problem that takes some thinking to solve in an efficient, composable, and portable way.
This blog post describes how I am used to working with endianness in virtual platforms, and why this approach makes sense to me. There are other ways of dealing with endianness, with different trade-offs and overriding goals.
I just found a recent paper on the topic of parallel simulation of computer systems. Christopher Schumacher et al., published an articles at CODES+ISSS in October of 2010 talking about “parSC: Synchronous Parallel SystemC Simulation on Multicore Architectures“. Essentially, parallel SystemC.
Looks like S4D (and the co-located FDL) is becoming my most regular conference. S4D is a very interactive event. With some 20 to 30 people in the room, many of them also presenting papers at the conference, it turns into a workshop at its best. There were plenty of discussion going on during sessions and the breaks, and I think we all got new insights and ideas.
This post features some additional notes on the topic of transporting bugs with checkpoints, which is the subject of a paper at the S4D 2010 conference.
The idea of transporting bugs with checkpoints is some ways obvious. If you have a checkpoint of a state, of course you move it. Right? However, changing how you think about reporting bugs takes time. There are also some practical issues to be resolved. The S4D paper goes into some of the aspects of making checkpointing practical.
I have a new post at my Wind River blog, about variability and determinism and how these two concepts interact. In short, even a deterministic simulator can expose great variability in a software workload and target system behavior.
I have a paper about “Transporting Bugs with Checkpoints” to be presented at the S4D (System, Software, SoC and Silicon Debug) conference in Southampton, UK, on September 15 and 16, 2010. The core concept presented is to leverage Simics checkpointing to capture and move a bug from the bug reporter to the responsible developer. It is a fairly simple idea, but getting it to work efficiently does require that some things are done right. See the longer Wind River blog posting about this topic for a few more details.
I have just found what almost has to be the first cycle-accurate computer simulator in history. According to the article “Stretch-ing is Great Exercise — It Gets You in Shape to Win” by Frederick Brooks (the man behind the Mythical Man-Month) in the January-March 2010 issue of IEEE Annals of the History of Computing, IBM created a simulator of the pipeline for the IBM 7030 “Stretch” computer developed from 1956 to 1961 (photo from IBM.com).
One of the many nice effects of the Wind River acquisition of Simics is that I will be blogging as part of the Wind River Blog network. My first post there is up now, and it is a short (at least compared to a textbook, I admit it looks terribly long for a blog post) overview of how Simics works inside.
I think it is important for users of technologically advanced tools to know a bit of how they work. A classic example of this is compilers, where I taught an ESC class almost a decade ago which is my most popular piece of writing to date…
I just read a short paper by Antoine Trouvé and Kazuaki Murakami from the RAPIDO 2010 workshop on “rapid simulation and performance evaluation”. The paper is “FFast: Efficient Application of Compiled Simulation Techniques To A Fast ISS Over a Virtual Machine”. It explores the interesting idea of how an existing virtual machine infrastructure can be used to build a fast instruction-set simulator, and in the extension, a full system simulator.
To me, this idea is worth exploring, since using a mature VM like the .net CLR (used in this paper) or a JVM would offer a shortcut to get high-quality code generation for a JIT compiler. It could also offer other benefits, as these environments support many advanced configuration and management features. I have touched on this topic before, in the posts “Dream ESL Language” (VM as the basis for a simulator) and “The JVM as Universal Parallel Glue” (that a common VM can offer huge benefits for an ecosystem).
The discussion on my previous blog post about “the ideal ESL language” made me think some more about the purpose of a hardware modeling or description language. If you look closely, you realize that there are two quite different goals being pursued by the tools and languages discussed there.
On one hand, we have the task of supporting the design of new hardware bits, for the purpose of creating it. On the other hand, we have the task of describing a particular design for the purpose of simulating it. These two are not necessarily the same.
Continuing on my series of posts about checkpointing in virtual platforms (see previous posts Simics, Cadence, our FDL paper), I have finally found a decent description of how CoWare does things for SystemC. It is pretty much the same approach as that taken by Cadence, in that it uses full stores a complete process state to disk, and uses special callbacks to handle the connection to open files and similar local resources on a system. The approach is described in a paper called “A Checkpoint/Restore Framework for SystemC-Based Virtual Platforms”, by Stefan Kraemer and Reiner Leupers of RWTH Aachen, and Dietmar Petras, and Thomas Philipp of CoWare, published at the International Symposium on System-on-Chip, in Tampere, Finland, in October of 2009.