There is a new post at my Wind River blog, about the new Simics 4.6 release. 4.6 has some serious new goodies in it, including an Eclipse source-code debugger and a way to build blinking lights front panels for boards.
There is a new post at my Wind River blog, about some computing history. Wind River turns thirty this year, Simics twenty, and simulation for debug (and probably debug in general) turns sixty. Computing has come a long way.
There is a new post at my Wind River blog, about how Simics was used to kick-start the development of the 64-bit version of VxWorks. It is an interesting example of how to use a virtual platform as a model of something much simpler and gentler than actual hardware systems.
I have a fairly lengthy new blog post at my Wind River blog. This time, I interview Tennessee Carmel-Veilleux, a Canadian MSc student who have done some very smart things with Simics. His research is in IMA, Integrated Modular Avionics, and how to make that work on multicore.
Last week, I posted a discussion about fault injection in virtual systems, using Basil Fawlty as the perfect example of a fault injection agent.
I have a post at my Wind River blog, about the difference between virtual and physical systems. The key idea is this:
Comparing virtual and physical systems is like comparing apples and apples, not apples and oranges: while apples are mostly interchangeable, they is certainly variation between them. Some apples are best for eating, some are better for making sauce, some are pie material, and some are best for fermenting cider. The type you select depends on what you want to cook. The difference between physical and virtual hardware is similar: they can be used as replacements for each other to some extent, but the connoisseur can make much better use of both by looking at the differences.
Go there now and read i!
This post features some additional notes on the topic of transporting bugs with checkpoints, which is the subject of a paper at the S4D 2010 conference.
The idea of transporting bugs with checkpoints is some ways obvious. If you have a checkpoint of a state, of course you move it. Right? However, changing how you think about reporting bugs takes time. There are also some practical issues to be resolved. The S4D paper goes into some of the aspects of making checkpointing practical.
I recently read a couple of articles on multicore that felt a bit like jumping back in time. In IEEE Spectrum, David Patterson at Berkeley’s parallel computing lab brings up the issue of just how hard it is to program in parallel and that this makes the wholesale move to multicore into something like a “hail Mary pass” for the computer industry. In Computer World, Chris Nicols at NICTA in Australia asks what you will do with a hundred cores – implying that there is not much you can do today. While both articles make some good points, I also think they should be taken with a grain of salt. Things are better than they make them seem. Continue reading “Multicore is not That Bad”
I have another blog up at Wind River. This one is about multicore bugs that cannot happen on multithreaded systems, and is called True Concurrency is Truly Different (Again). It bounces from a recent interesting Windows security flaw into how Simics works with multicore systems.
One of the many nice effects of the Wind River acquisition of Simics is that I will be blogging as part of the Wind River Blog network. My first post there is up now, and it is a short (at least compared to a textbook, I admit it looks terribly long for a blog post) overview of how Simics works inside.
I think it is important for users of technologically advanced tools to know a bit of how they work. A classic example of this is compilers, where I taught an ESC class almost a decade ago which is my most popular piece of writing to date…
Continuing on my series of posts about checkpointing in virtual platforms (see previous posts Simics, Cadence, our FDL paper), I have finally found a decent description of how CoWare does things for SystemC. It is pretty much the same approach as that taken by Cadence, in that it uses full stores a complete process state to disk, and uses special callbacks to handle the connection to open files and similar local resources on a system. The approach is described in a paper called “A Checkpoint/Restore Framework for SystemC-Based Virtual Platforms”, by Stefan Kraemer and Reiner Leupers of RWTH Aachen, and Dietmar Petras, and Thomas Philipp of CoWare, published at the International Symposium on System-on-Chip, in Tampere, Finland, in October of 2009.
Part of my daily work at Virtutech is building demos. One particularly interesting and frustrating aspect of demo-building is getting good raw material. I might have an idea like “let’s show how we unravel a randomly occurring hard-to-reproduce bug using Simics“. This then turns into a hard hunt for a program with a suitable bug in it… not the Simics tooling to resolve the bug. For some reason, when I best need bugs, I have hard time getting them into my code.
I guess it is Murphy’s law — if you really set out to want a bug to show up in your code, your code will stubbornly be perfect and refuse to break. If you set out to build a perfect piece of software, it will never work…
So I was actually quite happy a few weeks ago when I started to get random freezes in a test program I wrote to show multicore scaling. It was the perfect bug! It broke some demos that I wanted to have working, but fixing the code to make the other demos work was a very instructive lesson in multicore debug that would make for a nice demo in its own right. In the end, it managed to nicely illustrate some common wisdom about multicore software. It was not a trivial problem, fortunately.
Past Tuesday, I attended the Freescale Design With Freescale (DWF) one-day technology event in Kista, Stockholm. This is a small-scale version of the big Freescale Technology Forum, and featured four tracks of talks running from the morning into the afternoon. All very technical, aimed at designing engineers.
Last Friday, I attended this year’s edition of the SiCS Multicore Day. It was smaller in scale than last year, being only a single day rather than two days. The program was very high quality nevertheless, with keynote talks from Hazim Shafi of Microsoft, Richard Kaufmann of HP, and Anders Landin of Sun. Additionally, there was a mid-day three-track session with research and industry talks from the Swedish multicore community. Continue reading “SiCS Multicore Day 2009”
The paper will explain how we did Simics-style checkpointing in SystemC, using the GreenSocs GreenConfig mechanisms to obtain an approximation for the Simics attribute system.
The past few days here at DAC, a big theme has been transaction level modeling (TLM).
TLM is often considered to be SystemC TLM-2.0. Most of the statements from the EDA companies are to the effect that SystemC TLM-2.0 solves the problem of combining models from different sources. Scratching the surface of this happy picture, it is clear that it is not that simple…
I while ago I wrote a blog post on checkpointing in virtual platforms, and what it is good for. Checkpointing has been a fairly rare feature in virtual platform tools for some reason, but it seems to be picking up some implementations. In particular, I recently noticed that Cadence added it to their simulator solutions a while ago (2007 according to their blog posts). There are a two blog posts by George Frazier of Cadence (“saving boot time” and “advanced usage“) that offer some insight into what is going on.
Virtutech and Cadence yesterday announced the integration of Virtutech Simics and Cadence ISX (Incisive Software Extensions), which is essentially a directed random test framework for software. With this tool integration, you can systematically test low-level software and the hardware-software (device driver) interface of a system, leveraging a virtual platform.
In my series (well, I have one previous post about checkpointing) about misunderstood simulation technology items, the turn has come to the most difficult of all it seems: determinism. Determinism is often misunderstood as meaning “unchanging” or “constant” behavior of the simulation. People tend to assume that a deterministic simulation will not reveal errors due to nondeterministic behavior or races in the modeled system, which is a complete misunderstanding. Determinism is a necessary feature of any simulation system that wants to be really helpful to its users, not an evil that hides errors.
One thing that surprises me is how rare the feature of checkpointing or snapshotting is in the land of virtual platforms, despite the obvious benefits of that feature. Indeed, checkpointing was one of the first cool things demonstrated to me when I joined Virtutech back in 2002. Today, I could not ever imagine doing without it. Not having checkpointing is like having a word processor where you only get to save once, when your document is finished, with no option of saving intermediate states.
But not everyone seems to consider this an important feature, judging from its relative rarity in the world of EDA and virtual platforms. Why is this? Let’s look at some possible explanations.
There is an eternal debate going on in virtual platform land over what the right kind of abstraction is for each job. Depending on background, people favor different levels. For those with a hardware background, more details tend to be the comfort zone, while for those with a software background like myself, we are quite comfortable with less details. I recently did some experiments about the use of quite low levels of hardware modeling details for early architecture exploration and system specification.
Frank Schirrmeister of Synopsys recently published a blog post called “Busting Virtual Platform Myths – Part 1: “Virtual Platforms are for application software only”. In it, he is refuting a claim by Eve that virtual platforms are for application-level software-development only, basically claiming that they are mostly for driver and OS development and citing some Synopsys-Virtio Innovator examples of such uses. In his view, most appication-software is being developed using host-compiled techniques. I want to add to this refutal by adding that application-software is surely a very important — and large — use case for virtual platforms.
Unknown to most, IBM has one of the world’s longest records of using virtual platforms for software and firmware development and verification. This project has been ongoing since at least the days of the zSeries 900 machines, through z990, z9, and now z10. An excellent article on this virtual platform and its uses is found in the IBM Journal of Research and Development, number 1, 2009, . It is called “IBM System z10 Firmware Simulation”, by Körner et al.
A common question from simulation users to us simulation providers is “can I simulate a machine with N cores”, where N is “large”. As if running lots of cores was a simulation system or even a hardware problem. In almost all cases, the problem is with software. Creating an arbitrary configuration in a virtual platform is easy. Creating a software stack for that arbitrary platform is a lot harder, since an SMP software stack needs to understand about the cores and how they communicate.
Essentially, what you need is a hardware design that has addressing room for lots of cores, and a software stack that is capable of using lots of cores — even if such configurations do not exist in hardware. Unfortunately, since software is normally written to run on real existing machines, there tends to be unexpected limitations even where scalability should be feasible “in principle”.
Here is the story of how I convinced Linux to handle more than two cores in a virtual MPC8641D machine.
Traditional hardware design languages like Verilog were designed to model naturally concurrent behavior, and they naturally leaned on a concept of threads to express this. This idea of independent threads was brought over into the design of SystemC, where it was manifested as cooperative multitasking using a user-level threading package. While threads might at first glance look “natural” as a modeling paradigm for hardware simulations, it is really not a good choice for high-performance simulation.
In practice, threading as a paradigm for software models of hardware circuits connected to a programmable processor brings more problems than it provides benefits in terms of “natural” modeling.
Now I am home again, and some days have passed since the IP 08 panel discussion about software and hardware virtual platforms. This was an EDA hardware-oriented conference, and thus the audience was quite interested in how to tie things to hardware design. Any case, it was a fun panel, and Pierre Bricaud did a good job of moderating and keeping things interesting.
There are times when working with virtual hardware and not real hardware feels very liberating and efficient (not to mention safe). Bringing up, modifying, and extending operating systems is one obvious such case. Recently, I have been preparing an open-source-based demonstration and education systems based on embedded PowerPC machines, and teaching myself how to do Linux device drivers in the process. This really brought out the best in virtual platform use.
As might be evident from this blog, I do have a certain interest in history and the history of computing in particular. One aspect where computing and history collide in a not-so-nice way today is in the archiving of digital data for the long term. I just read an article at Forskning och Framsteg where they discuss some of the issues that use of digital computer systems and digital non-physical documents have on the long-term archival of our intellectual world of today. Basically, digital archives tend to rot in a variety of ways. I think virtual platform technology could play a role in preserving our digital heritage for the future.
Only half an hour ago, the embargoes lifted. Freescale announced its new QorIQ series of multicore (and some single- and dual-core) processors. For the top-end of that line, the P4080, Freescale and Virtutech (where I work, remember) have developed a virtual platform solution to help Freescale customers get to working products faster. The virtual platform is available now, and is already running several operating systems including VxWorks, QNX, and a variety of Linuxes. Apart from the fairly large scale of this SoC, the really new part of the virtual platform is the so-called Hybrid solution, where the fast models are combined with detailed models from Freescale themselves. This creates a cycle-level detailed model with validated timing, “from the source” — but without the performance issues of having to run everything at great level of detail. Rather, you use the fast model to steer the simulation of a workload to an interesting spot, and then turn up the level of detail then and there. You can also select which components of the chip are actually detailed and which parts are modeled with the fast functional models, avoiding the incredible slow-down of running and entire virtual platform at a great level of detail.
If you happen to be at the FTF in Orlando, do come by and look at the demos!
I have been involved in this work for the past year, and it is wonderful to finally see it coming out and be able to talk about it.
On Tuesday next week, I will be presenting at the Power Architecture Conference (PAC) in München, Germany. The topics will be multicore debug using virtual hardware, and the new Simics Accelerator technology. Especially Simics Accelerator is pretty interesting technology.
It is a simple idea, using multiple host cores to run a virtual platform, with fairly amazing results. Now, using a single computer we can run fairly incredible simulations that were the realm of pure fantasy just a few years ago. We also got a nice new little box to demonstrate it with, an eight-core Dell with 16 GB of RAM. With 64-bit Linux, this thing makes my Core 2 Duo laptop with 32-bit Vista look like yesteryear’s snail… And creates that giggling feeling that a really impressive new toy brings up in even the most grown up boys. Booting a 16-machine network of PowerPC boards was so fast it was not demoworthy. I think we have to up the ante to some 100 target machines to make it interesting, and I have no doubt that a combination of multithreading and idle-loop optimization will make that thing be usefully interactive from the target command lines. There are many other wild things we could try on that demo box, once it gets back from the Power Architecture Conferences tour.