inThere will be a session on checkpointing in SystemC at the upcoming SystemC Evolution Day in München on October 18, 2017. I will be presenting it, together with some colleagues from Intel. Checkpointing is a very interesting topic in its own right, and I have written lots about it in the past – both as a technology and it applications.
I have posted a two-part blog post to the public Intel Developer Zone blog, about the “Small Batches Principle” and how simulation helps us achieve it for complicated hardware-software systems. I found the idea of the “small batch” a very good way to frame my thinking about what it is that simulation really brings to system development. The key idea I want to get at is this:
[…] the small batches principle: it is better to do work in small batches than big leaps. Small batches permit us to deliver results faster, with higher quality and less stress.
Integration is hard, that is well-known. For computer chip and system-on-chip design, integration has to be done pre-silicon in order to find integration issues early so that designs can be updated without expensive silicon re-spins. Such integration involves a lot of pieces and many cross-connections, and in order to do integration pre-silicon, we need a virtual platform.
I have just published a piece about the Intel Excite project on my Software Evangelist blog at the Intel Developer Zone. The Excite project is using a combination of of symbolic execution, fuzzing, and concrete testing to find vulnerabilities in UEFI code, in particular in SMM. By combining symbolic and concrete techniques plus fuzzing, Excite achieves better performance and effect than using either technique alone.
Today, when developing embedded control systems, it is standard practice to test control algorithms against some kind of “world model”, “plant model” or “environment simulator”.
Using a simulated control system or a virtual platform running the actual control system code, connected to the world model lets you test the control system in a completely virtual and simulated environment (see for example my Trinity of Simulation blog post from a few years ago). This practice of simulating the environment for a control computer is long-standing in the aerospace field in particular, and I have found that it goes back at least to the Apollo program.
In the early 1990s, “PC graphics” was almost an oxymoron. If you wanted to do real graphics, you bought a “real machine”, most likely a Silicon Graphics workstation. At the PC price-point, fast hardware-accelerated 3D graphics wasn’t doable… until it suddenly was, thanks to Moore’s law. 3dfx was the first company to create fast 3D graphics for PC gamers. To get off the ground and get funded, 3dfx had to prove that their ideas were workable – and that proof came in the shape of a simulator. They used the simulator to demo their ideas, try out different design points, develop software pre-silicon, and validate the silicon once it arrived. Read the full story on my Intel blog, “How Simulation Started a Billion-Dollar Company”, found at the Intel Developer Zone blogs.
I had many interesting conversations at the HiPEAC 2017 conference in Stockholm back in January 2017. One topic that came up several times was the GEM5 research simulator, and some cool tricks implemented in it in order to speed up the execution of computer architecture experiments. Later, I located some research papers explaining the “full speed ahead” technology in more detail. The mix of fast simulation using virtualization and clever tricks with cache warming is worth a blog post.
Doing continuous integration and continuous delivery for embedded systems is not necessarily all that easy. You need to get tools in place to support automatic testing, and free yourself from unneeded hardware dependencies. Based on an inspiring talk by Mike Long from Norway, I have a piece on how simulation helps with embedded CI and CD on my Software Evangelist blog on the Intel Developer Zone.
It is really sad that the European Space Agency (ESA) lost their Schiaparelli lander last year, as we will miss out on a lot of Mars science. From a software engineering and testing perspective, the story of why the landing failed rather instructive, though. It gets down to how software can be written and tested to deal with unexpected inputs in unexpected circumstances. I wrote a piece about this on my blog at the Intel Developer Zone.
A new entry just showed up in the world of reverse debugging – Simulics, from German company Simulics. It does seem like the company and the tool are called the same. Simulics is a rather rare breed, the full-system-simulation-based reverse debugger. We have actually only seen a few these in history, with Simics being the primary example. Most reverse debuggers apply to user-level code and use various forms of OS call intercepts to create a reproducible run. Since the Simulics company clearly comes from the deeply embedded systems field, it makes sense to take the full-system approach since that makes it possible to debug code such as interrupt handlers.
I have also updated my history of commercial reverse debuggers to include Simulics.
Simics and other simulation solutions are a great way to add more variation to your software testing. I have just documented a nice case of this on my blog at the Intel Developer Zone (IDZ), where the Simics team found a bug in how Xen deals with MPX instructions when using VT-x. Thanks to running on Simics, where scenarios not available in current hardware are easy to set up.
Last year (2015), a paper called “Don’t Panic: Reverse Debugging of Kernel Drivers” was presented at the ESEC/FSE (European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering) conference. The paper was written by Pavel Dovgalyuk, Denis Dmitriev, and Vladimir Makarov from the Russian Academy of Sciences. It describes a rather interesting approach to Linux kernel device driver debug, using a deterministic variant of Qemu along with record/replay of hardware interactions. I think this is the first published instance of using reverse debugging in a simulator together with real hardware.
I am going to present a paper about our new SystemC Library in Simics, at the DVCon Europe conference taking place in München next month. The paper is titled “Integrating Different Types of Models into a Complete Virtual System – The Simics SystemC* Library”, and I authored it together with my Intel colleagues Andreas Hedström, Xiuliang Wang, and Håkan Zeffer.
On my Intel Software Evangelist blog, I just published an updated version of an interview I first published back in May, about how to use Intel CoFluent Studio for IoT system architecture. This is a really cool story, about how you can use a calibrated simulation model to architect and analyze software performance before actually writing the code! I
My first blog post as a software evangelist at Intel was published last week. In it, I tell the story of how our development teams used Simics to test the software behavior (UEFI, in particular) when a server is configured with several terabytes of RAM. Without having said server in physical form – just as a simulation. And running that simulation on a small host with just 256 GB of RAM. I.e., the host RAM is just a small fraction of the target. That’s the kind of things that you can do with Simics – the framework has a lot of smarts in it.
It was rather interesting to realize that just the OS page tables for this kind of system occupies gigabytes of RAM… but that just underscores just how gigantic six terabytes of memory really is.
This really happened last week, but I was in the US for the DAC then. I did another blog on Intel Software blog, about a white paper that Wind River put out about how they use Simics internally. The white paper is a really good set of examples of how Simics can be used for software development, test, and debug – regardless of how old or new the hardware is. It also touches my favorite topic of IoT simulation and scaling up – Wind River is actually using Simics for 1000+ node tests of IoT software! Read on at https://blogs.intel.com/evangelists/2016/06/06/wind-river-uses-simics-test-massive-iot-networks/
I love bug and debug stories in general. Bugs are a fun and interesting part of software engineering, programming, and systems development. Stories that involve running Simics on Simics to find bugs are a particular category that is fascinating, as it shows how to apply serious software technology to solve problems related to said serious software technology. On the Intel Software and Services blog, I just posted a story about just that: debugging a Linux kernel bug provoked by Simics, by running Simics on a small network of machines inside of Simics. See https://blogs.intel.com/evangelists/2016/05/30/finding-kernel-1-2-3-bug-running-wind-river-simics-simics/ for the full story.
I have posted my first blog post to the Intel Software and Services blog channel. The Intel Software and Services blog is one channel in the Intel corporate blog you find at https://blogs.intel.com/. Other bloggers on the Software and Services channel write about security, UEFI, cloud, graphics, open source software, and other topics. Intel has a large software development community, and we produce quite a bit of software – and we do write about the innovations that come out of Intel that rely on software.
On my part, I will be posting more materials on simulation at Intel, as part of my role as a simulation evangelist on the Software and Service blog channel.
Even though I am now working for Intel, the nice folks at Wind River have let me do blogging on the Wind River blog as a guest anyway. I first blogged about the fantastic world of simulators that I have found inside Intel, and now a longer technical piece has appeared on a use of Intel CoFluent Studio. I interviewed Sangeeta Ghangam at Intel, who used CoFluent Studio to model the behavior of a complex software load on a gateway, connected to a set of sensor nodes. It is rather different from the very concrete software execution I work on with Simics. Being able to model and estimate the performance and cost and size of systems before you go to the concrete implementation is an important part of software and systems architecture, and CoFluent offers a neat tool for that.
Read the full story on the Wind River blog!
IEEE Micro published an article called “Architectural Simulators Considered Harmful”, by Nowatski et al, in the November-December 2015 issue. It is a harsh critique of how computer architecture research is performed today, and its uninformed overreliance on architectural simulators. I have to say I mostly agree with what they say. The article follows in a good tradition of articles from the University of Wisconsin-Madison of critiquing how computer architecture research is performed, and I definitely applaud this type of critique.
There are still some articles being published that I wrote while at Wind River. The latest is a piece on just what you could do with a lab in cloud – in particular, a lab based on virtual platforms like Simics. Eva Skoglund at Wind River and I wrote this together, and it is a nice high-level summary of why you really need to have a virtual cloud-based lab if you are doing embedded systems development. It is published in the online European magazine Electropages.
A long time ago, when I was a PhD student at Uppsala University, I supervised a few Master’s students at the company CC-Systems, in some topics related to the simulation of real-time distributed computer systems for the purpose of software testing. One of the students, Magnus Nilsson, worked on a concept called “Time-Accurate Simulation”, where we annotated the source code of a program with the time it would take to execute (roughly) on the its eventual hardware platform. It was a workable idea at the time that we used for the simulation of distributed CAN systems. So, I was surprised and intrigued when I saw the same idea pop up in a paper written last year – only taken to the next level (or two) and used for detailed hardware design!
Continue reading “Time-Accurate Simulation Revisited – 15 years later”
Intel is a big Simics user, but most of the time Intel internal use of Simics is kept internal. However, we recently had the chance to interview Karthik Kumar and Thomas Willhalm of Intel about how they used Simics to interact with external companies and improve Intel hardware designs. The interview is found on the Wind River blog network.
It is also my last blog post written at Wind River; since January 18, I am working at Intel. I am working on ways to keep publishing texts about Simics and simulation, but the details are not yet clear.
I just posted a short blog post on the Wind River blog, introducing a video demo of the Web API to Wind River Helix Lab Cloud. In the post and video, I show how the Lab Cloud Web API works. For someone familiar with REST-style APIs, this is probably baby-level, but for me and probably most of our user base, it is something new and a rather interesting style for an API. Thus, doing a video that shows the first few steps of authentication and getting things going seems like a good idea.
On November 3, 2015, I will give a presentation at the Embedded Conference Scandinavia about simulating IoT systems. The conference program can be found at http://www.svenskelektronik.se/ECS/ECS15/Program.html, with my session detailed at http://www.svenskelektronik.se/ECS/ECS15/Program/IoT%20Development.html.
My topic is how to realistically simulate very large IoT networks for software testing and system development. This is a fun field where I have spent significant time recently. Only a couple of weeks ago, I tried my hand simulating a 1000-node network. Which worked! I had 1000 ARM-based nodes running VxWorks running at the same time, inside a single Simics process, and at speeds close to real time! It did use some 55GB of RAM, which I think is a personal record for largest use of system resources from a single process. Still, it only took a dozen processors to do it.
There is a new post at my Wind River blog, about the new Wind River Helix Lab Cloud product that we launched for real last week. The Lab Cloud is a really cool way to expose Simics-style functionality, and my blog goes through some of the more prominent use cases for a simulator in the cloud. There a couple of demo videos linked from the blog, and I have also set up a Youtube playlist collecting the Simics demos and other videos that we have posted there. Quite a set over the past few years, actually!
There is a new post at my Wind River blog, about how I helped a colleague resolve a real problem using the preview version of the new Helix Lab Cloud system. The Lab Cloud right now is basically Simics behind a simplified web user interface, exposing the checkpointing and record-replay facilities in a very clear way. You can also share your sessions for live interactions with other people, which is truly cool.
I just added a new blog post on the Wind River blog, about how you do fault injection with Simics. This blog post covers the new fault injection framework we added in Simics 5, and the interesting things you can do when you add record and replay capabilities to spontaneous interactive work with Simics. There is also a Youtube demo video of the system in action.
While I was on vacation, Wind River published a blog post I wrote about the new multicore accelerator feature of Simics 5. The post has some details on what we did, and some of the things we learnt about simulation performance.
I have read a few news items and blog posts recently about how various types of software running on top of virtual machines and emulators have managed to either break the emulators or at least detect their presence and self-destruct. This is a fascinating topic, as it touches on the deep principles of computing: just because a piece of software can be Turing-equivalent to a piece of hardware does not mean that software that goes looking for the differences won’t find any or won’t be able to behave differently on a simulator and on the real thing.