Injecting faults into systems and subjecting them to extreme situations at or beyond their nominal operating conditions is an important part of making sure they keep working even when things go bad. It was realized very early in the history of Simics (and the same observation had been made by other virtual platform and simulator providers) that using a virtual platform makes it much easier to provide cheap, reliable, and repeatable fault injection for software testing. In an Intel Developer Zone (IDZ) blog post, I describe some early cases of fault injection with Simics.
I work with virtual platforms and software simulation technology, and for us most simulation is done on standard servers, PCs, or latptops. Sometimes we connect up an FPGA prototype or emulator box to run some RTL, or maybe a real-world PCIe device, but most of the time a simulator is just another general-purpose computer with no special distinguishing properties. When connecting to the real world, it is simple standard things like Ethernet, serial ports, or USB.
There are other types of simulators in the world however – still based on computers running software, but running it somehow closer to the real world, and with actual physical connections to real hardware beyond basic Ethernet and USB. I saw a couple of nice examples of this at the Embedded World back in February, where full-height racks were basically “simulators”.
Back in 2004, the startup Virtutech built a crazy demo for the 2004 Embedded Systems Conference (ESC). Back then, ESC was the place to be, and Virtutech was there with a battery of demos to blast the competition. The most interesting demo from a technology perspective was the 1002-machine network, as described in an Intel Developer Zone blog post of mine.
I have just released a new blog post on my Intel Developer Zone blog, about how Simics runs
large huge workloads. I look back at the kinds of workloads that ran on Simics back in 1998 when the product first went commercial, and then look at some current examples running on Simics. This is the first post in a series intended to celebrate 20 years of Simics as a commercial product.
There is a blog post out on my Intel Developer Zone blog where I take a look at the Gartner “Top Ten Tech Trends” for 2018. There are a couple of them where I found clear roles for the kinds of simulation tools we build in my little corner of Intel. In particular, Digital Twins is a concept that is all about simulation. To find the other trend where I found a big role for simulation, read the full blog post.
Over time, Intel and other processor core designers add more and more instructions to the cores in our machines. A good question is how quickly and easily new instructions added to an Instruction-Set Architecture (ISA) actually gets employed by software to improve performance and add new capabilities. Considering that our operating systems and programs are generally backwards-compatible, and run on all kind of hardware, can they actually take advantage of new instructions?
inThere will be a session on checkpointing in SystemC at the upcoming SystemC Evolution Day in München on October 18, 2017. I will be presenting it, together with some colleagues from Intel. Checkpointing is a very interesting topic in its own right, and I have written lots about it in the past – both as a technology and it applications.
I have posted a two-part blog post to the public Intel Developer Zone blog, about the “Small Batches Principle” and how simulation helps us achieve it for complicated hardware-software systems. I found the idea of the “small batch” a very good way to frame my thinking about what it is that simulation really brings to system development. The key idea I want to get at is this:
[…] the small batches principle: it is better to do work in small batches than big leaps. Small batches permit us to deliver results faster, with higher quality and less stress.
Integration is hard, that is well-known. For computer chip and system-on-chip design, integration has to be done pre-silicon in order to find integration issues early so that designs can be updated without expensive silicon re-spins. Such integration involves a lot of pieces and many cross-connections, and in order to do integration pre-silicon, we need a virtual platform.
I have just published a piece about the Intel Excite project on my Software Evangelist blog at the Intel Developer Zone. The Excite project is using a combination of of symbolic execution, fuzzing, and concrete testing to find vulnerabilities in UEFI code, in particular in SMM. By combining symbolic and concrete techniques plus fuzzing, Excite achieves better performance and effect than using either technique alone.
Today, when developing embedded control systems, it is standard practice to test control algorithms against some kind of “world model”, “plant model” or “environment simulator”.
Using a simulated control system or a virtual platform running the actual control system code, connected to the world model lets you test the control system in a completely virtual and simulated environment (see for example my Trinity of Simulation blog post from a few years ago). This practice of simulating the environment for a control computer is long-standing in the aerospace field in particular, and I have found that it goes back at least to the Apollo program.
In the early 1990s, “PC graphics” was almost an oxymoron. If you wanted to do real graphics, you bought a “real machine”, most likely a Silicon Graphics workstation. At the PC price-point, fast hardware-accelerated 3D graphics wasn’t doable… until it suddenly was, thanks to Moore’s law. 3dfx was the first company to create fast 3D graphics for PC gamers. To get off the ground and get funded, 3dfx had to prove that their ideas were workable – and that proof came in the shape of a simulator. They used the simulator to demo their ideas, try out different design points, develop software pre-silicon, and validate the silicon once it arrived. Read the full story on my Intel blog, “How Simulation Started a Billion-Dollar Company”, found at the Intel Developer Zone blogs.
I had many interesting conversations at the HiPEAC 2017 conference in Stockholm back in January 2017. One topic that came up several times was the GEM5 research simulator, and some cool tricks implemented in it in order to speed up the execution of computer architecture experiments. Later, I located some research papers explaining the “full speed ahead” technology in more detail. The mix of fast simulation using virtualization and clever tricks with cache warming is worth a blog post.
Doing continuous integration and continuous delivery for embedded systems is not necessarily all that easy. You need to get tools in place to support automatic testing, and free yourself from unneeded hardware dependencies. Based on an inspiring talk by Mike Long from Norway, I have a piece on how simulation helps with embedded CI and CD on my Software Evangelist blog on the Intel Developer Zone.
It is really sad that the European Space Agency (ESA) lost their Schiaparelli lander last year, as we will miss out on a lot of Mars science. From a software engineering and testing perspective, the story of why the landing failed rather instructive, though. It gets down to how software can be written and tested to deal with unexpected inputs in unexpected circumstances. I wrote a piece about this on my blog at the Intel Developer Zone.
A new entry just showed up in the world of reverse debugging – Simulics, from German company Simulics. It does seem like the company and the tool are called the same. Simulics is a rather rare breed, the full-system-simulation-based reverse debugger. We have actually only seen a few these in history, with Simics being the primary example. Most reverse debuggers apply to user-level code and use various forms of OS call intercepts to create a reproducible run. Since the Simulics company clearly comes from the deeply embedded systems field, it makes sense to take the full-system approach since that makes it possible to debug code such as interrupt handlers.
I have also updated my history of commercial reverse debuggers to include Simulics.
Simics and other simulation solutions are a great way to add more variation to your software testing. I have just documented a nice case of this on my blog at the Intel Developer Zone (IDZ), where the Simics team found a bug in how Xen deals with MPX instructions when using VT-x. Thanks to running on Simics, where scenarios not available in current hardware are easy to set up.
I am going to present a paper about our new SystemC Library in Simics, at the DVCon Europe conference taking place in München next month. The paper is titled “Integrating Different Types of Models into a Complete Virtual System – The Simics SystemC* Library”, and I authored it together with my Intel colleagues Andreas Hedström, Xiuliang Wang, and Håkan Zeffer.
On my Intel Software Evangelist blog, I just published an updated version of an interview I first published back in May, about how to use Intel CoFluent Studio for IoT system architecture. This is a really cool story, about how you can use a calibrated simulation model to architect and analyze software performance before actually writing the code! I
My first blog post as a software evangelist at Intel was published last week. In it, I tell the story of how our development teams used Simics to test the software behavior (UEFI, in particular) when a server is configured with several terabytes of RAM. Without having said server in physical form – just as a simulation. And running that simulation on a small host with just 256 GB of RAM. I.e., the host RAM is just a small fraction of the target. That’s the kind of things that you can do with Simics – the framework has a lot of smarts in it.
It was rather interesting to realize that just the OS page tables for this kind of system occupies gigabytes of RAM… but that just underscores just how gigantic six terabytes of memory really is.
This really happened last week, but I was in the US for the DAC then. I did another blog on Intel Software blog, about a white paper that Wind River put out about how they use Simics internally. The white paper is a really good set of examples of how Simics can be used for software development, test, and debug – regardless of how old or new the hardware is. It also touches my favorite topic of IoT simulation and scaling up – Wind River is actually using Simics for 1000+ node tests of IoT software! Read on at https://blogs.intel.com/evangelists/2016/06/06/wind-river-uses-simics-test-massive-iot-networks/
I love bug and debug stories in general. Bugs are a fun and interesting part of software engineering, programming, and systems development. Stories that involve running Simics on Simics to find bugs are a particular category that is fascinating, as it shows how to apply serious software technology to solve problems related to said serious software technology. On the Intel Software and Services blog, I just posted a story about just that: debugging a Linux kernel bug provoked by Simics, by running Simics on a small network of machines inside of Simics. See https://blogs.intel.com/evangelists/2016/05/30/finding-kernel-1-2-3-bug-running-wind-river-simics-simics/ for the full story.
I have posted my first blog post to the Intel Software and Services blog channel. The Intel Software and Services blog is one channel in the Intel corporate blog you find at https://blogs.intel.com/. Other bloggers on the Software and Services channel write about security, UEFI, cloud, graphics, open source software, and other topics. Intel has a large software development community, and we produce quite a bit of software – and we do write about the innovations that come out of Intel that rely on software.
On my part, I will be posting more materials on simulation at Intel, as part of my role as a simulation evangelist on the Software and Service blog channel.
Even though I am now working for Intel, the nice folks at Wind River have let me do blogging on the Wind River blog as a guest anyway. I first blogged about the fantastic world of simulators that I have found inside Intel, and now a longer technical piece has appeared on a use of Intel CoFluent Studio. I interviewed Sangeeta Ghangam at Intel, who used CoFluent Studio to model the behavior of a complex software load on a gateway, connected to a set of sensor nodes. It is rather different from the very concrete software execution I work on with Simics. Being able to model and estimate the performance and cost and size of systems before you go to the concrete implementation is an important part of software and systems architecture, and CoFluent offers a neat tool for that.
Read the full story on the Wind River blog!
IEEE Micro published an article called “Architectural Simulators Considered Harmful”, by Nowatski et al, in the November-December 2015 issue. It is a harsh critique of how computer architecture research is performed today, and its uninformed overreliance on architectural simulators. I have to say I mostly agree with what they say. The article follows in a good tradition of articles from the University of Wisconsin-Madison of critiquing how computer architecture research is performed, and I definitely applaud this type of critique.
A long time ago, when I was a PhD student at Uppsala University, I supervised a few Master’s students at the company CC-Systems, in some topics related to the simulation of real-time distributed computer systems for the purpose of software testing. One of the students, Magnus Nilsson, worked on a concept called “Time-Accurate Simulation”, where we annotated the source code of a program with the time it would take to execute (roughly) on the its eventual hardware platform. It was a workable idea at the time that we used for the simulation of distributed CAN systems. So, I was surprised and intrigued when I saw the same idea pop up in a paper written last year – only taken to the next level (or two) and used for detailed hardware design!
Continue reading “Time-Accurate Simulation Revisited – 15 years later”
Intel is a big Simics user, but most of the time Intel internal use of Simics is kept internal. However, we recently had the chance to interview Karthik Kumar and Thomas Willhalm of Intel about how they used Simics to interact with external companies and improve Intel hardware designs. The interview is found on the Wind River blog network.
It is also my last blog post written at Wind River; since January 18, I am working at Intel. I am working on ways to keep publishing texts about Simics and simulation, but the details are not yet clear.
On November 3, 2015, I will give a presentation at the Embedded Conference Scandinavia about simulating IoT systems. The conference program can be found at http://www.svenskelektronik.se/ECS/ECS15/Program.html, with my session detailed at http://www.svenskelektronik.se/ECS/ECS15/Program/IoT%20Development.html.
My topic is how to realistically simulate very large IoT networks for software testing and system development. This is a fun field where I have spent significant time recently. Only a couple of weeks ago, I tried my hand simulating a 1000-node network. Which worked! I had 1000 ARM-based nodes running VxWorks running at the same time, inside a single Simics process, and at speeds close to real time! It did use some 55GB of RAM, which I think is a personal record for largest use of system resources from a single process. Still, it only took a dozen processors to do it.
There is a new post at my Wind River blog, about the new Wind River Helix Lab Cloud product that we launched for real last week. The Lab Cloud is a really cool way to expose Simics-style functionality, and my blog goes through some of the more prominent use cases for a simulator in the cloud. There a couple of demo videos linked from the blog, and I have also set up a Youtube playlist collecting the Simics demos and other videos that we have posted there. Quite a set over the past few years, actually!
There is a new post at my Wind River blog, about how I helped a colleague resolve a real problem using the preview version of the new Helix Lab Cloud system. The Lab Cloud right now is basically Simics behind a simplified web user interface, exposing the checkpointing and record-replay facilities in a very clear way. You can also share your sessions for live interactions with other people, which is truly cool.