For Simicstraining and demo purposes, we often use Linux* running on the virtual platforms. In the early days of Simics and embedded Linux, we built our own minimal configurations by hand to run on simple target systems. Most recently, we changed our Linux default demo and training setup to use Clear Linux*. This change showed us just how sophisticated modern Linux setups are – which is good in general, but it also can make some low-level details more complicated.
I heard about the DOOM Game Engine Black Book by Fabien Sanglard on the Hanselminutes podcast episode 666, and immediately ordered the book. It was a riveting read – at least for someone who likes technology and computer history like I do. The book walks through how the ID Software classic DOOM game from 1993 works and the tricks and techniques used to get sufficient performance out of the hardware of 1993. As background to how the software was written, the book contains a great description of the hardware design of IBM-compatible PCs, gaming consoles, and NeXT machines circa 1992-1994. It covers software design, game design, marketing, and how ID Software worked.
The US Defense Advanced Projects Agency (DARPA) ran a “Cyber Grand Challenge” in 2016, where automated cyber-attack and cyber-defense systems were pitted against each other to drive progress in autonomous cyber-security. The competition was run on physical computers (obviously), but Simics was used in a parallel flow to check that competitors’ programs were not trying to undermine the infrastructure of the competition rather than compete fairly inside the rules of the competition.
There are some things in computing that seem “obviously” true and that “clearly” make it “impossible” to do some things. One example of this is the idea that you cannot go backwards in time from the current state of a program or computer system and recover previous state by just reversing the semantics of the instructions in the program. In particular, that you cannot take a core dump from a failed system and reverse-execute back from it – how could you? In order to do reverse debugging and reverse execution, you “have to” record the state at the first point in time that you want to be able to go back to, and then record all changes to the state. Turns out I was wrong, as shown by a recent Usenix OSDI paper.
DVCon Europe took place in München, Bayern, Germany, on October 24 and 25, 2018. Here are some notes from the conference, including both general observations and some details on a few papers that were really quite interesting. This is not intended as an exhaustive replay, just my personal notes on what I found interesting.
This year’s Design and Verification Conference and Exhibition (DVCon Europe) takes place on October 24 and 25 (2018). DVCon Europe has turned into the best conference for virtual platform topics, and this year is no exception. There are some good talks coming!
I have a documented love for keyboards with RGB lighting. So I was rather annoyed when one of my Corsair K65 keyboards suddenly seemed to lose its entire red color component. The keyboard is supposed to default to all-red color scheme with the WASD and arrow keys highlighted in white when no user is logged in to the machine it is connected to – but all of a sudden, it went all dark except a light-blue color on the “white” keys. I guessed it was just a random misconfiguration, but it turned out to be worse than that.
Recently I stumbled on a nice piece called “Always Measure One Level Deeper” by John Ousterhout, from Communications of the ACM, July 2018. https://cacm.acm.org/magazines/2018/7/229031-always-measure-one-level-deeper/fulltext. The article is about performance analysis, and how important it is to not just look at the top-level numbers and easy-to-see aspects of a system, but to also go (at least) one level deeper to measure the components and subsystems that affect the overall system performance.
I recently asked myself the question of just how many Powerpoint files I had on my work laptop and on my home machines. It turns out that it was pretty easy to figure that out using Windows Powershell, with some commands I found on a random website.
I just found a story about Undo software that was rather interesting from a strategic perspective. “Patient capital from CIC gives ‘time travelling’ company Undo space to pivot“, from the BusinessWeekly in the UK. The article describes a change from selling to individual developers, towards selling to enterprises. This is an important business change, but it also marks I think a technology thinking shift: from single-session debug to record-replay.
Injecting faults into systems and subjecting them to extreme situations at or beyond their nominal operating conditions is an important part of making sure they keep working even when things go bad. It was realized very early in the history of Simics (and the same observation had been made by other virtual platform and simulator providers) that using a virtual platform makes it much easier to provide cheap, reliable, and repeatable fault injection for software testing. In an Intel Developer Zone (IDZ) blog post, I describe some early cases of fault injection with Simics.
There have been quite a few security exploits and covert channels based on timing measurements in recent years. Some examples include Spectre and Meltdown, Etienne Martineau’s technique from Def Con 23, the technique by Maurice et al from NDSS 2017, and attacks on crypto algorithms by observing the timing of execution. There are many more examples, and it is clear that measuring time, in particular in order to tell cache hits and cache misses apart, is a very useful primitive. Thus, it seems to make sense to make it harder for software to measure time, by reducing the precision of or adding jitter to timing sources. But it seems such attempts are rather useless in practice.
The introduction of non-volatile memory that is accessed and addressed like traditional RAM instead of using a special interface has some rather interesting effects on software. It blurs the traditional line between persistent long-term mass storage and volatile memory. On the surface, it sounds pretty simple: you can keep things living in RAM-like memory across reboots and shutdowns of a system. Suddenly, there is no need to reload things into RAM for execution following a reboot. Every piece of data and code can be kept immediately accessible in the memory that the processor uses. A computer could in principle just get rid of the whole disk/memory split and just get a single huge magic pool of storage that makes life easier. No file system, no complications, easy programmer life. Or is it that simple?
Over time, Intel and other processor core designers add more and more instructions to the cores in our machines. A good question is how quickly and easily new instructions added to an Instruction-Set Architecture (ISA) actually gets employed by software to improve performance and add new capabilities. Considering that our operating systems and programs are generally backwards-compatible, and run on all kind of hardware, can they actually take advantage of new instructions?
A blog post from Undo Software informed me that Microsoft has rather quietly released a reverse debugger tool for Windows programs – WinDbg with Time Travel Debug. It is available in the latest preview of WinDbg, as available through the Windows Store, for the most recent Windows 10 versions (Anniversary update or later). According to a CPPcon talk about the tool (Youtube recording of the talk) the technology has a decade-long history internally at Microsoft, but is only now being released to the public after a few years of development. So it is a new old thing 🙂
Intel Software Guard Extensions (SGX) is a pretty cool piece of technology that aims to make it possible for user programs to hide secrets from other user programs and the operating system itself. It establishes enclaves in the system that hides the data being processed and the code processing it from all other software. The original application for SGX was to support client-machine features like DRM, to create a safe space on a client that a server can trust. Recently, the people behind the Signal messaging system have provided a really interesting example of an application that makes use of the of SGX “in reverse”, to make it possible for a client to trust a server.
Integration is hard, that is well-known. For computer chip and system-on-chip design, integration has to be done pre-silicon in order to find integration issues early so that designs can be updated without expensive silicon re-spins. Such integration involves a lot of pieces and many cross-connections, and in order to do integration pre-silicon, we need a virtual platform.
IEEE Spectrum ran a short interview with Thomas Knoll, the creator of Photoshop, who made a very interesting point about the move to subscription-based software rather than one-time buys plus upgrades. His point is that if you are building based sold using the “upgrade model”, developers have to create features that cause users to upgrade. In his opinion, that means you have to focus on flashy features that demo well and catch people’s attention – but that likely do not actually help users in the end.
I have just published a piece about the Intel Excite project on my Software Evangelist blog at the Intel Developer Zone. The Excite project is using a combination of of symbolic execution, fuzzing, and concrete testing to find vulnerabilities in UEFI code, in particular in SMM. By combining symbolic and concrete techniques plus fuzzing, Excite achieves better performance and effect than using either technique alone.
Human Resource Machine, from the Tomorrow Corporation, is a puzzle game that basically boils down to assembly-language programming. It has a very charming graphical style and plenty of entertaining sideshows, hiding the rather dry core game in a way that really works! It is really great fun and challenging – even for myself who has a reasonable amount of experience coding at this low level. It feels a bit like coding in the 1980s, except that all instructions take the same amount of time in the game.
Today, when developing embedded control systems, it is standard practice to test control algorithms against some kind of “world model”, “plant model” or “environment simulator”.
Using a simulated control system or a virtual platform running the actual control system code, connected to the world model lets you test the control system in a completely virtual and simulated environment (see for example my Trinity of Simulation blog post from a few years ago). This practice of simulating the environment for a control computer is long-standing in the aerospace field in particular, and I have found that it goes back at least to the Apollo program.
In the early 1990s, “PC graphics” was almost an oxymoron. If you wanted to do real graphics, you bought a “real machine”, most likely a Silicon Graphics workstation. At the PC price-point, fast hardware-accelerated 3D graphics wasn’t doable… until it suddenly was, thanks to Moore’s law. 3dfx was the first company to create fast 3D graphics for PC gamers. To get off the ground and get funded, 3dfx had to prove that their ideas were workable – and that proof came in the shape of a simulator. They used the simulator to demo their ideas, try out different design points, develop software pre-silicon, and validate the silicon once it arrived. Read the full story on my Intel blog, “How Simulation Started a Billion-Dollar Company”, found at the Intel Developer Zone blogs.
A new entry just showed up in the world of reverse debugging – Simulics, from German company Simulics. It does seem like the company and the tool are called the same. Simulics is a rather rare breed, the full-system-simulation-based reverse debugger. We have actually only seen a few these in history, with Simics being the primary example. Most reverse debuggers apply to user-level code and use various forms of OS call intercepts to create a reproducible run. Since the Simulics company clearly comes from the deeply embedded systems field, it makes sense to take the full-system approach since that makes it possible to debug code such as interrupt handlers.
Simics and other simulation solutions are a great way to add more variation to your software testing. I have just documented a nice case of this on my blog at the Intel Developer Zone (IDZ), where the Simics team found a bug in how Xen deals with MPX instructions when using VT-x. Thanks to running on Simics, where scenarios not available in current hardware are easy to set up.
My first blog post as a software evangelist at Intel was published last week. In it, I tell the story of how our development teams used Simics to test the software behavior (UEFI, in particular) when a server is configured with several terabytes of RAM. Without having said server in physical form – just as a simulation. And running that simulation on a small host with just 256 GB of RAM. I.e., the host RAM is just a small fraction of the target. That’s the kind of things that you can do with Simics – the framework has a lot of smarts in it.
It was rather interesting to realize that just the OS page tables for this kind of system occupies gigabytes of RAM… but that just underscores just how gigantic six terabytes of memory really is.