I just added a new blog post on the Wind River blog, about how you do fault injection with Simics. This blog post covers the new fault injection framework we added in Simics 5, and the interesting things you can do when you add record and replay capabilities to spontaneous interactive work with Simics. There is also a Youtube demo video of the system in action.
I have read a few news items and blog posts recently about how various types of software running on top of virtual machines and emulators have managed to either break the emulators or at least detect their presence and self-destruct. This is a fascinating topic, as it touches on the deep principles of computing: just because a piece of software can be Turing-equivalent to a piece of hardware does not mean that software that goes looking for the differences won’t find any or won’t be able to behave differently on a simulator and on the real thing.
The Security Now Podcast number 497 dealt with the topic of Vehicle Hacking. It was fairly interesting, if a bit too light on the really interesting thing which is what actually went on in the vechicle hack that was apparently demonstrated on US national television at some point earlier this year (I guess this CBS News transcript fits the description). It was still good to hear the guys from the Galois consulting firm (Lee Pike and Pat Hickey) talking about what they did. Sobering to realize just how little even a smart guy like Steve Gibson really knows about embedded systems and the reality of their programming. Embedded software really is pretty invisible in both a good way and a bad way.
I have been thinking about the role and prestige of testing for the past several years. Many things I have read and things companies have done indicate that “testing” is something that is considered a bit passe and old-school. Testers are dead weight that get into the way of releases, and they are unproductive barnacles that slow development down. Testers can all be replaced by automatic testing put in place by brilliant developers. The creative developer types are the guys with the status anyway. I might be exaggerating, but there is an issue here. I think we need to be acknowledge that testers are a critical part of the software quality puzzle, and that testing is not just something developers can do with one hand tied behind their back.
In a dusty bookshelf at work I found an ancient tome of wisdom, long abandoned by its previous owner. I was pointed to it by a fellow explorer of the dark arts of computer system design as something that you really should read. The book was “Fortress Rochester”, written by Frank Soltis, and published in 2001.
Last year, I concluded a programming project at work that clearly demonstrated that real programming tasks tend to involve multiple languages. I once made a remark to a journalist that there is a zoo of languages inside all real products, and my little project provided a very clear example of this. The project, as discussed previously, was to build an automated integration between a simple Simics target system and the Simulink processor-in-the-loop code testing system. In the course of this project, I used six or seven languages (depending on how you count), three C compilers, and three tools. Eight different compilers were involved in total.
I am going to be speaking at the 2015 Embedded World Conference in Nürnberg, Germany. My talk is about Continuous Integration for embedded systems, and in particular how to enable it using simulation technology such as Simics.
My talk is at 16.00 to 16.30, in session 03/II, Software Quality I – Design & Verification Methods.
There is a new post at my Wind River blog, about how you can use Simics to enable the automatic testing of pretty much any computer system (as long as we can put it inside a simulator). This is a natural follow-up to the earlier post about continuous integration with Simics and Simics-Simulink integrations — automated test runs is a mandatory and necessary part of all modern software development.
I just found and read an old text in the computer systems field, “Why Do Computers Fail and What Can Be Done About It?” , written by Jim Gray at Tandem Computers in 1985. It is a really nice overview of the issues that Tandem had encountered in their customer based, back in the early 1980s. The report is really a classic in the computer systems field, but I did not read it until now. Tandem was an early manufacturer of explicitly fault tolerant and highly reliable and available computers. In this technical report Jim Gray describes the basic principles of fault tolerance, and what kinds of faults happen in the field and that need to be tolerated.
At the ISCA 2014 conference (the biggest event in computer architecture research), a group of researchers from Microsoft Research presented a paper on their Catapult system. The full title of the paper is “A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services“, and it is about using FPGAs to accelerate search engine queries at datacenter scale. It has 23 authors, which is probably the most I have ever seen on an interesting paper. There are many things to be learnt from and discussed about this paper, and here are my thoughts on it.
The Mill is a new general-purpose high-performance processor design from out-of-the-box computing (http://ootbcomp.com/). They claim to beat typical high-end out-of-order (OOO) designs like the Intel Haswell generation by crazy factors, such as being 2.3x faster while using 2.3x less power compared to a Haswell. All the while costing less. Ignoring the cost aspect, the power and performance numbers are truly impressive – especially for general code. How can they do something so much better than what we have today? For general code? That requires some serious innovation. With that perspective, I ask myself where the Mill is really significantly different from what we have seen before.
I recently made my first acquaintance with Windows 8, having bought a new Sony ultrabook for the family. Including a touch screen. The combination of the touch-based interface and the phone-like look of Windows 8 even on a PC has led me to think about the (unconscious) expectations that I have come to have on how systems behave and how services are accessed, from how smart phones and tablets have come to work in the past few years. In particular, where are web-based services going?
Apple just released their new iPhone 5s, where the biggest news is really the 64-bit processor core inside the new A7 SoC. Sixty four bits in a phone is a first, and it immediately raises the old question of just what 64 bits gives you. We saw this when AMD launched the Opteron and 64-bit x86 PC computing back in the early 2000’s, and in a less public market the same question was asked as 64-bit MIPS took huge chunks out of the networking processor market in the mid-2000s. It was never questioned in servers, however.
I have a silly demo program that I have been using for a few years to demonstrate the Simics Analyzer ability to track software programs as they are executing and plot which threads run where and when. This demo involves using that plot window to virtually draw text, in a way akin to how I used to make my old ZX Spectrum “draw” things in the border. But when I brought it up in a new setting it failed to run properly and actually starting hanging on me. Strange, but also quite funny when I realized that I had originally foreseen this very problem and consciously decided not to put in a fix for it… which now came back to bite me in a pretty spectacular way. But at least I did get an interesting bug to write about.
Debugging – the 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems by David Agans was published in 2002, based on several decades of practical experience in debugging embedded systems. Compared to the other debugging book I read this Summer, Debugging is much more a book for the active professional currently working on embedded products. It is more of a guidebook for the practitioner than a textbook for students that need to learn the basics.
Via the EETimes, I found a very interesting talk by Bristol professor David May, presented at the 4th Annual Bristol Multicore Challenge, in June of 2013. The talk can be found as a Youtube movie here, and the slides are available here. The EETimes focused on the idea to cut down ARM to be really RISC, but I think the more interesting part is Professor May’s observations on multicore computing in general, and the case for and against heterogeneity in (parallel) computers.
There is a new post at my Wind River blog, about how Simics sessions are started and the mechanics of system setups in Simics. It also has a link to a Youtube video demonstrating various ways of starting Simics simulation sessions.
Nudge theory is an intriguing fairly new take on how to manage societies and modulate the behavior of people to make them behave in a “better” way. My layman’s interpretation of the idea is that instead of threating punishment and setting up rules, you try to encourage desired behavior with rewards and discourage undesired behavior by making it require an effort. When working with the new Test Runner view in the Simics 4.8 GUI, I realized that we had invented exactly such a nudge.
Last year, I did a Simics webinar which included a two-part demo of how to use Simics to debug an endianness bug in a networked system as it migrates from big-endian to a little-endian system. Along the way, I also showed off various Simics features like reverse execution and checkpointing and scripted execution.
The demo is now online at the Wind River Youtube channel, and the setup is explained in a blog post at the Wind River company blog which is worth reading before watching the video.
LEGOs seem to be a favorite analogy for people bemoaning the state of software development today. “If only it would be as simple as putting Legos together” is a common enough statement, along with various proposals to make software that is Lego-like. Sometime, I wonder if people making these statements have actually tried to build anything non-trivial from Lego recently. Here, I will look a bit closer at the Lego-programming analogy. There is indeed quite a lot to it, but it is not all about child-level simplicity. I think there are some good lessons that can be learnt from analogizing Lego and programming.
There is a new post at my Wind River blog, telling the story of how some of the Simics developers used Simics itself to debug an intermittent Simics program crash caused by a timing-sensitive race condition.
Running Simics on itself is pretty cool, and shows the power of the simulator and its applicability even to really complex software.
Logging as as debug method is not new, and I have been writing about it to and from over the past few years myself. At the S4D conference, tracing and logging keeps coming up as a topic (see my reports from 2009, 2010 and 2012 ). I recently found an interesting piece on logging from the IT world in the ACM Queue (“Advances and Challenges in Log Analysis“, Adam Oliner, ACM Queue December 2011). Here, I want to address some of my current thoughts on the topic.
After some discussions at the S4D conference last week, I have some additional updates to the history and technologies of reverse execution. I have found one new commercial product at a much earlier point in time, and an interesting note on memory consistency.
There is a new blog post on my Wind River blog, about the Landslide system from CMU. It is a pretty impressive Master’s Thesis project that used the control that Simics has over interrupts to systematically try different OS kernel thread interleavings to find concurrency bugs. The blog is an interview with Ben Blum, the student who did the work. Ben is now a PhD student, and I bet that he will continue to generate cool stuff in the future.
The 2012 edition of the SiCS Multicore Day was fun, like they have always been in the past. I missed it in 2010 and 2011, but could make it back this year. It was interesting to see that the points where keynote speakers disagreed was similar to previous years, albeit with some new twists. There was also a trend in architecture, moving crypto operations into the core processor ISA, that indicates another angle on the hardware accelerator space.
I am going to the S4D conference for the third year in a row. This year, I have a paper on reverse debugging, reviewing the technology, products, and history of the idea. I will probably write a longer blog post after the conference, interesting things tend to come up.
In the past year, I have started listening to various podcast from the “Skeptic” community. Although much of the discussion tends to center on medicine (because of the sadly enormous market for quackery) and natural science (because the sad fight over evolution), it has made me think and reflect more about the nature of science and publishing. Indeed, it would have been great if this kind of material would have been easily found back when I was doing my PhD.
I am going to be talking about how to transport bugs with virtual platform checkpoints, in the Software Tools track at the Embedded Conference Scandinavia, on October 3, 2012, in Stockholm (Sweden). The ECS is a nice event, and there are several tracks to choose from both on October 2 and October 3. In addition to the tracks, Jan Bosch from Chalmers is going to present a keynote that I am sure will be very entertaining (see my notes from a presentation he did in Göteborg last year).
I am scheduled to talk at the SiCS multicore day 2012 (like I did back in 2009 and 2008). The event takes palce on September 13, at SiCS in Kista. My topic will be on System-Level Debug – how we can make debuggers that work for big systems.
This year, the multicore day is part of a bigger Software Week event, which also covers cloud and internet of things. See you there!
I am not in the computer security business really, but I find the topic very interesting. The recent wide coverage and analysis of the Flame malware has been fascinating to follow. It is incredibly scary to see a “well-resourced (probably Western) nation-state” develop this kind of spyware, following on the confirmation that Stuxnet was made in the US (and Israel).
In any case, regardless of the resources behind the creation of such malware, one wonders if it could not be a bit more contained with a different way to structure our operating systems. In particular, Flame’s use of microphones, webcams, bluetooth, and screenshots to spy on users should be containable. Basically, wouldn’t cell-phone style sandboxing and capabilities settings make sense for a desktop OS too?