Debugging – the 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems by David Agans was published in 2002, based on several decades of practical experience in debugging embedded systems. Compared to the other debugging book I read this Summer, Debugging is much more a book for the active professional currently working on embedded products. It is more of a guidebook for the practitioner than a textbook for students that need to learn the basics.
Via the EETimes, I found a very interesting talk by Bristol professor David May, presented at the 4th Annual Bristol Multicore Challenge, in June of 2013. The talk can be found as a Youtube movie here, and the slides are available here. The EETimes focused on the idea to cut down ARM to be really RISC, but I think the more interesting part is Professor May’s observations on multicore computing in general, and the case for and against heterogeneity in (parallel) computers.
There is a new post at my Wind River blog, about how Simics sessions are started and the mechanics of system setups in Simics. It also has a link to a Youtube video demonstrating various ways of starting Simics simulation sessions.
Nudge theory is an intriguing fairly new take on how to manage societies and modulate the behavior of people to make them behave in a “better” way. My layman’s interpretation of the idea is that instead of threating punishment and setting up rules, you try to encourage desired behavior with rewards and discourage undesired behavior by making it require an effort. When working with the new Test Runner view in the Simics 4.8 GUI, I realized that we had invented exactly such a nudge.
Last year, I did a Simics webinar which included a two-part demo of how to use Simics to debug an endianness bug in a networked system as it migrates from big-endian to a little-endian system. Along the way, I also showed off various Simics features like reverse execution and checkpointing and scripted execution.
The demo is now online at the Wind River Youtube channel, and the setup is explained in a blog post at the Wind River company blog which is worth reading before watching the video.
LEGOs seem to be a favorite analogy for people bemoaning the state of software development today. “If only it would be as simple as putting Legos together” is a common enough statement, along with various proposals to make software that is Lego-like. Sometime, I wonder if people making these statements have actually tried to build anything non-trivial from Lego recently. Here, I will look a bit closer at the Lego-programming analogy. There is indeed quite a lot to it, but it is not all about child-level simplicity. I think there are some good lessons that can be learnt from analogizing Lego and programming.
There is a new post at my Wind River blog, telling the story of how some of the Simics developers used Simics itself to debug an intermittent Simics program crash caused by a timing-sensitive race condition.
Running Simics on itself is pretty cool, and shows the power of the simulator and its applicability even to really complex software.
Logging as as debug method is not new, and I have been writing about it to and from over the past few years myself. At the S4D conference, tracing and logging keeps coming up as a topic (see my reports from 2009, 2010 and 2012 ). I recently found an interesting piece on logging from the IT world in the ACM Queue (“Advances and Challenges in Log Analysis“, Adam Oliner, ACM Queue December 2011). Here, I want to address some of my current thoughts on the topic.
After some discussions at the S4D conference last week, I have some additional updates to the history and technologies of reverse execution. I have found one new commercial product at a much earlier point in time, and an interesting note on memory consistency.
There is a new blog post on my Wind River blog, about the Landslide system from CMU. It is a pretty impressive Master’s Thesis project that used the control that Simics has over interrupts to systematically try different OS kernel thread interleavings to find concurrency bugs. The blog is an interview with Ben Blum, the student who did the work. Ben is now a PhD student, and I bet that he will continue to generate cool stuff in the future.
The 2012 edition of the SiCS Multicore Day was fun, like they have always been in the past. I missed it in 2010 and 2011, but could make it back this year. It was interesting to see that the points where keynote speakers disagreed was similar to previous years, albeit with some new twists. There was also a trend in architecture, moving crypto operations into the core processor ISA, that indicates another angle on the hardware accelerator space.
I am going to the S4D conference for the third year in a row. This year, I have a paper on reverse debugging, reviewing the technology, products, and history of the idea. I will probably write a longer blog post after the conference, interesting things tend to come up.
In the past year, I have started listening to various podcast from the “Skeptic” community. Although much of the discussion tends to center on medicine (because of the sadly enormous market for quackery) and natural science (because the sad fight over evolution), it has made me think and reflect more about the nature of science and publishing. Indeed, it would have been great if this kind of material would have been easily found back when I was doing my PhD.
I am going to be talking about how to transport bugs with virtual platform checkpoints, in the Software Tools track at the Embedded Conference Scandinavia, on October 3, 2012, in Stockholm (Sweden). The ECS is a nice event, and there are several tracks to choose from both on October 2 and October 3. In addition to the tracks, Jan Bosch from Chalmers is going to present a keynote that I am sure will be very entertaining (see my notes from a presentation he did in Göteborg last year).
I am scheduled to talk at the SiCS multicore day 2012 (like I did back in 2009 and 2008). The event takes palce on September 13, at SiCS in Kista. My topic will be on System-Level Debug – how we can make debuggers that work for big systems.
This year, the multicore day is part of a bigger Software Week event, which also covers cloud and internet of things. See you there!
I am not in the computer security business really, but I find the topic very interesting. The recent wide coverage and analysis of the Flame malware has been fascinating to follow. It is incredibly scary to see a “well-resourced (probably Western) nation-state” develop this kind of spyware, following on the confirmation that Stuxnet was made in the US (and Israel).
In any case, regardless of the resources behind the creation of such malware, one wonders if it could not be a bit more contained with a different way to structure our operating systems. In particular, Flame’s use of microphones, webcams, bluetooth, and screenshots to spy on users should be containable. Basically, wouldn’t cell-phone style sandboxing and capabilities settings make sense for a desktop OS too?
There is a new post at my Wind River blog, about how the LDRA code coverage tools have been brought to work on Simics using a simulation-only “back door “.
The most interesting part of this is how a simulator can provide an easy way to get information out of target software, without all the software and driver overhead associated with doing the same on a real target. In this case, all that is needed is a single memory-mapped location that can written to be software – which can be put into user-mode-accessible locations if necessary.
Once upon a time, all programming was bare metal programming. You coded to the processor core, you took care of memory, and no operating system got in your way. Over time, as computer programmers, users, and designers got more sophisticated and as more clock cycles and memory bytes became available, more and more layers were added between the programmer and the computer. However, I have recently spotted what might seem like a trend away from ever-thicker software stacks, in the interest of performance and, in particular, latency.
I was recently pointed to a 2011 SPLASH presentation by David Ungar, an IBM researcher working on parallel programming for manycore systems. In particular, in a project called Renaissance, run together with the Vrije Universiteit Brussels in Belgium (VUB) and Portland State University in the US. The title of the presentation is “Everything You Know (about Parallel Programming) Is Wrong! A Wild Screed about the Future“, and it has provoked some discussion among people I know about just how wrong is wrong.
In this final part of my series on the history of reverse debugging I will look at the products that launched around the mid-2000s and that finally made reverse debugging available in a commercially packaged product and not just research prototypes. Part one of this series provided a background on the technology and part two discussed various research papers on the topic going back to the early 1970s. The first commercial product featuring reverse debugging was launched in 2003, and then there have been a steady trickle of new products up until today.
This is the second post in my series on the history of reverse execution, covering various early research papers. It is clear that reverse debugging has been considered a good idea for a very long time. Sadly though, not a practical one (at the time). The idea is too obvious to be considered new. Here are some papers that I have found dating from the time before “practical” reverse debug which for me starts in 2003 (as well as a couple of later entrants).
For some reason, when I think of reverse execution and debugging, the sound track that goes through my head is a UK novelty hit from 1987, “Star Trekkin” by the Firm. It contains the memorable line “we’re only going forward ’cause we can’t find reverse“. To me, that sums up the history of reverse debugging nicely. The only reason we are not all using it every day is that practical reverse debugging has not been available until quite recently. However, in the past ten years, I think we can say that software development has indeed found reverse. It took a while to get there, however. This is the first of a series of blog posts that will try to cover some of the history of reverse debugging. The text turned out to be so long that I had to break it up to make each post usefully short. Part two is about research, and part three about products.
Continue reading “Reverse History Part One”
It used to be that Microsoft was the big, boring, evil company that nobody felt was very inspiring. Today, with competition from Google and Apple as well as a strong internal research department, Microsoft feels very different. There are really interesting and innovative ideas and paper coming out of Microsoft today. It seems that their investments in research and software engineering are generating very sophisticated software tools (and good software products).
I have recently seen a number of examples of what Microsoft does with the user feedback data they collect from their massive installed base. I am not talking about Google-style personal information collection, but rather anonymous collection of user interface and error data in a way that is more designed to built better products than targeting ads.
On the very binary date of 11-11-11, my alma mater, the computer science (DV, for datavetenskap) education at Uppsala University celebrated its thirty years’ anniversary. It was a great classic student party in the evening with a nice mix of old alumni and fresh-faced students. Lots of singing and some nice skits on stage. Great fun, and my voice has still not recovered. It also got me thinking about it is that we really do as computer scientists.
Last week, I had the honor of presenting at and attending the talks of the Lindholmen Software Development Day. The first keynote speaker was Professor Jan Bosch from Chalmers, who did his best to provoke, prod, and shock the audience into action to change how they do software. While I might not agree with everything he said, overall it was very enjoyable and insightful talk.
I just read a quite interesting article by Christian Pinto et al, “GPGPU-Accelerated Parallel and Fast Simulation of Thousand-core Platforms“, published at the CCGRID 2011 conference. It discusses some work in using a GPGPU to run simulations of massively parallel computers, using the parallelism of the GPU to speed the simulation. Intriguing concept, but the execution is not without its flaws and it is unclear at least from the paper just how well this generalizes, scales, or compares to parallel simulation on a general-purpose multicore machine.
Paul Henning-Kamp has written a series of columns for the ACM Queue and Communications of the ACM. He is pointed, always controversial, and often quite funny. One recent column was called “The Most Expensive One-Byte Mistake“, which discusses the bad design decision of using null-terminated strings (with the associated buffer overrun risks that would have been easily avoided with a length+data-style string format). Well worth a read. A key part of the article is the dual observation that compilers are starting to try to solve the efficiency problems of null-terminated strings – and that such heavily optimizing compilers quite often very hard to use.
I just had two articles published the Embedded Design part of the EETimes.
First, “Rethink your project planning with a virtual platform“, which talks about how virtual platforms can change your entire project planning. Essentially, by reducing project friction and risks related to hardware availability, software integration, and show-stopper bugs, you can make projects work much better.
Then we have “Transporting bugs with virtual checkpoints“, which is a shorter, popular science, version of the paper I published last year at S4D. This describes how you can use checkpointing in a virtual platform to communicate bugs across time, space, and teams.
There is a new post at my Wind River blog, which could seem to be about shoes but which is really about process improvement. In particular, the need for companies to let their employees take a step or two back and look at what they are doing and what they could do better.
It is way too common to be so busy running around being inefficient that there is no time to think about how to become more efficient. Change also requires some discipline to actually keep pushing at habits until they change for the better.