I just added a new blog post on the Wind River blog, about how you do fault injection with Simics. This blog post covers the new fault injection framework we added in Simics 5, and the interesting things you can do when you add record and replay capabilities to spontaneous interactive work with Simics. There is also a Youtube demo video of the system in action.
There is a new post at my Wind River blog, about the Simics Agent feature that we included in Simics last year. Took a while to get a blog out, as I had so many other things to write about. It was also nice to get a video demo out to accompany the post. The most interesting part about the Simics Agent to me is how much more convenient it is to script a target with an agent on the inside. Too bad that also changes the target software stack a bit — but I do think that that is OK most of the time. As always, the solution has to be designed with the end goal in mind, and there is no absolute right or wrong here. Read the blog post for more details!
Last year, I did a Simics webinar which included a two-part demo of how to use Simics to debug an endianness bug in a networked system as it migrates from big-endian to a little-endian system. Along the way, I also showed off various Simics features like reverse execution and checkpointing and scripted execution.
The demo is now online at the Wind River Youtube channel, and the setup is explained in a blog post at the Wind River company blog which is worth reading before watching the video.
Every once in a while I need to build demo setups to show debugging in action. As I have blogged before, finding a good bug when you need one isn’t always easy. The solution is to try to invent artificial bugs, and I was very happy when I managed to stage a buffer overrun in a VxWorks program.
It is pretty very nice demo in which you first start a period program A, which prints the value of an incrementing counter every target second. You then run a supposedly unrelated program B, resulting in the values that program A prints to become corrupted. Perfect to show off reverse execution and data breakpoints in reverse as you go from the point where the corrupted value is printed to the piece of code that overwrote the variable.
But then I ported the demo to a new platform… and the bug didn’t work anymore. My bug had caught a bug and was now not working, or at least not they way I expected it to. What had happened?
Continue reading “My Bug Doesn’t Work!”
Part of my daily work at Virtutech is building demos. One particularly interesting and frustrating aspect of demo-building is getting good raw material. I might have an idea like “let’s show how we unravel a randomly occurring hard-to-reproduce bug using Simics“. This then turns into a hard hunt for a program with a suitable bug in it… not the Simics tooling to resolve the bug. For some reason, when I best need bugs, I have hard time getting them into my code.
I guess it is Murphy’s law — if you really set out to want a bug to show up in your code, your code will stubbornly be perfect and refuse to break. If you set out to build a perfect piece of software, it will never work…
So I was actually quite happy a few weeks ago when I started to get random freezes in a test program I wrote to show multicore scaling. It was the perfect bug! It broke some demos that I wanted to have working, but fixing the code to make the other demos work was a very instructive lesson in multicore debug that would make for a nice demo in its own right. In the end, it managed to nicely illustrate some common wisdom about multicore software. It was not a trivial problem, fortunately.