<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Observations from Uppsala &#187; debugging</title>
	<atom:link href="http://jakob.engbloms.se/archives/tag/debugging/feed" rel="self" type="application/rss+xml" />
	<link>http://jakob.engbloms.se</link>
	<description>Computer Technology: Simulation, Virtualization, Virtual Platforms, Embedded, Multicore and Multiprocessing (by Jakob Engblom)</description>
	<lastBuildDate>Sun, 29 Jan 2012 19:45:28 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
<image>
    <title>Observations from Uppsala</title>
    <url>http://jakob.engbloms.se/favicon.png</url>
    <link>http://jakob.engbloms.se</link>
    <width>32</width>
    <height>32</height>
    <description>Observations from Uppsala - http://jakob.engbloms.se</description>
    </image>		<item>
		<title>Reverse History Part Three &#8211; Products</title>
		<link>http://jakob.engbloms.se/archives/1564?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1564#comments</comments>
		<pubDate>Sun, 08 Jan 2012 19:51:57 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[history of computing]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[gdb]]></category>
		<category><![CDATA[Green Hills]]></category>
		<category><![CDATA[Lauterbach]]></category>
		<category><![CDATA[Multi]]></category>
		<category><![CDATA[reverse debug]]></category>
		<category><![CDATA[reverse execution]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[TotalView]]></category>
		<category><![CDATA[UndoDB]]></category>
		<category><![CDATA[VMWare]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1564</guid>
		<description><![CDATA[In this final part of my series on the history of reverse debugging I will look at the products that launched around the mid-2000s and that finally made reverse debugging available in a commercially packaged product and not just research prototypes. Part one of this series provided a background on the technology and part two [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2011/12/reverse-icon.png"><img class="alignleft size-full wp-image-1550" title="reverse icon" src="http://jakob.engbloms.se/wp-content/uploads/2011/12/reverse-icon.png" alt="" width="62" height="62" /></a>In this final part of my series on the history of reverse debugging I will look at the products that launched around the mid-2000s and that finally made reverse debugging available in a commercially packaged product and not just research prototypes. <a href="http://jakob.engbloms.se/archives/1547">Part one </a>of this series provided a background on the technology and <a href="http://jakob.engbloms.se/archives/1554">part two </a>discussed various research papers on the topic going back to the early 1970s. The first commercial product featuring reverse debugging was launched in 2003, and then there have been a steady trickle of new products up until today.</p>
<p><span id="more-1564"></span></p>
<p><strong>2003</strong>. The embedded tools company Green Hills launched their<br />
<a href="http://www.ghs.com/news/20030930_best_of_show.html">Time Machine</a> feature in their well-known MULTI debugger. I consider this the start of commercial reverse debugging, as it was the first<br />
commercial-grade product to include reverse debugging. The implementation was based on tracing the execution of a program on actual hardware, using a debug probe and a &#8220;JTAG&#8221; debug interface. The trace box would capture several gigabytes of execution data, and then the debugger performed operations based on this trace. To check a backwards breakpoint, you scan back over the trace until you find a matching state or operation (such a memory access or instruction address that is being executed). The main limitation of the method is that the trace buffer can only capture a few seconds of execution on a typical 100s of MHz embedded processor. It only works for a single processor, and it does not capture IO actions (except as memory-mapped IO). It is system-level and cross-target.</p>
<p>Extending this kind of trace to multicore has proven hard, since getting a synchronized trace out of several processors is very hard. There might be debug hardware coming out in the next few years that can indeed support a time-stamped consistent trace of multiple cores, and with such hardware, the Time Machine approach could well be extended into multicore.</p>
<p><strong>2005</strong>. <a href="http://www.windriver.com/products/simics/">Simics </a>3.0 was launched by Virtutech (later acquired by Wind River and Intel) with full-system reverse execution and reverse debugging. The Simics approach was also unique, being based on a full-system simulator. By simulating the entire target, it is trivial to reverse (and put reverse breakpoints on) changes to memory, persistent storage like disks, and hardware devices. Since all device models in the simulator are deterministic in their implementation, re-executing hardware events like interrupts and IO outputs is just as easy as re-executing code on the main processor, something that had eluded all previous approaches. Recording is used at the interface between the simulator and the outside world, such as user interaction over graphics displays and serial ports and connections to the real-world network. The software stack is unmodified and system-level, and the simulator can handle multiple processors and even multiple machines in a network as a unit. The use case is normally cross-target (even if a system identical to the host can be simulated, it would work like a cross target logically). Time is handled by counting clock cycles on all processors in the system, and reverse debugging can position the simulation at any point in time based on the virtual time.</p>
<p>There is a cost in execution speed from simulation rather than direct execution, and an intrusion effect from running on a simulator rather than on a physical machine. This affects the <a href="http://jakob.engbloms.se/archives/97">timing of events</a>, even with a software stack that is not modified. Still, the fact that you can run a complete real software stack with no modifications needed before starting to run the target system is fairly rare in the world of reverse debuggers.<strong></strong></p>
<p><strong></strong>Simics shipped with a modified gdb that talked gdb serial to Simics and accessed reverse execution with some new debugger commands as well as extensions to the gdb serial protocol. This was offered to the gdb community, but not accepted. However, prompted by this, the gdb community started to discuss reverse execution. Some interesting old threads can still be found, such as <a href="http://sourceware.org/ml/gdb/2005-05/msg00225.html">http://sourceware.org/ml/gdb/2005-05/msg00225.html</a>. Clearly, at that point in time Virtutech did not really explain how Simics worked, and there were some pretty bad proposals floated in the community for how to do reverse. In the end, the gdb serial design did turn out in the right way, assuming<br />
the remote debugger would reverse itself and <a href="http://sourceware.org/ml/gdb/2005-05/msg00235.html">gdb would just ask it to do so</a>. This separation of concerns is important to creating practical reverse debugging solutions that can use any debugger backend.</p>
<p><strong>2005</strong>. Also in 2005, Lauterbach launched the <a href="http://www.lauterbach.com/cts.html">Context Tracking System, CTS</a>. Lauterbach is a big player in the embedded debug market, with their TRACE32 debugger. CTS can be seen as their reply to the Time Machine debugger. CTS is also based on a trace from a hardware unit or from an instruction-set simulator. However, from the available information is also appears to be more limited &#8211; you can step back and go back in time and replay forward, but there is no mention of actual backwards breakpoints (even today, six years later).  Thus, I count this as record-replay rather than reverse debug. It is cross-target, system-leve, and uniprocessor like Time Machine.</p>
<p><strong>2006</strong>. <a href="http://undo-software.com/undodb_about.html">Undo Software </a>launched the first Linux-targeting host-based reverse debugger, <a href="http://undo-software.com/pressrelease-1.html">UndoDB</a>. It is described as a <em>bidirectional</em> debugger (the same terminology as the Boothe 2000 PLDI paper). It is user-level, does do reverse breakpoints (and data breakpoints, also known as watchpoints, which is really useful). It handles multiple threads (at least in 3.0), but from the description of the recording technology used I believe they have to serialize their execution. The implementation is based on checkpoint and re-execution, with recording of all non-deterministic events like IO. There is a feature to move to a certain point in time, based on &#8220;simulated nanoseconds&#8221;. These are not really nanoseconds, but values which are guaranteed to increase even between two instructions (which probably means that they are sub-nanoseconds and on a &gt; 1GHz CPU single-cycle instructions will indeed take less than one nanosecond).</p>
<p>There is a nice description of how it works on their <a href="http://www.undo-software.com/undodb-gdb.1.html">online man page</a>. It is worth noting that they call it &#8220;gdb&#8221;, but the command set is distinct from what gdb introduced with its reverse execution in 2009. They use the &#8220;b&#8221; prefix for backwards commands rather than &#8220;r&#8221; for reverse.  In some way, UndoDB is in direct competition with the gdb reverse target, but it is much much faster and has more features.</p>
<p><strong>2008</strong>. The Rogue Wave (at the time, it was an independent company) TotalView debugger gained support for reverse debugging, with the <a href="http://www.roguewave.com/products/totalview-family/replayengine.aspx">ReplayEngine </a>add-on. TotalView is an old mainstay in the HPC market, having been around since <a href="https://computing.llnl.gov/tutorials/totalview/#Overview">at least 1993</a>. Indeed, it was developed initially for the <a href="http://en.wikipedia.org/wiki/BBN_Butterfly">BBN Butterfly computer</a>, and thus it might have had a touch with reverse execution as far back as the 1987 paper cited in my <a href="http://jakob.engbloms.se/archives/1554">previous blog post</a>.</p>
<p>Judging from <a href="http://www.roguewave.com/documents.aspx?Command=Core_Download&amp;EntryId=739">the available materials</a>, TotalView can clearly can step back in various ways. However, it is not clear that it triggers breakpoints when going backwards. Thus, it has to count as record-replay debugging rather than reverse debugging. The base of the implementation is extensive instrumentation of the the runtime system of the target computer.  The implementation builds on the fact that the target programs tend to b clustered programs that use MPI to communicate &#8211; and thus a large part of the communication between threads is explicit and easily intercepted and recorded.  There is also an existing infrastructure of checkpoint and restart for parallel programs using MPI to support fault tolerance that was used as the base of the implementation.  Finally, in a slightly ugly hack, they make each multi-threaded program run on a single processor by a big lock. In this way, all that needs to be replayed is the interleaving of threads on a single processor, a far more tractable problem compared to trying to replicate a true parallel execution in a new session.</p>
<p><strong>2008</strong>. VmWare officially launched a record-replay debugger based on their virtual machine technology with <a href="http://www.replaydebugging.com/2008/08/vmware-workstation-65-reverse-and.html">VmWare Workstation 6.5</a>. Single-processor, system-level (but really only supported for user-level debugging), cross target (since the VM is not really the absolutely same hardware as the host), time model is based on the virtual machine which I believe is cycles-based. Mostly used for record-replay debug of non-deterministic software bugs, but could also do reverse debugging including reverse data breakpoints. Based on snapshot and deterministic re-execution, plus recording of all non-deterministic device accesses (not all devices in the VmWare hardware emulation layer are deterministic). Going back to a snapshot was a very heavy operation (I tried it) since you had to restore the entire target memory (quickly got into gigabytes). The hardware supported in the VM was quite limited, and things like CD-ROMs and floppies could not be part of a record/replay session. Replay logs could be moved between hosts.</p>
<p>The VmWare reverse debug functionality was removed from VmWare workstation version 8 in 2011, since it required a large investment and was not apparently used by very many VmWare users. This indicates that trying to build developer-oriented functionality into a technology base that is fundamentally driven by the need of deployed virtual machines was hard. There are contradictions between these two goals, as the determinism and control needed for a good reverse debugger is not necessarily consistent with maximum performance for running virtual machines in a production setting.</p>
<p><strong>2009</strong>. gdb 7.0 added support for reverse execution (a work that began in 2006). The built-in &#8220;record&#8221; target supports reverse debugging on user-level single-threaded programs on the same host. The command set for reverse debugging is fairly full-featured, but is a bit quirky with a &#8220;<a href="http://sourceware.org/gdb/news/reversible.html">set direction</a>&#8221; command that makes regular run-control commands work in reverse. The record technology is quite slow since it basically records the effect of each and every instruction run in the program.</p>
<p>In addition to its built-in target, gdb can also control external reversible debug systems over the gdb serial protocol. This made the changes to gdb-serial created by Virtutech for Simics in 2005 part of the mainline gdb release. <a href="http://sourceware.org/gdb/news/reversible.html">Several tools support the command set</a>, including VmWare, UndoDB, and Simics. There was also a set of MI commands added to basically let Eclipse use gdb as a backend for reverse debug, including using it to control external tools via gdb-serial. How this happened is quite a long story, and I made a small contribution to the gdb code base myself in the process. Read about this <a href="http://jakob.engbloms.se/archives/1065">here</a>.</p>
<p><strong>2009</strong>. Eclipse CDT added support for <a href="(http://www.eclipse.org/community/training/webinars/090526_CDT_Webinar.pdf">reverse execution</a>, using gdb 7.0 reverse as the initial backend. As noted above, this lets Eclipse also use other reverse debugging backends (Eclipse uses the gdb-MI interface to gdb to control the debug session). This is noteworthy since it meant that the buttons to control reverse execution are now part of the CDT, making it much easier to use Eclipse to build a frontend to any reversible backend. Eclipse is not really a debugger, just an interface to a debugger.</p>
<p><strong>2009</strong>. Microsoft Visual Studio <a href="http://blogs.msdn.com/b/ianhu/archive/2009/05/13/historical-debugging-in-visual-studio-team-system-2010.aspx">got record-replay debugging with IntelliTrace</a>. It is strictly about replay debugging, including the nice ability to send traces around between developers. There are no backwards breakpoints. The support is limited to programs running on top of the .net runtime system, meaning that <a href="http://msdn.microsoft.com/en-us/library/dd264915.aspx">it does not apply to classic Windows software</a>. Using the CLR virtual machine as the implementation basis should make the implementation easier, cleaner, and faster compared to a machine-level native solution. It is user-level, single-threaded, and host-based. Time concept is unknown.</p>
<p><strong>2011</strong>. Adobe demonstrated (not launched) reverse debugging in their Flash Builder programming environment. A <a href="http://tv.adobe.com/watch/max-2011-sneak-peeks/max-2011-sneak-peek-reverse-debugging-in-flash-builder/">nice video is posted on the Adobe website</a>. Seems to be based on the virtual machine that flash runs on, and includes what looks like pretty powerful backwards data analysis tools. In a <a href="http://anirudhsasikumar.net/blog/2011.12.15.html">blog post</a>, the developer describes some of the features, which to me seem to indicate some pretty heavy recording.</p>
<p><strong>Final notes.</strong>In researching these commercial tools, there also seems to be a lost one. A company called Visicomp launched a Java debugger called RetroVue in 2002 which supposedly did allow backwards debugging in some way. However, it seems that this tool was not really practical, being too slow for actual use. It seems to have disappeared since without anyone picking up its legacy. The technology was apparently pretty much like the Omniscient Debugger presented in 2003 and which I described in the <a href="http://jakob.engbloms.se/archives/1554">blog post on reverse execution research</a>.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1564"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1564" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1564" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1564/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Debug, Design, and Microsoft Data</title>
		<link>http://jakob.engbloms.se/archives/1527?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1527#comments</comments>
		<pubDate>Sat, 19 Nov 2011 15:38:23 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[Communications of the ACM]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[Kinshuman Kinshumann]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Steven Sinofsky]]></category>
		<category><![CDATA[Windows]]></category>
		<category><![CDATA[Windows 8]]></category>
		<category><![CDATA[Windows Explorer]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1527</guid>
		<description><![CDATA[It used to be that Microsoft was the big, boring, evil company that nobody felt was very inspiring. Today, with competition from Google and Apple as well as a strong internal research department, Microsoft feels very different. There are really interesting and innovative ideas and paper coming out of Microsoft today.  It seems that their [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2010/07/windows-phone-logo.png"><img class="alignleft size-full wp-image-1205" title="windows phone logo" src="http://jakob.engbloms.se/wp-content/uploads/2010/07/windows-phone-logo.png" alt="" width="66" height="58" /></a>It used to be that Microsoft was the big, boring, evil company that nobody felt was very inspiring. Today, with competition from Google and Apple as well as a strong internal research department, Microsoft feels very different. There are really interesting and innovative ideas and paper coming out of Microsoft today.  It seems that their investments in research and software engineering are generating very sophisticated software tools (and good software products).</p>
<p>I have recently seen a number of examples of what Microsoft does with the user feedback data they collect from their massive installed base. I am not talking about Google-style personal information collection, but rather anonymous collection of user interface and error data in a way that is more designed to built better products than targeting ads.</p>
<p><span id="more-1527"></span>The first paper is &#8220;<a href="http://dl.acm.org/citation.cfm?id=1965749&amp;bnc=1">Debugging in the (very) large: ten years of implementation and experience</a>&#8221; by Kinshumann et al, Communications of the ACM, July 2011. This paper describes how Microsoft uses of the data they collect from Windows Error Reporting (you know, the little dialog boxes that appear every once in a while on Windows when a program has crashed or frozen, or Windows restored from a crash).</p>
<p>Microsoft has a number of heuristics that look at the data collected, grouping the bug reports into buckets. Ideally, each bucket corresponds to a single root cause for possibly quite different errors. They automatically analyze the errors and generate metadata about the error reports that can be used to generate statistics and allow database queries to be performed over all collected error<br />
reports.  Heuristics include walking through chains of threads blocked on synchronization objects to determine which one is the actual cause of a hang, and finding the most likely thread and stackframe for containing the root cause of an error.  Heuristics are applied both on the client and the server, but mostly on the server. Technically very hard to do right, I can appreciate the huge amount of work that has gone into engineering this.</p>
<p>With this huge pile of information, a new debugging method becomes available: statistics-driven bug finding and prioritization at large scale.  The introduction to the paper puts it very well:</p>
<blockquote><p>Beyond mere debugging from error reports, WER enables a new form of statistics-based debugging. WER gathers all error reports to a central database. In the large, programmers can mine the error report database to prioritize work, spot trends, and test hypotheses. Programmers use data from WER to prioritize debugging so that they fix the bugs that affect the most users, not just the bugs hit by the loudest customers. WER data also aids in correlating failures to co-located components. For example, WER can identify that a collection of seemingly unrelated crashes all contain the same likely culprit—say a device driver—even though its code was not running at the time of failure.</p></blockquote>
<p>For a product manager like me, used to working with individual bug reports in bug reporting systems and trying to manually assess the importance of each error, this is nothing short of a dream.  Instead of trying to guess how many users can be impacted by a bug, Microsoft can run queries against the error report database and get a fairly accurate idea of how common a certain error is in the user base.  This has allowed them to address the most common errors first, leading to Windows and Office becoming more stable for more users in recent generations.  They can also pinpoint which device drivers are causing the most issues, and putting pressure on vendors to clean up their act.</p>
<p>I wonder where else you really apply this idea of statistical debugging. You need a large user base, in systems which are connected to the Internet so you can collect data, and who are comfortable with providing direct feedback to you as a vendor.  Apparently, Apple has the same kind of feature built into iOS, with more than 100 million users which seem not to be too interested in strong privacy.  Presumably, Google can do the same thing with Android, at least its use in phones. Mozilla has a crash reporter, so I guess it makes sense in the consumer space.</p>
<p>But when your user base counts in thousands of seats and half of these are in defense sector beyond air-gaps, it is harder to apply. Products that call home are not taken to kindly in the professional field, as secrecy and confidentiality is very important to big companies. Industrial embedded products like telecom infrastructure might have sufficient volume of code and computer hardware to form a basis for statistical reporting &#8211; as long as operators agree to provide the information to the hardware vendors.</p>
<p>Another example of how Microsoft makes use of their collected data is in UI design. The blog post &#8220;<a href="http://blogs.msdn.com/b/b8/archive/2011/08/29/improvements-in-windows-explorer.aspx">Improvements in Windows Explorer</a>&#8220;, by Steven Sinofsky, from the Windows 8 blog discusses how Windows Explorer has evolved over the years, and how it is now getting a radical redesign based on usage data.  Microsoft is an enviable position here, having collected information about what millions of users are doing.  Definitely beats inspiration or trying with a few users in a classic user interface lab.</p>
<p>I have seen quite a few people criticize this blog post from a variety of angles &#8211; from the fact that they are not data-driven enough and keep rarely-used buttons in the ribbon to the fact that they remove somebody&#8217;s favorite function.  It is also the case that the measurements can only tell you which functions people are using from what is available today &#8211; if you want to invent new things, data like this might not be very helpful.</p>
<p>Fortunately, Microsoft also seems to have taken a clue from Linux and is allowing much more user customization than before. For me, this is great news, as I seem to have a user profile quite far from the mainstream.  We have not seen Windows 8 in its final form just yet, but hopefully this approach will be applied to other parts of that GUI overhaul too.  There are professional Windows users who need an OS that makes even very esoteric operations easy to access, and customizations of things like the start menu possible.  Hopefully, we do not get washed away by the flood of data from regular users.</p>
<p>For some reason, I feel that bug reporting is not as sensitive to the user style as GUI design &#8211; Windows and driver bugs would seem to be more evenly distributed as they depend more on hardware than on software. At least it seems to me that Windows is more stable today<br />
than it was a couple of years ago.</p>
<p>&nbsp;</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1527"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1527" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1527" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1527/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Disappointing SystemC Debugger Integration Paper</title>
		<link>http://jakob.engbloms.se/archives/1419?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1419#comments</comments>
		<pubDate>Wed, 25 May 2011 19:35:21 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[SystemC]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1419</guid>
		<description><![CDATA[Since I have a certain interest in debugging, I was happy find the article &#8220;Guidelines for SystemC &#8211; Debugger Integration&#8221; at the usually interesting Design and Reuse website. However, I must say that it was pretty disappointing. The key idea of the article is to put the debug service in a thread and the debugged [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2011/05/debug-small.png"><img class="alignleft size-full wp-image-1421" title="debug small" src="http://jakob.engbloms.se/wp-content/uploads/2011/05/debug-small.png" alt="" width="81" height="73" /></a>Since I have a certain interest in debugging, I was happy find the article <a href="http://www.design-reuse.com/articles/26457/guidelines-for-systemc-debugger-integration.html">&#8220;Guidelines for SystemC &#8211; Debugger Integration&#8221;</a> at the usually interesting Design and Reuse website. However, I must say that it was pretty disappointing.</p>
<p><span id="more-1419"></span>The key idea of the article is to put the debug service in a thread and the debugged SystemC system in another thread, and stop SystemC using a mutex. Yes, you have to do that.</p>
<p>But the really interesting part is how to connect the debugger into the virtual platform, and what that requires from the models and processors and the infrastructure. Unfortunately, the article is pretty silent on that. There is some talk of breakpoint handling required in the ISS, and how to update target memory that mostly corresponds to the debug interface of SystemC TLM-2.0 in scope.</p>
<p>Also, nothing about multicore debug and how to deal with temporal decoupling and debugging, or the need for repeatability across runs. Or breakpoints on things like hardware accesses and internal actions in the simulator.</p>
<p>Too bad.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1419"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1419" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1419" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1419/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Wind River Blog: 20, 30, 60 years ago</title>
		<link>http://jakob.engbloms.se/archives/1408?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1408#comments</comments>
		<pubDate>Fri, 06 May 2011 12:27:37 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[history of computing]]></category>
		<category><![CDATA[Wind River Blog]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[Wind River]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1408</guid>
		<description><![CDATA[There is a new post at my Wind River blog, about some computing history. Wind River turns thirty this year, Simics twenty, and simulation for debug (and probably debug in general) turns sixty. Computing has come a long way. Tweet]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-1122" style="margin: 5px 10px;" title="Wind River Logo" src="http://jakob.engbloms.se/wp-content/uploads/2010/04/button-quicklink-blogs.png" alt="" width="46" height="46" />There is a new post at my Wind River blog, about <a href="http://blogs.windriver.com/engblom/2011/05/twenty-thirty-and-sixty-years-ago.html">some computing history</a>. Wind River turns thirty this year, Simics twenty, and simulation for debug (and probably debug in general) turns sixty. Computing has come a long way.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1408"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1408" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1408" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1408/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>S4D 2010 Part 2</title>
		<link>http://jakob.engbloms.se/archives/1280?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1280#comments</comments>
		<pubDate>Sun, 19 Sep 2010 20:10:55 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[EDA]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[embedded software]]></category>
		<category><![CDATA[embedded systeme]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[Barry Lock]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[Lauterbach]]></category>
		<category><![CDATA[S4D]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1280</guid>
		<description><![CDATA[My previous post on S4D did omit some of my notes from the conference. In particular, the very entertaining and serious keynote of Barry Lock from Lauterbach and some more philosophical observations on the nature of debugging. Barry Lock Barry lock gave a very entertaining keynote, from his viewpoint as essentially the champion of physical [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/09/S4D1.jpg"><img class="alignleft size-full wp-image-941" title="S4D" src="http://jakob.engbloms.se/wp-content/uploads/2009/09/S4D1.jpg" alt="" width="143" height="62" /></a>My <a href="http://jakob.engbloms.se/archives/1251">previous post on S4D </a>did omit some of my notes from the conference. In particular, the very entertaining and serious keynote of Barry Lock from Lauterbach and some more philosophical observations on the nature of debugging.</p>
<h3><span id="more-1280"></span>Barry Lock</h3>
<p>Barry lock gave a very entertaining keynote, from his viewpoint as essentially the champion of physical hardware debug. Lauterbach is clearly focused on debugging using hardware assists in real systems, with not much to do with high-level programming or virtual platforms. Barry has been working with computers longer than I have lived, and have seen both the semiconductor and software side of things.</p>
<p>His main message was that you have to take debuggability into account when buying chips for your embedded project. Saving a few cents by buying a chip with no or limited debug features will come back to haunt you, many times, in many nightmares. He had had grown men crying over the phone, asking for a miracle to save their projects after debugging had utterly failed for many months. He had seen startup companies go under, burning all their money chasing the last bug&#8230; and claimed that 75% of all product starts never got to market, blaming debug problems for a large proportion of that.</p>
<p>The most important debug feature is <em>trace </em>- which follows the theme of this being the S4D of trace. After trace, you want <em>hardware breakpoints</em>. Apparently, you need at least two to breakpoints to debug a system with a virtual memory RTOS. One to keep a look at the MMU, and one to actually debug code. More are better, but it is rare to see silicon vendors include many more breakpoints.</p>
<p>Barry gave  a number of examples of projects which had failed by not buying the right hardware. He put the blame both on the buyers chasing a few cents of costs in the end product, but also on the poor quality of silicon salespeople who rather took the price route than the quality route when selling chips.</p>
<p>He also noted that there seemed to be a positive correlation between industry leaders and buying debuggable hardware. Companies like Bosch, Ericsson, and Nokia always spend the extra money to get hardware that can be debugged, and have results to show for it.</p>
<h3>Philosophy of Debug</h3>
<p>During our two panel discussions on debug, there were two ways to look at debugging that stood out from the crowd.</p>
<p>The first was the observation that debugging today is very much a <em>craft</em>. When things go really bad, you go to the proven expert. Debugging is a craft you learn by apprenticeship with a master, and master debuggers are incredibly valuable for their organizations. This reliance on masters indicate that general programming education to a large extent overlooks debugging as a crucial skill for programmers. It also means that, in the words of one member of the audience, debug cannot scale. As problems become more complex, we still rely on single individuals, which reduces our ability to tackle problems.</p>
<p>The second observation was to liken debugging to the diagnosis of human diseases. As systems become more complex, their behavior gets so complicated and rich that it is hard to even precisely identify what a bug is. A simple crash or illegal operation is clear-cut &#8211; but when results of programs are just a bit off? When control loops don&#8217;t quite do the right thing, but almost? When the quality of a picture of a television just feels wrong? In those cases, we might be looking at composite measurements of many different parameters and factors in a system, and making a diagnosis of error based on the whole picture rather than each factor in isolation.</p>
<p>Based on these observations, I can envision a somewhat weird future where we train computer doctors (as in medical doctor, not PhD)  to diagnose computer problems based on holistic, systematic, approaches. Such education could be separate from the training of programmers and testers, as their specialty would be the diagnosis of system outputs against the expected outcomes at a high level, rather than the details of code.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1280"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1280" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1280" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1280/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Additional Notes on Transporting Bugs with Checkpoints</title>
		<link>http://jakob.engbloms.se/archives/1231?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1231#comments</comments>
		<pubDate>Wed, 15 Sep 2010 05:38:42 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[appearances]]></category>
		<category><![CDATA[articles]]></category>
		<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[virtual machines]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Checkpointing]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[S4D]]></category>
		<category><![CDATA[Simics]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1231</guid>
		<description><![CDATA[This post features some additional notes on the topic of transporting bugs with checkpoints, which is the subject of a paper at the S4D 2010 conference. The idea of transporting bugs with checkpoints is some ways obvious. If you have a checkpoint of a state, of course you move it. Right? However, changing how you [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/09/S4D1.jpg"><img class="alignleft size-full wp-image-941" style="margin: 5px 10px;" title="S4D" src="http://jakob.engbloms.se/wp-content/uploads/2009/09/S4D1.jpg" alt="" width="143" height="62" /></a>This post features some additional notes on the topic of transporting bugs with checkpoints, which is the subject of a paper at the <a href="http://www.ecsi.me/s4d">S4D </a>2010 conference.</p>
<p>The idea of transporting bugs with checkpoints is some ways obvious. If you have a checkpoint of a state, of course you move it. Right? However, changing how you think about reporting bugs takes time. There are also some practical issues to be resolved. The S4D paper goes into some of the aspects of making checkpointing practical.</p>
<p><span id="more-1231"></span>In particular, we need the checkpoints to be:</p>
<ul>
<li>Portable &#8211; so that checkpoints can be copied around between computers</li>
<li>Deterministic- so that everyone opening a checkpoint sees the same behavior</li>
<li>Compact &#8211; so that they can actually be moved around without incurring undue pain</li>
<li>Differential &#8211; so that a checkpoint can build on previous state and just contain a set of changes, not the entire state of the target system</li>
</ul>
<p>Most of my paper is spent on how to make checkpoints small enough to be easily transported, and how it fits with development workflows. The requirements above would seem to be common sense, but there are checkpointing systems out there that do not fulfill them. In particular, the portability aspect is hard to get right.</p>
<p>There are other ways to achieve transportation of bugs, and this blog post will fill in on some related work that I could not fit into the paper or which I discovered only after the final version of the paper was submitted.</p>
<h3>Record-Replay Systems</h3>
<p>There seem to be boundless creativity in creating methods to record live systems and replay their inputs/outputs/internal behavior/other interesting behavior on another system to support debug or analysis or other tasks. It shows just how important the replication of bugs is to the development of systems, and just how hard it is to accurately capture a bug in practice.</p>
<p>The company called <strong>Zealcore </strong>was doing some interesting work in software-based recording of &#8220;only the relevant events&#8221;, and then replaying this on a lab machine. Their angle on the problem was to have software record a minimal trace of important events on a live system, and then control the runtime system in a lab to replicate the event trace. Making this efficient and precise was the subject of a sequence of research papers in the early 2000s. Zealcore was acquired by Enea in 2008, and I have not seen much from them since. From what I can tell, the Zealcore fundamental technology for recording on a live system (or at least the ideas) have been continued into a new company called <strong><a href="http://www.percepio.se/">Percepio</a></strong>.</p>
<p>Aa fundamental difference between these recording systems and checkpointing systems is that they do not capture the complete target system state in the way a checkpoint does. The recording is much more compact, but it does not really solve the same problem. It is not based on running the target inside a simulator (other than at the replay end). What the relative success of such recording system indicates, however, is that in many systems, there are &#8220;important&#8221; and &#8220;irrelevant&#8221; aspects of inputs and events and behaviors, and that recording and replaying only &#8220;important&#8221; aspects is often sufficient to trigger bugs.</p>
<p>You can also throw hardware at the problem.</p>
<p>Completely unexpectedly, I also found a reference to a hardware-based record/replay system in a <a href="http://cacm.acm.org/magazines/2010/8/96632-an-interview-with-edsger-w-dijkstra/fulltext">Communications of the ACM interview with Edsger Dijkstra</a> (a rerun of an <a href="http://www.cbi.umn.edu/oh/pdf.phtml?id=296">interview from 2002</a>). Apparently, during the early programming of the <strong>IBM 360</strong>, IBM realized that debugging interrupts was hard. The solution was to create a piece of special hardware which would record interrupts, and later replay them with precise timing. In this way, you achieved repeatable executions of the most difficult code there was. I must quote what Dijkstra says on this &#8220;throw money at the problem&#8221; approach:</p>
<blockquote><p>When IBM had to develop the software for the 360, they built one or two machines especially equipped with a monitor. That is an extra piece of machinery that would exactly record when interrupts took place and from where to where. And if something went wrong, it could replay it again and use the recorded history to control when interrupts would occur. So they made it reproducible, yes, but at the expense of much more hardware than we could afford. Needless to say, they never got the OS/360 right.</p></blockquote>
<p>The final comment is typical for Dijkstra&#8217;s thinking that debugging is just an indication that you did not get the program and design right from the start. That&#8217;s certainly true, and he would likely have considered my little S4D paper as an unnecessarily complicated solution to a problem that should not have existed in the first place.</p>
<p>I, however, find the idea of the monitor interesting. I think that building something like that today would be much more difficult, as chips are very highly integrated and the support for replaying interrupts would have to go right into the heart of an SoC. But it would be interesting if it could be done.</p>
<p>There is also a <a href="http://jakob.engbloms.se/archives/130">paper from 1969 that I wrote about a few years ago </a>that does include the idea of recording and replaying asynchronous external inputs to a simulator.</p>
<h3>Other Checkpointing Systems</h3>
<p>There might be some related use of checkpoints (or snapshots as they are more commonly known) in the development of game emulators.  There is clearly the ability to save game state in a portable way in emulators like MAME.  Such states can be useful to help debug the emulator, but in a different way from the approach that I presented.  In the emulator case, the state is really the state of the emulated target.  It is not the state of the emulator program itself. If game emulator snapshots were used to debug the game code, it would be the same situation as what I describe in the S4D paper.</p>
<p>As I understand it, this is more like a attaching an example document that makes a program crash to a bug report, rather than transporting the state of the emulator itself.</p>
<p>Going down in the level of abstraction, I have also been told that RTL simulators offers a similar ability and that they have used in a similar way. Since I am not at all familiar with that field, I would not comment on this in the paper.</p>
<p>Transporting RTL bugs using checkpoints makes perfect sense. In an RTL simulator, the target state is very clearly described in an unambiguous way with no  relationship to host state. Checkpointing should be easy to implement and checkpoints should be portable, anything else would be a poor implementation.  The simulation is also deterministic, assuming a reasonable implementation of the simulator. The simulated world is also encapsulated with a set of test cases, RTL simulations are too slow to be interfaced to the real world. If an RTL simulator is interfaced to something else, recording the incoming signals should be straightforward since they are at a very low level (bits, clocks, pin states).</p>
<p>The use of checkpointing with RTL also fits with a conversation I had in 2005 when Virtutech introduced reverse execution in Simics. At one of the tradeshows where we showed the technology, an older gentleman approach me and told me that he had done similar things with hardware simulators back in the 1980s. He immediately understood the implementation idea (checkpoints with deterministic replay), and sounded like he felt it was nothing much new.</p>
<p>Finally, at some other event last year, I saw an demonstration of an RTL-level tool where the trace of the execution was generated on one machine, but inspected on a different machine. That amounts to a portable trace, even if the data volumes were rather large (many GB) and essentially required the RTL simulator (or hardware accelerated emulator) to be sharing disks with the investigation machine. Still, nothing prevents such a solution from being remotely used. The main difference from what I describe is that here only the result of the execution (trace of signals) is transported, not an actual state snapshot that can be brought up to continue the execution in a different place.</p>
<p>If you have any other notes on this, please comment!</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1231"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1231" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1231" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1231/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>S4D Paper on Transporting Bugs with Checkpoints</title>
		<link>http://jakob.engbloms.se/archives/1235?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1235#comments</comments>
		<pubDate>Tue, 31 Aug 2010 18:40:55 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[appearances]]></category>
		<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[virtual machines]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[Wind River Blog]]></category>
		<category><![CDATA[Checkpointing]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[S4D]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1235</guid>
		<description><![CDATA[I have a paper about &#8220;Transporting Bugs with Checkpoints&#8221; to be presented at the S4D (System, Software, SoC and Silicon Debug) conference in Southampton, UK, on September 15 and 16, 2010. The core concept presented is to leverage Simics checkpointing to capture and move a bug from the bug reporter to the responsible developer. It [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/09/S4D1.jpg"><img class="alignleft size-full wp-image-941" style="margin: 5px 10px;" title="S4D" src="http://jakob.engbloms.se/wp-content/uploads/2009/09/S4D1.jpg" alt="" width="143" height="62" /></a>I have a paper about &#8220;Transporting Bugs with Checkpoints&#8221; to be presented at the <a href="http://www.ecsi.me/s4d">S4D (System, Software, SoC and Silicon Debug) conference </a>in Southampton, UK, on September 15 and 16, 2010. The core concept presented is to leverage <a href="http://www.windriver.com/products/simics/">Simics </a>checkpointing  to capture and move a bug from the bug reporter to the responsible  developer. It is a fairly simple idea, but getting it to work  efficiently does require that some things are done right. See the longer <a href="http://blogs.windriver.com/engblom/2010/08/transporting-bugs-with-checkpoints.html#more">Wind River blog posting </a>about this topic for a few more details.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1235"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1235" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1235" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1235/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Finally, a Bug!</title>
		<link>http://jakob.engbloms.se/archives/975?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/975#comments</comments>
		<pubDate>Sun, 25 Oct 2009 20:41:20 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[embedded software]]></category>
		<category><![CDATA[multicore software]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Checkpointing]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[demo]]></category>
		<category><![CDATA[Linux kernel]]></category>
		<category><![CDATA[Simics]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=975</guid>
		<description><![CDATA[Part of my daily work at Virtutech is building demos. One particularly interesting and frustrating aspect of demo-building is getting good raw material. I might have an idea like &#8220;let&#8217;s show how we unravel a randomly occurring hard-to-reproduce bug using Simics&#8220;. This then turns into a hard hunt for a program with a suitable bug [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/10/butterfly.png"><img class="alignleft size-full wp-image-982" title="butterfly" src="http://jakob.engbloms.se/wp-content/uploads/2009/10/butterfly.png" alt="butterfly" width="90" height="91" /></a>Part of my daily work at Virtutech is building demos. One particularly interesting and frustrating aspect of demo-building is getting good raw material. I might have an idea like &#8220;let&#8217;s show how we unravel a randomly occurring hard-to-reproduce bug using <a href="http://www.virtutech.com/products/simics_hindsight.html">Simics</a>&#8220;. This then turns into a hard hunt for a program with a suitable bug in it&#8230; not the Simics tooling to resolve the bug. For some reason, when I best need bugs, I have hard time getting them into my code.</p>
<p>I guess it is Murphy&#8217;s law &#8212; if you really set out to want a bug to show up in your code,  your code will stubbornly be perfect and refuse to break. If you set out to build a perfect piece of software, it will never work&#8230;</p>
<p>So I was actually quite happy a few weeks ago when I started to get random freezes in a test program I wrote to show multicore scaling. It was the perfect bug! It broke some demos that I wanted to have working, but fixing the code to make the other demos work was a very instructive lesson in multicore debug that would make for a nice demo in its own right. In the end, it managed to nicely illustrate some common wisdom about multicore software. It was not a trivial problem, fortunately.</p>
<p><span id="more-975"></span>First, some notes about the program. It is a producer-consumer system using pthreads, with a single producer thread feeding a variable number of compute threads with data, over a shared queue structure (a simple one that uses a single lock to protect it, making it not very scalable for small data messages and lots of workers).</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/10/program-structure-2.png"><img class="aligncenter size-full wp-image-980" title="program structure 2" src="http://jakob.engbloms.se/wp-content/uploads/2009/10/program-structure-2.png" alt="program structure 2" width="411" height="237" /></a></p>
<p>The queue contains a circular buffer, managed using a standard set of full/empty/tail/head kinds of variables. There is also a flag &#8220;done&#8221; which is set once we are out of data, to tell the compute threads to shut down and terminate the program. As this program is used to demonstrate and test scaling, it is actually something that terminates. The main program spawns off all the threads, and then waits for all threads to finish before it terminates itself.</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/10/program-structure.png"><img class="aligncenter size-full wp-image-981" title="program structure" src="http://jakob.engbloms.se/wp-content/uploads/2009/10/program-structure.png" alt="program structure" width="300" height="458" /></a></p>
<p>This program and the queue subsystem had worked perfectly for a long time for me, running on an MPC8641 machine with a Linux 2.6.23 kernel, with 1 to 8 cores and 1 to 16 threads. Regardless of settings like thread counts, data sizes, number of packets to compute, it always ran smoothly and terminated.</p>
<p>However, the other week, I moved the program, the exact same binary even, over to a new software stack built on a Linux 2.6.27 kernel. Still on the same MPC8641 machine. Suddenly, I started to see occasional freezes where the program would never terminate. I added some more diagnostic printouts to the program, and saw that the main program would simply freeze waiting for the other threads to terminate and report in. The freezes had no real relationship to input variables. Maybe they were a bit more common with short packets, but no real pattern emerged. They also happened randomly, running the program with the same parameters for a few times in a row would sometimes result in a freeze. Using control-C to quit it and restart would keep the new instance of program running well. Doing some other demo work, I found the same effect on a P4080 machine with 8 cores and a 2.6.30 Linux kernel.</p>
<p>This is a common pattern for parallelism bugs: they only manifest themselves as actual visible crashes or freezes or bad computation results once something in the software stack has changed, even though the fundamental issues have been there all the time. In this case, I think it was the Linux scheduler, but it is really hard to tell. Just because a program runs fine today it does not have to run fine tomorrow.</p>
<p>After deciding to finally sit down and turn this lemon into lemonade, I had to reproduce the error. Thankfully, that is easy when you have a simulator. The first few times I had to run the target program 20 times or so before hitting the issue, but with some parameter and timing variations I managed to create a script that would open a <a href="http://jakob.engbloms.se/archives/714">checkpoint</a>, and run the program a few times under script control, triggering the bug on the fourth run (every time, thanks to determinism).</p>
<p>To diagnose the problem I wrote some Simics script code that I actually felt was fairly cool. I guessed that the problem had something to do with the queue and its handling of &#8220;done&#8221;, since that is what told the threads to terminate.</p>
<p>The first problem was that the queue was not a global variable. Instead, it was dynamically allocated on the heap by a function, and a pointer passed around, but never stored in a global variable (a good computer science graduate never uses a global variable other than as the means of last resort). Finally, my script set a breakpoint on the line in the setup function that came after the allocation. With the program stopped at that point, I could read the local variable pointing to the queue, and find and store the addresses of all the interesting members of the structure.</p>
<p>The code looked like this (Simics CLI), for the record:</p>
<pre> $mbp = ($ctx.break ($st.pos (rule30_threaded.c:222)))
 $cpu = (wait-for-breakpoint $mbp)
 $pq_addr  = ($cpu.sym "pq")
 $pq_tail  = ($cpu.sym "&amp;(pq-&gt;tail)")
 $pq_empty = ($cpu.sym "&amp;(pq-&gt;empty)")
 $pq_full  = ($cpu.sym "&amp;(pq-&gt;full)")
 $pq_head  = ($cpu.sym "&amp;(pq-&gt;head)")
 $pq_done  = ($cpu.sym "&amp;(pq-&gt;done)")</pre>
<p>Next, I set breakpoints on all writes to empty, full, and done. This was the most expedient route to catch actual puts and gets to the queue. Breakpoints on the queue_put() and queue_get() functions are not really showing the true flow, as these functions start by contending for the lock. Looking at writes to the actual queue members gave me the point where the tasks had grabbed the lock.</p>
<p>The script that caught all writes to done, full, and empty, and on each write, it dumped the state of the queue including computing out the number of elements in the circular buffer (without having to run any code on the target). To get an idea for who was active, it also used OS awareness to find the currently executing thread ID, and scripted debugging to convert the current program counter into a position in the program source code (actually, the important issue was the name of the function we were executing in).</p>
<p>This trace of activity showed quite an interesting pair of patterns. When the program ran well, the queue was mostly full, and it looked like the producer task always got some kind of priority to fill it before consumers could get in and drain it. When the program froze, the queue was seldom more than a few elements deep. This was the same program, on the same kernel, just run a few milliseconds later.</p>
<p>Clearly, the Linux kernel can exhibit quite variable behavior even for a program this simple. I guess that&#8217;s why this is called &#8220;soft real time&#8221;&#8230; Another parallelism lesson here: the scheduler is very important, and a smart adaptive scheduler can wreak havoc with software that was accidentally tuned for a different scheduler.</p>
<p>In the end, the crucial hint was that whenever the program froze, the &#8220;done&#8221; flag was set with a queue that was empty or contained just a few elements. I was sure that I had handled this case in my code, checking specifically for that and making sure to wake up the other threads with a signal that &#8220;the queue is not empty any more, please come check for more work&#8221;&#8230; but looking closely at the code, it turned out the code only woke up a single thread. Thus, the froze resulted from the producer setting &#8220;done&#8221; with an empty queue, waking up a single compute thread, and then having the other threads wait forever for more data to be put into the queue. The fix was easy: use a broadcast signal rather than a single signal.</p>
<p>In retrospect, it seems really strange that this ever worked reliably&#8230; it almost that I suspect the old Linux kernel of having a flawed pthreads implementation where signals always wake up all waiting threads, and not just a single one like the documentation says. But that will wait for another day to be investigated.</p>
<p>Here is the code, for reference:</p>
<pre>void rule30_packet_queue_signal_done(rule30_packet_queue_t *q) {
 //
 // Grab lock, set the done signal atomically
 //
 pthread_mutex_lock (&amp;(q-&gt;mutex));
 q-&gt;done = 1;
 pthread_mutex_unlock (&amp;(q-&gt;mutex));
 // Signal any threads waiting for data to wake up
 // and discover that we are indeed done
 //
 // This is the bug:
 // - It only wakes up one thread...
 pthread_cond_signal (&amp;(q-&gt;notEmpty));
 // To be correct:
 // pthread_cond_broadcast (&amp;(q-&gt;notEmpty));
}</pre>
<p><em>Updated analysis:</em></p>
<p>My initial analysis was that when things worked, the &#8220;done&#8221; flag was set with enough data left in the queue that all threads had a chance to pull in data and come in and see the done flag being set.</p>
<p>However, today I went back and wrote a deeper analysis script that also checked for reads from the done flag (turning this check on only after the write to &#8216;done&#8217; to reduce the noise). I expected there to be a single reader when the freeze happened&#8230; but that was not the case. In my current test case, three out of five threads actually got in to read the done flag and terminate.  The crucial code for the compute threads looks like this:</p>
<pre> // Grab mutex,
 //   Check if the queue is empty, if so wait for someone
 //   to push something onto the queue, or signal done.
 //   both of which are done by setting the not_empty conditional variable
 pthread_mutex_lock (&amp;(queue-&gt;mutex));
 while ((queue-&gt;empty) &amp;&amp; !(queue-&gt;done)) {
   pthread_cond_wait (&amp;(queue-&gt;notEmpty), &amp;(queue-&gt;mutex));
 }</pre>
<p>To freeze, a thread actually has to be doing the conditional wait here. There are plenty of other places threads can be as the program is finishing. For example, they can be waiting to grab the initial mutex lock, or actually doing compute work. That explains why some threads actually still terminate even with the buggy version. It certainly also illustrates just how chaotic concurrent programs can be. More so that you can ever imagine, really.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/975"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/975" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/975" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/975/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The S4D Debug Conference</title>
		<link>http://jakob.engbloms.se/archives/942?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/942#comments</comments>
		<pubDate>Sun, 27 Sep 2009 19:38:27 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[appearances]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[FDL]]></category>
		<category><![CDATA[gdb]]></category>
		<category><![CDATA[Hardware debug support]]></category>
		<category><![CDATA[p4080]]></category>
		<category><![CDATA[S4D]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=942</guid>
		<description><![CDATA[An unplanned and unexpected bonus with my trip to the FDL 2009 conference was the co-located S4D conference. S4D means System, Software, SoC and Silicon Debug, and is a conference that has grown out of some recent workshops on the topic of debugging, as seen from the perspective of hardware designers (mostly). S4D was part [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-941" title="S4D" src="http://jakob.engbloms.se/wp-content/uploads/2009/09/S4D1.jpg" alt="S4D" width="143" height="62" />An unplanned and unexpected bonus with my trip to the FDL 2009 conference was the co-located <a href="http://www.ecsi-association.org/ecsi/s4d/s4d09/mainpage.asp">S4D conference</a>. S4D means <em>System, Software, SoC and Silicon Debug</em>, and is a conference that has grown out of some recent workshops on the topic of debugging, as seen from the perspective of hardware designers (mostly). S4D was part of the same package as FDL and DASIP, entrance to one conference got you into the other two too. As I did not know about S4D until quite late in the process, this was a great opportunity for me to look at what they were doing.</p>
<p><span id="more-942"></span></p>
<p>It was sufficiently interesting that I spent all of Thursday in S4D rather than in  FDL. It was really the first time that I have seen so many people working with practical embedded systems debug in the same room. Debug tends to be a topic at embedded systems conferences of various kinds, but then mostly from a fairly superficial technical perspective: assuming fairly simple software tools. Here,  there were presentations on how current hardware debug is being extended to incorporate powerful trace and debug and synchronous stop facilities.</p>
<p>It was very interesting to see Infineon, ST, and ARM present their work in on-chip debug. Users at ST, Nokia and Continental presented their view of debug requirements, uses, and current home-grown tools. There were presentations from EDA vendors showing off debuggers for hardware designs and some virtual platforms tools for software debug. Freescale presented how their HyperTRK debug agent works with their P4080 hypervisor, covering the software-instrumentation approach. Debug tends to be a field neglected by academia, but there were some academic papers presented as well. <a href="http://sourceware.org/gdb/wiki/GDB_7.0_Release">gdb7</a>&#8216;s multi-threaded debug abilities were mentioned. Pretty much the only topic missing in action was reverse execution.</p>
<p>This mixed audience gave rise to quite a few interesting discussions during the day. It was simple fun, as far as I am concerned.</p>
<p>The following were the main themes addressed and discussed:</p>
<ul>
<li>How to make customers of silicon chips appreciate the on-chip debug and not just consider it an unnecessary cost that could be avoided if only their software engineers did not make any mistakes. Answer: sell it as a performance optimization tool instead.</li>
<li>Multicore debug, including hardware-supported tracing and synchronized stop of multiple cores on a single SoC.</li>
<li>Given that we have massive traces from hardware and software debug and trace facilities, how can we actually find errors? Processing of trace information to detect anomalies is going to be an important issue in the future.</li>
<li>Performance bugs are the next frontier, after current concerns with functionality bugs.</li>
</ul>
<p>If I were to take a critical look at the conference and its scope, there were some things that were not covered.</p>
<ul>
<li>System-level debug, outside the scope of a single SoC, was not in any talk.</li>
<li>Almost all the speakers and attendees came from the world of consumer electronics and automotive systems. It would have been nice with some input from long-time parallel world of servers and operating systems, such as Microsoft&#8217;s debugger teams.  In a sense, this is the inverse of my complaint about the <a href="http://jakob.engbloms.se/archives/905">SiCS Multicore Day 2009</a>.</li>
<li>As well as compiler people involved in creating debug information and how they deal with parallel programs.</li>
<li>Security vs debuggability, a <a href="http://jakob.engbloms.se/archives/799">favorite topic </a>of <a href="http://www.strombergson.com/kryptoblog/">Joachim Strömbergsson</a>. It would have been fun if Joachim would have been there. I asked Rolf Kühnis from Nokia about <a href="http://www.mipi.org/">security in MIPI</a>, and he said that it simply was not in scope for MIPI: each manufacturer deals with it in their own way.</li>
</ul>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/942"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/942" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/942" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/942/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Simulation Determinism: Necessary or Evil?</title>
		<link>http://jakob.engbloms.se/archives/734?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/734#comments</comments>
		<pubDate>Sun, 19 Apr 2009 20:36:02 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[multicore debug]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[determinism]]></category>
		<category><![CDATA[multicore]]></category>
		<category><![CDATA[repeatability]]></category>
		<category><![CDATA[reverse execution]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[VMWare]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=734</guid>
		<description><![CDATA[In my series (well, I have one previous post about checkpointing) about misunderstood simulation technology items, the turn has come to the most difficult of all it seems: determinism. Determinism is often misunderstood as meaning &#8220;unchanging&#8221; or &#8220;constant&#8221; behavior of the simulation. People tend to assume that a deterministic simulation will not reveal errors due [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-735" style="margin-left: 10px; margin-right: 10px;" title="gears" src="http://jakob.engbloms.se/wp-content/uploads/2009/04/gears.png" alt="gears" width="56" height="57" />In my series (well, I have one previous post about <a href="http://jakob.engbloms.se/archives/714"><em>checkpointing</em></a>) about misunderstood simulation technology items, the turn has come to the most difficult of all it seems: <em>determinism.</em> Determinism is often misunderstood as meaning &#8220;unchanging&#8221; or &#8220;constant&#8221; behavior of the simulation. People tend to assume that a deterministic simulation will not reveal errors due to nondeterministic behavior or races in the modeled system, which is a complete misunderstanding. Determinism is a necessary feature of any simulation system that wants to be really helpful to its users, not an evil that hides errors.</p>
<p><span id="more-734"></span></p>
<h2>What?</h2>
<p>Determinism really means this:</p>
<ul>
<li>Given a certain initial state</li>
<li>And a certain sequence of external inputs</li>
<li>The end result and state of the simulation will always be the same</li>
</ul>
<p>The key to note is that you need to require both the starting state and the sequence of external inputs to be the same in order to get the same result. If either of these change, you can well get a different result. Implementing a deterministic simulator requires all internal events and activities in the simulator to be performed in the same order and at the same time in each simulation run. It means that the host computer environment state cannot be allowed to affect the simulator execution, and that in turn means that all sorting of internal events have to be done in defined orders in all instances.</p>
<p>I have a story about how hard that can be in practice. I once talked to some compiler developers who had the issue that when recompiling the same program with the same set of compiler options, the results might come out different, even on the same machine. The problem was that each run of the compiler was done in a different overall system state, and this might affect how the OS memory allocation functions allocated items in memory. It turned out that in some cases, the precise value of the <em>pointers </em>to the items in a complex data structure were used by standard libraries to handle iteration over nodes in the data structures. Thus, a different memory allocation pattern gave a different iteration order and a different traversal order of nodes, and in the end an almost arbitrarily different result. The correct solution they had to implement was to use a defined lexical ordering to traverse and iterate, not anything dependent on the state of the host machine. It is nothing different in a simulator: define the order of <em>everything</em>, in order to be deterministic.</p>
<h2>Why?</h2>
<p>The crucial benefit that determinism brings to a simulation in general and a virtual platform in particular is <em>repeatable debugging</em>. With determinism and an appropriate recording mechanism (and most practically <a href="http://jakob.engbloms.se/archives/714">checkpointing</a>) you can rely on being able to repeat a run resulting in a bug any number of times with the precise same sequence of events in the simulation. In particular, the same sequence and timing and timing relative to instructions executed for events visible to and relevant for the software running on the virtual platform. Especially for multicore and parallel computing systems this is incredibly powerful, and something that just cannot be achieved on physical hardware (due to its inherent randomness and chaotic behavior, see my 2006 and 2007 ESC Silicon Valley talks for more on this, at my <a href="http://www.engbloms.se/jakob_publications.html">publications </a>and <a href="http://www.engbloms.se/jakob_presentations.html">presentations </a>pages).</p>
<p>If you assume stability of the simulation infrastructure and the simulation platform, determinism also makes debugging the simulation itself easier. Often, a bug in a simulation model is repeatable, and with determinism, it is easy to repeat the same external stimulus sequence to the module and debug it repeatably.</p>
<p>Determinism also makes it easy to detect change in the behavior of a simulation: if the same simulation setup results in a different result or final simulation state, you know something in the setup (model, model parameters, or software) changed. There is no randomness that cause changes without some fundamental parameter being changed. Such boring reliable behavior is generally exactly what you want when testing and debugging large, complex systems.</p>
<p>Obviously, once determinism becomes a requirement, missing determinism in a model is a bug in itself &#8212; and finding such bugs can certainly be interesting exercises.</p>
<h2>Why Not?</h2>
<p>Just like for checkpointing, one reason not do to determinism is that it is hard, as discussed above.</p>
<p>The most common reason that people claim to want to avoid determinism is that they want to explore alternatives within their simulation. Basically, there is a need for <em>variability </em>that would seem to be at odds with determinism. The typical argument is that &#8220;if my simulation model contains a non-deterministic choice, I want the simulation to expose that and not just make the same decision every time&#8221;. This is where determinism tends to be considered <em>evil</em>. However, this argument is not correct.</p>
<p>If we take the case that at some point P in a simulation run there are two different events <em>E</em> and <em>F</em> that can fire (since they are both posted to the same point in virtual time), a deterministic simulator will always select one and the same. This is necessary to reap the system-level benefits discussed above. However, nothing prevents us from programming a change from this behavior into our system explicitly, <em>introducing controlled and repeatable variation. </em>In such a setup, we will have a random decision being made in each simulation run, but one where the outcome in any particular run can be repeated by setting the same random seed parameter.</p>
<p>This brings the best of both worlds: variation to expose issues where there is potential non-determinism or lack of synchronization in the model, and perfect repeatability of the issues this poses in terms of target software and simulation system behavior. The reason for the simultaneous readiness can be considered to be lacking synchronization in the model, in general, and such a randomizer of behavior will expose that at several different levels. But uncontrolled randomness is not the answer.</p>
<p>Another common misconception is that at a higher level, determinism in a virtual platform means that target software will always run in the same way. That is not true, and misses the importance of state in the deterministic behavior equation. If the initial state when a program starts is different, a different execution will result. If software is run on top of any non-trivial operating system, there is plenty of such variation. In one of our simplest Simics demos, we show this by running an intentionally buggy race-condition-ridden program. Each time it is run, it hits a different number of race conditions. But thanks to determinism (best demoed using reverse execution), we can repeat each run perfectly.</p>
<p>Thus, determinism is not equal to constant behavior or lack of variation.</p>
<h2>The reverse argument</h2>
<p>Finally, determinism is the simplest way to implement reverse execution: if you have recording, determinism, and checkpointing, you can easily virtually reverse the execution by going back to a checkpoint and replay the execution from that point. If you stop one instruction before the current instruction, you have in essence stepped backwards one step in time. This is how both VMWare and Simics implement reverse execution and debugging. And it could not happen without determinism.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/734"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/734" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/734" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/734/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pulling the Virtual Ethernet Plug</title>
		<link>http://jakob.engbloms.se/archives/248?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/248#comments</comments>
		<pubDate>Wed, 27 Aug 2008 09:25:00 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[virtual machines]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[ACM Queue]]></category>
		<category><![CDATA[bryan cantrill]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[ethernet]]></category>
		<category><![CDATA[fault injection]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=248</guid>
		<description><![CDATA[I just read the panel interview at the start of the latest issue (Number 4, 2008) of ACM Queue. Here, you have Bryan Cantrill of Sun (the man behind dTrace) bemoan the difficulty of testing faults. In particular: Part of the reason I&#8217;m interested in virtualization is as a development methodology. It has not delivered [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-249" style="margin: 5px 10px;" title="acm-queue-logo" src="http://jakob.engbloms.se/wp-content/uploads/2008/08/acm-queue-logo.gif" alt="" width="87" height="51" />I just read the <a href="http://mags.acm.org/queue/20080708/?pg=10&amp;pm=2">panel interview </a>at the start of <a href="http://mags.acm.org/queue/20080708/">the latest issue (Number 4, 2008) of ACM Queue</a>. Here, you have <a href="http://mags.acm.org/queue/20080708/?pg=14&amp;pm=2">Bryan Cantrill of Sun </a>(the man behind dTrace) bemoan the difficulty of testing faults. In particular:</p>
<blockquote><p>Part of the reason I&#8217;m interested in virtualization is as a development methodology. It has not delivered on this, but one of the things that I ask is can I use virtualization to automate someone pulling the Ethernet cable out of the jack? I can get a lot closer to simulating it if you let me create a toy virtual machine than I can running on the live machine.</p></blockquote>
<p>Well, <a href="http://www.virtutech.com/solutions/virtual_platform">this already exists</a>. It is a common feature to any virtual platform that is not a datacenter-oriented runtime engine like VmWare, Xen, LPAR, and its ilk. Doing fault injection is a primary use case for virtual platforms, especially for larger servers and systems featuring redundancy and fault tolerance.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/248"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/248" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/248" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/248/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>EETimes Article on Multicore Debug</title>
		<link>http://jakob.engbloms.se/archives/154?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/154#comments</comments>
		<pubDate>Sun, 20 Jul 2008 08:35:05 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[articles]]></category>
		<category><![CDATA[embedded software]]></category>
		<category><![CDATA[multicore debug]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[EETimes]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[multicore]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=154</guid>
		<description><![CDATA[I have another short technical piece published about Multicore Debug at the EETimes (and their network of related publications, like Embedded.com). Pretty short piece, and they cut out some bits to make it fit their format. Nothing new to fans of virtual platforms for software development, basically we can use virtual platforms to reintroduce control [...]]]></description>
			<content:encoded><![CDATA[<p><img class="size-medium wp-image-155 alignleft" style="margin: 10px;" title="eetimes logo" src="http://jakob.engbloms.se/wp-content/uploads/2008/07/eetimes.png" alt="" width="127" height="56" />I have another short technical piece published about <a href="http://www.eetimes.com/news/design/showArticle.jhtml?articleID=209100262">Multicore Debug at the EETimes </a>(and their network of related publications, like <a href="http://www.embedded.com/design/209101250">Embedded.com</a>). Pretty short piece, and they cut out some bits to make it fit their format. Nothing new to fans of virtual platforms for software development, basically we can use virtual platforms to reintroduce control over parallel and for all practical purposes chaotic hardware/software systems.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/154"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/154" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/154" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/154/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Simulation is Better than Barr &amp; Massa Says</title>
		<link>http://jakob.engbloms.se/archives/94?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/94#comments</comments>
		<pubDate>Wed, 02 Apr 2008 18:51:00 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[books]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[software tools]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/archives/94</guid>
		<description><![CDATA[In the book &#8220;Programming Embedded Systems &#8212; with C and GNU Development Tools&#8220;, authors Michael Barr and Anthony Massa make some statements on simulation that I just have to disagree with on principle. Read on for what. Note that overall this is a good book, I am not claiming that it is not. The Amazon [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://jakob.engbloms.se/wp-content/uploads/2008/04/citation-cover.thumbnail.jpeg" alt="Cover of Programming Embedded Systems by Barr and Massa" hspace="10" vspace="10" align="left" />In the book &#8220;<a href="http://www.amazon.com/Programming-Embedded-Systems-Development-Tools/dp/0596009836">Programming Embedded Systems &#8212; with C and GNU Development Tools</a>&#8220;, authors Michael Barr and Anthony Massa make some statements on simulation that I just have to disagree with on principle. Read on for what. Note that overall this is a good book, I am not claiming that it is not. The Amazon reviews are pretty good, and having a foreword by <a href="http://jakob.engbloms.se/wp-admin/www.ganssle.com">Jack Ganssle</a> is always a  sign of quality. But I just have to correct them on one little fact&#8230;</p>
<p><span id="more-94"></span><br />
In a section on &#8220;other development tools&#8221;, there is the following piece on simulation:</p>
<p style="text-align: center"><a title="Cite about simulation not full and useful" href="http://jakob.engbloms.se/wp-content/uploads/2008/04/citation-1.jpeg"><img src="http://jakob.engbloms.se/wp-content/uploads/2008/04/citation-1.jpeg" alt="Cite about simulation not full and useful" /></a></p>
<p>I just had to scan the page to make it look more real, right?</p>
<p>Anyhow, the critical bit is at the end:</p>
<blockquote><p>By far, the biggest disadvantage of a simulator is that it simulates only the processor.</p></blockquote>
<p>That is simply not true. Not today. Note that this is a second edition of a book, and that when the first edition was created the market place looked different. Today, vendors like <a href="http://www.virtutech.com">Virtutech</a>, CoWare,  Synopsys, VaST, and ARM routinely provide <a href="http://en.wikipedia.org/wiki/Full_system_simulator">full-system simulators</a> that simulate much more than just the core processor. These virtual platforms contain memory, peripherals, and often connections to the outside world or simulations of the environment in which an embedded system operates. For more on this, please read <a href="http://www.embedded.com/columns/technicalinsights/199702616">the piece I wrote for embedded.com last year</a>, or come visit my talk at the <a href="http://jakob.engbloms.se/archives/75">ESC SV this year</a> (due in two weeks).</p>
<p>And next we have:</p>
<blockquote><p>So you probably won&#8217;t do too much with the simulator  once the actual embedded hardware is available.</p></blockquote>
<p>This is not generally true either. Very often, a virtual hardware platform is used to develop software for a system even after hardware appears &#8212; simply because it is more convenient, available, and capable. It really is.</p>
<p>But finally, here is a quote that I like, finishing off this short section in their book:</p>
<p align="center"><a title="Second quote from the book" href="http://jakob.engbloms.se/wp-content/uploads/2008/04/citation-2.jpeg"><img src="http://jakob.engbloms.se/wp-content/uploads/2008/04/citation-2.jpeg" alt="Second quote from the book" /></a></p>
<p>That is a very good tip: use the simulator and compare notes with reality. It is a very powerful technique, but it is even better for a full-system simulator: there you can trace the actual virtual hardware-actual software interaction and see if any of the two parties are not following the data book. On hardware, you would have no chance to see that, as you cannot usually attach a trace unit to the device bus of an integrated chip.</p>
<p>I guess that was all I had on this topic. Barr and Massa are a bit wrong, but I guess it is mostly because they have not seen what can be done in virtualization and simulation today. Which means we in the business have more education and marketing to do to raise awareness of what tools can do today.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/94"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/94" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/94" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/94/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hardware Debug Support &amp; LinuxLink PodCast</title>
		<link>http://jakob.engbloms.se/archives/39?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/39#comments</comments>
		<pubDate>Sun, 14 Oct 2007 19:56:31 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer architecture]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[podcast commentary]]></category>
		<category><![CDATA[software tools]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/archives/39</guid>
		<description><![CDATA[The TimeSys Embedded Linux Podcast (also called LinuxLink Radio) is a nice listen about embedded computing using Linux. Sometimes they are a bit too open-source centric, though, and ignore very good tools that live in the classic commercial world. One such example is the recent episode 20 on debugging tools, where they totally ignore modern [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.timesys.com/services/podcast.htm">TimeSys Embedded Linux Podcast (also called LinuxLink Radio)</a> is a nice listen about embedded computing using Linux. Sometimes they are a bit too open-source centric, though, and ignore very good tools that live in the classic commercial world. One such example is the recent episode 20 on debugging tools, where they totally ignore modern high-powered hardware-based debugging.</p>
<p><span id="more-39"></span><br />
They do talk about the use of <a href="http://en.wikipedia.org/wiki/Joint_Test_Action_Group">JTAG for debugging</a> and the old <a href="http://en.wikipedia.org/wiki/In-circuit_emulator">ICE systems</a>, but miss the modern trend towards much more powerful on-chip debug hardware. Especially interesting today is the use of three technologies:</p>
<ul>
<li>Better on-chip supports like ARM&#8217;s Embedded Trace Macrocell, and the recent quite advanced <a href="http://www.arm.com/products/solutions/CoreSight.html">CoreSight </a>system gives much better insight into the system execution thanks to specialized buses and buffers for debug info.</li>
<li>On-chip debug logic that makes it possible for the processors and logic on a chip to break on complex conditions and across different processor cores without involving the host debugger in the decision loop.</li>
<li>Huge trace buffers that can take out several seconds worth of execution traces, and smart tools that take advantage of the data offline to do performance analysis and debugging and reverse debugging.</li>
</ul>
<p>All of these are available from commercial vendors like ARM, GreenHills, and WindRiver systems, but there is no really good open-source support. Probably because the systems in question are fairly rare, and open-source tends to provide good support for the mainstream use case and technology and very poor support for everything else.Â  Also, the reverse debuggers are usually tied to a particular trace system or debug agent, since reversibility is not part of any standard debug protocol (yet, there are several different attempts to introduce it to gdb for various backends). Finally, if you buy a very expensive piece of debug hardware, the cost of software to use it with does not really matter.</p>
<p>So thanks to the great power of hardware trace and powerful tracebuffers, and contrary to the opinions in the podcast, I actually believe that cross-debugging using hardware support &#8212; or, even better, virtual hardware &#8212; is a very good tool for application-level debug once the application and the hardware platform get sufficiently complex. You really want that nice unintrusive debug experience, rather than affecting your target machine with a debug agent, or even worse, by running it on the same time as you are debugging the code it runs.</p>
<p>You do need to have a debugger that is aware of virtual memory and the tasks running on the target system, but that is not that hard to do. Freescale&#8217;s CodeWarrior and WindRiver Workbench both do this for hardware-assisted debug of Linux and VxWorks targets. We at Virtutech have also done it using virtual hardware.</p>
<p>Just to clarify: as I have noted earlier, I still think that even the most ambitious hardware-debug approaches in the market today do not go far enough for multicore processors. For quickly getting to reasonable performance on a multicore platform, I think reducing peak performance by replacing performance-enhancing hardware with debug- and tuning-enhancing hardware makes perfect sense. But that is a tangent.</p>
<p>So what is the final take here?</p>
<ul>
<li>Hardware debug rocks.</li>
<li>Virtual hardware debug also rocks, often much more.</li>
<li>Remote/cross-debug tools should be used more, not less.</li>
<li>Someone needs to package remote/cross-debug so that even PC types want to use it for their &#8220;native&#8221; applications.</li>
<li>Commercial software development tools are often ahead of the open-source tools (but not always, Valgrind is a good example of an outstanding open-source solution).</li>
</ul>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/39"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/39" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/39" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/39/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

