<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Observations from Uppsala &#187; virtual platforms</title>
	<atom:link href="http://jakob.engbloms.se/archives/category/virtual/virtual-platforms/feed" rel="self" type="application/rss+xml" />
	<link>http://jakob.engbloms.se</link>
	<description>Computer Technology: Simulation, Virtualization, Virtual Platforms, Embedded, Multicore and Multiprocessing (by Jakob Engblom)</description>
	<lastBuildDate>Tue, 27 Jul 2010 19:57:05 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
<image>
    <title>Observations from Uppsala</title>
    <url>http://jakob.engbloms.se/favicon.png</url>
    <link>http://jakob.engbloms.se</link>
    <width>32</width>
    <height>32</height>
    <description>Observations from Uppsala - http://jakob.engbloms.se</description>
    </image>		<item>
		<title>Wind River Blog: Simics Analyzer</title>
		<link>http://jakob.engbloms.se/archives/1137?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1137#comments</comments>
		<pubDate>Wed, 26 May 2010 19:40:42 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[embedded software]]></category>
		<category><![CDATA[multicore debug]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Wind River]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1137</guid>
		<description><![CDATA[I have a new blog post up at the Wind River blog network, about the new target analysis tools in Simics 4.4. It is a very fun piece of technology to play with, and you learn a lot just by poking around at existing software systems&#8230;]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2010/04/button-quicklink-blogs.png"><img class="alignleft size-full wp-image-1122" style="margin: 5px 10px;" title="button-quicklink-blogs" src="http://jakob.engbloms.se/wp-content/uploads/2010/04/button-quicklink-blogs.png" alt="" width="46" height="46" /></a>I have a <a href="http://blogs.windriver.com/engblom/2010/05/analyzed.html">new blog post </a>up at the Wind River blog network, about the new target analysis tools in Simics 4.4. It is a very fun piece of technology to play with, and you learn a lot just by poking around at existing software systems&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1137/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>First Blog at Wind River!</title>
		<link>http://jakob.engbloms.se/archives/1121?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1121#comments</comments>
		<pubDate>Thu, 29 Apr 2010 19:14:03 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[articles]]></category>
		<category><![CDATA[blogging]]></category>
		<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[Wind River]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1121</guid>
		<description><![CDATA[One of the many nice effects of the Wind River acquisition of Simics is that I will be blogging as part of the Wind River Blog network. My first post there is up now, and it is a short (at least compared to a textbook, I admit it looks terribly long for a blog post) [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2010/04/button-quicklink-blogs.png"><img class="alignleft size-full wp-image-1122" style="margin: 5px 10px;" title="button-quicklink-blogs" src="http://jakob.engbloms.se/wp-content/uploads/2010/04/button-quicklink-blogs.png" alt="" width="46" height="46" /></a>One of the many nice effects of the Wind River acquisition of Simics is that I will be blogging as part of the Wind River Blog network. <a href="http://blogs.windriver.com/engblom/2010/04/what_is_simics_really.html">My first post there is up now</a>, and it is a short (at least compared to a textbook, I admit it looks terribly long for a blog post) overview of how Simics works inside.</p>
<p>I think it is important for users of technologically advanced tools to know a bit of how they work. A classic example of this is compilers, where I taught an ESC class almost a decade ago which is my most <a href="http://jakob.engbloms.se/archives/750">popular piece of writing to date</a>&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1121/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>FFast: Good Idea, Too Bad About the Implementation</title>
		<link>http://jakob.engbloms.se/archives/1114?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1114#comments</comments>
		<pubDate>Sun, 11 Apr 2010 19:23:29 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[virtual machines]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[.net]]></category>
		<category><![CDATA[Antoine Trouvé]]></category>
		<category><![CDATA[C]]></category>
		<category><![CDATA[CLR]]></category>
		<category><![CDATA[FFast]]></category>
		<category><![CDATA[Kazuaki Murakami]]></category>
		<category><![CDATA[RAPIDO]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1114</guid>
		<description><![CDATA[I just read a short paper by Antoine Trouvé and Kazuaki Murakami from the RAPIDO 2010 workshop on &#8220;rapid simulation and performance evaluation&#8221;. The paper is &#8220;FFast: Efficient Application of Compiled Simulation Techniques To A Fast ISS Over a Virtual Machine&#8221;. It explores the interesting idea of how an existing virtual machine infrastructure can be [...]]]></description>
			<content:encoded><![CDATA[<p>I just read a short paper by Antoine Trouvé and Kazuaki Murakami from the <a href="http://www2.lifl.fr/rapido/Rapido/Program.html">RAPIDO 2010</a> workshop on &#8220;rapid simulation and performance evaluation&#8221;. The paper is &#8220;FFast: Efficient Application of Compiled Simulation Techniques To A Fast ISS Over a Virtual Machine&#8221;. It explores the interesting idea of how an existing virtual machine infrastructure can be used to build a fast instruction-set simulator, and in the extension, a full system simulator.</p>
<p>To me, this idea is worth exploring, since using a mature VM like the .net CLR (used in this paper) or a JVM would offer a shortcut to get high-quality code generation for a JIT compiler. It could also offer other benefits, as these environments support many advanced configuration and management features. I have touched on this topic before, in the posts &#8220;<a href="http://jakob.engbloms.se/archives/1008">Dream ESL Language</a>&#8221; (VM as the basis for a simulator) and &#8220;<a href="http://jakob.engbloms.se/archives/264">The JVM as Universal Parallel Glue</a>&#8221; (that a common VM can  offer huge benefits for an ecosystem).</p>
<p><span id="more-1114"></span></p>
<p>In the paper, the authors show how they have built an ISS for MIPS which runs at 1 MIPS in basic interpretive mode, but at up to 225 MIPS in the most optimized mode. Decent performance on a 2.6 GHz Core 2, but still an order of magnitude compared to the fastest commercial offerings available. However, I think these numbers are not particularly interesting or relevant.</p>
<p>First of all, they only check the performance on basic user-level programs with no I/O, since there is nothing but the CPU present and they thus cannot run an operating system. This makes the numbers essentially &#8220;peak&#8221; numbers, for small programs, which is not particularly realistic. Second, their implementation does not go straight to bytecodes, but rather to C# code. This is not how a high-performance solution would work, as it is obvious that they struggle to get performance even on these small benchmarks using that approach. Too much effort seems to be spent on gaming the C# runtime system and compiler, in my opinion.</p>
<p>Thus, the paper does not really reveal anything useful in terms of &#8220;is building a JIT ISS using a VM a viable idea&#8221;? It probably tells us that it is not necessarily a broken idea, but there is a lot of work to bring the solution up to the level of native C-based solutions.</p>
<p>It is clear that the implementation effort in this case is lower, and that the porting cost to new hosts is also very low, compared to the native C-based approaches used in current industrial solutions. C# should also be more productive than using something like C++ for building a software system.</p>
<p>The most interesting aspect of the idea, and one which the authors do not explore at all, is using the power of the .net CLR to build a dynamic full-system simulator. Using the CLR, it should be trivial to build a solution where hardware models can be separately compiled and loaded dynamically at runtime. Using .net &#8220;properties&#8221;, it might be possible to support user inspection of a running system. Maybe the .net programming tools offer some really interesting possibilities for the debugging of full-system simulators. However, none of this is currently explored, which is a real shame. I guess I could hope that the authors read this short critique  and get some more ideas for future work, as I really think that virtual platforms could be built on top of virtual machines.  That idea is worth exploring in research.</p>
<p>On the nitpicking side, as always when reviewing academic papers in this blog, the authors seem to be unaware of some very relevant previous work. In particular, they should have mentioned Qemu and Simics. They could have used Qemu for MIPS as a point of comparison to compare speeds between their approach and a native C approach. As it is right now, the reference list looks like a fairly random walk around the DAC and DATE communities, but with little insight into the actual virtual platform or full-system simulation tools available today.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1114/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Describe is not the same as Design</title>
		<link>http://jakob.engbloms.se/archives/1083?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1083#comments</comments>
		<pubDate>Mon, 15 Feb 2010 20:56:41 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[EDA]]></category>
		<category><![CDATA[ESL]]></category>
		<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[DML]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1083</guid>
		<description><![CDATA[The discussion on my previous blog post about &#8220;the ideal ESL language&#8221; made me think some more about the purpose of a hardware modeling or description language. If you look closely, you realize that there are two quite different goals being pursued by the tools and languages discussed there. On one hand, we have the [...]]]></description>
			<content:encoded><![CDATA[<p>The discussion on my previous blog post about &#8220;<a href="http://jakob.engbloms.se/archives/1008">the ideal ESL language</a>&#8221; made me think some more about the purpose of a hardware modeling or description language. If you look closely, you realize that there are two quite different goals being pursued by the tools and languages discussed there.</p>
<p>On one hand, we have the task of supporting the design of new hardware bits, for the purpose of creating it. On the other hand, we have the task of describing a particular design for the purpose of simulating it. These two are not necessarily the same.</p>
<p><span id="more-1083"></span>To use an <a href="http://jakob.engbloms.se/archives/1035">analogy with building a house</a>, a design language helps the architect create the house (piece of hardware). Since the architect relies on craftsmen and experts (compilers) to do detailed design (how to put in windows, where to put light switches, etc.), the high-level description does not contain all the details of the house. However, if you are trying to simulate the house (piece of hardware) so that its inhabitants (software) don&#8217;t see the difference to the real thing, the details are sometimes what matters most. For example, the precise way to operate the stove in the house is very important for familiarity, but is a detail most likely left out of the architect&#8217;s initial drawings.</p>
<p>A design language can leave many things unspecified to be filled in by a compiler, but these things can be absolutely core to a description language. In particular, programming register maps tend to be created as a not-too-important side activity in hardware design. They do not really need to be visible in higher-level ESL languages, as they can obviously be filled in later by a tool or a human. But for a description language, they are absolutely core.</p>
<p>A description language can also leave out many parts of the hardware. If the software being used or written does not use certain modes or functions of a piece of hardware, those pieces can be ignored and implemented as dummies. That means that support for dummies is very important in description languages. But dummies make little sense in a design language, as you are unlikely to design a chip with lots of area spent on dummy functions that do nothing.</p>
<p>A description language can also ignore crucial aspects like power constraints and synthesis constraints. These are guidelines for a compilation step that has no bearing on the description of the hardware &#8212; the description language should describe what ended up happening, not the if, please, what, and buts that guided how we got there.</p>
<p>For virtual platform creation, you seem to need a bit of both. I maintain that most of a VP is based on old hardware that exists, which calls for languages with strong description abilities. That&#8217;s the space that <a href="http://jakob.engbloms.se/archives/99">Simics DML </a>was designed for. For the small part of the hardware that is novel would be nice to have some way to convert from a design language to a virtual platform. Here, I don&#8217;t really see any usable current tools or languages &#8212; SystemC is really more a design language, but if you want a virtual platform model, you have to use it as a description language. There is no automagic getting to a fast abstract model from a design-oriented description. That&#8217;s why we need new, higher level systems, that can push out decent descriptions from a design.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1083/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>The System, Not the Parts</title>
		<link>http://jakob.engbloms.se/archives/1035?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1035#comments</comments>
		<pubDate>Sat, 19 Dec 2009 19:38:22 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[ESL]]></category>
		<category><![CDATA[business issues]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Peter Day]]></category>
		<category><![CDATA[podcast commentary]]></category>
		<category><![CDATA[Russel Ackoff]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1035</guid>
		<description><![CDATA[I just listened to the November 16, 2009, issue of the BBC podcast called &#8220;Peter Day&#8217;s World of Business&#8220;. It is a rerun (in memoriam) of an interview with business professor Russell Ackoff, which was originally published in 2007. The main theme of the interview is the need to shift business thinking from small details [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/12/300x300.jpg"><img class="alignleft size-full wp-image-1036" style="margin: 10px;" title="Peter days world of business" src="http://jakob.engbloms.se/wp-content/uploads/2009/12/300x300.jpg" alt="Peter days world of business" width="100" height="100" /></a>I just listened to the November 16, 2009, issue of the BBC podcast called &#8220;<a href="http://www.bbc.co.uk/podcasts/series/worldbiz/">Peter Day&#8217;s World of Business</a>&#8220;. It is a rerun (in memoriam) of an interview with business professor Russell Ackoff, which was <a href="http://news.bbc.co.uk/2/hi/business/6338527.stm ">originally published in 2007</a>.</p>
<p><span id="more-1035"></span></p>
<p>The main theme of the interview is the need to shift business thinking from small details to entire systems. From operations research where you spend lots of time understanding some process or department in great detail, to a system-level thinking where you focus on what an entire enterprise is doing.</p>
<p>For me, this struck a chord in my system-level heart&#8230; in my world of computer systems and virtual platforms, system-level is what it is so hard to get engineers to. Far too much time is spent (in my opinion) understanding, modeling, and tweaking subsystems. Far too little effort is spent on understanding the whole, how things fit together in practice, taking software, hardware, and software system evolution over time into account. The analogy is not perfect, but there are more things that are alike than are not.</p>
<p>The most interesting analysis that Russell Ackoff fires off from his perspective is that of comparing companies and architecture. An architect knows the whole of a building, but does not entirely go into details on just how it is to be built. He/she trusts the carpenters, bricklayers, and other workers to know how best to solve their local problems. Basically, applying hierarchical abstraction to the task of constructing an actual building.</p>
<p>This got me thinking some of why this is the case. I think it could be because building things (castles, cathedrals, houses, walls, pyramids, canals, &#8230;) must have been among the most complex tasks undertaken for a very long time in human history. Thanks to this long history, we have perfected the abstraction and division of labor in construction. Buildings are built in a certain way, by a certain set of crafts, since that method has been proven to work well for a very long time. So just like in the case of the design patterns craze in the late 1990&#8242;s, architecture might have something to teach us about how to build hardware/software systems too.</p>
<p>Note that for some reason, I cannot find a link to the podcast on the BBC homepage. But if you subscribe in iTunes or similar, I think you will find it. Something is not as user-friendly as it could be.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1035/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Dream ESL Language</title>
		<link>http://jakob.engbloms.se/archives/1008?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1008#comments</comments>
		<pubDate>Fri, 27 Nov 2009 19:51:27 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[EDA]]></category>
		<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[FDL]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1008</guid>
		<description><![CDATA[This post is a belated comment on the FDL 2009 conference that I attended some months ago. I have had some things in mind for a while, but some recent podcast listening has brought the issues to front again. What has been striking is the extent to which FDL was about languages only to a [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/08/fdllogosmall.jpg"><img class="alignleft size-full wp-image-881" style="margin: 5px 10px;" title="fdllogosmall" src="http://jakob.engbloms.se/wp-content/uploads/2009/08/fdllogosmall.jpg" alt="fdllogosmall" width="80" height="79" /></a> This post is a belated comment on the FDL 2009 conference that I attended some months ago. I have had some things in mind for a while, but some recent podcast listening has brought the issues to front again. What has been striking is the extent to which FDL was about <em>languages </em>only to a very small degree. Compared to programming-language conferences like PLDI, there was precious little innovation going on in input languages, and very little concern for the <em>programming </em>aspects of virtual platform design and hardware modeling.</p>
<p><span id="more-1008"></span>Walking to and from the conference from my hotel, I listened through a <a href="http://www.twit.tv/floss79">FLOSS Weekly interview </a>with David Heinemeier Hanson, the creator of Ruby on Rails. His approach to programming and languages was quite unlike that exposed at FDL. In his world, anything that is repeated in code should be put into the language or library. In Ruby, that is easier than in many other cases, as the language can be extended arbitrarily without recompiling the VM. His focus on programmer productivity and convenience is in stark contrast to the FDL discussions which mostly dealt with how to simulate things in a single language, SystemC. Quite boring from a programming language perspective.</p>
<p>Another podcast that triggered thoughts on programming and how to improve it using languages was <a href="http://itc.conversationsnetwork.org/shows/detail4291.html">Stackoverflow Episode 73. </a>In the listener questions section, the topic of language evolution came up. Joel and Jeff pointed out that C# is a glowing example of a language that quickly evolves and adds useful features, including things from the field of dynamic languages. Quite interesting. They made the crucial point that backwards compatibility in a language is not really needed, as long as you can link code compiled from the old and the new languages together. So, if C# 3.0 won&#8217;t compile all C# 2.0 code, it is no big deal, as you can still have the old C# 2.0 compiler around, and then link with the new C# 3.0 code.</p>
<p>The key is linkability between modules, not the standard of the input language. Here, Microsoft&#8217;s .net system is starting to make a very impressive showing, I think. C#, VB, F#, Python, Ruby &#8212; a ton of languages all share the same common language runtime and the basic libraries of .net. After hearing a talk by Tim Harris of Microsoft UK at <a href="http://www.it.uu.se/research/upmarc/MCC09/prog">MCC 2009</a>, I am even more impressed by what .net can do.</p>
<p>.net was also the topic of the <a href="http://www.twit.tv/floss82">FLOSS Weekly interview with the team behind IronPython</a>. IronPython is Python on top of the .net framework, and the interview went into a lot of interesting details on how that has played out. The short answer is: very impressive, very smart, and very much the way things should be.</p>
<p>Note that even if the perspective is that &#8220;ESL languages describe a single hardware chip configuration, which is fixed&#8221;, having a language which is more dynamic still helps.Remember that modeling is programming, and anything that makes programming more efficient is a good. All you need to do is to have a &#8220;freeze&#8221; operation that says that &#8220;this particular set of things is my design&#8221;. But you might get there by interactively adding and removing things at a command-line interface.</p>
<p>Working in OSCI CCI WG, I have come to realize just how useful reflection in languages like Python is (or as we implement it in Simics). When all you have is a static C++ compiled binary, you cannot easily do things like ask objects for their type and other metadata like documentation. Since it just is not there. While in Python, you can do such inspection, and also extend things at run-time, which is very useful. If you want to add configuration hooks to a class, Python makes it dead easy, while C++ makes it major painful.</p>
<h2>The Dream ESL Language</h2>
<p>Overall, what that I get from all of this that a sound design for an &#8220;ESL&#8221; language, had we started today, would be:</p>
<ul>
<li>Basic semantics given by a virtual machine, not an input language.</li>
<li>Opportunities for several different input languages of potentially very different styles to be used, all compiling and linking into the same VM. That would open up for real innovation.</li>
<li>Extensive reflection and introspection features.</li>
<li>Dynamic reconfiguration during run-time, optionally frozen if the goal is to actually describe some hardware design for synthesis. But such synthesis would be from VM code, not some input language.</li>
</ul>
<p>Essentially, taking the approach of providing a stable interoperability layer between languages in the form of a VM, and allowing languages to be anything anyone could care to invent.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1008/feed</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Finally, a Bug!</title>
		<link>http://jakob.engbloms.se/archives/975?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/975#comments</comments>
		<pubDate>Sun, 25 Oct 2009 20:41:20 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[embedded software]]></category>
		<category><![CDATA[multicore software]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Checkpointing]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[demo]]></category>
		<category><![CDATA[Linux kernel]]></category>
		<category><![CDATA[Simics]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=975</guid>
		<description><![CDATA[Part of my daily work at Virtutech is building demos. One particularly interesting and frustrating aspect of demo-building is getting good raw material. I might have an idea like &#8220;let&#8217;s show how we unravel a randomly occurring hard-to-reproduce bug using Simics&#8220;. This then turns into a hard hunt for a program with a suitable bug [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/10/butterfly.png"><img class="alignleft size-full wp-image-982" title="butterfly" src="http://jakob.engbloms.se/wp-content/uploads/2009/10/butterfly.png" alt="butterfly" width="90" height="91" /></a>Part of my daily work at Virtutech is building demos. One particularly interesting and frustrating aspect of demo-building is getting good raw material. I might have an idea like &#8220;let&#8217;s show how we unravel a randomly occurring hard-to-reproduce bug using <a href="http://www.virtutech.com/products/simics_hindsight.html">Simics</a>&#8220;. This then turns into a hard hunt for a program with a suitable bug in it&#8230; not the Simics tooling to resolve the bug. For some reason, when I best need bugs, I have hard time getting them into my code.</p>
<p>I guess it is Murphy&#8217;s law &#8212; if you really set out to want a bug to show up in your code,  your code will stubbornly be perfect and refuse to break. If you set out to build a perfect piece of software, it will never work&#8230;</p>
<p>So I was actually quite happy a few weeks ago when I started to get random freezes in a test program I wrote to show multicore scaling. It was the perfect bug! It broke some demos that I wanted to have working, but fixing the code to make the other demos work was a very instructive lesson in multicore debug that would make for a nice demo in its own right. In the end, it managed to nicely illustrate some common wisdom about multicore software. It was not a trivial problem, fortunately.</p>
<p><span id="more-975"></span>First, some notes about the program. It is a producer-consumer system using pthreads, with a single producer thread feeding a variable number of compute threads with data, over a shared queue structure (a simple one that uses a single lock to protect it, making it not very scalable for small data messages and lots of workers).</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/10/program-structure-2.png"><img class="aligncenter size-full wp-image-980" title="program structure 2" src="http://jakob.engbloms.se/wp-content/uploads/2009/10/program-structure-2.png" alt="program structure 2" width="411" height="237" /></a></p>
<p>The queue contains a circular buffer, managed using a standard set of full/empty/tail/head kinds of variables. There is also a flag &#8220;done&#8221; which is set once we are out of data, to tell the compute threads to shut down and terminate the program. As this program is used to demonstrate and test scaling, it is actually something that terminates. The main program spawns off all the threads, and then waits for all threads to finish before it terminates itself.</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/10/program-structure.png"><img class="aligncenter size-full wp-image-981" title="program structure" src="http://jakob.engbloms.se/wp-content/uploads/2009/10/program-structure.png" alt="program structure" width="300" height="458" /></a></p>
<p>This program and the queue subsystem had worked perfectly for a long time for me, running on an MPC8641 machine with a Linux 2.6.23 kernel, with 1 to 8 cores and 1 to 16 threads. Regardless of settings like thread counts, data sizes, number of packets to compute, it always ran smoothly and terminated.</p>
<p>However, the other week, I moved the program, the exact same binary even, over to a new software stack built on a Linux 2.6.27 kernel. Still on the same MPC8641 machine. Suddenly, I started to see occasional freezes where the program would never terminate. I added some more diagnostic printouts to the program, and saw that the main program would simply freeze waiting for the other threads to terminate and report in. The freezes had no real relationship to input variables. Maybe they were a bit more common with short packets, but no real pattern emerged. They also happened randomly, running the program with the same parameters for a few times in a row would sometimes result in a freeze. Using control-C to quit it and restart would keep the new instance of program running well. Doing some other demo work, I found the same effect on a P4080 machine with 8 cores and a 2.6.30 Linux kernel.</p>
<p>This is a common pattern for parallelism bugs: they only manifest themselves as actual visible crashes or freezes or bad computation results once something in the software stack has changed, even though the fundamental issues have been there all the time. In this case, I think it was the Linux scheduler, but it is really hard to tell. Just because a program runs fine today it does not have to run fine tomorrow.</p>
<p>After deciding to finally sit down and turn this lemon into lemonade, I had to reproduce the error. Thankfully, that is easy when you have a simulator. The first few times I had to run the target program 20 times or so before hitting the issue, but with some parameter and timing variations I managed to create a script that would open a <a href="http://jakob.engbloms.se/archives/714">checkpoint</a>, and run the program a few times under script control, triggering the bug on the fourth run (every time, thanks to determinism).</p>
<p>To diagnose the problem I wrote some Simics script code that I actually felt was fairly cool. I guessed that the problem had something to do with the queue and its handling of &#8220;done&#8221;, since that is what told the threads to terminate.</p>
<p>The first problem was that the queue was not a global variable. Instead, it was dynamically allocated on the heap by a function, and a pointer passed around, but never stored in a global variable (a good computer science graduate never uses a global variable other than as the means of last resort). Finally, my script set a breakpoint on the line in the setup function that came after the allocation. With the program stopped at that point, I could read the local variable pointing to the queue, and find and store the addresses of all the interesting members of the structure.</p>
<p>The code looked like this (Simics CLI), for the record:</p>
<pre> $mbp = ($ctx.break ($st.pos (rule30_threaded.c:222)))
 $cpu = (wait-for-breakpoint $mbp)
 $pq_addr  = ($cpu.sym "pq")
 $pq_tail  = ($cpu.sym "&amp;(pq-&gt;tail)")
 $pq_empty = ($cpu.sym "&amp;(pq-&gt;empty)")
 $pq_full  = ($cpu.sym "&amp;(pq-&gt;full)")
 $pq_head  = ($cpu.sym "&amp;(pq-&gt;head)")
 $pq_done  = ($cpu.sym "&amp;(pq-&gt;done)")</pre>
<p>Next, I set breakpoints on all writes to empty, full, and done. This was the most expedient route to catch actual puts and gets to the queue. Breakpoints on the queue_put() and queue_get() functions are not really showing the true flow, as these functions start by contending for the lock. Looking at writes to the actual queue members gave me the point where the tasks had grabbed the lock.</p>
<p>The script that caught all writes to done, full, and empty, and on each write, it dumped the state of the queue including computing out the number of elements in the circular buffer (without having to run any code on the target). To get an idea for who was active, it also used OS awareness to find the currently executing thread ID, and scripted debugging to convert the current program counter into a position in the program source code (actually, the important issue was the name of the function we were executing in).</p>
<p>This trace of activity showed quite an interesting pair of patterns. When the program ran well, the queue was mostly full, and it looked like the producer task always got some kind of priority to fill it before consumers could get in and drain it. When the program froze, the queue was seldom more than a few elements deep. This was the same program, on the same kernel, just run a few milliseconds later.</p>
<p>Clearly, the Linux kernel can exhibit quite variable behavior even for a program this simple. I guess that&#8217;s why this is called &#8220;soft real time&#8221;&#8230; Another parallelism lesson here: the scheduler is very important, and a smart adaptive scheduler can wreak havoc with software that was accidentally tuned for a different scheduler.</p>
<p>In the end, the crucial hint was that whenever the program froze, the &#8220;done&#8221; flag was set with a queue that was empty or contained just a few elements. I was sure that I had handled this case in my code, checking specifically for that and making sure to wake up the other threads with a signal that &#8220;the queue is not empty any more, please come check for more work&#8221;&#8230; but looking closely at the code, it turned out the code only woke up a single thread. Thus, the froze resulted from the producer setting &#8220;done&#8221; with an empty queue, waking up a single compute thread, and then having the other threads wait forever for more data to be put into the queue. The fix was easy: use a broadcast signal rather than a single signal.</p>
<p>In retrospect, it seems really strange that this ever worked reliably&#8230; it almost that I suspect the old Linux kernel of having a flawed pthreads implementation where signals always wake up all waiting threads, and not just a single one like the documentation says. But that will wait for another day to be investigated.</p>
<p>Here is the code, for reference:</p>
<pre>void rule30_packet_queue_signal_done(rule30_packet_queue_t *q) {
 //
 // Grab lock, set the done signal atomically
 //
 pthread_mutex_lock (&amp;(q-&gt;mutex));
 q-&gt;done = 1;
 pthread_mutex_unlock (&amp;(q-&gt;mutex));
 // Signal any threads waiting for data to wake up
 // and discover that we are indeed done
 //
 // This is the bug:
 // - It only wakes up one thread...
 pthread_cond_signal (&amp;(q-&gt;notEmpty));
 // To be correct:
 // pthread_cond_broadcast (&amp;(q-&gt;notEmpty));
}</pre>
<p><em>Updated analysis:</em></p>
<p>My initial analysis was that when things worked, the &#8220;done&#8221; flag was set with enough data left in the queue that all threads had a chance to pull in data and come in and see the done flag being set.</p>
<p>However, today I went back and wrote a deeper analysis script that also checked for reads from the done flag (turning this check on only after the write to &#8216;done&#8217; to reduce the noise). I expected there to be a single reader when the freeze happened&#8230; but that was not the case. In my current test case, three out of five threads actually got in to read the done flag and terminate.  The crucial code for the compute threads looks like this:</p>
<pre> // Grab mutex,
 //   Check if the queue is empty, if so wait for someone
 //   to push something onto the queue, or signal done.
 //   both of which are done by setting the not_empty conditional variable
 pthread_mutex_lock (&amp;(queue-&gt;mutex));
 while ((queue-&gt;empty) &amp;&amp; !(queue-&gt;done)) {
   pthread_cond_wait (&amp;(queue-&gt;notEmpty), &amp;(queue-&gt;mutex));
 }</pre>
<p>To freeze, a thread actually has to be doing the conditional wait here. There are plenty of other places threads can be as the program is finishing. For example, they can be waiting to grab the initial mutex lock, or actually doing compute work. That explains why some threads actually still terminate even with the buggy version. It certainly also illustrates just how chaotic concurrent programs can be. More so that you can ever imagine, really.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/975/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The S4D Debug Conference</title>
		<link>http://jakob.engbloms.se/archives/942?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/942#comments</comments>
		<pubDate>Sun, 27 Sep 2009 19:38:27 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[EDA]]></category>
		<category><![CDATA[appearances]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[FDL]]></category>
		<category><![CDATA[gdb]]></category>
		<category><![CDATA[Hardware debug support]]></category>
		<category><![CDATA[p4080]]></category>
		<category><![CDATA[S4D]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=942</guid>
		<description><![CDATA[An unplanned and unexpected bonus with my trip to the FDL 2009 conference was the co-located S4D conference. S4D means System, Software, SoC and Silicon Debug, and is a conference that has grown out of some recent workshops on the topic of debugging, as seen from the perspective of hardware designers (mostly). S4D was part [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-941" title="S4D" src="http://jakob.engbloms.se/wp-content/uploads/2009/09/S4D1.jpg" alt="S4D" width="143" height="62" />An unplanned and unexpected bonus with my trip to the FDL 2009 conference was the co-located <a href="http://www.ecsi-association.org/ecsi/s4d/s4d09/mainpage.asp">S4D conference</a>. S4D means <em>System, Software, SoC and Silicon Debug</em>, and is a conference that has grown out of some recent workshops on the topic of debugging, as seen from the perspective of hardware designers (mostly). S4D was part of the same package as FDL and DASIP, entrance to one conference got you into the other two too. As I did not know about S4D until quite late in the process, this was a great opportunity for me to look at what they were doing.</p>
<p><span id="more-942"></span></p>
<p>It was sufficiently interesting that I spent all of Thursday in S4D rather than in  FDL. It was really the first time that I have seen so many people working with practical embedded systems debug in the same room. Debug tends to be a topic at embedded systems conferences of various kinds, but then mostly from a fairly superficial technical perspective: assuming fairly simple software tools. Here,  there were presentations on how current hardware debug is being extended to incorporate powerful trace and debug and synchronous stop facilities.</p>
<p>It was very interesting to see Infineon, ST, and ARM present their work in on-chip debug. Users at ST, Nokia and Continental presented their view of debug requirements, uses, and current home-grown tools. There were presentations from EDA vendors showing off debuggers for hardware designs and some virtual platforms tools for software debug. Freescale presented how their HyperTRK debug agent works with their P4080 hypervisor, covering the software-instrumentation approach. Debug tends to be a field neglected by academia, but there were some academic papers presented as well. <a href="http://sourceware.org/gdb/wiki/GDB_7.0_Release">gdb7</a>&#8216;s multi-threaded debug abilities were mentioned. Pretty much the only topic missing in action was reverse execution.</p>
<p>This mixed audience gave rise to quite a few interesting discussions during the day. It was simple fun, as far as I am concerned.</p>
<p>The following were the main themes addressed and discussed:</p>
<ul>
<li>How to make customers of silicon chips appreciate the on-chip debug and not just consider it an unnecessary cost that could be avoided if only their software engineers did not make any mistakes. Answer: sell it as a performance optimization tool instead.</li>
<li>Multicore debug, including hardware-supported tracing and synchronized stop of multiple cores on a single SoC.</li>
<li>Given that we have massive traces from hardware and software debug and trace facilities, how can we actually find errors? Processing of trace information to detect anomalies is going to be an important issue in the future.</li>
<li>Performance bugs are the next frontier, after current concerns with functionality bugs.</li>
</ul>
<p>If I were to take a critical look at the conference and its scope, there were some things that were not covered.</p>
<ul>
<li>System-level debug, outside the scope of a single SoC, was not in any talk.</li>
<li>Almost all the speakers and attendees came from the world of consumer electronics and automotive systems. It would have been nice with some input from long-time parallel world of servers and operating systems, such as Microsoft&#8217;s debugger teams.  In a sense, this is the inverse of my complaint about the <a href="http://jakob.engbloms.se/archives/905">SiCS Multicore Day 2009</a>.</li>
<li>As well as compiler people involved in creating debug information and how they deal with parallel programs.</li>
<li>Security vs debuggability, a <a href="http://jakob.engbloms.se/archives/799">favorite topic </a>of <a href="http://www.strombergson.com/kryptoblog/">Joachim Strömbergsson</a>. It would have been fun if Joachim would have been there. I asked Rolf Kühnis from Nokia about <a href="http://www.mipi.org/">security in MIPI</a>, and he said that it simply was not in scope for MIPI: each manufacturer deals with it in their own way.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/942/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>FDL Impressions</title>
		<link>http://jakob.engbloms.se/archives/936?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/936#comments</comments>
		<pubDate>Thu, 24 Sep 2009 07:24:42 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[EDA]]></category>
		<category><![CDATA[appearances]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Checkpointing]]></category>
		<category><![CDATA[FDL]]></category>
		<category><![CDATA[Peter Flake]]></category>
		<category><![CDATA[SystemC]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=936</guid>
		<description><![CDATA[This is end of the second day of FDL 2009, and it is proving to be quite an interesting experience. The location is very bad, apart from the weather (coming from a Swedish Fall where temperatures are dropping towards 10 C, to a sunny 27 C is quite nice). But Sophia Antipolis is just a [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/08/fdllogosmall.jpg"><img class="alignleft size-full wp-image-881" style="margin: 5px 10px;" title="fdllogosmall" src="http://jakob.engbloms.se/wp-content/uploads/2009/08/fdllogosmall.jpg" alt="fdllogosmall" width="80" height="79" /></a>This is end of the second day of <a href="http://www.ecsi-association.org/ecsi/fdl/fdl09/mainpage.asp">FDL 2009</a>, and it is proving to be quite an interesting experience. The location is very bad, apart from the weather (coming from a Swedish Fall where temperatures are dropping towards 10 C, to a sunny 27 C is quite nice). But Sophia Antipolis is just a tech park with some hotels, and you cannot get anywhere interesting or civilized without a car. No shops, no restaurants except for hotels, and so sidewalks in parts.</p>
<p>But the conference is good enough to be worth the bodily discomforts. And I did find a nice Parcours Sportif for the morning run, as well as a nice breakfast buffet at the Mercure Hotel.</p>
<p><span id="more-936"></span>So what were the highlights and themes of FDL?</p>
<ul>
<li>SystemC is literally everywhere, it is really the only simulation kernel that researchers are using. Often not so much for hardware simulation, as rather for general simulation of timed concurrent processes. Not exactly what it was designed for&#8230;</li>
<li>There is a lot of work on bridging abstraction levels and using multiple levels of timing detail for different purposes. That is a nice change from a tradition of &#8220;everything has to be cycle accurate&#8221; that tended to come out of hardware design in previous years.</li>
<li>Architecture exploration is big, as always.</li>
<li>Validity of virtual platforms and models keep coming up, some people are really too concerned about precise agreement with hardware. In practice, it does not matter than much if it is only 95% correct and 90% complete, as the software will work well enough anyway for the platform to be useful&#8230; but that is a hard message for hardware people to accept.</li>
<li>ST-Ericsson&#8217;s ex-NXP local office gave a couple of interesting presentation of how they were using SystemC. For one of the groups, they had an interesting confusion between &#8220;SystemC&#8221; and &#8220;Virtual Platforms&#8221;. They could not quite keep the language and application of it apart, which is indicative of the language-centricity of hardware designers in general. They did not even equate it with their tool, which would have been logical (they are using CoWare).</li>
<li>Peter Flake made some really good points and asked good questions in almost every presentation session. I definitely respect his deep understanding .</li>
</ul>
<p>I presented my talk on SystemC and Checkpointing, and it was quite interesting to hear the questions. The two main themes of my presentation was the explicit conversion from internal state to the external state held in a checkpoint, and the necessity to not use threads to enable decent checkpointing. The threading discussion continued for a quite a while&#8230; and led to some interesting observations.</p>
<p>It seems that most people can accept the idea of abandoning threads in SystemC for hardware modeling&#8211; but not for software modeling <img src='http://jakob.engbloms.se/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> . Essentially, considering a hardware unit as an event-driven state machine is fairly natural and easy to understand. But when people try to model software behavior (directly in a SystemC model, not using an ISS to run the real code), they tend to think that threads are more natural and easy. However, for a typical software development use-case for a virtual platform you will run software on an ISS. I think we might have a useful generally acceptable design point for modeling coming up here, with hardware modeled as event-driven blocks that can be checkpointed and controlled by the simulator.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/936/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Freescale P4080, in Physical Form</title>
		<link>http://jakob.engbloms.se/archives/933?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/933#comments</comments>
		<pubDate>Thu, 17 Sep 2009 10:16:37 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[appearances]]></category>
		<category><![CDATA[embedded software]]></category>
		<category><![CDATA[embedded systeme]]></category>
		<category><![CDATA[multicore computer architecture]]></category>
		<category><![CDATA[multicore debug]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[DWF]]></category>
		<category><![CDATA[freescale]]></category>
		<category><![CDATA[heterogeneous]]></category>
		<category><![CDATA[homogeneous]]></category>
		<category><![CDATA[Jonas Svennebring]]></category>
		<category><![CDATA[MPC5606]]></category>
		<category><![CDATA[p4080]]></category>
		<category><![CDATA[Simics]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=933</guid>
		<description><![CDATA[Past Tuesday, I attended the Freescale Design With Freescale (DWF) one-day technology event in Kista, Stockholm. This is a small-scale version of the big Freescale Technology Forum, and featured four tracks of talks running from the morning into the afternoon. All very technical, aimed at designing engineers. There were several topic areas, such as automotive, [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/08/freescale-logo-icon.png"><img class="alignleft size-full wp-image-878" style="margin: 5px 10px;" title="freescale-logo-icon" src="http://jakob.engbloms.se/wp-content/uploads/2009/08/freescale-logo-icon.png" alt="freescale-logo-icon" width="80" height="80" /></a>Past Tuesday, I attended the Freescale Design With Freescale (DWF) one-day technology event in Kista, Stockholm. This is a small-scale version of the big Freescale Technology Forum, and featured four tracks of talks running from the morning into the afternoon. All very technical, aimed at designing engineers.</p>
<p><span id="more-933"></span>There were several topic areas, such as automotive, consumer, and networking. Networking was mostly focused on the issues of multicore hardware and software.</p>
<p>Of particular interest to me was to see a <a href="http://www.freescale.com/webapp/sps/site/overview.jsp?nodeId=0162468rH3bTdG25E4">Freescale QorIQ P4080 </a>8-core networking/control-plane processor live for the first time. This chip was <a href="http://jakob.engbloms.se/archives/137">announced in the Summer of 2008</a>, with a full ecosystem of software support thanks to <a href="http://www.virtutech.com/qoriq">Virtutech Simics</a>. Now, when the silicon is here, software is indeed running on it thanks to the long headstart development got with the virtual platform. Note that several demos at the event used the Simics simulator to show the software support for the P4080, as there was only a single chip to go around.</p>
<p>I would have loved to have a meaningful picture of the first P4080 in Europe, but  a chip is not really very photogenic &#8211; the P4080 processor was in an open computer case, but covered with a 10 cm-high heat sink which made it fairly hard to actually see. That&#8217;s the challenge with infrastructure things: they are not designed to be seen&#8230; just to do their job well. If you have a new consumer electronics processor, you can at least drive a screen quickly or something. But watching 28 Gbps of Ethernet traffic is not as easy <img src='http://jakob.engbloms.se/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Jonas Svennebring of Freescale gave a good talk about how the process of bringup on the P4080 had worked out. It was a total validation of the methodology of using virtual platforms, at different levels of abstraction, and slipping in a bit of hardware emulation as well.</p>
<p>Freescale started software development on the functional fast model, and when clock-cycle-level detailed models of subsystems became available, they started using them as well for performance validation for small pieces of code. Any discrepancies in behavior between the two models was then used to correct the models and documentation. Finally, as the RTL for the silicon began to become available, they used a few emulation setups to run parts of the actual RTL (the emulator could only handle a subset of the entire chip), and validate the performance numbers in the detailed model and the behavior of both models. In the end, when the first silicon became available, Linux was up in a very short time (I cannot give the exact number, but it was a matter of days rather than weeks).</p>
<p>This is the typical iterative process that all chip designers are implementing today: using virtual platforms you can get a head start on development of software, and then as more details become available, you tune models and update both designs, models, and software, iterating towards a hardware/software combination that just works once the silicon realization of the hardware comes around.</p>
<p>So that was all cool.</p>
<p>Jonas also showed a die photo of the QorIQ, and that confirmed by opinion from the <a href="http://jakob.engbloms.se/archives/905">SiCS Multicore Day</a>: embedded multicore is not just about processor cores and cache, it is very much about accelerators to help offload repetitive work from the processing cores. More than half the chip was such acceleration logic! To me, this is a clear confirmation that heterogeneity is the future of hardware design, and a useful way to spend hundreds of millions of transistors to boost SoC performance.</p>
<p>The same was true for most other Freescale hardware showcased at the event. For example, there was the <a href="http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=MPC560xS">MPC5606S dashboard processor</a>, running an LCD display with lots of dynamic graphics with 0.2% CPU load on a 60 MHz e200 Power Architecture processor. All the work was done by its display driver and accelerator. It is hard to argue with that kind of efficiency. That chip did not need a heatsink, either. It was just mounted on the back of an example board with no need for any external logic chips. Apparently, it could also have moved some physical gauges and blinked LEDs, but that demo was considered too distracting for this particular setting.</p>
<p>I also gave a talk at the DWF, about debugging software on multicore using virtual platforms. That was fun, as always. Need to get out more on the road and talk in conferences, I think <img src='http://jakob.engbloms.se/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/933/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Another Layer of Virtual Indirection</title>
		<link>http://jakob.engbloms.se/archives/893?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/893#comments</comments>
		<pubDate>Sun, 23 Aug 2009 19:41:06 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[ethernet]]></category>
		<category><![CDATA[indirection]]></category>
		<category><![CDATA[networking]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=893</guid>
		<description><![CDATA[After a long break, this is another blog post in the series of &#8220;how to do modeling for virtual platforms&#8221;. The previous installments dealt with checkpointing and determinism. This post is about the use of indirection in a model to increase its flexibility and ease of use, at the cost of a bit more work [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-486" title="gears-modeling" src="http://jakob.engbloms.se/wp-content/uploads/2008/12/gears-modeling.png" alt="gears-modeling" width="62" height="65" />After a long break, this is another blog post in the series of &#8220;how to do modeling for virtual platforms&#8221;. The previous installments dealt with <a href="http://jakob.engbloms.se/archives/714">checkpointing </a>and <a href="http://jakob.engbloms.se/archives/734">determinism</a>.</p>
<p>This post is about the use of <strong>indirection </strong>in a model to increase its flexibility and ease of use, at the cost of a bit more work for the first model to be created.In particular, indirection in the sense of having explicit objects in a simulation to represent things like networks and cables connecting virtual machines.</p>
<p><span id="more-893"></span>There is a well-known saying (by <a href="http://en.wikipedia.org/wiki/David_Wheeler_%28computer_scientist%29">David Wheeler</a>) that &#8220;any problem in computer science can be solved with another                                 layer of indirection&#8221;. Among computer architects, this is often used with addition &#8220;&#8230;or a cache&#8221;. I think this is true, most of the time. The number of times that adding some indirection to an architecture for a program has simplified it &#8212; or made it feasible at all &#8212; are too many to count. It is at the very core of object-oriented programming, and the number of times you end up passing around function pointers is innumerable.</p>
<p>In the world of virtual platforms, there is one particular area where I see a pretty useful layer of indirection missing. Networks. Many virtual platform solutions offer various ways of connecting a virtual platform to a physical world, for interfaces like USB, Ethernet, or serial. Most virtual platforms achieve this by making the virtual hardware directly connect to the outside world.</p>
<p>Here is an illustration for Ethernet, where I have included a PHY in the picture. Quite often, you don&#8217;t even get that, just an Ethernet device that includes its PHY and connects out to a physical network. That&#8217;s what Qemu tends to do, for example.</p>
<p><img class="aligncenter size-full wp-image-895" title="No indirection" src="http://jakob.engbloms.se/wp-content/uploads/2009/08/No-indirection.png" alt="No indirection" width="248" height="311" />This approach is the simplest when your thinking is that you will model a single device, and simulate one virtual machine at a time, connecting to the physical network to receive stimulus. For USB, this means the useful feature of connecting a camera or USB disk on your PC to the virtual machine. And as a bonus, you can connect multiple machines together using some form of cross-connection on the PC (such as TAP network interface).</p>
<p>However, there is a much better structure that is employed in some simulators. It is based on making each network an explicit object in the simulation, and have all virtual devices talk to the virtual network. Connections to the physical world are then handled by the virtual network, or, even better, by another device attached to the same virtual network.What you also get is the ability to connect multiple virtual devices to each other over the virtual network, and to easily write simulation modules that inspect or do fault-injection on the network traffic.</p>
<p>The picture below illustrates the idea for Ethernet:</p>
<p><img class="aligncenter size-full wp-image-896" title="indirection" src="http://jakob.engbloms.se/wp-content/uploads/2009/08/indirection.png" alt="indirection" width="369" height="356" />The cost of this architecture is that you have to create the virtual network object, and invent the interface between devices and the network. This increases the cost for the first network device you create, and if all  you are tasked with is that single device, I can see why some simulation designers took the direct route. However, if you think about the task of creating tens of devices connecting to the same type of network, the &#8220;cost&#8221; of creating a virtual network is actually negative. Using an indirect approach like this makes creating each device simpler, and each device immediately gets the benefit of all the services that have been added to the virtual network. As long as a device can connect to the virtual network, it can connect to the physical network without any extra coding or cost.</p>
<p>Encapsulating entire networks with multiple virtual machines within a single simulation session <a href="http://www.virtutech.com/whitepapers/networking.html">is also very beneficial for control, inspection, and determinism. </a>Relying on a physical connection between virtual machines makes all packets pass the unreliable and random real world on their way between machines, destroying any determinism or control you might have hoped to incur.</p>
<p>In the world of SystemC simulation, an indirect approach like this is also a way to overcome some silly language limitations. Unbelievable as it might sound to the uninitiated, in SystemC you set up a simulation once into a single static setup (in something called the elaboration phase), and then that is what you simulate. There is no option to setup connection between modules or even add new modules to the simulation after the initial setup. Here, you can use a layer of indirection as a work-around. At the  start of simulation, connect all devices that might at some point in time be connected to a particular network to that network. During simulation, configure and reconfigure the network module to only allow traffic from and to certain modules, essentially creating a useful illusion that they are connected and disconnected from the network.</p>
<p>I hope I have convinced you: if you ever build a virtual platform, make sure to make all connections indirect.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/893/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Toast to Abstraction Layers</title>
		<link>http://jakob.engbloms.se/archives/888?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/888#comments</comments>
		<pubDate>Thu, 13 Aug 2009 19:41:47 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[gadgets]]></category>
		<category><![CDATA[general research]]></category>
		<category><![CDATA[virtual machines]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[abstraction]]></category>
		<category><![CDATA[abstraction levels]]></category>
		<category><![CDATA[DAC 2009]]></category>
		<category><![CDATA[information hiding]]></category>
		<category><![CDATA[TheToasterProject]]></category>
		<category><![CDATA[Thomas Thwaites]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=888</guid>
		<description><![CDATA[I just found &#8220;The Toaster Project&#8220;, a Royal College of Art project where Thomas Twaites built a simple toaster from scratch. Really from scratch, going all they way back to iron ore and raw petroleum. In the process, he had to smelt ore, create plastic from petroleum, etc. It is a very interesting observation about [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-890" style="margin: 10px 5px;" title="toaster" src="http://jakob.engbloms.se/wp-content/uploads/2009/08/toaster.png" alt="toaster" width="81" height="87" />I just found &#8220;<a href="http://www.thomasthwaites.com/thomas/toaster/page2.htm">The Toaster Project</a>&#8220;, a Royal College of Art project where <a href="http://www.thomasthwaites.com/">Thomas Twaites </a>built a simple toaster from scratch. Really from scratch, going all they way back to iron ore and raw petroleum. In the process, he had to smelt ore, create plastic from petroleum, etc. It is a very interesting observation about the immense industrial complexity behind the very simple everyday items of our lives. I also think it has something to tell us computer scientists about abstraction.</p>
<p><span id="more-888"></span>What Thomas is showing is just how efficiently today&#8217;s economy manages to hide complexity from consumers (users). That toaster is just there on the shelf at a very low cost. If you take it apart, you will note that it is made from plastic which has been moulded to shape, and various bits and pieces of steel and copper wires. At that level, you feel that you could almost build it yourself. However, that is just the tip of the iceberg. What the toaster project reveals is the next level of abstraction and information hiding going on: that copper wire contains an enormously complex process in its making. From ore extraction, energy production to fuel the process, copper foundries, factories converting raw copper into wires, and a huge logistical machine to move things around.</p>
<p>In essence, we have a very nice example of information hiding and abstraction. As a user of the toaster, I do not need to understand how it works, and I do definitely not have any idea of the huge chain of suppliers leading up to its presence on my breakfast table.</p>
<p>That&#8217;s where are going with computers, but it is going to take time. Today, most users are fairly well shielded from how computers really work. Until they break down, at least. As programmers, we are less lucky. In practice, most good programmers end up understanding at least the basics of assembly language and the memory hierarchy of the machine.</p>
<p>What is hidden today is mostly the innards of the silicon. I have no real idea of how a processor works at the level of transistors and electrons. I don&#8217;t have to care about that, while any computer user fifty years ago probably had a decent understanding of the electronics. If nothing else, that was how you investigated hardware faults and actually built computers in a factory. Before integrated circuits, the electronic bits were much more exposed.</p>
<p>I think the current trend towards virtualization in the IT space and virtual platforms in the system design space is showing that the abstraction stack we are using in computing is getting deeper and more opaque. It takes some getting used to, but in the end, we have to realize that most computer programmers will be like the toaster user. All they want is a virtual toaster that toasts virtual bread in a way that lets them do their job.That is: write software that really does not care that much about the particulars of the hardware it is running on.</p>
<p>For the designer of a toaster (or even worse, the manufacturer of copper wire or the oil producer for the raw materials for the plastics), this takes some getting used to. We have to accept that in many cases, a simple abstraction is sufficient to help programmers get moving. There is no need for perfect timing accuracy or all the details of bus transactions. As long as what comes out is sufficiently similar to toast (a virtual toaster spitting out candy would be a bad abstraction), most users are happy.</p>
<p>Brian Bailey touched on this in a blog post following DAC, called &#8220;<a href="http://www.chipdesignmag.com/bailey/2009/07/30/accuracy-does-not-imply-accuracy/">Accuracy does not imply accuracy</a>&#8220;. Same idea as the toaster: you have to accept less detail, more abstraction, to get somewhere useful. Not everyone needs to go back to basics&#8230; and doing so tends to be counter productive in the end.</p>
<p>It is late now, but I think I will have toast and jam for breakfast tomorrow. Writing this got me hungry.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/888/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Checkpointing in SystemC @ FDL</title>
		<link>http://jakob.engbloms.se/archives/880?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/880#comments</comments>
		<pubDate>Sat, 08 Aug 2009 19:48:26 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[ESL]]></category>
		<category><![CDATA[appearances]]></category>
		<category><![CDATA[articles]]></category>
		<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Checkpointing]]></category>
		<category><![CDATA[FDL]]></category>
		<category><![CDATA[GreenSocs]]></category>
		<category><![CDATA[Marius Monton]]></category>
		<category><![CDATA[Mark Burton]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[SystemC]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=880</guid>
		<description><![CDATA[Along with Marius Monton and Mark Burton of GreenSocs, I will be presenting a paper on checkpointing and SystemC at the FDL, Forum on Specification and Design Languages, in late September 2009. The paper will explain how we did Simics-style checkpointing in SystemC, using the GreenSocs GreenConfig mechanisms to obtain an approximation for the Simics [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-881" style="margin: 5px;" title="fdllogosmall" src="http://jakob.engbloms.se/wp-content/uploads/2009/08/fdllogosmall.jpg" alt="fdllogosmall" width="80" height="79" />Along with Marius Monton and Mark Burton of <a href="http://www.greensocs.com">GreenSocs</a>, I will be presenting a paper on <a href="http://jakob.engbloms.se/archives/714">checkpointing </a>and <a href="http://www.systemc.org">SystemC </a>at the FDL, <a href="http://www.ecsi-association.org/ecsi/fdl/fdl09/mainpage.asp?fn=advance">Forum on Specification and Design Languages</a>, in late September 2009.</p>
<p>The paper will explain how we did <a href="http://www.virtutech.com/whitepapers/simics_checkpointing.html">Simics-style checkpointing </a>in SystemC, using the GreenSocs GreenConfig mechanisms to obtain an approximation for the Simics attribute system.</p>
<p><span id="more-880"></span>It is an approach that does not have the limitations of the &#8220;save the entire simulation process&#8221; method employed by Cadence (and I think also CoWare) in their <a href="http://jakob.engbloms.se/archives/817">SystemC checkpointing solution</a>. It does require you to mark all relevant state in your models, but the benefit from doing so is that regardless of how you change the code of a model, you can still use the same old checkpoints. It is also portable across hosts. We did have to do some patching to the OSCI SystemC kernel to draw out and reset all relevant state from the kernel. The OSCI kernel does not provide sufficient interfaces to checkpoint its state in its vanilla form.</p>
<p>The conference takes place on September 22 to 24, in Sophia Antipolis in France. Now all I have to do is figure out how to get there in the most convenient way. I expect this to be as much fun as the other EDA conferences I have been to recently (I seem to only go to such events nowadays, nothing left on the old embedded circuit for me it seems).</p>
<p>By the way, the FDL logo is really pretty. I think all long-running events should spend the time to create a recognizable logo. My old real-time conferences used to just have plain text and the <a href="http://www.ieee.org">IEEE </a>and <a href="http://www.acm.org">ACM </a>logos.</p>
<p><img class="aligncenter size-full wp-image-882" title="fdl_logo_new" src="http://jakob.engbloms.se/wp-content/uploads/2009/08/fdl_logo_new.jpg" alt="fdl_logo_new" width="435" height="159" /></p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/880/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Downloadable Book about Embedded Multicore</title>
		<link>http://jakob.engbloms.se/archives/877?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/877#comments</comments>
		<pubDate>Sat, 08 Aug 2009 19:27:08 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[books]]></category>
		<category><![CDATA[embedded software]]></category>
		<category><![CDATA[embedded systeme]]></category>
		<category><![CDATA[multicore computer architecture]]></category>
		<category><![CDATA[multicore debug]]></category>
		<category><![CDATA[multicore software]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[freescale]]></category>
		<category><![CDATA[John Logan]]></category>
		<category><![CDATA[Jonas Svennebring]]></category>
		<category><![CDATA[Patrik Strömblad]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=877</guid>
		<description><![CDATA[Freescale has now released the collected, updated, and restyled book version of the article series on embedded multicore that I wrote last year together with Patrik Strömblad of Enea, and Jonas Svennebring, and John Logan of Freescale. The book covers the basics of multicore software and hardware, as well as operating systems issues and virtual [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.freescale.com"><img class="alignleft size-full wp-image-878" style="margin-left: 5px; margin-right: 5px;" title="freescale-logo-icon" src="http://jakob.engbloms.se/wp-content/uploads/2009/08/freescale-logo-icon.png" alt="freescale-logo-icon" width="80" height="80" /></a>Freescale has now released the collected, updated, and restyled <a href="http://www.freescale.com/files/32bit/doc/ref_manual/EMBMCRM.pdf">book version </a>of the article series on embedded multicore that I <a href="http://jakob.engbloms.se/archives/423">wrote last year </a>together with Patrik Strömblad of <a href="http://www.enea.com">Enea</a>, and Jonas Svennebring, and John Logan of <a href="http://www.freescale.com">Freescale</a>. The book covers the basics of multicore software and hardware, as well as operating systems issues and virtual platforms. Obviously, the virtual platform part was my contribution.</p>
<p><span id="more-877"></span></p>
<p>It is one of the more comprehensive introductions to how to think about and use multicore architectures in the high-end embedded space. It is free to download and print, but if you want a printed copy, such can be ordered at a price of (I am told) 15 USD (did not try it myself).</p>
<p>The PDF is at <a href="http://www.freescale.com/files/32bit/doc/ref_manual/EMBMCRM.pdf">http://www.freescale.com/files/32bit/doc/ref_manual/EMBMCRM.pdf </a>.</p>
<p>It will also be linked from the &#8220;Documentation&#8221; section for most Freescale multicore chips&#8217; information pages.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/877/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The TLM DAC</title>
		<link>http://jakob.engbloms.se/archives/865?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/865#comments</comments>
		<pubDate>Thu, 30 Jul 2009 22:47:23 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[EDA]]></category>
		<category><![CDATA[ESL]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[DAC]]></category>
		<category><![CDATA[GreenSocs]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[SystemC]]></category>
		<category><![CDATA[tlm]]></category>
		<category><![CDATA[TLM-2.0]]></category>
		<category><![CDATA[Virtutech]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=865</guid>
		<description><![CDATA[The past few days here at DAC, a big theme has been transaction level modeling (TLM). TLM is often considered to be SystemC TLM-2.0. Most of the statements from the EDA companies are to the effect that SystemC TLM-2.0 solves the problem of combining models from different sources. Scratching the surface of this happy picture, [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-824" style="margin: 5px;" title="46daclogo" src="http://jakob.engbloms.se/wp-content/uploads/2009/07/46daclogo.gif" alt="46daclogo" width="81" height="73" />The past few days here at <a href="http://www.dac.com/46th/index.aspx">DAC</a>, a big theme has been transaction level modeling (TLM).</p>
<p>TLM is often considered to be <a href="http://www.systemc.org/apps/group_public/workgroup.php?wg_abbrev=tlmwg">SystemC TLM-2.0</a>. Most of the statements from the EDA companies are to the effect that SystemC TLM-2.0 solves the problem of combining models from different sources. Scratching the surface of this happy picture, it is clear that it is not that simple&#8230;</p>
<p><span id="more-865"></span>The issue is that even if all agree on using the TLM-2.0 standard and its default standard generic memory-mapped bus protocol and payload for the memory-map part of their device models, there are other interfaces which are not standard at this point in time.</p>
<p>For example, there is no standard way to model interrupts between devices. So any time you have interrupts in a system (which tends to be always), you need to write custom wrappers between modules to convert different ways of modeling interrupts. Even worse, the standard way to do it is to use SystemC signals, which are definitely not TLM abstractions. They take a detour through the SystemC kernel, which is quite costly.</p>
<p>The defining property (from a simulation execution perspective) of TLM is that your simulation modules talk directly to each other through <em>direct function calls</em>, rather than passing over whatever simulation kernel you happen to be using. Essentially, TLM tends to convert simulators into being much more like &#8220;regular programs&#8221;, with fewer references to the simulation kernel and its event and time handling. In my world, unless you are doing direct function calls, you are not doing TLM.</p>
<p>Note that this state of things in the SystemC world is likely to change for the better over time. <a href="http://www.greensocs.com">GreenSocs</a> announced at DAC that they are working with <a href="http://www.virtutech.com">Virtutech </a>and an unnamed other partner to create a set of TLM interfaces for other interconnects, such as signals (interrupts under another name), serial, and Ethernet.</p>
<p>But apart from all the technicalities of SystemC TLM-2.0 and how it works, the big question is just what to use TLM for, and how. Here, everyone seems to try to turn TLM into their own use cases. The most obvious application is doing fast virtual platforms, but you also have TLM use as the basis for hardware synthesis, validation, golden reference models, architectural exploration, and pretty much all other EDA design tasks.</p>
<p>Even so, the most important message for me is that the EDA industry is actually starting to get interesting in TLM. It is no longer a quaint odd thing done by some peripheral start-up companies, but rather a mainstream technology that everyone has to pay attention to.</p>
<p>Finally, I want to point out that TLM is not just SystemC. TLM is a general idea that has been in <a href="http://jakob.engbloms.se/archives/130">active use since the late 1960s</a>. It is the obvious way to model a computer, if all you are concerned about is how it looks to the software. Another current example is the <a href="http://www.virtutech.com/whitepapers/simics-tlm.html">Simics style of TLM</a> (<a href="http://www.virtutech.com/whitepapers/modeling.html">and here</a>), which is similar to but different in details from the SystemC implementation.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/865/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>DAC 2009 Panel and Paper</title>
		<link>http://jakob.engbloms.se/archives/823?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/823#comments</comments>
		<pubDate>Wed, 01 Jul 2009 12:38:58 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[EDA]]></category>
		<category><![CDATA[appearances]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Cadence]]></category>
		<category><![CDATA[DAC]]></category>
		<category><![CDATA[hardware-software interface]]></category>
		<category><![CDATA[Jason Andrews]]></category>
		<category><![CDATA[Ross Dickson]]></category>
		<category><![CDATA[Wild West panel]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=823</guid>
		<description><![CDATA[The 46th Design Automation Conference (DAC) is coming up in San Francisco in the US, last week of July. For me, this will be the first time I ever go to DAC. I have been to a couple of Design Automation and Test Europe  (DATE) conferences before, but DAC is supposedly even bigger as an [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-824" style="margin: 5px;" title="46daclogo" src="http://jakob.engbloms.se/wp-content/uploads/2009/07/46daclogo.gif" alt="46daclogo" width="81" height="73" />The <a href="http://www.dac.com/46th/index.aspx">46th Design Automation Conference (DAC) </a>is coming up in San Francisco in the US, last week of July. For me, this will be the first time I ever go to DAC. I have been to a couple of <a href="http://www.date-conference.com/">Design Automation and Test Europe  (DATE) </a>conferences before, but DAC is supposedly even bigger as an event for the EDA and related communities. I have the honor to be on a panel this year, as well as co-authoring a paper on software validation.</p>
<p><span id="more-823"></span>The panel is called &#8220;<a href="http://www.dac.com/events/eventdetails.aspx?id=95-49">The Wild West: Conquest of Complex Hardware-Dependent Software Design</a>&#8220;, and takes place on Thursday, July 30, at 16.30, in room 131. We will be discussing hardware/software integration, multicore software, and other topics that I like. We will have a good mix of tool providers and tool users.</p>
<p>The paper is called &#8220;Design Flow for Embedded System Device Driver Development and Verification&#8221;, and is co-authored by me, Jason Andrews of Cadence, and my colleague Ross Dickson. It is presented in the user track session called &#8220;<span id="ctl00_Center_Content_Placeholder__lblEventTitle" class="sestitle"><a href="http://www.dac.com/events/eventdetails.aspx?id=95-3-U">Verification: A Front-End Perspective</a>&#8220;, on Tuesday, at 16.30. It deals with how you can use directed random testing to verify software drivers for custom hardware, using a virtual platform.<br />
</span></p>
<p>I will at the DAC all week, sounds like a great fun event!<span class="sestitle"><br />
</span></p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/823/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cadence SystemC Checkpointing</title>
		<link>http://jakob.engbloms.se/archives/817?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/817#comments</comments>
		<pubDate>Sat, 13 Jun 2009 20:29:35 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Cadence]]></category>
		<category><![CDATA[Checkpointing]]></category>
		<category><![CDATA[George Frazier]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[SystemC]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=817</guid>
		<description><![CDATA[I while ago I wrote a blog post on checkpointing in virtual platforms, and what it is good for. Checkpointing has been a fairly rare feature in virtual platform tools for some reason, but it seems to be picking up some implementations. In particular, I recently noticed that Cadence added it to their simulator solutions [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-737" style="margin: 5px 10px;" title="gears1" src="http://jakob.engbloms.se/wp-content/uploads/2009/04/gears1.png" alt="gears1" width="56" height="57" />I while ago I wrote a blog post on <a href="http://jakob.engbloms.se/archives/714">checkpointing </a>in virtual platforms, and what it is good for. Checkpointing has been a fairly rare feature in virtual platform tools for some reason, but it seems to be picking up some implementations. In particular, I recently noticed that Cadence added it to their simulator solutions a while ago (2007 according to their blog posts). There are a two blog posts  by <a href="http://www.cadence.com/community/posts/georgef.aspx">George Frazier </a>of Cadence (&#8220;<a href="http://www.cadence.com/Community/blogs/sd/archive/2009/02/18/how-to-save-os-boot-time-in-your-systemc-virtual-platform-with-save-and-restore.aspx">saving boot time</a>&#8221; and &#8220;<a href="http://www.cadence.com/Community/blogs/sd/archive/2009/03/09/systemc-save-and-restore-part-2-advanced-usage.aspx">advanced usage</a>&#8220;) that offer some insight into what is going on.</p>
<p><span id="more-817"></span>Note that checkpointing is nothing new to RTL-level HDL simulators, since that is a much more controlled environment than a general virtual platform. I think the Cadence blog put it quite well:</p>
<blockquote><p><span id="anormal_12" class="Cadence_CS_BlogDetail_BlogText">Save and restore (or restart) has existed in HDL simulators for years, but things are trickier if SystemC is involved. For one thing, SystemC simulators use external tools for compilation and linking: i.e. gcc. They have more or less a “black box” understanding of global variables, local variables, file descriptors and heap values that make up the simulation state at any point in time. When you throw in multiple threads implemented with application-level threading packages and the fact that C++ heap objects are impractical to save programmatically, it’s easy to see why save and restore tools for HDL simulators can’t be easily extended for SystemC. </span></p></blockquote>
<p><span class="Cadence_CS_BlogDetail_BlogText">I could not say it better myself. What is interesting is that the Cadence solution does solve this problem, in a limitied way, for a limited use case. I have not looked at their solution in detail (such as using it myself), but this paragraph indicates that the solution is essentially a complete memory contents dump:</span></p>
<blockquote><p><span id="anormal_12" class="Cadence_CS_BlogDetail_BlogText">During restart, all internal variables inherit the same values from the process as it existed at the time of save (for example, C variables declared static). While this behavior helps assure that SystemC state information is properly saved and restored, it can also leave variables that reference the process environment (like file descriptor and sockets to other processes) in limbo. </span></p></blockquote>
<p><span class="Cadence_CS_BlogDetail_BlogText">Doing it this way is heroic in effort but also quite limited in scope. If I look at the four operations for restoring from a checkpoint that I outlined in my previous blog post on checkpointing:</span></p>
<ul>
<li><span class="Cadence_CS_BlogDetail_BlogText">Restore to same machine, same model</span></li>
<li><span class="Cadence_CS_BlogDetail_BlogText">Restore to different machine, same model</span></li>
<li><span class="Cadence_CS_BlogDetail_BlogText">Restore to same or different machine, updated model</span></li>
<li><span class="Cadence_CS_BlogDetail_BlogText">Restore to same or different machine, completely different model</span></li>
</ul>
<p>It is clear that you can only do the first, as the solution will restore the state of an implementation of a model, not just its relevant state as is done in Simics checkpointing. The sole advantage of this approach is that it does work with arbitrary code. But it does not support any of the more powerful uses of checkpoints beyond simply not repeating work for a single user on a particular machine.</p>
<p>I don&#8217;t think a memory dump can travel even to a second machine of similar make and setup, since it will depend on the precise memory layout of a process that starts. And that is affected by DLL and shared objects load order, which is hard to control. The versions of all libraries have to be exactly the same too. It is not even clear that a checkpoint survives the upgrading of the OS on the machine being used, as that will surely change things in terms of precise memory allocation.Would be happy for the Cadence users to be proven wrong, but in principle I think checkpointing done right requires models to be written explicitly to support it. <a href="http://stackoverflow.com/questions/184027/serialization-of-objects-no-thread-state-can-be-involved-right">Just like any serialization solution in any programming language</a>.</p>
<p>I must admit that George Frazier does mention using checkpoints &#8220;weeks or even months&#8221; after initial save, but there is no mention of changing the code of the model in that time frame. For me checkpoints tend to live for years, I have some nice Simics demo checkpoints that have been with us for some five years at this point in time, surviving from Simics 2.0 to 2.2 to 3.0 to 3.2 to 4.0 to 4.2&#8230; thanks to the power of the Simics &#8220;save only explicitly defined state&#8221; principle, and checkpoint updater functions that essentially rewrite old checkpoints to make them compatible with new and updated machine models.</p>
<p>The other bit that I find interesting is what is considered the biggest headache: not the actual saving of the memory state, but how to handle open files and similar host operating-system connections:</p>
<p><span id="anormal_12" class="Cadence_CS_BlogDetail_BlogText"></p>
<blockquote><p>If a save operation is performed when the file is open then problems can arise if the program attempts to write to the same file after restore (because the file descriptor associated with the open file will be in a different state after restore).</p></blockquote>
<p>It is nice to have a way to solve this, but it is also pretty shocking that you have to solve it! It kicks of a mini-rant&#8230; A virtual platform model should <em>not </em>read or write or access other host resources directly in any way, in my rulebook for sound programming practice. All host dependencies should be handled via the simulation core and framework, in a manner that is checkpoint-safe, portable, and does not rely on any information from the host directly in the models. It is crucial to localize all such host interactions in specially written host connection modules that make sure all regular simulation modules run in a completely encapsulated and virtual world.</p>
<p>So, overall, hats off to Cadence for actually doing something, but keep in mind that it will be very limited until some discipline is exercised in modeling and state considered as something separate from the implementation.</p>
<p></span></p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/817/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Article in ECNmag about Multicore and Virtual Platforms</title>
		<link>http://jakob.engbloms.se/archives/807?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/807#comments</comments>
		<pubDate>Tue, 09 Jun 2009 06:46:49 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[articles]]></category>
		<category><![CDATA[multicore debug]]></category>
		<category><![CDATA[multicore software]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[ECNmag]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=807</guid>
		<description><![CDATA[I have a short article on multicore systems development and virtual platforms in the May 2009 issue of ECN magazine, over at www.ecnmag.com.]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-808" style="margin: 5px;" title="ecn_logos" src="http://jakob.engbloms.se/wp-content/uploads/2009/06/ecn_logos.gif" alt="ecn_logos" width="84" height="52" />I have a short article on <a href="http://www.ecnmag.com/article-cover-story-Virtual-Platforms-051509.aspx">multicore systems development and virtual platforms </a>in the May 2009 issue of ECN magazine, over at <a href="http://www.ecnmag.com">www.ecnmag.com</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/807/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cadence Industry Insight: &#8220;Virtual Platforms Unite HW and SW&#8221;</title>
		<link>http://jakob.engbloms.se/archives/784?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/784#comments</comments>
		<pubDate>Fri, 22 May 2009 06:41:09 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[ESL]]></category>
		<category><![CDATA[articles]]></category>
		<category><![CDATA[embedded software]]></category>
		<category><![CDATA[embedded systeme]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Cadence]]></category>
		<category><![CDATA[Domain-specific languages]]></category>
		<category><![CDATA[ISX]]></category>
		<category><![CDATA[Richard Goering]]></category>
		<category><![CDATA[scdsource]]></category>
		<category><![CDATA[software testing]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=784</guid>
		<description><![CDATA[Another Cadence guest blog entry, about the overall impact of virtual platforms on the interaction between hardware and software designers. Essentially, virtual platforms are a great tool to make software and hardware people talk to each other more, since it provides a common basis for understanding. My entry is called &#8220;Virtual Platforms unites Hardware, Software [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-654" style="margin-left: 5px; margin-right: 5px;" title="opinion" src="http://jakob.engbloms.se/wp-content/uploads/2009/02/opinion.png" alt="opinion" width="91" height="69" />Another Cadence guest blog entry, about the overall impact of virtual platforms on the interaction between hardware and software designers. Essentially, virtual platforms are a great tool to make software and hardware people talk to each other more, since it provides a common basis for understanding.</p>
<p><span id="more-784"></span>My entry is called &#8220;<a href="http://www.cadence.com/Community/blogs/ii/archive/2009/05/21/guest-blog-virtual-platforms-unite-hardware-software-designers.aspx">Virtual Platforms unites Hardware, Software Engineers</a>&#8220;, and is presented by <a href="http://www.cadence.com/community/posts/rgoering.aspx">Richard Goering </a>(who used to be with <a href="http://www.scdsource.com">SCDSource</a>), in his &#8220;<a href="http://www.cadence.com/Community/blogs/ii/default.aspx">Industry Insights</a>&#8221; section of the Cadence community of blogs. Richard Goering has a personal post pointing in the same direction, about <a href="http://www.cadence.com/Community/blogs/ii/archive/2009/05/13/meeting-the-embedded-software-challenge.aspx?postID=17593">EDA tackling embedded software</a>. Worth reading.</p>
<p>Note that I do <em>not </em>say that hardware and software engineers should use the same <em>programming languages</em> as a result of using virtual platforms. Programming languages efficient for hardware design are quite different from those efficient for virtual platform creation, which are in turn different from good software engineering languages.  In some cases, some of them coincide, but in general, I believe in using the best tool for each job, and a programming language is just a tool. And the more designed it is for its task, the better. Some older posts of mine on this topic:</p>
<ul>
<li><a href="http://jakob.engbloms.se/archives/747">DSL: Purpose-built languages</a></li>
<li><a href="http://jakob.engbloms.se/archives/681">DSL: The tyranny of syntax</a></li>
<li><a href="http://jakob.engbloms.se/archives/283">Multicore programming and DSLs</a></li>
<li><a href="http://jakob.engbloms.se/archives/165">What is the obsession with C in EDA?</a></li>
<li><a href="http://jakob.engbloms.se/archives/157">Kunle Olukotun on DSLs</a></li>
<li><a href="http://jakob.engbloms.se/archives/709">Modeling hardware at a high level for software development</a></li>
</ul>
<p>And there is the <a href="http://jakob.engbloms.se/archives/306">ChipDesign article </a>from last year about using virtual platforms in the hardware design process all the way out to customers.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/784/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Guest Blog at Cadence: &#8220;Way Worse than the Real Thing&#8221;</title>
		<link>http://jakob.engbloms.se/archives/781?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/781#comments</comments>
		<pubDate>Wed, 20 May 2009 10:45:21 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[ESL]]></category>
		<category><![CDATA[articles]]></category>
		<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[embedded software]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Cadence]]></category>
		<category><![CDATA[ISX]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[software testing]]></category>
		<category><![CDATA[Virtutech]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=781</guid>
		<description><![CDATA[Virtutech and Cadence yesterday announced the integration of Virtutech Simics and Cadence ISX (Incisive Software Extensions), which is essentially a directed random test framework for software. With this tool integration, you can systematically test low-level software and the hardware-software (device driver) interface of a system, leveraging a virtual platform. As part of explaining why this [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-782" style="margin-left: 5px; margin-right: 5px;" title="avataraspx" src="http://jakob.engbloms.se/wp-content/uploads/2009/05/avataraspx.jpg" alt="avataraspx" width="72" height="72" />Virtutech and Cadence yesterday announced the integration of Virtutech Simics and Cadence ISX (Incisive Software Extensions), which is essentially a directed random test framework for software. With this tool integration, you can systematically test low-level software and the hardware-software (device driver) interface of a system, leveraging a virtual platform.</p>
<p><span id="more-781"></span></p>
<p>As part of explaining why this is cool and what it means, I have a <a href="http://www.cadence.com/Community/blogs/sd/archive/2009/05/18/way-worse-than-the-real-thing.aspx">guest blog posting over at Cadence&#8217;s blog site</a>, called &#8220;Way Worse than the Real Thing&#8221;. The blog is posted under the general &#8220;TeamESL&#8221; &#8220;personality&#8221; on the blog site, which is used for people external to Cadence in the ESL space.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/781/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
