<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Observations from Uppsala</title>
	<atom:link href="http://jakob.engbloms.se/feed" rel="self" type="application/rss+xml" />
	<link>http://jakob.engbloms.se</link>
	<description>Computer Simulation, Virtual Platforms, Embedded Programming, Multicore and More (by Jakob Engblom)</description>
	<lastBuildDate>Tue, 21 May 2013 17:05:03 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
<image>
    <title>Observations from Uppsala</title>
    <url>http://jakob.engbloms.se/favicon.png</url>
    <link>http://jakob.engbloms.se</link>
    <width>32</width>
    <height>32</height>
    <description>Observations from Uppsala - http://jakob.engbloms.se</description>
    </image>		<item>
		<title>Wind River Blog: Simics 4.8 is Here</title>
		<link>http://jakob.engbloms.se/archives/1874?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1874#comments</comments>
		<pubDate>Tue, 21 May 2013 17:05:03 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Wind River Blog]]></category>
		<category><![CDATA[4.8]]></category>
		<category><![CDATA[Simics]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1874</guid>
		<description><![CDATA[Simics 4.8 is finally released, and I put up a blog post explaining the most important news in this release. It is two years since we released Simics 4.6, so there is quite a bit of news in Simics 4.8 &#8211; even though lots of functionality has been released continuously into 4.6 over the past <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1874" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-1122" style="margin: 5px 10px;" title="Wind River Logo" alt="" src="http://jakob.engbloms.se/wp-content/uploads/2010/04/button-quicklink-blogs.png" width="46" height="46" />Simics 4.8 is finally released, and I put up <a href="http://blogs.windriver.com/wind_river_blog/2013/05/simics-48-is-here.html">a blog post explaining the most important news </a>in this release. It is two years since we released Simics 4.6, so there is quite a bit of news in Simics 4.8 &#8211; even though lots of functionality has been released continuously into 4.6 over the past twenty four months. My personal favorite are the comments you can put on an execution and the stop log,  but then again, that might be because they have been a couple of pet ideas of mine so I am hardly an impartial judge. Everything else is also really good, and the engineering teams and marketing teams involved have put in a lot of effort into this release (as we do in all releases).</p>
<p>&nbsp;</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1874" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1874" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1874"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1874" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1874/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Few Electrons too Many</title>
		<link>http://jakob.engbloms.se/archives/1865?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1865#comments</comments>
		<pubDate>Thu, 09 May 2013 20:16:15 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[embedded systeme]]></category>
		<category><![CDATA[trains]]></category>
		<category><![CDATA[transportation]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1865</guid>
		<description><![CDATA[Adding electronics to systems that used to be mechanical has been the great wave of innovation for a quite a while now. Modern transportation just would not work without all the electronics and computers inside (someone once quipped that a modern fighter is just a plastic airplane full of software), and so much convenience has <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1865" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><img class="alignleft" alt="" src="http://jakob.engbloms.se/wp-content/uploads/2013/05/light-bulb.png" /></p>
<p>Adding electronics to systems that used to be mechanical has been the great wave of innovation for a quite a while now. Modern transportation just would not work without all the electronics and computers inside (someone once quipped that a modern fighter is just a plastic airplane full of software), and so much convenience has been provided by automation and smarts driven by electronics. However, this also introduces brand new ways that things can break, and sometimes I wonder if we really are not setting ourselves up for major problems when the electrons stop flowing.</p>
<p><span id="more-1865"></span></p>
<p><img class="alignleft" alt="" src="http://jakob.engbloms.se/wp-content/uploads/2013/05/oras-batteridriven.jpg" width="150" height="136" /> I was confronted with a particularly silly example the other day at the gym. There was a handwritten note taped to the wall next to the tap in the locker room, saying essentially that &#8220;This tap does not work since the battery is out. Facilities services have been notified and should fix this shortly.&#8221;</p>
<p>What? A water tap that does not work since its battery got depleted? What about attaching it to mains power? Or even just making it mechanical, using some kind of directly acting mechanical actuator to turn the water on and off? Or just having a manual override or emergency actuator? This was not a life-or-death situation, but it did showcase the vulnerability of electrically powered machinery when the power goes out. We have far too many of these unnecessarily electronic system in society today. I think we need to think more about robustness and less about convenience.</p>
<p><img class="alignright" alt="" src="http://jakob.engbloms.se/wp-content/uploads/2013/05/littman-electronic.png" width="300" height="190" /></p>
<p>A few days later, I was pointed at an example that could actually be life-or-death. The <a href="http://www.medisave.co.uk/littmann-3200-electronic-stethoscope-burgundy-p-100448.htmlhttp://www.medisave.co.uk/littmann-3200-electronic-stethoscope-burgundy-p-100448.html">Littmann Electronic stethoscope.</a> As you can see on the right, the sensor part of this device has an on/off switch, a small display, and some buttons to control its many functions. It does add some really useful features to the device. With electronics, you can do noise reduction and bring out the essential sounds from the background. Collected sounds can be transmitted to a computer for recording, storing, and analysis. It is a very good example of how electronics can enhance and in many ways transform a traditional instrument by making it so much smarter. A completely brilliant product development that I could see myself being part of developing. But.</p>
<p style="text-align: center;"><img class="aligncenter" alt="" src="http://jakob.engbloms.se/wp-content/uploads/2013/05/battery-out-littmann.png" width="637" height="247" /></p>
<p>Battery dead, device inoperable.</p>
<p>A classic stethoscope is a nicely mechanical device that a doctor can use anywhere, anytime, with no need for a hospital, electricity, or any other facility. I find that very reassuring (probably having read too much disaster recovery and war history literature). Imagine the situation in the emergency room when a doctor has to stop looking after a patient to chase a new battery for their stethoscope (please hold it right there&#8230;). Maybe it should have had a mechanical backup. Or maybe it is simply a matter of thinking about the tools of the trade before going out into the bush where spare batteries will not be found.</p>
<p style="text-align: left;">Speaking about batteries, the most obvious example of big problems due to an empty battery is the modern smartphone. A friend of mine pointed out that unless you keep your phone charged during the day in the office, it might not have enough battery left on the way home to get you home. Why? Since the way to buy tickets for the commuter train is to use an <a href="https://play.google.com/store/apps/details?id=dk.unwire.projects.samtrafiken&amp;hl=en">app on the smartphone</a>. Seems convenient, and is convenient, but the number of ways that this can break is scary when you think about it. Apart from local failure on the phone (in particular, finding a modern phone with a battery that can survive a day of use is hard), you also rely on the phone network and the sales servers all being online at the same time. If any link fails, it will make it a bit harder to get home (or to work, for that matter).</p>
<p style="text-align: left;">Overall, I guess my feeling is that we need to think a bit more about how to build systems that are robust in the face of power failures and network failures. A bit of mechanical backup does not hurt, and avoiding some features to preserve robustness would seem prudent.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1865" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1865" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1865"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1865" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1865/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Wind River Blog: Visuality NQ CIFS Server on Simics</title>
		<link>http://jakob.engbloms.se/archives/1862?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1862#comments</comments>
		<pubDate>Tue, 23 Apr 2013 09:05:17 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Wind River Blog]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[Visuality NQ]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1862</guid>
		<description><![CDATA[There is a new post at my Wind River blog, about how I ran a Windows file share server (CIFS) on a Simics-simulated VxWorks big-endian Power Architecture target. Something that just should work, given that the software in question is known to work in the real world. But still, pretty cool, and a bit eerie. <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1862" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-1122" style="margin: 5px 10px;" title="Wind River Logo" alt="" src="http://jakob.engbloms.se/wp-content/uploads/2010/04/button-quicklink-blogs.png" width="46" height="46" />There is a new post at my Wind River blog, about how I<a href="http://blogs.windriver.com/tools/2013/04/serving-windows-files-from-a-simics-quick-start-platform.html"> ran a Windows file share server (CIFS) on a Simics-simulated VxWorks big-endian Power Architecture target</a>. Something that just should work, given that the software in question is known to work in the real world. But still, pretty cool, and a bit eerie.</p>
<p>&nbsp;</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1862" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1862" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1862"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1862" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1862/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Everything in the Cloud?</title>
		<link>http://jakob.engbloms.se/archives/1855?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1855#comments</comments>
		<pubDate>Sat, 30 Mar 2013 19:12:33 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[business issues]]></category>
		<category><![CDATA[off-topic]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[cloud computing]]></category>
		<category><![CDATA[hype]]></category>
		<category><![CDATA[salesforce]]></category>
		<category><![CDATA[Waze]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1855</guid>
		<description><![CDATA[Cloud&#8230; I tend to dislike hype and I am honestly quite sick of all the talk about cloud computing and &#8220;anything as a service&#8221;. Still, it is an intriguing area. Last week, I attended Produktledardagen, a very inspiring product management and product leadership seminar, innovation lab, and social event for the profession of product management.  <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1855" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><img class="alignleft" alt="" src="http://jakob.engbloms.se/wp-content/uploads/2013/03/IMG.jpg" width="67" height="49" /> Cloud&#8230; I tend to dislike hype and I am honestly quite sick of all the talk about cloud computing and &#8220;anything as a service&#8221;. Still, it is an intriguing area. Last week, I attended <a href="http://produktledardagen.se/">Produktledardagen</a>, a very inspiring product management and product leadership seminar, innovation lab, and social event for the profession of product management.  A significant part of the discussion was about the Cloud, and how to think about it from a product perspective.  Suddenly, with this perspective, it actually got quite interesting. In particular, trying to define to myself just what a cloud service is.</p>
<p><span id="more-1855"></span></p>
<p>Pretty much everything today is being billed as &#8220;cloud&#8221; as soon as something is connected to the Internet. That does not seem right to me. We need a stricter definition that actually helps focus the mind and reduce the space to something essential. To me, cloud means: centralization, specialization, and data transfer.</p>
<p><strong>Centralized</strong>. There has to be something running in a central location (and a set of data centers like what Facebook is using does count as central location, even if it is actually a whole host of different sites in practice). Cloud-based services is a total reversal of the trend towards pushing computing out into the end points and people&#8217;s own machines that was dominant until quite recently. Minicomputers, microprocessors, home computers, PCs, &#8230; all of the past trends in hardware have pushed computing out closer to people. Now, with the rise of cloud services, computing is being centralized again.</p>
<p>The most widespread and successful cloud services tend to have custom local clients on connected machines, but this does not mean that they are not centralized. In practice, a web page is not a sufficient interface for most interesting services, especially not on mobile devices.  In a way, cloud services are sufficiently complex that they make sense to embed in custom GUIs, and they benefit greatly from a deep integration with the local file system and data store. A web browser for all its power does not offer the same ability to access to the local machine and integrate into the functions of the OS as a real application.</p>
<p>The centralization aspect means that I would not count peer-to-peer file sharing services as cloud.</p>
<p><strong>Specialized</strong>. From an economics perspective, it seems to me that many of the best cloud-based services are examples of classic Smithian specialization by division of labor. Using the Internet, it is possible to aggregate many but minor demands for a particular specialized service into a total demand that makes it possible to create a business focused on doing just one thing and doing it really well. The prior state was many players doing the same thing as part of a broader business, with less skills specialization. The companies providing the services become specialists in a particular function, and can thus do it more efficiently.</p>
<p>This is nothing new in itself, the only difference is the delivery mechanism.  For example, consider the traditional highly skilled consultant that do the same occasional but valuable work for several clients.  Another example is the creation of standard software that is used by many different customers (instead of each customer building their own). In the past, things like specialized operating-system vendors, commercial software vendors (and standard software packages), and open-source software have all been the result of the same kind of aggregation of demand and specialization of provisioning.  Indeed, many of the current business-process-as-a-service or software-as-a-service vendors could have been sellers of well-written standard software a decade ago.  The main difference is that with cloud services, the software runs on the vendor premises instead of on the customer premises.</p>
<p><strong>Data-based</strong>. Cloud-based services have to move some kind of data to the servers for processing, and that data processing on the server has to achieve something that just cannot be achieved locally. For mobile applications in particular, this is becoming a very common design pattern (in part to achieve <a href="http://arstechnica.com/security/2013/02/mobile-app-security-always-keep-the-back-door-locked/">better security</a>). Which really is back to terminals and mainframes, with a fancier terminal and a less imposing mainframe, but still the same idea. Proven in use for a very long time. If there is no data transfer off of the local device, it is not a cloud service.</p>
<p>An interesting modern spin on centralized data processing that is very common in cloud systems is that the services correlate and aggregate the data from many individual users and then feed the result of the aggregation back to the individuals. This creates a network effect where the service provides a collective value to its user base. Social networks definitely belong here, and even more services like <a href="http://www.waze.com/">Waze</a>. The new thing here is really the individual getting direct personal benefit from the service, while traditional centralized data processing tended to be done for the benefit of the organizing organization. Giving back to the individual users is thus a crucial component to a successful cloud service.</p>
<p>Using the cloud as a <strong>mirror and replication service </strong>is a common variety of data processing. Cloud storage like Dropbox or the Google and Apple sync between mobile devices and computers are examples of this. Here, a server is used instead of more classic point-to-point peer-to-peer data movement, based on the fact that it is often easier to talk to an Internet server than to the computer next to you (even within a home network, dropbox usually synchs things more transparently than trying to make the devices copy data between them, in my experience).</p>
<p><strong>Price</strong> is not an interesting factor to me. Often when people present cloud services, they often describe &#8220;low price&#8221; or &#8220;cheaper&#8221; as a key part of their value proposition. To take an utterly unoriginal example, <a href="http://salesforce.com">Salesforce.com </a>provides even the smallest firm with access to world-class IT support for their sales force, at a very low price. But that low price really results from the division or labor and specialization. The reason is that the provider here is more efficient than old providers, but not that they are on the Internet per se. The Internet is an important enabler, for sure, but not in itself a cause of low prices. I can certainly see room for cloud services that are very high priced, as long as they provide a corresponding value. Business as usual, from a product management perspective.</p>
<p>So, in summary, there is something really interesting with Cloud, and I want to pursue. But I find it more interesting to see how it can be used in high-value business-to-business transactions, rather than your typical consumer-oriented play. I just like hard problems, I guess.</p>
<p><strong>update, after some  more reading and thinking</strong></p>
<p><strong>Standardization</strong>. One aspect that seems very common when people describe cloud plays is the idea that you sell a very standard product. Compared to the amount of customization and variation (in which hosts to run on, for example) seen in standard PC software, a centralized server park does allow a new level of standardization and simplification of the software. A <a href="http://www.bvp.com/blog/bessemer-cloud-computing-law-2-build-doer-build-employee-software">note in the Bessemer Ventures rules for cloud businesses </a>really brings home this point: only build a single variant of your software, as that lets you innovate faster. This makes sense, since the server end of a cloud system is theoretically at least totally under your control and can be made very homogeneous in terms of hardware, operating systems, and middleware. Thus, one possible additional criterion for a cloud play is that it is highly standardized and offering the same thing to many users and not making allowances for variations. This would seem to lead to higher efficiency in development and operations, and is an economical argument for specialization. Why would the user need to care about this, leave it up to the experts!</p>
<p>I am less sure about the Bessemer statement that you never should allow on-premise deployment for customers &#8212; there are cases where that makes eminent sense, in particular for very secure systems. Sure, banks have standards for their IT providers and similar, but there are cases where physical control is necessary and sane. Hopefully, you can then make the customer use your favorite software stack to run the on-premise version.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1855" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1855" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1855"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1855" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1855/feed</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Wind River Blog: TCF and Simics</title>
		<link>http://jakob.engbloms.se/archives/1850?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1850#comments</comments>
		<pubDate>Tue, 26 Mar 2013 09:00:48 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Wind River Blog]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[reverse debugging]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[TCF]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1850</guid>
		<description><![CDATA[On my Wind River blog, you can now find a description on how we have used the Eclipse TCF (target connection framework) to build the Simics GUI. Or rather, the connection between the Simics GUI and the Simics simulation process. It is actually quite revolutionary what you can do with the TCF, compared to older <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1850" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-1122" style="margin: 5px 10px;" title="Wind River Logo" alt="" src="http://jakob.engbloms.se/wp-content/uploads/2010/04/button-quicklink-blogs.png" width="46" height="46" />On my Wind River blog, you can now find a <a href="http://blogs.windriver.com/wind_river_blog/2013/03/a-better-way-to-connect-tcf-and-simics.html">description on how we have used the Eclipse TCF</a> (target connection framework) to build the Simics GUI. Or rather, the connection between the Simics GUI and the Simics simulation process. It is actually quite revolutionary what you can do with the <a href="http://eclipse.org/tcf/">TCF</a>, compared to older debug protocols. In particular, TCF lets you combine many different services across a single connection.</p>
<p><span id="more-1850"></span><br />
For Simics, the result looks like this:</p>
<p><img src="http://jakob.engbloms.se/wp-content/uploads/2013/03/tcf-diagram.png" alt="" /></p>
<p>Note that all services, and all GUI views, are driven across a single connection to the Simics process. We do not need a special debug connection for each target system, and we do not need to separate debug and analysis and general Simics things. They all travel across the same single connection, thanks to the service orientation of TCF. </p>
<p>It is important to note that <a href="http://eclipse.org/tcf/">TCF is a completely open-source project under Eclipse</a>, and it is available for any and all debugger and tools vendors to use.  </p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1850" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1850" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1850"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1850" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1850/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Off-Topic: Moving Bad Piggies Save Games</title>
		<link>http://jakob.engbloms.se/archives/1844?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1844#comments</comments>
		<pubDate>Fri, 15 Mar 2013 21:00:17 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer games]]></category>
		<category><![CDATA[Android]]></category>
		<category><![CDATA[Bad Piggies]]></category>
		<category><![CDATA[iOS]]></category>
		<category><![CDATA[Nexus]]></category>
		<category><![CDATA[Rovio]]></category>
		<category><![CDATA[save games]]></category>
		<category><![CDATA[Xperia]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1844</guid>
		<description><![CDATA[I am really a great fan of the Rovio games from Angry Birds and on. One of these games is the tricky puzzler Bad Piggies, which I have spent a great deal of time playing to unlock new levels (and as an illustration of deterministic simulation). Playing on my Nexus 7 I have solved level <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1844" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><img class="alignleft" style="margin: 5px;" alt="" src="http://jakob.engbloms.se/wp-content/uploads/2013/03/bad-piggies-icon.png" width="62" height="62" /> I am really a great fan of the Rovio games from Angry Birds and on. One of these games is the tricky puzzler Bad Piggies, which I have spent a great deal of time playing to unlock new levels (and as an <a href="http://blogs.windriver.com/engblom/2012/11/determinism-simics-and-flying-piggies.html">illustration of deterministic simulation</a>). Playing on my Nexus 7 I have solved level after level&#8230; and then I got myself a new <a href="http://www.sonymobile.com/se/products/phones/xperia-v/">Xperia phone. </a>Not that playing on the go is that big an attraction, but if I happened to want to do that, starting over just felt wrong.</p>
<p><span id="more-1844"></span></p>
<p>I read a number of articles and blog posts discussing how to move Angry Birds save games, but nothing that was particular to Bad Piggies. However, the pattern that emerged was pretty clear. The save game state is just a bunch of files, and if you can simply get them from one device to another you should be fine. There were some indications that  maybe with Android 4.0 or later you would need to root your device in order to get at the files, which was a bit worrying.</p>
<p>Turned it out it was really easy, even though I had to take a small detour.</p>
<p>First, I attached the Nexus 7 to my PC. It was no problem to navigate to <strong>Internal storage/Android/data/com.rovio.BadPiggiesHD/</strong>and just copy the whole <strong>files</strong> directory to to my PC&#8217;s disk. I then attached the Xperia to the same machine, and tries to locate the same directory. It was not there. Instead, I copied the files folder to another location (on an SD card, but I don&#8217;t think that matters). On the phone I then used a file manager program to copy the folder from the SD card to the internal memory. On the phone,  I could see the target location and copy the files folder into it (image shrunk to 50% to fit the blog format):</p>
<p><img alt="" src="http://jakob.engbloms.se/wp-content/uploads/2013/03/Screenshot_2013-03-11-11-03-37.png" /></p>
<p>And then it just worked.</p>
<p>Thanks to Rovio for making it that easy. And thanks to Android for being a mobile OS with a file system that is visible. Having used Apple iOS devices for quite a few years, I did not realize just how useful it was to have an exposed file system until I started using Android. Back on the old Symbian phones I used to use, there just was not that much to do with a file system, and they were never mountable as file systems to a PC anyway. Now, with high-resolution screens and the ability to watch movies and read PDFs in a way that makes sense,  having a file system makes so much more sense.</p>
<p>There is probably some way to extract save games from the older Angry Birds games from my iPods to my Android devices, I need to look into that. It should just end up being the same types of files in the end, I assume.</p>
<p>I wonder why Rovio does not offer the service to sync games across devices. The only way I can think of that makes sense is running some kind of server that would sync user&#8217;s save games across devices. A hassle to maintain, and users would need to establish accounts just for this purpose. Not sure that I would. Considering the average cost per game sold of only a few dollars, the cost of running servers might well be quite high compared to the average income per user. On the other hand, they might even be able to charge a dollar or so for the service. It could also interact in strange ways with in-game purchases. For games that are guaranteed identical on all devices it is easy to sync progress. But what if I have purchased a set of levels on one device, but not on another, how would you deal with that in a robust way?</p>
<p>Offering a simple way to copy progress to a file would spoil the game a bit (users sharing saved games to avoid the effort to get three stars on all levels and similar thresholds). It is also something you cannot really do on an iOS device in a natural way, as there is no exposed file system.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1844" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1844" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1844"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1844" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1844/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Two Cores, Four Cores, Eight Cores &#8211; Mobile Variety</title>
		<link>http://jakob.engbloms.se/archives/1841?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1841#comments</comments>
		<pubDate>Sun, 03 Mar 2013 20:26:54 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer architecture]]></category>
		<category><![CDATA[multicore computer architecture]]></category>
		<category><![CDATA[multicore software]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[bigLITTLE]]></category>
		<category><![CDATA[Cortex-A15]]></category>
		<category><![CDATA[Cortex-A7]]></category>
		<category><![CDATA[Cortex-A9]]></category>
		<category><![CDATA[eQuad]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[MP6530]]></category>
		<category><![CDATA[Renesas]]></category>
		<category><![CDATA[ST Ericsson]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1841</guid>
		<description><![CDATA[Probably thanks to the yearly Mobile World Congress, there have been a slew of recent announcements of mobile application processors recently. Everything is ARM-based, but show quite some variety in the CPU core configurations used. Indeed, I think this variety has something to say on the general state of multicore. My starting point is the <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1841" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><img class="alignleft" alt="" src="http://jakob.engbloms.se/wp-content/uploads/2008/09/onoff.png" width="72" height="70" align="left/" /> Probably thanks to the yearly Mobile World Congress, there have been a slew of recent announcements of mobile application processors recently. Everything is ARM-based, but show quite some variety in the CPU core configurations used. Indeed, I think this variety has something to say on the general state of multicore.</p>
<p><span id="more-1841"></span></p>
<p>My starting point is the marketing hype surrounding core counts (something we saw in PCs a few years ago). It sure sounds powerful with an &#8220;octacore&#8221; device, right? Beats an old dualcore any day, right? To some extent this is marketing hype, but it also does show that there is a need for more performance in phones and tablets. I have also been looking for a new TV lately, and was quite flabbergasted to see several brands touting dual-core processors as a key feature of their offerings. Processor specs for selling a dumb thing like a TV? Things are weird. Sure, <a href="http://arstechnica.com/apple/2013/02/apple-ceo-specs-touted-by-tech-giants-because-they-cant-have-a-great-experience/">Tim Cook of Apple blasted </a>this as being stupid marketing &#8220;because they cannot provide a great experience&#8221; &#8212; but the truth of the matter is that when vendors compete within a single software ecosystem, this is what you compete on. The experience of Android or Windows is mostly the same across devices, so differentiation has to come from the hardware, and core counts are easy to understand. Just like horsepower in cars.</p>
<p>But the key question is: can we really use more cores than two or four in any meaningful way?</p>
<p>Looked at objectively, there are four ways to improve performance of the processor portion of an SoC:</p>
<ul>
<li>More cores</li>
<li>Better microarchitecture</li>
<li>Higher clock frequency</li>
<li>Allow the device to go hotter and use more power</li>
</ul>
<p>In ARM land, there is clearly room for all four, while in Intel PCs, it is hard to do more than minor improvements in anything except core count. So we are left with a more interesting competitive space than most. The better microarchitecture arena is a place where <a href="http://arstechnica.com/gadgets/2013/02/future-of-mobile-cpus-part-2-whats-ahead-for-the-major-players/">Qualcomm and Apple have shown that you can do better than ARMs standard cores</a>.</p>
<p>Some smart and brave souls at ST Ericsson (one of which I met many years ago and have great respect for) put out an <a href="http://www.stericsson.com/technologies/FD-SOI-eQuad-white-paper.pdf">interesting whitepaper on the quad-core mania</a>, making the case that their new very-high-clock frequency dual Cortex-A9 is more useful than a lower-clocked quad-core Cortex-A9. Basically, the software just does not typically make use of multiple cores in a good way, while driving the clock frequency higher will immediately accelerate the software that people really care about. When it comes to reducing the latency of operations, higher clocks tend to be hard to beat. Still, some marketing person at ST Ericsson decided that quad is the word of the day, and dubbed their <a href="http://www.stericsson.com/products/L8580.jsp">Novathor L8580 2.5 GHz dual-core </a>a &#8220;eQuad&#8221;, for something equivalent to a quad. Silly. Had the core been a bit more modern, I think this could have been a real winner. For most real applications, this should be much much faster than the last-generation 1.0 to 1.5 GHz Cortex-A9 dual-cores or quad-cores.</p>
<p>Another interesting new release was the <a href="http://renesasmobile.com/news-events/news/2013/mp6530-quad-core-arm-cortex-a15-a7-lte">Renesas Mobile MP6530</a>. It is a chip targeting the midrange of smartphones, using an ARM bigLITTLE setup with two Cortex-A15 and two Cortex-A7 cores in a single core complex. What I found interesting here was that for the first time, I saw a bigLITTLE setup described as activating both types of cores at once (look at the <a href="http://renesasmobile.com/news-events/news/2013/mp6530-quad-core-arm-cortex-a15-a7-lte">diagram on the press release page</a>). Until now, I have only seen designs that switch between only using A7 cores and only using A15s. It was just a matter of time before software matured to this point (the OS scheduler needs to understand that it is dealing with different types of cores, and behave accordingly), but still good to see it happening. I expect the same idea to be used on other mixed-core setups, would be strange if it was Renesas-specific.</p>
<p>This setup makes eminent sense to me. Living with multicore PCs for several years now, it is clear that having more than one core is useful for any multitasking environment. There is enough work simmering around even in a lightly-used system that using two cores clearly reduce latencies and provide a smoother experience with less locking-up of the UI. As long as nothing heavy is running, you can use the two power-efficient A7s and get a nice long battery life. What is interesting is what happens when a heavy task with high demands on latency appears. I think a good example is a web browser doing a rendering pass or running a Javascript application, or a game with heavy AI (which is pretty serial in general it seems). In this case, to get a snappy user experience you can kick in a powerful core at high frequency and get the job done quickly. Perfect, just activate one of the A15 cores and keep the rest of the background noise running on the A7s. If something really heavy comes along, you activate the second A15 core too.</p>
<p>So far, so good. But what about the new <a href="http://www.engadget.com/2013/01/09/samsung-announces-exynos-5-octa-chip-at-ces/">octacores </a>sporting four Cortex-A15 along four Cortex-A7? To me, this seems strange. Assuming that we are able to use any combination of cores, this setup would only make sense if four A7s could somehow do a better job than a single A15 while using less power. I heard some numbers indicating that the magic difference is about 3 in performance&#8230; so in that case, just what are all these A7s there for?  It would make more sense with a hexacore device &#8211; 2 A7s for background noise, and then you turn on 1 to 4 A15 cores to process progressively heavier software tasks. I have no doubt that we can imagine and invent software that can make good use of any number of powerful cores &#8211; but where is the case for a large number of weak cores in a client machine (it makes perfect sense on a server)?</p>
<p>Considering the existence of a <a href="http://arm.com/products/tools/development-boards/versatile-express/coretile-express.php?tab=Specifications+CPU+Options">reference platform from ARM with two A15s and three A7s</a>, you can clearly mix the numbers. No need for the same number of A7s and A15s. Maybe this will change as the use of truly heterogeneous multiprocessing spreads within the ARM ecosystem, while right now the safe bet is a quad + quad setup where you switch from one cluster to another. I have no real data and can only speculate.</p>
<p>What I would like to see done is to run long-term CPU profiling on an all-cores-active-all-the-time bigLITLLE platform and see just how much use you get out of each type of core. Unfortunately, I do not have that right now, and maybe nothing such will be on the market for quite some time.</p>
<p>In conclusion, I think there is some merit to bumping up mobile processors to at least quad, and there is definitely potential in mixing fast and slow cores. Making good use of eight cores seems a bit unlikely though (unless you switch between two separate clusters). One wonders if that silicon area could not have been used for something else instead.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1841" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1841" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1841"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1841" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1841/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bliss: Failing to Pivot for Ideology</title>
		<link>http://jakob.engbloms.se/archives/1835?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1835#comments</comments>
		<pubDate>Sun, 17 Feb 2013 20:05:47 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[business issues]]></category>
		<category><![CDATA[general history]]></category>
		<category><![CDATA[general research]]></category>
		<category><![CDATA[history]]></category>
		<category><![CDATA[Blissymbolics]]></category>
		<category><![CDATA[Charles Bliss]]></category>
		<category><![CDATA[pivot]]></category>
		<category><![CDATA[startup]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1835</guid>
		<description><![CDATA[Note: This post was caused by listening to an interesting science podcast while thinking about the theories of startups, and the connection might seem a bit odd. Still, I think there is something to be learnt here. End note. I recently listened to the episode on Bliss, by the Radiolab podcast. As always, Radiolab manages <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1835" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><img class="alignleft" style="margin: 5px;" alt="" src="http://jakob.engbloms.se/wp-content/uploads/2013/02/radiolab-logo-small.png" width="44" height="38" align="left" /><em>Note:</em> This post was caused by listening to an interesting science podcast while thinking about the theories of startups, and the connection might seem a bit odd. Still, I think there is something to be learnt here.<em> End note</em>.</p>
<p>I recently listened to the episode on <a href="http://www.radiolab.org/2012/dec/17/man-became-bliss/">Bliss</a>, by the <a href="http://www.radiolab.org/">Radiolab </a>podcast. As always, Radiolab manages to take a theme and connect all kinds of things to it. In this case, bliss as in happiness turned into <a href="http://en.wikipedia.org/wiki/Charles_K._Bliss">Bliss</a>, the man, and his invention of Symbolics. Symbolics was an attempt to create a rational language based on symbols that would not allow the manipulation of human opinion or feeling like regular languages do. It was an attempt to create an antidote to the manipulations of dictators, tricksters, and populists (Bliss himself had been briefly interned in a pre-war German concentration camp, so he definitely knew what words could do). He designed a symbolic writing scheme that was intended to only communicate ideas clearly and unambiguously and with no room for demagogery and oratory. In the end, nobody wanted to use the language for its original purpose.</p>
<p><span id="more-1835"></span></p>
<p>However&#8230; it turned out to be very useful for communicating with children with certain types of handicaps. This use was pioneered by a nurse in Canada, and she reached out to Bliss to thank him for his great invention and the joy it brought to the children. She also found that it was a great way to help the children learn regular English or French. The Blissymbolics were being used to bridge into a regular language, not as a permanent communications method on its won.</p>
<p>At this point, we have a classic &#8220;pivot&#8221; moment for Blissymbolics. Think of this as a startup. Bliss has discovered that the original idea for what to do with the symbolic language does not seem to work. But there is a new adjacent area where there is real utility and a great chance of success. Any startup theorists knows what to do: Pivot &#8211; stay grounded where you are, but change direction.</p>
<p>Charles Bliss did not pivot.</p>
<p>Instead he hated the idea, since it went counter to all that he wanted to do. He did not want people to use regular language, he wanted them to use his superior code. Thus, acrimony, lawsuits, and pain ensued. In the end, the conflict pretty much killed his idea as a practical tool. It is a sad story. We can either view Bliss as a stubborn man in love with his original idea who did not understand the need to change direction to be successful. Or we can view him as an idealist who did not sell out despite the lure of success, where success required violating his original idea. When looked at as a business, he clearly did the wrong thing. When looked at as an expression of art or ideology, it is less clear. Still, I think he should have been happy to help people, even if it was not quite in the way he envisioned.</p>
<p>For a startup, I think the story of Bliss illustrates the need to balance the love of the original idea and application with the need to accept that it might be used in unintended ways. Whether one embraces the pivot will depend on what the situation is. If it is just a different way to earn money and make people happy, do it. If it ends up being something morally challenging, maybe the pivot should not be done. In the most extreme, turning plowshares into swords might earn you money, but is it right?</p>
<p>Sorry for the rambling, but I think it was an interesting story.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1835" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1835" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1835"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1835" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1835/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Wind River Blog and Movie: Demo of Simics Debugging</title>
		<link>http://jakob.engbloms.se/archives/1830?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1830#comments</comments>
		<pubDate>Wed, 09 Jan 2013 20:40:15 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Wind River Blog]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[demo]]></category>
		<category><![CDATA[endianness]]></category>
		<category><![CDATA[networking]]></category>
		<category><![CDATA[reverse debugging]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[video]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1830</guid>
		<description><![CDATA[Last year, I did a Simics webinar which included a two-part demo of how to use Simics to debug an endianness bug in a networked system as it migrates from big-endian to a little-endian system. Along the way, I also showed off various Simics features like reverse execution and checkpointing and scripted execution. The demo <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1830" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-1122" style="margin: 5px 10px;" title="Wind River Logo" src="http://jakob.engbloms.se/wp-content/uploads/2010/04/button-quicklink-blogs.png" alt="" width="46" height="46" />Last year, I did a Simics webinar which included a two-part demo of how to use Simics to debug an endianness bug in a networked system as it migrates from big-endian to a little-endian system. Along the way, I also showed off various Simics features like reverse execution and checkpointing and scripted execution.</p>
<p>The <a href="http://www.youtube.com/watch?v=3CTvtpMptlg">demo is now online at the Wind River Youtube channel</a>, and the setup is explained in a <a href="http://blogs.windriver.com/tools/2013/01/debug-quicker-with-simics-video-demo.html">blog post at the Wind River company blog</a> which is worth reading before watching the video.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1830" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1830" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1830"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1830" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1830/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Simulation vs Reality in Schlock Mercenary</title>
		<link>http://jakob.engbloms.se/archives/1823?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1823#comments</comments>
		<pubDate>Mon, 07 Jan 2013 09:25:06 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[funny]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1823</guid>
		<description><![CDATA[Schlock Mercenary is a very funny web (and print) comic that I discovered earlier this year via a list at ArsTechnica. In reading up on back issues and back stories, I came across a nice little gem about simulation. From 2008-06-23: I do simulation tools for a living, but I still agree that there is <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1823" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.schlockmercenary.com/"><img class="alignleft size-full wp-image-1824" title="schlock logo" src="http://jakob.engbloms.se/wp-content/uploads/2013/01/schlock-logo.jpg" alt="" width="64" height="61" />Schlock Mercenary </a>is a very funny web (and print) comic that I discovered earlier this year <a href="http://arstechnica.com/gaming/2013/01/ars-readers-pick-the-12-most-incredible-webcomics/">via a list at ArsTechnica</a>. In reading up on back issues and back stories, I came across a nice little gem about simulation.</p>
<p><span id="more-1823"></span></p>
<p><a href="http://www.schlockmercenary.com/2008-06-23">From 2008-06-23:</a></p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2013/01/schlock-simulation-1.png"><img class="aligncenter size-full wp-image-1825" title="schlock simulation 1" src="http://jakob.engbloms.se/wp-content/uploads/2013/01/schlock-simulation-1.png" alt="" width="725" height="210" /></a></p>
<p>I do simulation tools for a living, but I still agree that there is no replacement for reality tests as well. Doing all those simulation runs probably indicates that nothing is totally broken, but you never know what happens when reality confronts the system with a situation you did not think about including in the simulation. I just love the line &#8220;create indistinguishably perfect virtual world&#8221;&#8230;</p>
<p>And indeed, things do turn out to be a bit different in reality.</p>
<p><a href="http://www.schlockmercenary.com/2008-11-21">From 2008-11-21</a>:</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2013/01/schlock-simulation-2.png"><img class="aligncenter size-full wp-image-1826" title="schlock simulation 2" src="http://jakob.engbloms.se/wp-content/uploads/2013/01/schlock-simulation-2.png" alt="" width="725" height="210" /></a></p>
<p>Go and read the entire story &#8211; it is fun all along.</p>
<p>&nbsp;</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1823" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1823" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1823"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1823" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1823/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Programming like Lego</title>
		<link>http://jakob.engbloms.se/archives/1806?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1806#comments</comments>
		<pubDate>Sun, 06 Jan 2013 20:58:40 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[lego]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1806</guid>
		<description><![CDATA[LEGOs seem to be a favorite analogy for people bemoaning the state of software development today. “If only it would be as simple as putting Legos together&#8221; is a common enough statement, along with various proposals to make software that is Lego-like. Sometime, I wonder if people making these statements have actually tried to build <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1806" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2013/01/blue-lego-brick1.jpg"><img class="alignleft size-full wp-image-1808" title="blue lego brick" src="http://jakob.engbloms.se/wp-content/uploads/2013/01/blue-lego-brick1.jpg" alt="" width="111" height="96" /></a></p>
<p>LEGOs seem to be a favorite analogy for people bemoaning the state of software development today. “If only it would be as simple as putting Legos together&#8221; is a common enough statement, along with various proposals to make software that is Lego-like. Sometime, I wonder if people making these statements have actually tried to build anything non-trivial from Lego recently. Here, I will look a bit closer at the Lego-programming analogy. There is indeed quite a lot to it, but it is not all about child-level simplicity. I think there are some good lessons that can be learnt from analogizing Lego and programming.</p>
<p><span id="more-1806"></span></p>
<p>The naïve attractive idea of Lego is, I think, is that “any piece can be combined with any other piece”. Technically, this is <em>mostly </em>true in my experience, but with a significant number of exceptions that might surprise someone who does not have kids of Lego age. Even if we ignore the modern Technic beams which are really a separate system with its own logic, there are many pieces in Lego that can only be attached to certain other types of pieces due their geometries (<a href="http://www.bricklink.com/catalogItem.asp?P=94161">such as this</a>).</p>
<p>Still – the basic unit of interface is <span style="text-decoration: underline;"><em>almost</em> </span>universal, the Lego stud. There is the first lesson to be drawn from Lego – <em>design for connection from the start</em>. The reason all the pieces fit is that they all use the stud, and the reason that they all use the stud is a conscious explicit design from the very start. It is not something that was retrofitted into an existing set of random blocks – Lego pieces were designed starting with the stud and its geometry, and then designed so that each new piece uses the same-size stud and the same geometrical logic. For software, the lesson is clear: if you want Lego-style assembly, you need to start with very strong standard interfaces and build the software for these interfaces. You cannot really hope to componentized and “Legoize” an existing body of code. The connectivity is more important than functionality.</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2013/01/lego-helicopter.jpg"><img class="alignleft" title="lego helicopter" src="http://jakob.engbloms.se/wp-content/uploads/2013/01/lego-helicopter-300x174.jpg" alt="" width="300" height="174" /></a><a href="http://jakob.engbloms.se/wp-content/uploads/2013/01/lego-tree.jpg"><img class="alignright" style="margin: 5px;" title="lego tree" src="http://jakob.engbloms.se/wp-content/uploads/2013/01/lego-tree.jpg" alt="" width="167" height="202" /></a>So, any two Lego pieces are easy to fit together. However, that does not mean that it is easy to build something that works in an interesting way or looks like what you had in mind. Indeed, if you look at the way modern Lego kits are designed, it is amazing how many smart solutions are employed to attach pieces at various angles.</p>
<p>The helicopter shown on the left is a typical modern lego design: plenty of pieces that fit at 90 degrees to the vertical, a far cry from the old-style design shown on the right. The sophistication of the design and the great variety of pieces needed is much greater for the helicopter. I think what mostly people want out of their software is more like the helicopter than the tree.</p>
<p>The basic composition of pieces is easy – but advanced construction is a skilled job. Your average seven-year-old will not be able to design the kits they buy, even though it is easy to follow the instructions and build them. Ease of assembly does not necessarily mean ease of actual design. If you have something particular in mind, phrasing it into Lego bricks can take a long time and a lot of trial and error. That trial and error is certainly usually quick and easy, but it never replaces some planning and forethought so that you do not have to tear down an entire building because you forgot to put some crucial part in the middle of a now-finished wall…</p>
<p>There is the second lesson: even for a software system where we can build like Legos, I would not expect to see regular users create superb software for sophisticated work – that would still be a rare skill. <em>Ease of construction removes some of the accidental difficulties of software construction, but it still leaves the essential problems to be solved</em>.</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2013/01/lego-piano.jpg"><img class="alignleft" style="margin: 5px;" title="lego piano" src="http://jakob.engbloms.se/wp-content/uploads/2013/01/lego-piano.jpg" alt="" width="268" height="232" /></a>Another Lego lesson is that to build something that truly works (or looks like something real), <em>you need rich palette of pieces</em>. In theory, you could away with just the simplest old square pieces and basically build a 3D model by using voxels like most of the older models you find in <a href="http://jakob.engbloms.se/archives/1169">Legoland</a>. But that requires both very large models and very fiddly work, and honestly is not something within the reach of most users. Using tens of thousands of pieces is time-consuming, even when they are Legos.</p>
<p>Thus, modern Lego is awash in special-purpose pieces. To get an idea, look at the list of pieces introduced each year at <a href="http://www.brickset.com/parts/browse/years/">Brickset</a>. 1000s of new pieces every year. They still follow the Lego system, but it is not just square bricks anymore. Look at the piano on the left: the little microphone and the glass are a special pieces (following the size rules of the Lego minifig hands). To build the rest of it, existing parts are reused, but it is built from 11 different types of bricks, only 5 of which are used more than once (see <a href="http://cache.lego.com/bigdownloads/buildinginstructions/6001991.pdf">instructions</a>).</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2013/01/lego-bike1.jpg"><img class="alignright" title="lego bike" src="http://jakob.engbloms.se/wp-content/uploads/2013/01/lego-bike1.jpg" alt="" width="140" height="127" /></a>Sometimes when you really need something special, you just have to craft it with no heed given to the idea of composability and flexibility. Like the Lego bicycle seen on the right, which are basically three pieces: <a href="http://www.brickset.com/parts/?part=4592277">frame</a>, and two <a href="http://www.brickset.com/parts/?part=4622574">wheels</a> (front light and helmet are optional). Makes for a perfect model of a bike, but will not help you build anything else.</p>
<p>In summary, the third lesson seems to be that in order to compensate for a rigid way to connect things, it seems you need to introduce an incredible host of special-purpose components to do just <em>that</em> particular thing that cannot be done from heavy standard bricks.</p>
<p>An attractive property of Lego that might or might not be desirable in software is that even in a finished model, you have plenty of places where you can attach new things. Unless you carefully cover all studs with little flat plates, anyone can add things to the model in almost any place. Indeed, extensibility has been a common theme for long-term successful software: by allowing users to extend and build on the basic software, the power and staying power of the software is typically greatly extended. Look at tools like Emacs, gdb, good old Microsoft Word, or Eclipse: by having ways for users (and OEMs and integrators and VARs and consultants) to hook in their own scripts and extensions, you get a software ecosystem that keeps users coming back. Trying to keep users out with a shiny polished surface where things just slide off tend to result in shorter-lived software.</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2013/01/lego-duplo.jpg"><img class="wp-image-1811 alignright" style="margin: 5px;" title="lego duplo" src="http://jakob.engbloms.se/wp-content/uploads/2013/01/lego-duplo.jpg" alt="" width="212" height="378" /></a>Finally, I have to recount a story from many years ago, when I was studying in Germany. My professor in software engineering made a remark that RAD (rapid application development) tools where like building with Lego Duplo: you quickly get something built, but you do not have much control over how it looks. I agree. You can build big things very quickly with Duplos, but they do require a lot of imagination to interpret the creation. Like the rocket on the right. Took no time at all to build, and it is very large compared to regular Lego (see minifig mechanic to get the scale).</p>
<p>So, to conclude, I think the Lego analogy has much going for it. But it is not &#8220;anyone can build anything&#8221;. The lessons I think that can be drawn from Lego for software designers:</p>
<ul>
<li>Design for modularity and connectivity from the very start</li>
<li>Force all modules to be connectable, even if that contorts the shape of the module a little (global goals override local optimizations)</li>
<li>Ease of assembly does not make design and architecture go away</li>
<li>Sophisticated software design will still be a skill</li>
<li>Expect to see many special-purpose modules created to solve very particular tasks</li>
<li>Expect modularity to show through in the end product &#8211; it will not look quite the same as something monolithic</li>
<li>Allow extensions throughout the life of a piece of software &#8211; embrace user extensibility</li>
</ul>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1806" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1806" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1806"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1806" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1806/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Does ISA Matter for Performance?</title>
		<link>http://jakob.engbloms.se/archives/1801?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1801#comments</comments>
		<pubDate>Sun, 23 Dec 2012 20:54:41 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer architecture]]></category>
		<category><![CDATA[32-bit]]></category>
		<category><![CDATA[64-bit]]></category>
		<category><![CDATA[68000]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[CISC]]></category>
		<category><![CDATA[instruction set architecture]]></category>
		<category><![CDATA[Intel]]></category>
		<category><![CDATA[ISA]]></category>
		<category><![CDATA[MIPS]]></category>
		<category><![CDATA[performance optimization]]></category>
		<category><![CDATA[power architecture]]></category>
		<category><![CDATA[RISC]]></category>
		<category><![CDATA[SPARC]]></category>
		<category><![CDATA[x86]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1801</guid>
		<description><![CDATA[When I grew up with computers, the big RISC vs CISC debate was raging. At the time, in the late 1980s, it did indeed seem that RISC was inherently superior to CISC. SPARCs, MIPS, and Alpha all outpaced boring old x86, VAX and 68000 processors. This turned out to be a historical parenthesis, as the <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1801" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2012/12/binary-code1.jpg"><img class="alignleft size-full wp-image-1803" style="margin: 5px 10px;" title="binary code" src="http://jakob.engbloms.se/wp-content/uploads/2012/12/binary-code1.jpg" alt="" width="89" height="85" /></a>When I grew up with computers, the big RISC vs CISC debate was raging. At the time, in the late 1980s, it did indeed seem that RISC was inherently superior to CISC. SPARCs, MIPS, and Alpha all outpaced boring old x86, VAX and 68000 processors. This turned out to be a historical parenthesis, as the Pentium Pro from Intel showed how RISC-style performance could be mated to a CISC ISA. However, maybe ISAs still do matter.</p>
<p><span id="more-1801"></span></p>
<p>The conventional wisdom (see for example <a href="http://archive.arstechnica.com/cpu/4q99/risc-cisc/rvc-1.html">http://archive.arstechnica.com/cpu/4q99/risc-cisc/rvc-1.html</a>) since the PentiumPro has been that for a mainstream processor, the decoder logic is such a small part that it does not really matter. Behind that decoder, both a high-end RISC and a high-end CISC do pretty much the same thing, and the sheer weight of design manpower and manufacturing advantages that Intel has had has ensured that their processors have been the fastest or at least very competitive for the past decade and a half. CISC vs RISC also turned out to be a blurry boundary – some PowerPC processors did the same trick as Intel processors, breaking up their instructions into microoperations before sending them down the pipeline. Indeed, Power Architecture is sometimes not even considered RISC, but rather something of hybrid.</p>
<p>It seems that market share is just as important for processor performance as technical elegance and ingenuity – a bit sad, but pretty true. At least, this holds true in the “old mainstream” desktop and server markets.</p>
<p>Still ISA matters. And even within a family, ISA changes can have a huge impact on performance.</p>
<p>Obviously, we have specialized ISAs like DSPs and network processors and GPUs. In these cases, the ISAs let us express computations in ways that they cannot be on a general-purpose processor, often trading ease of writing general code against performance on a certain class of computation. Efficiency is maybe an order of magnitude better compared to a GPP for these processors, doing the same work. A high-end GPP can often match the absolute performance of a specialized processor, by the brute force of high clock speeds, large caches, and aggressive out-of-order pipelines. But in doing so, it uses 10-100 times more energy.</p>
<p>Inside a mainstream ISA, the trend of the last decade has been the successive addition of small and large sets of specialized instructions for doing various important computations. Floating point units have been essentially replaced by vector processing instruction sets such as MMX, SSE, Altivec, VFP, and Neon. Crypto instructions have entered the mainstream on all major architectures. Power Architecture has included binary-coded-decimal instructions, and IBM mainframes have instructions that do string copies in the L3 cache. Configurable architectures like Tensilica show the value in adding a few well-chosen application-specific instructions. Adding features to existing ISAs has been proven to have very good bang for the buck.</p>
<p>Another, more interesting, aspect is the modernization of old ISAs. The actual cause for this blog post was that the Microprocessor Report estimated that an ARM Cortex-A57 (and A53) core would gain 10% performance when running in ARM AArch64 rather than AArch32 mode. That is pretty significant – the same processor core, the same program, just compiled using a different set of instructions. It is especially interesting for ARM, since the ARM v8 AArch64 64-bit instruction set is pretty much totally different from the old ARM 32-bit instruction set. ARM took the chance to modernize and remove old “good ideas at the time” aspects that are hard to implement efficiently today, such as predication on most instructions, the wonderfully complex shifting operands, and pc-relative constant pools swimming around in the instruction stream (see <a href="http://www.realworldtech.com/arm64/">http://www.realworldtech.com/arm64/</a>  for more details). AArch64 also finally makes ARM an architecture with 32 registers, rather than the old 16, which makes compiler optimizations much easier.</p>
<p>This means that AArch64 improves performance in two dimensions: it allows better more streamlined silicon implementations compared to the old ARM, and it makes the life of a compiler a bit easier with fewer spills and more data kept in registers. I think it is mostly this second effect that is seen for the Cortex-A57 and A53.</p>
<p>I think this is good proof that ISA design does matter. Another similar data point is the effect of going to x86-64 from x86-32. Typically, you can see a five to ten percent improvement in the speed of compute-intense code from the better register allocation allowed by having eight extra registers and the overall somewhat cleaner instruction set. This is a different approach compared to ARM, since x86 maintains the 32-bit instruction set and simply extends it, where ARM basically changed the ISA completely.</p>
<p>Thus, my conclusion is that the design of an ISA can have significant effect on the performance of a processor, even when all other factors are held constant. What this “best” design is seems to change over time – in the 1970s, CISC designs like the 8008 and 8086 were dominant as they let you do more with each slow fetch from memory. In the 1980s, with faster memory, RISC used a bit more program space to create an instruction set that could map directly to a very efficient pipelined processor. In the 1990s, with more complex processors, RISC or CISC turned out to be equivalent. In the 2010s, it is clear that efficiencies can be achieved by designing a clean ISA that is easy to implement in many different ways in out-of-order speculative high-frequency designs. The key today is hardware-independence, where it used to be the close co-design of instructions and hardware.</p>
<p>On the ultra-low-end, I think the classic RISC idea of “simple to implement” instructions still have clear value. Something like the ARM Cortex-M0 can be implemented in 12000 transistors, on the same level as a 1970s 8-bit design – while being fully 32-bit and clocking in at many times the performance of these old machines. It is even half the size of an Intel 8086. It is hard to see how an x86 processor could ever be cut down to this kind of size – unless of course Intel does the same as ARM did, and cut out most of the instruction set into an ultrareduced version. Note that a classic simple RISC like the MIPS R2000 was still about twice the size of a 68000 processor, so it is not necessarily the case that RISC means small. You need to design for smallness too.</p>
<p>&nbsp;</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1801" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1801" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1801"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1801" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1801/feed</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Wind River Blog: Debugging Simics using Simics</title>
		<link>http://jakob.engbloms.se/archives/1797?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1797#comments</comments>
		<pubDate>Thu, 06 Dec 2012 11:04:32 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[multicore debug]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Debug]]></category>
		<category><![CDATA[race condition]]></category>
		<category><![CDATA[reverse debugging]]></category>
		<category><![CDATA[Simics]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1797</guid>
		<description><![CDATA[There is a new post at my Wind River blog, telling the story of how some of the Simics developers used Simics itself to debug an intermittent Simics program crash caused by a timing-sensitive race condition. Running Simics on itself is pretty cool, and shows the power of the simulator and its applicability even to <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1797" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-1122" style="margin: 5px 10px;" title="Wind River Logo" src="http://jakob.engbloms.se/wp-content/uploads/2010/04/button-quicklink-blogs.png" alt="" width="46" height="46" />There is a new post at my Wind River blog, telling the story of how some of the Simics developers <a href="http://blogs.windriver.com/tools/2012/12/debugging-simics-on-simics.html">used Simics itself to debug an intermittent Simics program crash </a>caused by a timing-sensitive race condition.</p>
<p>Running Simics on itself is pretty cool, and shows the power of the simulator and its applicability even to really complex software.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1797" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1797" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1797"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1797" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1797/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Wind River Blog: Simics and Flying Piggies</title>
		<link>http://jakob.engbloms.se/archives/1787?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1787#comments</comments>
		<pubDate>Wed, 28 Nov 2012 09:59:06 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer games]]></category>
		<category><![CDATA[off-topic]]></category>
		<category><![CDATA[virtual machines]]></category>
		<category><![CDATA[Wind River Blog]]></category>
		<category><![CDATA[Bad Piggies]]></category>
		<category><![CDATA[determinism]]></category>
		<category><![CDATA[Rovio]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[variability]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1787</guid>
		<description><![CDATA[I just added a new blog post at the Wind River blog, about determinism and illustrating Simics-style determinism is by looking at the game Bad Piggies. Games and simulators have quite a lot in common, actually. In particular, Bad Piggies is perfectly determinstic. The details are all in the Wind River blog post, but I felt <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1787" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-1122" style="margin: 5px 10px;" title="Wind River Logo" src="http://jakob.engbloms.se/wp-content/uploads/2010/04/button-quicklink-blogs.png" alt="" width="46" height="46" />I just <a href="http://blogs.windriver.com/tools/2012/11/determinism-simics-and-flying-piggies.html">added a new blog post at the Wind River blog</a>, about determinism and illustrating Simics-style determinism is by looking at the game Bad Piggies. Games and simulators have quite a lot in common, actually.</p>
<p><span id="more-1787"></span></p>
<p>In particular, Bad Piggies is perfectly determinstic.</p>
<p>The details are all in the Wind River blog post, but I felt I wanted to share some cute screenshots from the game here.</p>
<p>First, we see how levels most usually end, with a broken vehicle and a bouncing pig. The main departure from reality in the physics engine in Bad Piggies is that fact that the pig is made from some indestructible material and can take any amount of pounding without getting more than a few cartoon stars around the head. Perfectly OK, as that makes for a compelling game.</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2012/11/bad-piggies-cart-falling-apart.png"><img class="aligncenter size-full wp-image-1788" title="bad-piggies-cart-falling-apart" src="http://jakob.engbloms.se/wp-content/uploads/2012/11/bad-piggies-cart-falling-apart.png" alt="" width="503" height="435" /></a></p>
<p>Second, the animations are also just wonderful, the pig (and the queen that sometimes tags along) look genuinely scared as they jump fly and bounce around.</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2012/11/bad-piggies-with-queen-driving.png"><img class="aligncenter size-full wp-image-1789" title="bad-piggies-with-queen-driving" src="http://jakob.engbloms.se/wp-content/uploads/2012/11/bad-piggies-with-queen-driving.png" alt="" width="640" height="400" /></a></p>
<p>The game is highly recommended, and I have seen people just watching a run laugh out loud as things do not work out too well&#8230; it is like a classic Roadrunner cartoon come live, or silent movie physical humor.</p>
<p>(Screenshots taken from the HD version on a Google Nexus 7, and then edited down to size).</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1787" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1787" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1787"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1787" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1787/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Dragons can be Useful &#8211; when AT Models Make Sense</title>
		<link>http://jakob.engbloms.se/archives/1779?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1779#comments</comments>
		<pubDate>Mon, 12 Nov 2012 13:51:22 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Bill Neifert]]></category>
		<category><![CDATA[Carbon]]></category>
		<category><![CDATA[clock-cycle models]]></category>
		<category><![CDATA[cycle accuracy]]></category>
		<category><![CDATA[dragons]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1779</guid>
		<description><![CDATA[Carbon Design Systems keeps putting out interesting blog posts at a good pace. Bill Neifert at recently put up a blog post about the various of speed/accuracy tradeoffs you can make when building virtual platforms. The main message of the blog is that you should use a mix of fast models (TLM + JIT, like <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1779" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.carbondesignsystems.com">Carbon Design Systems </a>keeps putting out interesting blog posts at a good pace. <a href="http://jakob.engbloms.se/wp-content/uploads/2012/05/carbonlogo.jpg"><img class="alignleft size-full wp-image-1654" title="carbonlogo" src="http://jakob.engbloms.se/wp-content/uploads/2012/05/carbonlogo.jpg" alt="" width="66" height="66" /></a>Bill Neifert at recently put up a blog post about the various of <a href="http://www.carbondesignsystems.com/virtual-prototype-blog/bid/163520/achieving-speed-and-accuracy-with-an-arm-virtual-prototype">speed/accuracy tradeoffs you can make when building virtual platforms</a>. The main message of the blog is that you should use a mix of fast models (TLM + JIT, like the ARM Fast Models) and cycle-accurate generated-from-RTL models (like the models generated by Carbon&#8217;s tools). By switching between the levels of abstraction when you need to go fast or go deep, you get something that is pretty much the best of both worlds (I already blogged about the <a href="http://jakob.engbloms.se/archives/1653">change between abstraction before</a>). It makes perfect sense, and I am all with him. There are dragons in the middle land.</p>
<p>However, I do not quite agree with Bill about the absolute uselessness of the intermediate types of models, like SystemC TLM-2.0 AT.  Basically, what is traditionally called &#8220;cycle accurate modeling&#8221; (while not derived from RTL).</p>
<p><span id="more-1779"></span></p>
<p>I love the illustration on their blog post:</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2012/11/carbon-here-be-dragons.png"><img class="aligncenter size-full wp-image-1781" title="carbon-here-be-dragons" src="http://jakob.engbloms.se/wp-content/uploads/2012/11/carbon-here-be-dragons.png" alt="" width="537" height="464" /></a></p>
<p>But that land of dragons sometimes have to be visited, for those daring knights of design that need to peek into the future.</p>
<p>If the goal is to build a model that <a href="http://jakob.engbloms.se/archives/1083">describes </a>a piece of hardware at cycle accuracy, manually coded AT models are the wrong way to go (see this blog post about the <a href="http://jakob.engbloms.se/archives/153">futility of  cycle accurate model building</a>). You will never be quite right, and the resulting model won&#8217;t be that much faster than a model derived from RTL.</p>
<p>However, if we look at the case of designing new hardware, the dragon area is the right place to be. Before you have RTL, you cannot be cycle-accurate to the final implementation, since that does not yet exist. Rather, you want a model that lets you try various latencies and pipelines and parallelization strategies and estimate the eventual performance. This has been the standard way to develop new processors <a href="http://jakob.engbloms.se/archives/1126">since the late 1950s</a>, and still is. James Aldis of TI had a nice description of the process in an <a href="http://jakob.engbloms.se/archives/1387">article from last year</a>.</p>
<p>The key take-away for me is that anyone who does not either have access to the RTL of a processor or a direct connection to the design team pretty much have to give up on hoping for &#8220;cycle accurate&#8221; models. In any case, once we start moving away from the processor, out into peripherals and then off-chip, we soon run out of precise models no matter what. A virtual platform can be perfectly useful for most use cases, even when it is only a fast platform with no attempt at cycle accuracy.</p>
<p>&nbsp;</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1779" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1779" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1779"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1779" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1779/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Logging (Some More Thoughts)</title>
		<link>http://jakob.engbloms.se/archives/1771?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1771#comments</comments>
		<pubDate>Sun, 28 Oct 2012 21:38:18 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[embedded software]]></category>
		<category><![CDATA[embedded systeme]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[ACM Queue]]></category>
		<category><![CDATA[Adam Oliner]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[logging]]></category>
		<category><![CDATA[tracing]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1771</guid>
		<description><![CDATA[Logging as as debug method is not new, and I have been writing about it to and from over the past few years myself.  At the S4D conference, tracing and logging keeps coming up as a topic (see my reports from 2009, 2010  and 2012 ).  I recently found an interesting piece on logging from <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1771" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p style="text-align: left;"><a href="http://jakob.engbloms.se/wp-content/uploads/2011/05/debug-small.png"><img class="alignleft size-full wp-image-1421" title="debug small" src="http://jakob.engbloms.se/wp-content/uploads/2011/05/debug-small.png" alt="" width="81" height="73" /></a>Logging as as debug method is not new, and I have been writing about it to and from over the past few years myself.  At the S4D conference, tracing and logging keeps coming up as a topic (see my reports from <a href="http://jakob.engbloms.se/archives/942">2009</a><a href="http://jakob.engbloms.se/archives/1251">, 2010  </a>and <a href="http://jakob.engbloms.se/archives/1758">2012 </a>).  I recently found an interesting piece on logging from the IT world in the ACM Queue (&#8220;<a href="http://queue.acm.org/detail.cfm?id=2082137">Advances and Challenges in Log Analysis</a>&#8220;, Adam Oliner, ACM Queue December 2011).  Here, I want to address some of my current thoughts on the topic.</p>
<p style="text-align: left;"><span id="more-1771"></span></p>
<h2 style="text-align: left;">Printf debug</h2>
<p style="text-align: left;">For some people, log printouts from a program is dirty and primitive. Very often when discussing debug methodologies, everyone starts with &#8220;printf debug&#8221; and implicitly considers everything else superior. In many cases this is true. When solving comparatively simple linear problems, proper use of a typical debugger ought to solve a problem in much less time and with much less intrusion into the target system. However, once you get to system composed of many quite independent parts, with unpredictable sequencing of actions and concurrent activity, logging is actually a pretty practical and powerful way to diagnose a system. In such a case, the key is to be systematic about how logging is done and to make sure it is much more than just a sprinkling of print statements in random places in the program. What <em>good</em> logging brings you is the executable embodiment of the knowledge about the most <em>interesting and important events</em> in the flow of a program, put there by the programmer writing the code. It is high-level semantic information, much easier to digest than a line by line trace or stepping of the code of the program.I find it hard to see a better way to put that in than by annotating the source code with log statements &#8211; but they had better be statements that are clearly about logging, and that direct the logs to some kind of processing system that allows selection and sorting of the output.</p>
<p style="text-align: left;">Of course, logging can also be done badly, in which case nothing important is ever logged, or millions of uninteresting things are logged, hiding the few relevant events. I have seen this many times. But when done right, logs help you zero in on a problem at a very high pace, without having to really sit down and inspect the code. Often, good logs let you pinpoint a problem and report it to the programmer without needing more than a trace of the log messages that show that something bad is going on in the code modules executing.</p>
<h2 style="text-align: left;">Single sequence</h2>
<p style="text-align: left;">One problem with logging that almost always comes is that you have many different log sources and many different log systems (the ACM Queue article has several examples of this), and thus getting a consistent correctly ordered view of events can be very difficult or impossible.  Obviously, this is something that can be solved for a system over which you have some amount of control. It would seem to me that logging all systems through a single conduit makes the most sense &#8211; but also requires some changes to existing mechanism.  In many cases, it would seem possible to just change existing log collectors to push the final results through to some unified infrastructure.</p>
<p style="text-align: left;">The typical real-world case is shown below, contrasted with an ideal case of a single log conduit that creates a single consistent view of what is going on.</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2012/10/log1.png"><img class="aligncenter size-full wp-image-1772" title="log1" src="http://jakob.engbloms.se/wp-content/uploads/2012/10/log1.png" alt="" width="610" height="302" /></a>Back at the S4D in 2010, Pat Brouillette presented a paper on Intel&#8217;s internal SVEN system that still presents the best logging design I have seen (but I have not seen every system on the planet so there are probably more that are as sophisticated).  What I found particularly attractive and useful in their approach was that they logged in user-level software, drivers, operating systems, and the hardware itself. Thus, you could follow a chain of events up and down through the stack.  As illustrated above, most existing log systems only apply at one of these levels, making it hard to really see what is happening. This system pretty much achieves the unified log illustrated above.</p>
<p>The result of a unified log is that you can follow the execution through the software stack, even across changes from user-level to kernel-level code and the actions of drivers on the hardware. This is incredibly useful for systems where software and hardware need to interact closely. This is something really hard to do using other debug methods, as most debuggers lose the trail when changing execution modes. Even when they can indeed step along, the volume of code that needs to be followed is so big that it is not very likely that debug information and source code is available for all of it.</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2012/10/log-unified-via-hw-output.png"><img class="aligncenter size-full wp-image-1774" title="log unified via hw output" src="http://jakob.engbloms.se/wp-content/uploads/2012/10/log-unified-via-hw-output.png" alt="" width="649" height="271" /></a></p>
<h2>Hardware support</h2>
<p>A key to making the unified log approach work is to have hardware support for logging. This gets around the common complaint about logging that the overhead affects the system and swamps IO. In particular for resource-constrained system, logging can quickly start to impact performance and in particular IO performance (it seems to be less of an issue in the server space). But there are ways around that, if we allow the hardware to help us. Thus, we want to get something like this into our embedded system:</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2012/10/log-via-hardware.png"><img class="aligncenter size-full wp-image-1773" title="log via hardware" src="http://jakob.engbloms.se/wp-content/uploads/2012/10/log-via-hardware.png" alt="" width="447" height="247" /></a>The simplest approach is to have some special memory-mapped locations where log information is dumped, and then passed on via a debug interface to the outside world.  This creates a side channel which avoids mixing in log messages into the useful data streams being sent over serial lines, networks, or other application communications channels. ARM has support for this in their trace facilities for hardware debug today, and I have seen it used to both implement efficient logging as well a general printf (using the output channel instead of serial ports to get data back to the debugger).  We also did something similar with <a href="http://blogs.windriver.com/tools/2012/04/getting-code-coverage-with-ldra-tools-on-simics.html">Simics and the LDRA code-coverage tools</a>.</p>
<p>In a panel at the S4D in 2012 I proposed using a special instruction in a processor&#8217;s ISA to do logging (of a single integer or a few integer values stored in registers).  I think this could work very well, but it would reuire some kind of discipline to allocate log ID numbers to various parts of the stack. This kind of approach would work best when the goal is to know where you have been executing with a small amount of context data. If there is a need to dump large amounts of data (like typical IT logs), it probably does not work that well. The advantage is that it can be used within any code, as there is no need for a call into a system library (print statements are typically verboten inside interrupt code, for example). Today, we can do this kind of logging inside virtual platforms by using &#8220;magic instructions&#8221; (no-ops that the simulator recognizes as special and which can be handed off to instrumentation modules or scripts), but it would be a very handy thing to have in hardware as well.</p>
<h2>Analyzing the results</h2>
<p>Given some kind of log mechanism, the final step is to analyze the logs to figure out what is going on and whether something is going on that should not. This is part that I am currently looking at ways to address &#8211; it is much easier getting things out of a system than to really understand them and draw the right conclusions. There is a huge body of research that should apply here, but just not sure about how the pieces should fit together for my target of &#8220;general embedded systems&#8221;.</p>
<p style="text-align: left;">
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1771" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1771" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1771"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1771" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1771/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Reverse Execution History Updates</title>
		<link>http://jakob.engbloms.se/archives/1768?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1768#comments</comments>
		<pubDate>Sat, 29 Sep 2012 19:33:59 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[embedded systeme]]></category>
		<category><![CDATA[history of computing]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[CTS]]></category>
		<category><![CDATA[Lauterbach]]></category>
		<category><![CDATA[reverse debugging]]></category>
		<category><![CDATA[reverse execution]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1768</guid>
		<description><![CDATA[After some discussions at the S4D conference last week, I have some additional updates to the history and technologies of reverse execution. I have found one new commercial product at a much earlier point in time, and an interesting note on memory consistency. First and most importantly, I must revise my previously published history of <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1768" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2011/12/reverse-icon.png"><img class="alignleft size-full wp-image-1550" title="reverse icon" src="http://jakob.engbloms.se/wp-content/uploads/2011/12/reverse-icon.png" alt="" width="62" height="62" /></a>After some discussions at the <a href="http://jakob.engbloms.se/archives/1758">S4D conference last week</a>, I have some additional updates to the history and technologies of reverse execution. I have found one new commercial product at a much earlier point in time, and an interesting note on memory consistency.</p>
<p><span id="more-1768"></span></p>
<p>First and most importantly, I must revise my <a href="http://jakob.engbloms.se/archives/1564">previously published history of reverse execution</a>. It turns out I was wrong about <a href="http://www.lauterbach.com">Lauterbach</a>. Rather than having something that was record-replay debug as I thought, it turns out that their CTS, Context Tracking System, actually is a working reverse debugger. And it has been that way since 1999, beating Green Hills Time Machine to market by quite a few years. Thus, the award for &#8220;first reverse debugger based on hardware trace recording&#8221; has to be reassigned to Lauterbach. I was presented with a Trace32 newsletter from 1999 where it was very clear that the CTS allows a user to move backwards in time in the trace, tying the point in time to source code and registers. The CTS can also be given commands to step backwards in time until some condition is fulfilled, which is in essence reverse breakpointing.</p>
<p>I have updated my post from early 2012 to reflect this new understanding. That is the charm of blogging &#8211; you are at liberty to go back rewrite what you wrote when new facts appear. Still, I keep the old text around but with strike-through to show what I originally wrote, even if it was indeed wrong. Better to be clear on what has been revised than to silently change things.</p>
<p>Second, <a href="http://jakob.engbloms.se/archives/1758">as noted before</a>, weak memory models complicate replay for reverse execution. In a typical host-based approach to reconstruction-based reverse debug, the replay of a parallel execution is done in serial (with interleaving at all points that communication was detected). This allows race-condition debug for the case that races are caused by accessing shared memory without proper locking. However, if races arise not from the program design but from the hardware behavior, these will not be reproduced. In a weak memory model like ARM employs, it is theoretically possible for certain memory operations to take noticeable time to propagate from one processor to another. Such conditions will not be replayable when using a single processor to reproduce a parallel scenario. If nothing else will the replay on a single processor ensure perfect memory consistency between threads. There is no way that hardware can allow such scenarios to be replayed without very special support being designed into the hardware.The only method I can imagine that can handle this correctly is to use cycle-accurate simulators that are still deterministic and applicable to reconstruction, running with a perfect model of the memory system &#8211; which means it will be painfully slow, but when debugging complex hardware-software memory consistency bugs, that might be the only tool that works.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1768" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1768" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1768"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1768" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1768/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Wind River Blog: Exposing OS Kernel Races with Landslide</title>
		<link>http://jakob.engbloms.se/archives/1766?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1766#comments</comments>
		<pubDate>Wed, 26 Sep 2012 19:18:08 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[parallel computing]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[Wind River Blog]]></category>
		<category><![CDATA[Ben Blum]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[Landslide]]></category>
		<category><![CDATA[operating systems]]></category>
		<category><![CDATA[Simics]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1766</guid>
		<description><![CDATA[There is a new blog post on my Wind River blog, about the Landslide system from CMU. It is a pretty impressive Master&#8217;s Thesis project that used the control that Simics has over interrupts to systematically try different OS kernel thread interleavings to find concurrency bugs. The blog is an interview with Ben Blum, the <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1766" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-1122" style="margin: 5px 10px;" title="Wind River Logo" src="http://jakob.engbloms.se/wp-content/uploads/2010/04/button-quicklink-blogs.png" alt="" width="46" height="46" />There is a <a href="http://blogs.windriver.com/tools/2012/09/systematically-exposing-os-kernel-races-an-interview-with-ben-blum.html">new blog post on my Wind River blog</a>, about the <a href="http://www.pdl.cmu.edu/PDL-FTP/associated/CMU-CS-12-118_abs.shtml">Landslide </a>system from CMU. It is a pretty impressive Master&#8217;s Thesis project that used the control that Simics has over interrupts to systematically try different OS kernel thread interleavings to find concurrency bugs. The blog is an interview with <a href="http://bblum.net/">Ben Blum</a>, the student who did the work. Ben is now a PhD student, and I bet that he will continue to generate cool stuff in the future.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1766" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1766" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1766"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1766" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1766/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>S4D 2012 &#8211; Notes</title>
		<link>http://jakob.engbloms.se/archives/1758?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1758#comments</comments>
		<pubDate>Mon, 24 Sep 2012 20:11:20 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[appearances]]></category>
		<category><![CDATA[articles]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[embedded software]]></category>
		<category><![CDATA[embedded systeme]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[Hardware debug support]]></category>
		<category><![CDATA[S4D]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1758</guid>
		<description><![CDATA[Last week, I attended my fourth System, Software, SoC and Silicon Degug conference (S4D) in a row. I think the silicon part is getting less attention these days, most of the papers were on how to debug software. Often with the help of hardware, and with an angle to how software runs in SoCs and <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1758" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/09/S4D1.jpg"><img class="alignleft size-full wp-image-941" title="S4D" src="http://jakob.engbloms.se/wp-content/uploads/2009/09/S4D1.jpg" alt="" width="143" height="62" /></a>Last week, I attended my fourth <a href="http://www.ecsi.org/s4d">System, Software, SoC and Silicon Degug conference (S4D) </a>in a row. I think the silicon part is getting less attention these days, most of the papers were on how to debug software. Often with the help of hardware, and with an angle to how software runs in SoCs and systems. I presented a paper reviewing the technology and history of reverse debugging, which went down pretty well.</p>
<p><span id="more-1758"></span></p>
<p>This year, S4D took place in Wien, colocated with FDL at TU Wien. The main building of the TU is really nice (but we were in the more modern microelectronics building). Here is a night shot of the newly renovated main building:</p>
<p><a href="http://jakob.engbloms.se/wp-content/uploads/2012/09/tu-wien-main-building_1.jpg"><img class="aligncenter size-full wp-image-1759" title="tu wien main building_1" src="http://jakob.engbloms.se/wp-content/uploads/2012/09/tu-wien-main-building_1.jpg" alt="" width="600" height="519" /></a></p>
<p>S4D is a small workshop, and the papers presented only provide part of the picture as to what the hot topics under discussion were. This year, the main themes that I picked up were:</p>
<ul>
<li>The bandwidth limitation for hardware-based debug.</li>
<li>Log-based debugging, automatically looking for errors in logs.</li>
</ul>
<p>The bandwidth limitation is very important. If you want to use hardware debug circuitry inside an SoC, you also need to be able to talk to it. The complexity of the chip and the capability of on-chip debug grows with Moore&#8217;s law &#8211; but off-chip bandwidth does not. The number of pins is also growing slowly, if at all, and thus there is strong pressure to use the pins and bandwidth for &#8220;real work&#8221; and not for debug. The result of this are a few trends in technology for debug:</p>
<ul>
<li>Doing more debug processing on the chip, without having to turn around in an off-chip interface box or debugger host (several research papers describes approaches to have checkers and inspection code run on the chip, in a coprocessor or even on one of the regular cores).</li>
<li>Aggressive compression of data sent off-chip (the ARMv8 debug architecture presented by Michael Williams of ARM only traced mispredicted branches off-chip &#8212; expecting the debugger to reconstruct the flow from a minimal amount of information).</li>
<li>The use of software debug agents and the software interface of on-chip debug hardware becomes more important. In particular for devices such as smartphones, where there is no dedicated hardware debug port and the debug might be done over USB, Bluettoth, or Wifi. Thus, the exposure of hardware breakpoints and similar functions to software becomes more important to let users actually take advantage of the debug power of a modern SoC. Hopefully, all other silicon vendors will follow the lead of ARM and expose really powerful features in the hardware to software agents so we can get away from silly things like rewriting code to plant breakpoints (and allowing full data write and read breakpoints in software agents).</li>
<li>Simulator-based debug offers a way to get around the issue by having virtually infinite bandwidth (potentially at the cost of slowing down the target, obviously).</li>
</ul>
<p>Log-based debug is a favorite topic of mine, and it has been on the agenda for S4D since it started (see reports from <a href="http://jakob.engbloms.se/archives/1251">2010 </a>and <a href="http://jakob.engbloms.se/archives/942">2009</a>).</p>
<ul>
<li>This year, the most interesting idea was a hardware unit (generated into an FPGA) that watched as a target was executing, looking for traces that satisfied properties expressed using <a href="http://fsl.cs.uiuc.edu/index.php/Past_Time_Linear_Temporal_Logic">past-time Linear Temporal Logic </a>(ptLTL). ptLTL seems quite well-suited for the task of watching traces of events fly by &#8211; it allows looking backwards just a bit, which makes it much more powerful than just looking at the current state, but still it can be implemented quite efficiently.</li>
<li>Users are clearly using logs to diagnose issues in running systems, and a key problem there is finding issues in huge logs. This is nothing new.</li>
<li>There was a discussion over how to handle explicit log and instrumentation calls in software. Should they remain in the target software as it ships, or be removed? How does that affect certification and validation?</li>
<li>If we introduce hardware-supported log instructions in the ISA, couldn&#8217;t they also be used as a timing-fault-injection mechanism? Basically, with a settable pipeline stall? Such single-cycle overhead instructions should be possible to keep in the shipping software, as they do not lower performance too much. And if single-cycle disturbances kill your real-time system, it is too close to the edge anyway.</li>
</ul>
<p>One idea that I threw out but that met very little agreement from the S4D and FDL participants was the notion that we should build systems that accept and tolerate and recover from errors, rather than hoping to make them bug-free and with timing under perfect control. I instinctively find the idea a bit repugnant, being schooled in the precise tradition of computer science where we expect programmers to fix bugs, not just work around them. But in practice, I realize that this might be the right thing to do as our systems get so complex we cannot hope to precisely understand them or diagnose issues in the lab. A typical <a href="http://queue.acm.org/detail.cfm?id=2333133">example of this apprach was published in the ACM Queue </a>last year &#8211; basically, using a malloc system that minimizes the effects of buffer overruns, double-free, and similar common causes of crashes. A variant of this is actually shipping in Windows 7 already. People building safety-critical systems do not want to have to do this, but at some point we probably need to go statistical rather than showing that our software is correct. There is some interesting work in making software become more continuous than discrete in behavior, paving the way for statistical analysis of errors. But that was not a topic of S4D, at least not this year.</p>
<p>I presented a paper on the history and techniques of reverse debugging, and received some good feedback from the audience. Someone pointed out that with a weak memory model, record-replay on hardware is not guaranteed to reproduce all bugs since concurrency bugs related to the memory model are not inside the controlled area. On x86, where most work has been done, the use of a <a href="http://jakob.engbloms.se/archives/1435">TSO-like memory model </a>makes this point fairly unimportant. But on ARM and Power architecture, it is indeed relevant. Another member of the audience found it funny that he believed that he had indeed invented record-based debug back in 2001 &#8211; but my presentation of history showed that there was ample work before that. Just goes to show how hard it is to know the history of computer science, many ideas are never widely circulated. Finally, there is a lead indicating that there was some kind of reverse-breakpoint trace-based debugger in the market around 1999. I hope to learn more and do a full blog post on this once more data emerges.</p>
<p>Once again a good workshop, and my only wish for next year is that many more people show up so we can get a broader discussion!</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1758" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1758" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1758"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1758" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1758/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>SiCS Multicore Day 2012</title>
		<link>http://jakob.engbloms.se/archives/1751?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1751#comments</comments>
		<pubDate>Sun, 16 Sep 2012 20:12:01 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer architecture]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[embedded software]]></category>
		<category><![CDATA[multicore computer architecture]]></category>
		<category><![CDATA[multicore debug]]></category>
		<category><![CDATA[multicore software]]></category>
		<category><![CDATA[parallel computing]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[Erik Hagersten]]></category>
		<category><![CDATA[heterogeneous]]></category>
		<category><![CDATA[homogeneous]]></category>
		<category><![CDATA[James Larus]]></category>
		<category><![CDATA[Rich Hetherington]]></category>
		<category><![CDATA[SiCS Multicore days]]></category>
		<category><![CDATA[Stephen Hill]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1751</guid>
		<description><![CDATA[The 2012 edition of the SiCS Multicore Day was fun, like they have always been in the past. I missed it in 2010 and 2011, but could make it back this year. It was interesting to see that the points where keynote speakers disagreed was similar to previous years, albeit with some new twists. There <span class="ellipsis">&#8230;</span> <span class="more-link-wrap"><a href="http://jakob.engbloms.se/archives/1751" class="more-link"><span>Read More &#8594;</span></a></span>]]></description>
				<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2012/08/sics-logo.png"><img class="alignleft size-full wp-image-1725" style="margin: 5px;" title="sics logo" src="http://jakob.engbloms.se/wp-content/uploads/2012/08/sics-logo.png" alt="" width="66" height="35" /></a>The 2012 edition of the SiCS Multicore Day was fun, like they have always been in the past. I missed it in 2010 and 2011, but could make it back this year. It was interesting to see that the points where keynote speakers disagreed was similar to previous years, albeit with some new twists. There was also a trend in architecture, moving crypto operations into the core processor ISA, that indicates another angle on the hardware accelerator space.</p>
<p><span id="more-1751"></span></p>
<h3>Many-Core Missing</h3>
<p>Five years have passed since the first SiCS Multicore Day in <a href="http://jakob.engbloms.se/archives/17">2007 </a>(making this the sixth event), and in the introduction by Erik Hagersten he looked back at some of the predictions made back then. One missed prediction stood out clearly. The idea that by now, 128 cores would be mainstream in personal computers. My theory of why this has not happened is simple. GPGPU. GPUs have eaten up the easy parallelism. Instead of using massively multicore regular processors, heavy-duty personal computing has been shifted onto GPUs. With the disappearance of these workloads, there has been little pressure on main processors to become more parallel as there would not be much to gain from that, performance-wise. GPUs have turned out to be perfect for massively dataparallel work in media and other areas (including tasks like <a href="http://hashcat.net/oclhashcat-plus/">cracking password hashes</a> and mining for bitcoins), achieving performance orders of magnitude higher than what could be hoped for with a multicore main processor &#8211; while costing less and using comparatively little power.</p>
<p>The prevalence of GPGPU on the desktop is not mirrored in the top supercomputers, however. According to Erik Hagersten, there is no real GPGPU machine in the top-500 supercomputer list at the moment. Maybe 5% of the performance and 3% of the chips are GPUs. I suspect part of this might have to do with the kinds of tasks being done. HPC at the high-end probably requires more flexibility and programmer control than GPUs can offer.</p>
<p>Programmability might be more important in architectural design for HPC, as HPC users tend to be programmers. Most regular computer users, on the other hand, just use software written by someone else. Thus, it is enough that a few people go through the hard work of coding in CUDA or OpenCL or similar toolkits, and the results of their work can be spread across a very large user base. GPGPUs are perfect to provide &#8220;performance for the rest of us&#8221;, for common tasks coded by a few expert programmers.</p>
<h3>Homogeneous, Heterogeneous</h3>
<p>The debate over GPGPU is part of a bigger debate about homogeneous vs heterogeneous compute systems (see previous blog posts like <a href="http://jakob.engbloms.se/archives/90">this</a>, <a href="http://jakob.engbloms.se/archives/283">this</a>, <a href="http://jakob.engbloms.se/archives/157">this</a>, and <a href="http://jakob.engbloms.se/archives/1496">this</a>). The debate is still going on, with the same intensity as it always have. To me, that would seem to indicate that hardware accelerators are here to stay, even if some people do not really like them.</p>
<p>This year, the primary example of the drive to homogeneity was Intel&#8217;s recently announced &#8220;more than 50 x86 cores on a chip&#8221; Knight&#8217;s Corner (Xeon Phi). The argument for the chip is very much programmability: &#8220;just a large x86 box that runs Linux&#8221;. But I guess you do need special compilers or libraries to make use of the big somewhat Cray-like vector unit (512-bit SIMD unit) each core has been equipped with. At least special optimization will be needed to make the best use of the chip, just like you always need to do when performance matters.</p>
<p>The UltraSparc T5 presented by Rich Hetherington from Oracle fell somewhere inbetween. It has 16 identical cores, but can tweak how it uses the SMT threading to make a core run a certain serial task faster than it otherwise would. This is a step towards the kind of heterogeneous performance in a single ISA that ARM is going after with their Cortex-A15/A7 bigLITTLE approach &#8211; but without the same span in performance, and also with less impact on the overall flexibility of the chip. The T5 also removed the special crypto accelerator hardware that used to be there, instead adding a few crypto instructions to the ISA.</p>
<p>The reason they moved crypto from an accelerator into the ISA was that it turned out to be costly to use a separate hardware unit for small pieces of data. There is OS overhead in invoking an accelerator, and that requires a decent size buffer of data to work on. With instructions in the ISA, you can work on a single word and still get performance gains. User-level software also have a far easier time accessing it, as the instructions are just part of the regular instruction stream. Interestingly, ARM (as presented by Stephen Hill) had done the exact same thing for crypto, for the same reason. This is an important point for hardware accelerators in general: the driver overhead has to be managed, sometimes by mapping hardware straight to individual programs (I made a <a href="http://jakob.engbloms.se/archives/709">simple experiment a few </a>years ago that showed this nicely). On the other hand, everything put into the ISA risks making the entire processor a bit slower and power hungrier, making general ISA extensions something done with great care. Hardware accelerators can be removed from a certain SoC if they turn out not to be needed, not so easy with ISA components.</p>
<p>Stephen Hill from ARM clearly believed in heterogeneity, with four types of processing on a typical chip:</p>
<ul>
<li>Big core (ARM Cortex-A15 today)</li>
<li>Little core (ARM Cortex-A7) &#8211; to create the kind of bigLITTLE setup that allows for a bigger span of power-performance settings.</li>
<li>GPU (from ARM, that means Mali T604 today) &#8211; they clearly see that GPGPU is moving into the mobile space very quickly, doing the same kind of work that it has done on the desktop, and with the same effect of reducing the need for general processor cores.</li>
<li>Special-purpose accelerators &#8211; except when merged into the ISA, as noted above.</li>
</ul>
<p>In researching some of the material from James Larus&#8217; talk, I also came across an interesting <a href="http://www.youtube.com/watch?v=P9NqzPQWzSk">talk from Surge 2011 where Artur Bergman from fastly.com </a>tell how they have optimized their content delivery network by only relying on plain processors and not using any network processing offload, router ASICs, etc. Too hard to use, to easy to make errors and have the software crash, and &#8220;Xeons are simply faster&#8221;. Note that the word &#8220;energy&#8221; is never mentioned in his talk.</p>
<h3>Software Needs to be 100x Better</h3>
<p>The software perspective was presented by James Larus from Microsoft research. His talk made many interesting points, but I think the main points were that:</p>
<p><em>We are not even trying to make efficient systems today</em>, throwing away billions of clock cycles on plain pure overhead. Example: IBM had investigated the conversion of a SOAP (text) date to a Java date object in IBM Trade benchmark.268 function calls and 70 object allocations.  There is great modularity and nothing obviously wrong in the code.  About 20% of memory is used to hold actual data, the rest is hash table, object management overhead, etc.  In general, objects are small and waste is large. Great for programmers, bad for machines. We could and should find ways to do better in programming than this, need to find a way to make performance an abstraction we can work with.</p>
<p><em>Languages should be as efficient as they can.</em> Today, common runtimes like PHP, Python, and Ruby are very far from optimal. They work &#8220;well enough&#8221;, but it should be pretty easy to make them 10 to 100 times more efficient with known compiler techniques. This should be done in addition to parallelism and distribution, it is almost criminal to leave that much performance on the table when it is so easy to get. Positive example: the 100x performance improvement for Javascript in recent years shows what can be done once it becomes important enough.</p>
<p>Note that Larus is not advocating going back to assembly language &#8211; there is far too much value in programmer productivity &#8211; but just that we remove unnecessary waste from our systems while advancing the state of the art in programming languages.</p>
<p><em>Distributed systems are the new norm</em>. Why don&#8217;t we teach it? All programmers should need to understand how build systems from many separate parts. In particular, the impact of IO and network traffic on software performance. Distribution is not free either.</p>
<p>For an example of how bad things can be, he brought up a nice introspective talk from Surge 2011, about the Etsy website:</p>
<ul>
<li><a href="http://www.youtube.com/watch?v=eenrfm50mXw&amp;feature=player_embedded ">The original talk on Youtube</a></li>
<li><a href="http://arstechnica.com/business/2011/10/when-clever-goes-wrong-how-etsy-overcame-poor-architectural-choices/">ArsTechnica coverage </a></li>
</ul>
<p>So, that&#8217;s my summary of an interesting day.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1751" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1751" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1751"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_linkedin simple_likebuttons_linkedin_s">
        <script type="IN/Share" data-url="http://jakob.engbloms.se/archives/1751" data-counter="right"></script>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1751/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
