<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Observations from Uppsala &#187; virtualization</title>
	<atom:link href="http://jakob.engbloms.se/archives/category/virtual/virtualization-virtual/feed" rel="self" type="application/rss+xml" />
	<link>http://jakob.engbloms.se</link>
	<description>Computer Technology: Simulation, Virtualization, Virtual Platforms, Embedded, Multicore and Multiprocessing (by Jakob Engblom)</description>
	<lastBuildDate>Sun, 29 Jan 2012 19:45:28 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
<image>
    <title>Observations from Uppsala</title>
    <url>http://jakob.engbloms.se/favicon.png</url>
    <link>http://jakob.engbloms.se</link>
    <width>32</width>
    <height>32</height>
    <description>Observations from Uppsala - http://jakob.engbloms.se</description>
    </image>		<item>
		<title>S4D Paper on Transporting Bugs with Checkpoints</title>
		<link>http://jakob.engbloms.se/archives/1235?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1235#comments</comments>
		<pubDate>Tue, 31 Aug 2010 18:40:55 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[appearances]]></category>
		<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[virtual machines]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[Wind River Blog]]></category>
		<category><![CDATA[Checkpointing]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[S4D]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1235</guid>
		<description><![CDATA[I have a paper about &#8220;Transporting Bugs with Checkpoints&#8221; to be presented at the S4D (System, Software, SoC and Silicon Debug) conference in Southampton, UK, on September 15 and 16, 2010. The core concept presented is to leverage Simics checkpointing to capture and move a bug from the bug reporter to the responsible developer. It [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/09/S4D1.jpg"><img class="alignleft size-full wp-image-941" style="margin: 5px 10px;" title="S4D" src="http://jakob.engbloms.se/wp-content/uploads/2009/09/S4D1.jpg" alt="" width="143" height="62" /></a>I have a paper about &#8220;Transporting Bugs with Checkpoints&#8221; to be presented at the <a href="http://www.ecsi.me/s4d">S4D (System, Software, SoC and Silicon Debug) conference </a>in Southampton, UK, on September 15 and 16, 2010. The core concept presented is to leverage <a href="http://www.windriver.com/products/simics/">Simics </a>checkpointing  to capture and move a bug from the bug reporter to the responsible  developer. It is a fairly simple idea, but getting it to work  efficiently does require that some things are done right. See the longer <a href="http://blogs.windriver.com/engblom/2010/08/transporting-bugs-with-checkpoints.html#more">Wind River blog posting </a>about this topic for a few more details.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1235"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1235" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1235" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1235/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Wind River Blog: Interview with a Virtualization Researcher</title>
		<link>http://jakob.engbloms.se/archives/1223?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1223#comments</comments>
		<pubDate>Sun, 29 Aug 2010 07:44:15 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer architecture]]></category>
		<category><![CDATA[virtual machines]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[Wind River Blog]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1223</guid>
		<description><![CDATA[Past Friday, I posted a new blog post in my Wind River blog. It is an interview the PhD student Girish Venkatasubramanian from the University of Florida. He is doing research on virtual machines/hypervisors and how they can be implemented more efficiently by making fairly small changes to the architecture of memory management units. The [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2010/04/button-quicklink-blogs.png"><img class="alignleft size-full wp-image-1122" style="margin: 5px 10px;" title="Wind River Logo" src="http://jakob.engbloms.se/wp-content/uploads/2010/04/button-quicklink-blogs.png" alt="" width="46" height="46" /></a>Past Friday, I posted a new blog post in my Wind River blog. It is an <a href="http://blogs.windriver.com/engblom/2010/08/interview-with-girish-venkatasubramanian.html">interview the PhD student Girish Venkatasubramanian </a>from the University of Florida. He is doing research on virtual machines/hypervisors and how they can be implemented more efficiently by making fairly small changes to the architecture of memory management units.</p>
<p><span id="more-1223"></span></p>
<p>The area of virtualization is one that I would definitely have looked  at as an opportunity had I started out as a PhD student today. The work  of his group is a good example of how Simics is being used for <a href="http://blogs.windriver.com/engblom/2010/07/academic-simics.html">research and teaching in universities</a> around the world.</p>
<p>Going one level up in abstraction, I note that this is probably the first time I have published an actual interview. I have been active in writing things since high school, but it has pretty much always been direct writing, not interviewing.  However, I really hope that this is not the last. Having a series of user interviews on the Wind River blog could be really neat, as a way to dive deeply into some particular areas of technology. Will be interesting to see if any other university user is interested in being featured.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1223"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1223" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1223" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1223/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>FLOSS Weekly on Xen: Some Background Missing</title>
		<link>http://jakob.engbloms.se/archives/775?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/775#comments</comments>
		<pubDate>Sun, 17 May 2009 19:41:05 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[review]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[FLOSS Weekly]]></category>
		<category><![CDATA[Ian Pratt]]></category>
		<category><![CDATA[Xen]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=775</guid>
		<description><![CDATA[FLOSS Weekly recently ran an interview with the creator of the Xen project, Ian Pratt from the University of Cambridge (and now working for Citrix since they bought Xensource). Since I happen to like virtual things, even the so-much-talked-about-it-hurts IT/server/desktop virtualization world this was a must-listen. It was a good show, but lacking some in [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.twit.tv/floss"><img class="alignleft size-full wp-image-214" style="margin: 5px 10px;" title="flossweekly" src="http://jakob.engbloms.se/wp-content/uploads/2008/08/flossweekly.jpg" alt="flossweekly" width="70" height="70" />FLOSS Weekly </a>recently ran <a href="http://www.twit.tv/floss67">an interview </a>with the creator of the <a href="http://www.xen.org/">Xen </a>project, Ian Pratt from the <a href="http://en.wikipedia.org/wiki/University_of_Cambridge_Computer_Laboratory">University of Cambridge</a> (and now working for <a href="http://www.citrix.com">Citrix </a>since they bought Xensource). Since I happen to like virtual things, even the so-much-talked-about-it-hurts IT/server/desktop virtualization world this was a must-listen. It was a good show, but lacking some in the humble background department.</p>
<p><span id="more-775"></span>I was a bit disappointed that they did not discuss paravirtualization, the concept that at least I think is the main contribution of Xen. Paravirtualization is the idea of modifying the hardware interface of an operating system just slightly to make it much more convenient and efficient to run it on top of a hypervisor. This is really not that different from the architecture in the classic heavy-duty server world, with the programming interface of the Sun Niagara, IBM POWER, and IBM zSeries machines all being not bare-metal but rather a more structured API to a layer partially implemented in software.</p>
<p>Xen is shaping up nicely to be a lot of things to a lot of people. I liked their work on live migration (done in the same way as I think I read IBM doing it in POWER). The discussion on snapshotting was also good, even if I felt he introduced it as something completely new, which it is not. VMWare has had it for ages. And my favorite simulator, <a href="http://www.virtutech.com">Simics</a>, also has it in an even more powerful way. Another interesting aspect is the use of Xen on ARM-based mobile devices, even where no real hardware support exists for virtualization (incidentally, this was the starting point for Xen on the x86 back in 2001 and the reason I understand for going with a paravirtual solution rather than a VmWare-style binary rewrite solution). Busy field, with lots of contenders.</p>
<p>The open-source nature of Xen is also very important to the project (and the reason for Ian Pratt being interviewed by FLOSS Weekly, as that show is all about open-source projects). In addition to enabling the exploration of new additions to the core &#8220;product&#8221;, it also means that Xen is  a great research tool. With Xen, researchers can implement virtualization-related ideas in a real industrial-strength product, which makes it much easier to do applicable and realistic research. Indeed, one of the greatest thing with the current strong open-source trend in the world is that research institutions have a complete system and application software stack that can be modified at any point to try new ideas. From the hypervisor to operating systems, compilers, middleware, language virtual machines, and lots of server and desktop application software.</p>
<p>Pratt was not exactly overly humble, but I think he is correct in taking some credit for having put virtualization front-and-center in the x86 market. Before Xen, it was not at all as hot as a topic. Or it might just be coincidence. I was a bit annoyed at his lack of painting a bigger picture: I would have liked more credits and more comparisons to previous work, especially the IBM classic architectures and what VmWare and others have been doing on the PC. If you had no context and listened to this show, you could well come away with the impression that Xen invented virtualization, which is not really the case at all.</p>
<p>Note that FLOSS weekly is still one of my regular podcasts, despite the <a href="http://jakob.engbloms.se/archives/586">occasional episode </a>that I take issue with.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/775"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/775" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/775" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/775/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cadence on Virtual Prototypes instead of Host Execution</title>
		<link>http://jakob.engbloms.se/archives/308?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/308#comments</comments>
		<pubDate>Sun, 19 Oct 2008 21:40:37 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[ESL]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[blog commentary]]></category>
		<category><![CDATA[Cadence]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=308</guid>
		<description><![CDATA[Cadence technical blogger Jason Andrews wrote a short piece a couple of days ago on his perception that host-based execution is becoming unncessary thanks to fast virtual platforms. In &#8220;Is Host-Code Execution History&#8220;, he tells the story of a technique from long time ago where a target program was executed directly on the host, and [...]]]></description>
			<content:encoded><![CDATA[<p>Cadence technical blogger <a href="http://www.cadence.com/community/posts/jasona.aspx">Jason Andrews </a>wrote a short piece a couple of days ago on his perception that host-based execution is becoming unncessary thanks to fast virtual platforms. In &#8220;<a href="http://www.cadence.com/Community/blogs/sd/archive/2008/10/17/is-host-code-execution-history.aspx">Is Host-Code Execution History</a>&#8220;, he tells the story of a technique from long time ago where a target program was executed directly on the host, and memory accesses captured and passed to a Verilog simulator. The problem being solved was the lack of a simulator for the MIPS processor in use, and the solution was pretty fast and easy to use. Quite interesting, and well worth a read.</p>
<p>However, like all host-compiled execution (which I also like to call API-level simulation) it suffered from some problems, and virtual platforms today might offer the speed of host-compiled simulation without all the problems.</p>
<p><span id="more-308"></span></p>
<p>The problems are these:</p>
<blockquote><p><span id="anormal_12" class="Cadence_CS_BlogDetail_BlogText">Most companies that are using host-code execution today use &#8220;explicit access&#8221;.  This means they require all places in the code that access the hardware to call read() and write() functions so every hardware access goes through a common set of functions and then they use #ifdef to change the hardware accesses to call the simulator if they are doing verification with host-code execution. If they are running on the target system, then pointer dereferences are used. </span></p>
<p>&#8230;</p>
<p><span id="anormal_12" class="Cadence_CS_BlogDetail_BlogText">This is where implicit access came in. It provided a way to automatically trap pointer dereferences that were reading and writing to hardware locations and convert the load or store instruction into a simulated read or write. For reads it would put the result into the proper host CPU register and the user had no idea that a line of C code would magically turn into a bus transaction on a Verilog BFM</span></p></blockquote>
<p>Yes, that is a right pain, and I have seen lots of solutions for it, none of which have the elegant simplicity of a processor simulation. The &#8220;implicit access&#8221; system is basically trying to trap memory accesses without overtly changing the source code of a program. I guess the best way to do this is binary instrumentation, but it is still very hard to get to work right and robustly. A simulator is simply much simpler in principle here.</p>
<p>Jason continues later on:</p>
<blockquote><p><span id="anormal_12" class="Cadence_CS_BlogDetail_BlogText">Given the hassle of host-code execution I would prefer to cross compile the software and run the target instruction set. Beyond the implicit or explicit access issue, this also eliminates issues with differences in data type sizes, data structure layout, byte order (endianess) and other differences between the host and target processor. </span></p></blockquote>
<p>That is absolutely true! Jason does not mention the additional fun of what happens when the target is running an OS that is happily fielding interrupts, scheduling software tasks, etc. Also, that having to maintain a separate build target and maybe code variant is very expensive, process-wise. The expense that a good virtual platform incurs can be paid for pretty quickly once such reduced friction costs are factored in.</p>
<p>So I guess I pretty  much agree with all that Jason is saying, and thanks him for mentioning <a href="http://www.virtutech.com/products">Simics</a>. Thanks for the insights into what was done in the 1990s, it always interesting to get pointers to old fundamental and interesting work.</p>
<p>About how the virtual platforms actually work inside: it is not that complicated in principle (but pretty hairy to get it quite right and fast in practice). You have to simplify the timing of the target processor, you have to convert from target processor binaries to host binary format using some kind of just-in-time compilation technique (also called dynamic binary translation or code morphing), and you have to provide some kind of direct access to target memory for the target processor simulation (like the DMI feature in <a href="http://systemc.org">SystemC TLM-2.0</a>, but usually the difficult bits are on the CPU side of that, not the memory side).  The most interesting bit is how to build the surroundign system model to not slow the CPU model down, and for this I can recommend a couple of pieces of writing:</p>
<ul>
<li>My ESC 2008 general intro to the subject of virtual prototypes (<a href="http://www.engbloms.se/presentations/engblom-ESC2008-class410-simulation-slides.pdf">slides</a>, <a href="http://www.engbloms.se/publications/engblom-ESC2008-class410-simulation-paper.pdf">paper</a>)</li>
<li>Virtutech white paper on <a href="http://www.virtutech.com/whitepapers/modeling.html">system modeling </a></li>
</ul>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/308"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/308" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/308" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/308/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What is Efficiency when Cores are Free?</title>
		<link>http://jakob.engbloms.se/archives/269?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/269#comments</comments>
		<pubDate>Sat, 13 Sep 2008 16:48:19 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[conferences]]></category>
		<category><![CDATA[embedded software]]></category>
		<category><![CDATA[embedded systeme]]></category>
		<category><![CDATA[multicore computer architecture]]></category>
		<category><![CDATA[multicore software]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[efficiency]]></category>
		<category><![CDATA[manycore]]></category>
		<category><![CDATA[operating systems]]></category>
		<category><![CDATA[SiCS Multicore days]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=269</guid>
		<description><![CDATA[More from the SiCS multicore days 2008. There were some interesting comments on how to define efficiency in a world of plentiful cores. The theme from my previous blog post called &#8220;Real-Time Control when Cores Become Free&#8221; came up several times during the talks, panels, and discussions. It seems that this year, everybody agreed that [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-270" style="margin-left: 5px; margin-right: 5px;" title="onoff" src="http://jakob.engbloms.se/wp-content/uploads/2008/09/onoff.png" alt="" width="72" height="70" />More from the SiCS multicore days 2008.</p>
<p>There were some interesting comments on how to define efficiency in a world of plentiful cores. The theme from my previous blog post called &#8220;<a href="http://jakob.engbloms.se/archives/123">Real-Time Control when Cores Become Free</a>&#8221; came up several times during the talks, panels, and discussions. It seems that this year, everybody agreed that we are heading to 100s or 1000s of &#8220;self-respecting&#8221; cores on a single chip, and that with that kind of core count, it is not too important to keep them all busy at all times at any cost. As I stated earlier, cores and instructions are now free, while other aspects are limiting, turning the classic optimization imperatives of computing on its head. Operating systems will become more about space-sharing than time-sharing, and it might make sense to dedicate processing cores to the sole job of impersonating peripheral units or doing polling work. Operating systems can also be simplified when the job of time-sharing is taken away, even if communications and resource management might well bring in some new interesting issues.</p>
<p>So, what is efficiency in this kind of environment?</p>
<p><span id="more-269"></span></p>
<p>It was clear from both the panel discussion and discussions over lunch that programmer productivity and predictability are things that can be traded for absolute100% load on all cores. Just like making 100% use of main memory is not usually a design goal today, so making 100% use of all processor cores is not a reasonable the goal tomorrow. Some resources are so plentiful that it makes sense not to try to push usage to the limit.</p>
<p>With 100s of cores, it is quite likely that even for the most performance-demanding loads like doing LTE decoding, it is not worth the herculean effort to get all cores running at full speed all the time. Getting 80% to 90% of the cores working on a workload is probably a good tradeoff.</p>
<p>Another tradeoff you can make is to increase determinism and debuggability by assigning tasks and schedules in a more static and predictable way. Instead of trying to balance loads across the cores, tasks could be assigned in some static or semi-static manner, so that the execution of a system can be repeated with some chance of success. That should not be too hard if all cores run a static cyclic scheduler, for example, or even a single task on each core. Dynamic scheduling might well be a global suboptimization in a world with plenty of cores, as it just makes things more complex for a fairly small increase in actual efficiency. You could also imagine putting debug agents and code on certain cores just to help you get better insight into what the system is doing. A bit like <a href="http://jakob.engbloms.se/archives/17">I blogged about after last year&#8217;s Multicore Day</a>, asking designers to put more silicon into debug functionality. Maybe in a 100s of core device, we allocate cores to debug as well (I do not think we can do without dedicated debug circuitry, as that is needed to effect things like stopping cores quickly and similar).,</p>
<p>When I heard this, my gut reaction was that &#8220;hey, that is not particularly environmental&#8221; &#8212; any kind of waste of resources is really an anathema to the ecologically friendly society we need to build over the next 10-20 years. But then someone pointed out that a key part of the efficiency equation is that you turn off the unused cores and accelerators so they do not use any power. And since the cores are a resource that keeps increasing in count from basically the same use of resources  (manufacturing a chip will cost about the same amount of energy and materials for each chip, but with finer geometries you pack double the number of cores in it), it should be fine. It should also be noted that multicore computing by itself allows for more efficient processing units, for a variety of reasons.</p>
<p>Robustness also tends to increase if you have some slack in your system. For example, most hard real-time systems insist on not being more than 80% loaded or so (on a single CPU) even at the worst of tested times. To have some margin for the inevitable unexpected situations. For a 100s of cores device, you might also want to spare some cores for the case that hardware faults crop up in certain parts of the chip. Then you can shift loads to other cores (which obviously requires a pretty resilient interconnect to make any sense).</p>
<p>This final point bring me to my final thought on this was of building computing systems: in some way, we get closer to physical engineering habits when cores are free. We do not build bridges with the minimum amount of concrete and steel to handle the load we expect. Instead, there is a margin of error of a factor of three or five or so, to make sure that even in the most unexpected of unknown circumstances, that bridge will still stand. In a similar way, we might be able to use lots of free cores to engineer software systems that have far more resilience in them than todays systems that keep trying to make maximum use of the resource of clock cycles and instruction processing count. I do not quite know how that kind of system would look, but the analogy is very interesting.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/269"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/269" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/269" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/269/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google Chrome and Parallel Browsing</title>
		<link>http://jakob.engbloms.se/archives/258?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/258#comments</comments>
		<pubDate>Fri, 12 Sep 2008 07:54:54 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[desktop software]]></category>
		<category><![CDATA[multicore software]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[Google chrome]]></category>
		<category><![CDATA[Internet explorer]]></category>
		<category><![CDATA[web browsing]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=258</guid>
		<description><![CDATA[Everybody seems to think the launch of the Google Chrome browser is very important and cool. Probably because Google itself is considered important and cool. I am a bit more skeptical about the whole Google thing, they seem to building themselves into a pretty dangerous monopoly company&#8230; but there are some interesting architectural and parallel [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-259" style="margin: 10px 5px;" title="gglchrome" src="http://jakob.engbloms.se/wp-content/uploads/2008/09/gglchrome.jpg" alt="" width="103" height="98" /> Everybody seems to think the launch of the <a href="http://www.google.com/googlebooks/chrome/">Google Chrome browser </a>is very important and cool. Probably because Google itself is considered important and cool. I am a bit more skeptical about the whole Google thing, they seem to building themselves into a pretty dangerous monopoly company&#8230; but there are some interesting architectural and parallel computing aspects to Chrome &#8212; and Internet Explorer 8, it turns out.</p>
<p><span id="more-258"></span></p>
<p>Both IE8 and Chrome have taken to running each tab of a multi-tab browser as its own protected process, to make it both parallel processing and to increase robustness. I think that is a very good idea, and I am waiting for Firefox to catch up.</p>
<p>Why does running a browser as a parallel program make sense? If you look at the tradition, when the web started, you would load a page, render it, and read it for a long time. With multiple tabs and windows, each such display was really also just a set of static prints of pages that you flipped between. No point in being parallel there. However, in recent years, the web page model is changing. Pages are becoming far more active, starting a long time ago with Java applets, Active-X controls, and similar, and today the main drivers seem to be Javascript/AJAX/Web 2.0 pages and media players like Flash and Silverlight.</p>
<p>Basically, we see another example of a domain change enabling parallel processing to be applied. The domain of web pages has changed from single-shot renderings of single pages at a time, which is essentially serial, to lots of active programs running at the same time.</p>
<p>I think we are going to see more of parallel processing being used to enable richer user experience. This is one way that the world is making use of the increase in computing power and communications bandwidth, just because it is there. It gives us a nice sea of threads to run in parallel &#8212; the only issue probably being IO bandwidth and cache restrictions of single chips.</p>
<p>The use of processes for robustness is kind of an application-level virutalization. The OS provides isolation between processes, just like virtualization provide isolation between operating systems.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/258"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/258" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/258" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/258/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Virtual Platforms for Late Hardware and the Winds of History</title>
		<link>http://jakob.engbloms.se/archives/180?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/180#comments</comments>
		<pubDate>Wed, 30 Jul 2008 20:56:32 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[business issues]]></category>
		<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[history of computing]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[archive]]></category>
		<category><![CDATA[digital archive]]></category>
		<category><![CDATA[document retention]]></category>
		<category><![CDATA[Forskning och Framsteg]]></category>
		<category><![CDATA[Simics]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=180</guid>
		<description><![CDATA[As might be evident from this blog, I do have a certain interest in history and the history of computing in particular. One aspect where computing and history collide in a not-so-nice way today is in the archiving of digital data for the long term. I just read an article at Forskning och Framsteg where [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2008/07/d21_2t.jpg"><img class="size-medium wp-image-181 alignleft" style="margin: 10px;" title="DataSaab D21" src="http://jakob.engbloms.se/wp-content/uploads/2008/07/d21_2t.jpg" alt="" width="150" height="112" /></a>As might be evident from this blog, I do have a certain interest in history and the history of computing in particular. One aspect where computing and history collide in a not-so-nice way today is in the archiving of digital data for the long term. I just read an article at <a href="http://www.fof.se/?id=08550">Forskning och Framsteg </a>where they discuss some of the issues that use of digital computer systems and digital non-physical documents have on the long-term archival of our intellectual world of today. Basically, digital archives tend to rot in a variety of ways. I think virtual platform technology could play a role in preserving our digital heritage for the future.</p>
<p><span id="more-180"></span></p>
<p>Living in a decently old city does provoke a certain interest in history and how to access the past. Note that Uppsala, like everything in Sweden, is fairly recent on the grand scale of things. We have a cathedral from the 1200s, and some older churches going back to the 1000s, but before that there is not much to boast about. It is not like Rome or Egypt or China with thousands of years of civilization and with impressive buildings that still stand. But I digress&#8230;</p>
<p>The real problem with archives and digital technology is that it is really hard to preserve a large portion of the intellectual data that we produce today. When I was an archivist at <a href="http://www.student.uu.se/nation/smalands/">Smålands Nation </a>here at the university in the 1990s, it was real fun to browse through the old papers we had in the archive (only stuff from the past century, all the really old and valuable stuff was in professional hands at the national archives). Reading a menu from a party held in 1945, or the time plan for a large formal celebration in the 1950s, or a guest list from such an event was actually quite intruiging. You could see how tastes and traditions change, even as students all believe they are maintaining ages-old traditions faithfully&#8230; Browsing the member list in old handwritten ledgers was also great fun.</p>
<p>In another fifty year&#8217;s time, what will we have left from today&#8217;s activities? The party plans are now all just printouts from a throwaway word document, the member list a database on a PC, and unless you make a conscious effort, memories and printed papers will quickly fade (you do need to use the appropriate laser toner on low-acid archive-quality paper if you want the things to last even a decade, let alone a hundred years) and the students fifty years from now will have no idea on we ran our parties, who were in the nation, and other such facts of daily life. &#8220;Print it out&#8221; is easy to say but hard to do. Much of today&#8217;s digital data does not really print out in a useful way. A wiki, for example, can be printed as a snapshot. But the links and the edit history is gone.</p>
<p>But imagine if we could retain these digital documents accessible, in their living digital form. Having a proper database available for use and query is so much more interesting and powerful than a stack of papers. Being able to look at the audit history of a wiki, or the commit history of a program&#8217;s source code would offer a much deeper insight into the people and processes that produced the end results compared to just looking at a static snapshot.</p>
<p>What does it take to do that? A huge and known problem in digital archiving is the physical media access. All great archival institutions in the world have scores of different tape readers, disk readers, some new, some antique, and try to use these to pry the bits of equally antiquated storage media. This is bad, just like we have problems looking at old movies as the reams of film are literally falling apart. Clay tablets start to look positively sophisticated and a smart choice compared to our very brittle technologies.</p>
<p>Second, once we have the bits, what can we do with them? Now we have to tackle the file formats, and having pried them open, to find the fonts and character sets which are suitable for displaying them. Rendering a MacWrite document from 1985 on a modern machine will likely not quite give the same wonderful black-and-white 72-dpi view that you got back then&#8230; Or figuring out that for a while, we used 7-bit ASCII to write Swedish texts by replacing [, ], |, {, }, \ with å, ä, ö, Å, Ä, Ö (not in exactly that order &#8212; but I used to be fluent in reading and writing that!). Even worse, decoding a database file or purely binary CAD/CAM file is going to require some serious reverse engineering of the old programs that created them&#8230; unless you could run these same old programs in some way.</p>
<p>At least to me, an &#8220;obvious&#8221; solution is to use virtual platform technology to make keep the old software stacks running. Some archivists are apparently working on OS emulation of various kinds to keep the old software running, but I do think that it is much simpler and more general to just simulate the entire machine hardware. This will result in the smallest risk of error, and the greatest fidelity to the original software as oddities such as word lengths and character sets can be faithfully reproduced. Simply because the SW-HW interface in a machine is the best document and narrowest interface in the entire machine. Old tapes and disks should be turned into files stored on the mass storage of current machines, and these files can then be migrated to new machines as they come online. As long as you maintain the copies of the materials by reproducing them on new hard drives, they will not rot.</p>
<p>It will also have the property that the entire look and feel of the old systems remains accessible. Which is a blessing in that it maintains our digital heritage and lets computer historians also go back and see how old software looked and worked. It is also considered a curse by practising archivists who do not particularly like the idea of learning how to operate strange old operating systems with arcane command lines. I think this is not necessarily a big problem. Just like people today study and learn ancient languages to learn more about ancient cultures as embodied in their written records, I can see historians of the future learning to use old operating systems and programming languages to study the ancient culture of our time as embodied in our computerized records.</p>
<p>Creating these kinds of virtual platforms is quite different from the dominant virtual platform thinking in the commercial market place today which is focused on modeling new and future hardwares. Rather, we need to study an existing artefact and make a very good model of it. It is a different type of task, requiring slightly different types of tools.</p>
<p>Of particular importance is that the simulation platform totally insulates simulation models from the underlying machine. You need to be able to port the simulation to new operating systems, machines, and architectures without changing a line in the source code of the models. This means making sure to locate all host dependencies inside the simulation framework, and making sure that models are completely portable regardless of word lengths, endianness, and other aspects. You want to be able to take a model that you run today on an x86-64 on Linux and run it on some future 128-bit middle-endian platform with a completely different type of operating system. Basically, the simulation platform has to be a complete virtual machine in its own right. Probably, over time, we are going to add more layers of virtual platforms to get around really major shifts in computer system architecture. There is nothing saying that you can only virtualize once, already the IBM S/370 showed that you could virtualize recursively and to an arbitrary depth.</p>
<p>I do think that some tools that we have today are perfectly useful for this kind of undertaking.</p>
<p>The business aspects are going to be interesting, though. We would like to use the technologies already available in the sophisticated commercial virtual platform tools, that&#8217;s kind of given. But the archival market place is not the most lucractive&#8230; and that they would like to have very good insurance that the tools remain available. Including source-code escrow that would make the requirements of 25-years military projects look positively light.</p>
<p>Developing a completely new totally open-source solution sounds like a nice idea in theory, but is probably a bit too much work. It also takes no advantage of all the commercial technology available today. Maybe as the virtual platform market matures, the industry can come up with some way to provide this technology for the greater benefit of society. Basically, doing a bit of social work where we really have the tools to help.</p>
<p>I have no good solutions to that to offer right now.</p>
<p>But imagine how cool it would be to, in fifty years from now, gather the old team from Virtutech around some kind of display device and watch an ancient <a href="http://www.virtutech.com/simics4.html">Simics 4.0 </a>run on top of a virtual quadcore x86 with Linux 2.6&#8230; all running on top of Simics 29.0 (estimating one major version every two years) on some future I-have-no-idea-what-it-will-be hardware and software system. That really is something that would be cool to see happen.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/180"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/180" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/180" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/180/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Freescale QorIQ P4080 is out &#8212; with Simics support</title>
		<link>http://jakob.engbloms.se/archives/137?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/137#comments</comments>
		<pubDate>Mon, 16 Jun 2008 11:36:27 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[business issues]]></category>
		<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[embedded systeme]]></category>
		<category><![CDATA[multicore computer architecture]]></category>
		<category><![CDATA[multicore software]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[p4080]]></category>
		<category><![CDATA[qoriq]]></category>
		<category><![CDATA[Simics]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=137</guid>
		<description><![CDATA[Only half an hour ago, the embargoes lifted. Freescale announced its new QorIQ series of multicore (and some single- and dual-core) processors. For the top-end of that line, the P4080, Freescale and Virtutech (where I work, remember) have developed a virtual platform solution to help Freescale customers get to working products faster. The virtual platform [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft alignnone size-medium wp-image-138" style="margin: 0px 10px;" title="fsl-qoriq" src="http://jakob.engbloms.se/wp-content/uploads/2008/06/fsl-qoriq.png" alt="" width="132" height="62" />Only half an hour ago, the embargoes lifted. Freescale announced its new <a href="http://freescale.com/multicore">QorIQ series</a> of multicore (and some single- and dual-core) processors. For the top-end of that line, <a href="http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=P4080&amp;fastpreview=1">the P4080</a>, Freescale and Virtutech (where I work, remember) have developed a <a href="http://www.virtutech.com/QorIQ">virtual platform solution</a> to help Freescale customers get to working products faster. The virtual platform is available now, and is already running several operating systems including VxWorks, QNX, and a variety of Linuxes. Apart from the fairly large scale of this SoC, the really new part of the virtual platform is the so-called <a href="http://www.virtutech.com/QorIQ/hybrid_simulation.html">Hybrid </a>solution, where the fast models are combined with detailed models from Freescale themselves. This creates a cycle-level detailed model with validated timing, &#8220;from the source&#8221; &#8212; but without the performance issues of having to run everything at great level of detail. Rather, you use the fast model to steer the simulation of a workload to an interesting spot, and then turn up the level of detail then and there. You can also select which components of the chip are actually detailed and which parts are modeled with the fast functional models, avoiding the incredible slow-down of running and entire virtual platform at a great level of detail.</p>
<p>If you happen to be at the FTF in Orlando, do come by and look at the demos!</p>
<p>I have been involved in this work for the past year, and it is wonderful to finally see it coming out and be able to talk about it.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/137"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/137" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/137" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/137/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The 1970 rule strikes again: Virtual Platform Principles in 1967</title>
		<link>http://jakob.engbloms.se/archives/130?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/130#comments</comments>
		<pubDate>Fri, 30 May 2008 20:37:31 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[history of computing]]></category>
		<category><![CDATA[multicore computer architecture]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[1969]]></category>
		<category><![CDATA[HITAC-8400]]></category>
		<category><![CDATA[Hitachi]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[operating systems]]></category>
		<category><![CDATA[race condition]]></category>
		<category><![CDATA[Temporal decoupling]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=130</guid>
		<description><![CDATA[Being a bit of a computer history buff, I am often struck by how most key concepts and ideas in computer science and computer architecture were all invented in some form or the other before 1970. And commonly by IBM. This goes for caches, virtual memory, pipelining, out-of-order execution, virtual machines, operating systems, multitasking, byte-code [...]]]></description>
			<content:encoded><![CDATA[<p>Being a bit of a computer history buff, I am often struck by how most key concepts and ideas in computer science and computer architecture were all invented in some form or the other before 1970. And commonly by IBM. This goes for caches, virtual memory, pipelining, out-of-order execution, virtual machines, operating systems, multitasking, byte-code machines, etc. Even so, I have found a quite extraordinary example of this that actually surprised me in its range of modern techniques employed. This is a follow-up to a previous post, after having actually digested <a href="http://jakob.engbloms.se/archives/121">the paper I talked about earlier</a>.</p>
<p><span id="more-130"></span></p>
<p>The paper in question was published in 1969, and is titled &#8220;<a href="http://portal.acm.org/citation.cfm?id=961053.961092&amp;coll=ACM&amp;dl=ACM&amp;CFID=67556471&amp;CFTOKEN=25257537">A program simulator by partial interpretation<strong>&#8220;</strong></a>. In the previous post, I took note of its use of direct execution of software plus trapping of privileged instructions, but that was not really the most interesting bits in there.</p>
<p>They lay out  in quite simple terms most of the key ideas behind today&#8217;s fast virtual platforms. Here are the best parts:</p>
<ul>
<li>They note that simulation of a computer is often used to overcome debugging difficulties, in particular repeating failed runs and tracing all that is going on in the target machine.</li>
<li>They are hunting down race conditions using the simulator.</li>
<li>They use recorded input and output to drive a deterministic simulation even of workloads involving communication with the external world.</li>
<li>They simulate multiple processors on top of a single physical processor by means of giving each processor a certain time slice to do its work before switching to the next processor. This is known as temporal decoupling or quantized simulation today, and is a key to the high speed of solutions such as Simics. They note the same tradeoffs as we see today, 40 years later, for doing this: shorter slices more accurately depict the parallelism, but also cost performance.</li>
<li>The temporally decoupled simulation also includes timers and similar non-CPU-hardware. Just like we do it today for virtual platforms.</li>
<li>In a temporally decoupled simulation, they optimize the simulation of the IDL, Idle, instruction. When it is encountered, they skip immediately to the end of the time slice. This is what we today call idle-loop optimization or hypersimulation, and which is absolutely key to achieving scalable simulation of large multiprocessor and multi-machine setups (since most parts of a system are not usually maximally loaded).</li>
<li>They are debugging operating systems on the simulator, not just user-level code.</li>
</ul>
<p>The computer in question is a Japanese System/360-compatible machine called the <a href="http://www.ipsj.or.jp/katsudou/museum/computer/0610_e.html">HITAC-8400</a>. The work was reported in 1969, but actually carried out in 1967.</p>
<p>There are some differences in scale and kind compared to today&#8217;s virtual platforms, but none that detract from the underlying principles. The 1967 system is host-on-host, so it is not the kind of cross-environment that is most common in today&#8217;s virtual platforms (Power Arch on x86, ARM on x86, etc.). The IO system is much easier to simulate since it is part of the instruction set of the processor rather than being a set of complex memory-mapped peripherals.</p>
<p>So the 1970 rule strikes again. Not the IBM rule, this time, this was all done by Hitachi. There are traces of similar work at IBM in other papers, but I have not been able to locate actual copies of any publication.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/130"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/130" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/130" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/130/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Virtual Platform by Virtualization Extensions &#8212; 1969</title>
		<link>http://jakob.engbloms.se/archives/121?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/121#comments</comments>
		<pubDate>Sun, 11 May 2008 18:53:11 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[history of computing]]></category>
		<category><![CDATA[multicore debug]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[1969]]></category>
		<category><![CDATA[conference paper]]></category>
		<category><![CDATA[HITAC-8400]]></category>
		<category><![CDATA[Hitachi]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[SOSP]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=121</guid>
		<description><![CDATA[By means of a trip down virtualization history, I found a real gem in 1969 paper called A program simulator by partial interpretation, by Kazuhiro Fuchi, Hozumi Tanaka, Yuriko Manago, Toshitsugu Yuba of the Japanese Government Electrotechnical Laboratory. It was published at the second symposium on Operating systems principles (SOSP) in 1969. It describes a [...]]]></description>
			<content:encoded><![CDATA[<p>By means of a trip down virtualization history, I found a real gem in 1969 paper called <a href="http://portal.acm.org/citation.cfm?id=961053.961092&amp;coll=ACM&amp;dl=ACM&amp;CFID=67556471&amp;CFTOKEN=25257537"><strong>A program simulator by partial interpretation</strong>,</a> by Kazuhiro Fuchi, Hozumi Tanaka, Yuriko Manago, Toshitsugu Yuba of the Japanese Government Electrotechnical Laboratory. It was published at the <span class="mediumb-text">second symposium on Operating systems principles</span> (SOSP) in 1969. It describes a system where regular target instructions are directly interpreted, and any privileged instructions are trapped and simulated. Very similar to how VmWare does it for x86, or any other modern virtualization solution.</p>
<p><span id="more-121"></span></p>
<p>The interesting bit is really the uses that this system is put to:</p>
<blockquote><p>In promoting the ETSS project a program simulator based on an idea of partial interpretation has been constructed, and its principle and design are described in the paper. This new approach has been introduced to provide the simulator with such features as high speed and high accuracy in simulation and simplification in implementation. The essence of the idea of partial interpretation is using direct execution of instructions by hardware and simulation of them by an interpreter in combination, wherewith the hardware interrupt mechanism intermediates the two phases of the whole simulation. An interruption takes place when executing a &#8220;privileged&#8221; instruction, which triggers the simulation of the instruction. The other type of instructions are normally rendered to direct execution by hardware. The simulation method for devices operating in parallel is also described with respect to the timing control and scheduling. <strong>A program simulator of this type provides a powerful tool for debugging &#8220;supervisor &#8221; programs and opens a new approach to system expansion</strong>.</p></blockquote>
<p>Note that last part. This is essentially a virtual machine used for operating-system debug. So far, the earliest mention of this idea that I have found. There are similar ideas in a classic 1972 IBM paper. If anyone has seen anything older, please comment and tell me!</p>
<p>It is also fun reading these old papers&#8230; they are usually scanned from a paper copy, and therefore really show how papers looked and felt forty years ago. Long before desktop publishing, or even TeX.</p>
<p><img class="aligncenter size-full wp-image-122" title="fuchi-1969" src="http://jakob.engbloms.se/wp-content/uploads/2008/05/fuchi-1969.png" alt="Abstract of Fuchi 1969 paper" /></p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/121"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/121" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/121" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/121/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>VMM Detection Myths and Realities from a Simics and Embedded Perspective</title>
		<link>http://jakob.engbloms.se/archives/97?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/97#comments</comments>
		<pubDate>Sun, 20 Apr 2008 00:02:21 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[virtual machines]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[Andrew Warfield]]></category>
		<category><![CDATA[HOTOS]]></category>
		<category><![CDATA[Jason Franklin]]></category>
		<category><![CDATA[Keith Adams]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[Tal Garfinkel]]></category>
		<category><![CDATA[Temporal decoupling]]></category>
		<category><![CDATA[Timing attack]]></category>
		<category><![CDATA[Virtual machine detection]]></category>
		<category><![CDATA[VMWare]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=97</guid>
		<description><![CDATA[It must have been Google Alerts that send me a link to the HOTOS 2007 (Hot Topics in Operating Systems) paper by Tal Garfinkel, Keith Adams, Andrew Warfield, and Jason Franklin called Compatibility is not Transparency: VMM Detection Myths and Realities. This paper is slightly less than a year old today, so it is old [...]]]></description>
			<content:encoded><![CDATA[<p>It must have been Google Alerts that send me a link to the <a href="http://www.usenix.org/events/hotos07/">HOTOS 2007</a> (Hot Topics in Operating Systems) paper by Tal Garfinkel, Keith Adams, Andrew Warfield, and Jason Franklin called <a href="http://www.usenix.org/events/hotos07/tech/full_papers/garfinkel/garfinkel_html/">Compatibility is not Transparency: VMM Detection Myths and Realities</a>. This paper is slightly less than a year old today, so it is old by blog standards and quite recent by research paper standards. It deals with the interesting problem of whether a virtual machine can be made undetectable by software running on it &#8212; and software that is trying to detect it. Their conclusion is that it is not feasible, and I agree with that. The reason WHY that is the case can use some more discussion, though&#8230; and here is my take on that issue from a Simics/embedded systems virtualization perspective.</p>
<p><span id="more-97"></span></p>
<p>Their main important assumption is that the VMM cannot be tailored to avoid detection by any particular piece of software, but has to be sufficiently like the real thing to fool something the first time it appears. They discuss from the perspective of virtualization solutions like VmWare that aim at high performance before all else. The virtual PCs generated by VmWare, Parallels, KQemu, and others are all compatible with physical PCs &#8212; run the same software &#8212; but are not at all identical in detail. So they are not transparent in the words of the paper. This means that they are quite easy to spot.</p>
<p>There are some holes in functional differences that VMMs can quite easily plug. The paper shows how you can get a different-sized TLB (compared to the physical hardware), for example, from interference from the VMM. This can obviously be fixed in the VMM, at a cost in performance. The reason such differences are there is that VMMs are optimized for performance at almost any cost. As long as the requisite operating systems run as they should, the VMM is fine even if it is does actually correspond to any particular existing physical machine. This is a testament to the tolerance of modern operating systems towards their hardware. Basically, any OS that probes hardware and discovers what is there will work fine as long as the (virtual) hardware exposes devices that it can recognize. This is quite different from the 1970s or 1980s where an OS would definitely expect a very particular hardware setup with very peculiar timing to run at all. Thus, making a VMM totally identical to some physical machine is a waste of effort and performance.</p>
<p>Paravirtual approaches like Xen and what Sun has with Niagara and IBM on their Power servers, where the OS is rewritten by having drivers for a purely virtual hardware/software interface is an obvious generalization from the VmWare compatibility approach. Compatible versus transparent/invisible  virtualization is really only an issue in the x86 PC world, since all other datacenter architectures are virtual by definition and all operating systems work towards a standard virtual layer. In such an environment, I have hard time seeing that the question posed in the paper does even make sense. You are always virtualized, period.</p>
<p><strong>Embedded Virtual Platforms</strong></p>
<p>Anyhow, back to the main thread. There is still a large set of targets where transparency and compatibility are of interest. x86 PCs is one such target, it is an interesting question for older architectures (Alpha, Vax, Sun and IBM in older generations). In particular,  it is an important topic for embedded systems where you want to use virtual or simulated approaches to develop and test software. As part of that software development process on a virtual machine, you could potentially be examining malware of various kinds. A good not-too-hypothetical example are mobile phone viruses.</p>
<p>If we look at embedded system virtual platforms, the functionality of the simulator is usually more complete and more like a particular physical machine than what a VmWare-style datacenter VMM. This is partially due to embedded software stacks tending to be a bit pickier about what they run on, and partially due to the simple fact that the goal really IS to expose the hardware/software interface of a particular piece of hardware as closely as possible. Also, since this is usually cross-targets (Power Arch on x86, for example), there is no performance gain from using features of the host directly. So items like TLB counts, memory layout, memory content, flash memory programming, etc. are all going to be functionally identical to the physical machine.</p>
<p><strong>Timing is Key</strong></p>
<p>Thus, just like for a patched VmWare-style VMM as discussed in the article, the main attack vector remains <em>timing</em>.</p>
<p>The best way, according to the authors, to spot a VMM is to look for timing differences compared to the behavior on normal hardware. Despite the inherent variability of typical hardware, there are cases where VMMs by necessity vary detectable amounts. I would say this means a factor five or more over many tests of a case.</p>
<p>The authors discuss whether tools like Virtutech Simics could be used to overcome this problem in the context of x86 PCs.  I think the main argument for something like Simics for this purpose is that by simulating the entire hardware platform and providing all timing measurements from a strong virtual time base, you do not see the types of time differences that can be used to detect a &#8220;normal&#8221; VMM. However, since the paper considers Simics and SimNow (from AMD) to be about ten times slower than native hardware, you can always detect them using a non-local time source. That is likely true. But it less obviously true for an embedded target where the simulator running on a fast PC might well be just as fast as the target.</p>
<p><strong>The Multicore Timing Attack</strong></p>
<p>A more intriguing aspect of embedded virtual platforms that could be used to detect virtual platforms is how simulation of multicore machines is handled. For performance reasons, simulators use <em>temporal decoupling</em>,  where each virtual processors is run for a &#8220;long&#8221; time slice before switching to the next. We discussed the effect of this in a recent presentation at the multicore expo (<a href="http://jakob.engbloms.se/archives/89">link to previous blog post</a>), and some of that data is worth repeating.</p>
<p>Here is a slide explaining how temporal decoupling works:</p>
<p><img class="aligncenter size-full wp-image-105" style="vertical-align: middle;" title="temporaldecoupling-what-it-is" src="http://jakob.engbloms.se/wp-content/uploads/2008/04/temporaldecoupling-what-it-is.png" alt="Illustration of temporal decoupling" width="500" height="375" /></p>
<p>So what does this mean in practice for detecting that you are running in a virtual machine?</p>
<p>It means that the communication latency between parallel threads is proportional to the size of the time slicing. If you have two threads progressing in parallel doing spinlocks, on a real machine they will be stealing the lock from each other all the time. On a temporally decoupled simulator, you will rather see a behavior where you can take the lock and then recapture it a few times before missing it. This effect was captured by a simple test program that we wrote, and the data is shown in the slide below:</p>
<p><img class="aligncenter size-full wp-image-106" title="temporaldecoupling-visible-disturbance" src="http://jakob.engbloms.se/wp-content/uploads/2008/04/temporaldecoupling-visible-disturbance.png" alt="Visible disturbance from temporal decoupling" width="500" height="375" /></p>
<p>The program here is running two threads in parallel, updating a shared variable, with three types of locking for the accesses:</p>
<ul>
<li>No locking at all</li>
<li>A local lock to each thread being used (&#8220;fake locking&#8221;)</li>
<li>A proper lock</li>
</ul>
<p>The interesting behavior is the execution time of the program for each of these locking styles. Obviously, running with no lock is the fastest, and with proper locking the slowest. The relative speed of these is the factor to consider. On real hardware, this program observes a very steep increase in execution time when using proper locking. On the simulator, as seen above, the difference in execution time between fake locking and proper locking is significantly smaller when using a long time slice compared to when using a short time slice. The behavior on physical machines is much more like that observed at time slice lengths of ten than that at time slices of 10000.</p>
<p>Normally, a multiprocessor simulator with any ambition to be fast has to use a time slice of 1000 or more. Thus, detecting that you are running inside a simulator is quite simple. If the outside world time seems right, check if you can see strange timing behavior when using locks. Since high speed requires a long time slice, you cannot have both correct real-world timing and a large performance difference. And on the other hand, if the behavior with locking seems reasonable, you should check the real-world time &#8212; as a simulator with a short time slice will be way slower than the real world.</p>
<p>The paper authors note a similar aspect in desktop/server x86 VMM detection. They discuss &#8220;performance cliffs&#8221; that appear when doing &#8220;unusual&#8221; things. For example, VmWare is engineered assuming a minimum use of self-modifying code. Performance is much worse if you use it extensively, and this can be used to detect VmWare quite effectively. This effect is quite similar to the time slice effect in embedded virtual platforms.</p>
<p>Hope you enjoyed this fairly long rant. And we have not even begun exhausting the contents of this topic&#8230; luckily, these discrepancies only very rarely impact the usefulness of virtual platforms. Since most software even on an embedded system does not care about detailed timing like this. In the example above, we still see the lock contention. So we know that we are getting an increase in execution time from the lock. Only not a complete picture of what it means in absolute terms. We will still find missing locks and overused locks.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/97"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/97" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/97" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/97/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>AMP vs Virtualization</title>
		<link>http://jakob.engbloms.se/archives/22?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/22#comments</comments>
		<pubDate>Thu, 13 Sep 2007 20:26:29 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[embedded software]]></category>
		<category><![CDATA[embedded systeme]]></category>
		<category><![CDATA[uncategorized]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[AMP]]></category>
		<category><![CDATA[operating systems]]></category>
		<category><![CDATA[SMP]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/archives/22</guid>
		<description><![CDATA[It just dawned on me recently (and it sure must have been obvious to those working with configuring AMP &#8212; Assymtric Multiprocessing Systems) that in an AMP setup, the operating systems involved actually know about each other and have to account for the fact that they are sharing a single processor chip with other operating [...]]]></description>
			<content:encoded><![CDATA[<p>It just dawned on me recently (and it sure must have been obvious to those working with configuring AMP &#8212; Assymtric Multiprocessing Systems) that in an AMP setup, the operating systems involved actually know about each other and have to account for the fact that they are sharing a single processor chip with other operating systems. So you cannot just take two single-core operating system images from an existing multiple-processor (local memory) solution and put them on a single chip and things just work. You do need to prepare the boot process and find a way to nicely share the common I/O devices, timers, accelerator engines and other resources on the chip. This is materially different from a virtualized setup.</p>
<p><span id="more-22"></span><br />
In a virtualization-based setup, you use a single hypervisor program that then controls several single-processor operating systems running on the machine. That hypervisor also takes care of allocating shared resources to the operating systems, sometimes by sharing a single physical resources, sometimes by only letting one operating system access a certain resource. So in this case, you can actually reuse existing OS images on a new multicore chip and transparently transform an existing system.</p>
<p>Too bad there is still no embedded processor with strong support for heavy-duty virtualization like this.</p>
<p>On the other hand, it might be a passing need. The transition of old applications to new hardware will always involve some rewrite and retouch, and if that means doing a bit of change in the OS setup to handle an AMP case nicely, it is probably not too expensive (compared to redoing applications on top of the OS to be truly SMP).  And for virtualization, this means that you can use a Xen-style paravirtual approach where the OS is modified to run on top of a simple hypervisor.</p>
<p>Running and booting an unmodified binary install of an OS is likely more of a server/desktop problem than one for embedded applications. We are going to see virtualization support in hardware to help light-weight approaches be even more efficient, and also to tackle the security issues of rogue code getting into some OS image. Hardware support is needed to contain an OS that has been taken over by bad guys, no amount of cooperation between OSes in an AMP setting is going to prevent that.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/22"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/22" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/22" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/22/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fastscale minimal virtual machines &#8212; beautiful simple idea</title>
		<link>http://jakob.engbloms.se/archives/16?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/16#comments</comments>
		<pubDate>Tue, 28 Aug 2007 19:01:26 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[business issues]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[Fastscale]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[operating systems]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/archives/16</guid>
		<description><![CDATA[A company called Fastscale Technologies has a product that is simple in concept and yet very powerful. Instead of using complete installs of heavy operating systems like Linux or Windows to run applications on virtual machines, they offer tools that provide minimal operating system configurations that are tailored to the needs of a particular application. [...]]]></description>
			<content:encoded><![CDATA[<p>A company called <a href="http://www.fastscale.com/">Fastscale Technologies</a> has a product that is simple in concept and yet very powerful. Instead of using complete installs of heavy operating systems like Linux or Windows to run applications on virtual machines, they offer tools that provide minimal operating system configurations that are tailored to the needs of a particular application. Since only that application is going to be run on the virtual machine, this is sufficient. <a href="http://www.theregister.co.uk/2007/08/27/fastscale_vmware_virtual_manager/">According to press reports, </a>this means that you can run several times more virtual machines on a given host, compared to default OS installs. And boot an order of a magnitude faster.</p>
<p><span id="more-16"></span>The basic premise is definitely one that makes sense. We have seen this at <a href="http://www.virtutech.com">Virtutech</a>, working with stripped-down Linux images on various embedded boards. Turning on and off individual services often has a significant impact on both execution speed and memory consumption of a particular target. The gut reaction when setting up a new target is to think about what can be stripped out rather than on putting in everything that could be useful. Since that will result in a bloated image that will consume resources with little additional value.</p>
<p>The difference is even more obvious for Windows machines, where the speed of the same virtual machine running NT, 2000, XP, or Vista is incredibly different. With Vista and XP in particular, there is much going on in the background eating up memory and processor time. Some simple tuning tricks like turning off animations in the GUI or background indexing can have a very large impact on speed. Same thing goes for graphical desktop Linux distributions, where turning off eye candy results in significant speed increases.</p>
<p>So I believe that Fastscale can offer what they promise, it is a nice simple idea. But as always with a simple idea that makes for a powerful product, there is something more than just the idea. There is a somewhat tricky piece of execution.</p>
<p>The trick that Fastscale brings is to automate the process of creating a minimal but sufficient substrate for a particular application, given the app, target OS, and server hardware. Sounds like dependency checking once you have the data on what needs what, but finding out the particular dependence chains in any particular OS is hard work. Doing it manually is pretty painful, and the Linux kernel configurator is not always up to the task.</p>
<p>It is also a nice example of a case where open-source opens up for new innovation. Doing this kind of configuration to Windows is much harder in practice, if nothing else since you do not get to recompile the kernel yourself.</p>
<p>Also, Linux in particular has rather good support for removing non-essential components, I believe to some extent thanks to its extensive use in embedded software. Embedded people have a tradition of application-specific configurations, and here is a nice case of when that results in a better desktop/server product in the end. Who said embedded requirements were just for a niche market?</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/16"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/16" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/16" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/16/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Real-Time in Sweden (RTiS) 2007</title>
		<link>http://jakob.engbloms.se/archives/11?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/11#comments</comments>
		<pubDate>Wed, 22 Aug 2007 19:47:23 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[appearances]]></category>
		<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[embedded software]]></category>
		<category><![CDATA[embedded systeme]]></category>
		<category><![CDATA[multicore computer architecture]]></category>
		<category><![CDATA[multicore software]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[Real-Time in Sweden]]></category>
		<category><![CDATA[Simics]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/archives/11</guid>
		<description><![CDATA[RTiS 2007 just took place in Västerås, Sweden. It is a biannual event where Swedish real-time research (and that really means embedded in general these days) presents new results and summarizes results from the past two years. For someone who has worked in the field for ten years, it really feels like a gathering of [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.idt.mdh.se/RTiS2007/">RTiS 2007</a> just took place in Västerås, Sweden. It is a biannual event where Swedish real-time research (and that really means embedded in general these days) presents new results and summarizes results from the past two years. For someone who has worked in the field for ten years, it really feels like a gathering of friends and old acquaintances. And always some fresh new faces. Due to a scheduling conflict, I was only able to make it to day one of two.</p>
<p>I presented a short summary of a paper I and a colleague at Virtutech wrote last year together with Ericsson and TietoEnator, on the <a href="http://www.virtutech.com">Simics</a>-based simulator for the Ericsson CPP system (see the <a href="http://www.engbloms.se/jakob_publications.html">publications page for 2006 and soon for 2007</a>). I also presented the Simics tool and demoed it in the demo session. Overall, nice to be talking to the mixed academic-industrial audience.</p>
<p><span id="more-11"></span>The papers presented really indicate current trends in embedded and real-time systems development. Focus is on topics like model-driven architecture and component-based software development, and on how to make these abstract ways to develop code create code that can actually run well on resource-limited processors. Which leads into scheduling analysis and the analysis of properties of collections of objects from the properties of the component objects. For example, what is the worst-case execution time of a certain piece of software, given the context of a particular system configuration. The industry cases gravitated towards the automotive domain, which is not surprising considering the make-up of Swedish industry. Mobile phones were notably absent, but they are not very real-time systems anyway.</p>
<p>The keynotes by <a href="http://ptolemy.eecs.berkeley.edu/~eal/">Edward Lee</a> in Berkeley (on his ideas for programming parallel processor machines) and by <a href="http://user.it.uu.se/~eh/">Erik Hagersten </a>(on why multicore is different from multiprocessors) were well-received and well-presented. Nothing particularly new, but nice overviews but good speakers. I really like the ideas of Lee on how to program parallel machines using sequential programs coordinated by a higher-level coordination layer. It is a nice way to separate concerns.</p>
<p>Overall, I appreciate the effort of the organizers and really regret that I could stay for the conference dinner.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/11"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/11" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/11" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/11/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

