<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Observations from Uppsala &#187; VMWare</title>
	<atom:link href="http://jakob.engbloms.se/archives/tag/vmware/feed" rel="self" type="application/rss+xml" />
	<link>http://jakob.engbloms.se</link>
	<description>Computer Technology: Simulation, Virtualization, Virtual Platforms, Embedded, Multicore and Multiprocessing (by Jakob Engblom)</description>
	<lastBuildDate>Sun, 05 Sep 2010 06:08:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
<image>
    <title>Observations from Uppsala</title>
    <url>http://jakob.engbloms.se/favicon.png</url>
    <link>http://jakob.engbloms.se</link>
    <width>32</width>
    <height>32</height>
    <description>Observations from Uppsala - http://jakob.engbloms.se</description>
    </image>		<item>
		<title>Driving an Old Canon Scanner using a VM</title>
		<link>http://jakob.engbloms.se/archives/842?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/842#comments</comments>
		<pubDate>Wed, 15 Jul 2009 18:43:50 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[desktop software]]></category>
		<category><![CDATA[virtual machines]]></category>
		<category><![CDATA[Canon]]></category>
		<category><![CDATA[LIDE30]]></category>
		<category><![CDATA[scanner]]></category>
		<category><![CDATA[USB]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[Vista]]></category>
		<category><![CDATA[VMWare]]></category>
		<category><![CDATA[Windows]]></category>
		<category><![CDATA[XP]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=842</guid>
		<description><![CDATA[I have an old Canon LIDE 30 scanner that I purchased sometime late in 2003. At that time, it was connected to a PC running Windows XP, and drivers worked just fine. However, after I got my new computer in early 2009, with Vista 64, there are no more drivers available. There is a funny [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-843" style="margin-left: 5px; margin-right: 5px;" title="lide30" src="http://jakob.engbloms.se/wp-content/uploads/2009/07/lide30.gif" alt="lide30" width="100" height="67" />I have an old <a href="http://www.canon-europe.com/For_Home/Product_Finder/Scanners/Flatbed/LIDE30/index.asp">Canon LIDE 30 </a>scanner that I purchased sometime late in 2003. At that time, it was connected to a PC running Windows XP, and drivers worked just fine. However, after I got my new computer in early 2009, with Vista 64, there are no more drivers available. There is a funny way around this though, using a virtual machine.</p>
<p><span id="more-842"></span>What I ended up doing to keep using my scanner (whose hardware is still very much intact and solid) is fairly obvious: I installed my old Windows XP license on a VMWare virtual machine (I had the good luck to have a full license with physical media), and then install the Canon LIDE30 driver on that virtualized XP.</p>
<p>VMWare Player is sufficient to let me attach the physical scanner to the virtual machine&#8217;s USB interface, and drive it without the host Vista 64 machine being any the wiser. To get the scanned pictures out, I have to resort to drag-and-drop, as I have failed to get shared folders to work with Player for some unknown reason.</p>
<p>The end result can be pretty complex&#8230; To send some emails from my work computer including scans with this scanner, I had to:</p>
<ul>
<li> Scan on the virtual XP machine</li>
<li>Drag-and-drop to the Pictures folder on my Vista 64 machine</li>
<li>Use file-sharing in Windows to move to my work laptop</li>
<li>Attach in Outlook</li>
</ul>
<p>Workable. It is also a pretty good demo of the power afforded by modern consumer operating systems. Imagine trying to do that in 1995&#8230; would not have been quite as fun.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/842/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Simulation Determinism: Necessary or Evil?</title>
		<link>http://jakob.engbloms.se/archives/734?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/734#comments</comments>
		<pubDate>Sun, 19 Apr 2009 20:36:02 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[multicore debug]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[debugging]]></category>
		<category><![CDATA[determinism]]></category>
		<category><![CDATA[multicore]]></category>
		<category><![CDATA[repeatability]]></category>
		<category><![CDATA[reverse execution]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[VMWare]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=734</guid>
		<description><![CDATA[In my series (well, I have one previous post about checkpointing) about misunderstood simulation technology items, the turn has come to the most difficult of all it seems: determinism. Determinism is often misunderstood as meaning &#8220;unchanging&#8221; or &#8220;constant&#8221; behavior of the simulation. People tend to assume that a deterministic simulation will not reveal errors due [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-735" style="margin-left: 10px; margin-right: 10px;" title="gears" src="http://jakob.engbloms.se/wp-content/uploads/2009/04/gears.png" alt="gears" width="56" height="57" />In my series (well, I have one previous post about <a href="http://jakob.engbloms.se/archives/714"><em>checkpointing</em></a>) about misunderstood simulation technology items, the turn has come to the most difficult of all it seems: <em>determinism.</em> Determinism is often misunderstood as meaning &#8220;unchanging&#8221; or &#8220;constant&#8221; behavior of the simulation. People tend to assume that a deterministic simulation will not reveal errors due to nondeterministic behavior or races in the modeled system, which is a complete misunderstanding. Determinism is a necessary feature of any simulation system that wants to be really helpful to its users, not an evil that hides errors.</p>
<p><span id="more-734"></span></p>
<h2>What?</h2>
<p>Determinism really means this:</p>
<ul>
<li>Given a certain initial state</li>
<li>And a certain sequence of external inputs</li>
<li>The end result and state of the simulation will always be the same</li>
</ul>
<p>The key to note is that you need to require both the starting state and the sequence of external inputs to be the same in order to get the same result. If either of these change, you can well get a different result. Implementing a deterministic simulator requires all internal events and activities in the simulator to be performed in the same order and at the same time in each simulation run. It means that the host computer environment state cannot be allowed to affect the simulator execution, and that in turn means that all sorting of internal events have to be done in defined orders in all instances.</p>
<p>I have a story about how hard that can be in practice. I once talked to some compiler developers who had the issue that when recompiling the same program with the same set of compiler options, the results might come out different, even on the same machine. The problem was that each run of the compiler was done in a different overall system state, and this might affect how the OS memory allocation functions allocated items in memory. It turned out that in some cases, the precise value of the <em>pointers </em>to the items in a complex data structure were used by standard libraries to handle iteration over nodes in the data structures. Thus, a different memory allocation pattern gave a different iteration order and a different traversal order of nodes, and in the end an almost arbitrarily different result. The correct solution they had to implement was to use a defined lexical ordering to traverse and iterate, not anything dependent on the state of the host machine. It is nothing different in a simulator: define the order of <em>everything</em>, in order to be deterministic.</p>
<h2>Why?</h2>
<p>The crucial benefit that determinism brings to a simulation in general and a virtual platform in particular is <em>repeatable debugging</em>. With determinism and an appropriate recording mechanism (and most practically <a href="http://jakob.engbloms.se/archives/714">checkpointing</a>) you can rely on being able to repeat a run resulting in a bug any number of times with the precise same sequence of events in the simulation. In particular, the same sequence and timing and timing relative to instructions executed for events visible to and relevant for the software running on the virtual platform. Especially for multicore and parallel computing systems this is incredibly powerful, and something that just cannot be achieved on physical hardware (due to its inherent randomness and chaotic behavior, see my 2006 and 2007 ESC Silicon Valley talks for more on this, at my <a href="http://www.engbloms.se/jakob_publications.html">publications </a>and <a href="http://www.engbloms.se/jakob_presentations.html">presentations </a>pages).</p>
<p>If you assume stability of the simulation infrastructure and the simulation platform, determinism also makes debugging the simulation itself easier. Often, a bug in a simulation model is repeatable, and with determinism, it is easy to repeat the same external stimulus sequence to the module and debug it repeatably.</p>
<p>Determinism also makes it easy to detect change in the behavior of a simulation: if the same simulation setup results in a different result or final simulation state, you know something in the setup (model, model parameters, or software) changed. There is no randomness that cause changes without some fundamental parameter being changed. Such boring reliable behavior is generally exactly what you want when testing and debugging large, complex systems.</p>
<p>Obviously, once determinism becomes a requirement, missing determinism in a model is a bug in itself &#8212; and finding such bugs can certainly be interesting exercises.</p>
<h2>Why Not?</h2>
<p>Just like for checkpointing, one reason not do to determinism is that it is hard, as discussed above.</p>
<p>The most common reason that people claim to want to avoid determinism is that they want to explore alternatives within their simulation. Basically, there is a need for <em>variability </em>that would seem to be at odds with determinism. The typical argument is that &#8220;if my simulation model contains a non-deterministic choice, I want the simulation to expose that and not just make the same decision every time&#8221;. This is where determinism tends to be considered <em>evil</em>. However, this argument is not correct.</p>
<p>If we take the case that at some point P in a simulation run there are two different events <em>E</em> and <em>F</em> that can fire (since they are both posted to the same point in virtual time), a deterministic simulator will always select one and the same. This is necessary to reap the system-level benefits discussed above. However, nothing prevents us from programming a change from this behavior into our system explicitly, <em>introducing controlled and repeatable variation. </em>In such a setup, we will have a random decision being made in each simulation run, but one where the outcome in any particular run can be repeated by setting the same random seed parameter.</p>
<p>This brings the best of both worlds: variation to expose issues where there is potential non-determinism or lack of synchronization in the model, and perfect repeatability of the issues this poses in terms of target software and simulation system behavior. The reason for the simultaneous readiness can be considered to be lacking synchronization in the model, in general, and such a randomizer of behavior will expose that at several different levels. But uncontrolled randomness is not the answer.</p>
<p>Another common misconception is that at a higher level, determinism in a virtual platform means that target software will always run in the same way. That is not true, and misses the importance of state in the deterministic behavior equation. If the initial state when a program starts is different, a different execution will result. If software is run on top of any non-trivial operating system, there is plenty of such variation. In one of our simplest Simics demos, we show this by running an intentionally buggy race-condition-ridden program. Each time it is run, it hits a different number of race conditions. But thanks to determinism (best demoed using reverse execution), we can repeat each run perfectly.</p>
<p>Thus, determinism is not equal to constant behavior or lack of variation.</p>
<h2>The reverse argument</h2>
<p>Finally, determinism is the simplest way to implement reverse execution: if you have recording, determinism, and checkpointing, you can easily virtually reverse the execution by going back to a checkpoint and replay the execution from that point. If you stop one instruction before the current instruction, you have in essence stepped backwards one step in time. This is how both VMWare and Simics implement reverse execution and debugging. And it could not happen without determinism.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/734/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Checkpointing: Meaningless, Difficult, or just Overlooked?</title>
		<link>http://jakob.engbloms.se/archives/714?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/714#comments</comments>
		<pubDate>Thu, 09 Apr 2009 19:56:16 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Checkpointing]]></category>
		<category><![CDATA[Macintosh]]></category>
		<category><![CDATA[Mambo]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[VMWare]]></category>
		<category><![CDATA[ZX Spectrum]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=714</guid>
		<description><![CDATA[One thing that surprises me is how rare the feature of checkpointing or snapshotting is in the land of virtual platforms, despite the obvious benefits of that feature. Indeed, checkpointing was one of the first cool things demonstrated to me when I joined Virtutech back in 2002. Today, I could not ever imagine doing without [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-737" style="margin-left: 10px; margin-right: 10px;" title="gears1" src="http://jakob.engbloms.se/wp-content/uploads/2009/04/gears1.png" alt="gears1" width="56" height="57" />One thing that surprises me is how rare the feature of checkpointing or snapshotting is in the land of virtual platforms, despite the obvious benefits of that feature. Indeed, checkpointing was one of the first cool things demonstrated to me when I joined Virtutech back in 2002. Today, I could not ever imagine doing without it. Not having checkpointing is like having a word processor where you only get to save once, when your document is finished, with no option of saving intermediate states.</p>
<p>But not everyone seems to consider this an important feature, judging from its relative rarity in the world of EDA and virtual platforms. Why is this? Let&#8217;s look at some possible explanations.</p>
<p><span id="more-714"></span></p>
<p>But first, let&#8217;s examine the subject of this post a bit more. What is checkpointing, precisely?</p>
<h2>What?</h2>
<p>In short, it is the ability of a virtual platform or virtualization environment to save the state of an executing simulation to disk (or memory or something) and later bring the saved state back and continue the simulation as if nothing had happened.</p>
<p>In detail, there are four operations that need to be supported for this to be truly useful:</p>
<p><img class="aligncenter size-full wp-image-715" title="checkpoints" src="http://jakob.engbloms.se/wp-content/uploads/2009/04/checkpoints.png" alt="checkpoints" width="632" height="494" /></p>
<ul>
<li>Saving and restoring to the same simulation system on the same host machine (i.e., into the exact same program binary for the simulation).</li>
<li>Restoring on a different machine (where different can mean a machine with a different word-length, endianness, and operating system).</li>
<li>Restoring into a bug-fixed version of the same simulation model.</li>
<li>Restoring into a completely different simulation model that happens to have the same state.</li>
</ul>
<h2>Why?</h2>
<p>Let&#8217;s look at some use cases for checkpointing:</p>
<p>The last operation is very interesting, since it carries with it the ability to change abstraction level. It is used in IBM Mambo (see a <a href="http://www.research.ibm.com/journal/rd/502/peterson.html">2006 IBM paper that you now have to buy due to an annoying change in IBM policy</a>) to exactly this effect, and in Simics for the Freescale QorIQ P4080 as well. It is also well exploited by academic research frameworks for Simics, such as <a href="http://www.cs.wisc.edu/gems/">GEMS </a>and <a href="http://www.ece.cmu.edu/~simflex/">SimFlex</a>. Essentially, the idea is to position using fast mode, and then move over to detailed mode. The advantage to doing this over a checkpoint is that you can farm out the experiments across many different hosts, save the precise starting point for future regression tests, and try different detailed settings from a known common starting position.</p>
<p>The most obvious use for checkpoints is to avoid repeating simulation work that does not add value, in particular booting of operating systems. A modern OS boot  easily takes billions of instructions (say 10 seconds on a dual-core gigahertz machine&#8230; do the math). Being able to save a simulation effort like this for instant reuse is such a standard part of how I work with virtual platforms that I could not imagine the pain of not having it.</p>
<p>Checkpointing is also a useful communications tool: it makes it possible for any user of a virtual platform to precisely communicate the system state and configuration to anybody else with access to the same virtual platform system (note that a Checkpoint, at least in Simics land, contains the list of objects in the simulation and how they are connected, so you do not need any other description of the simulation setup). This helps in debugging models &#8211; a user testing it can easily package problems and report them to the modeling team. And it helps in debugging software running on the virtual platform, as a tester can package up the precise system state right before a bug hits and send it back to development. Incredibly powerful! Here, portability of checkpoings across hosts is obviously very important, as well as across model versions. Once you have a fix for a model bug, you test it using the checkpoint, and check that things now proceed as they should.</p>
<p>Checkpointing also comes in handy as a backup-save ability when configuring an interactive target system. In many cases, the loading and configuration of software on a target is a very valuable and hard-to-repeat-exactly activity. Adding in software, configuring it, starting servers, assigning network addresses, configuring communications paths for backplanes can take a lot of time. On physical machines or virtual platforms, if you mess up, you have to go back and start over. With checkpointing, you can incrementally save work as you go along. This is a common use case for the snapshotting ability in VmWare, for example. But it works equally well for embedded targets modeled as virtual platforms.</p>
<p>There are more uses, the paragraphs above just scratch the surface of the utility of checkpoints.</p>
<h2>Why Not?</h2>
<p>But despite the obvious benefits, this feature is very rarely found in virtual platforms. I can see three main lines of argument:</p>
<ul>
<li><em>Meaningless</em>: for tests comprising only short software runs like a few million or tens of millions of instructions, rerunning it is fast enough. Or changes major enough. That checkpointing seems pointless. I can buy that &#8212; but only until the simple target is part of a greater context. If a DSP, for example, is part of a big system setup, you want to save its state even if it is only running a few small million-instruction loops.</li>
<li><em>Difficult</em>: I think this might be the most important explanantion. Doing checkpointing right puts requirements on the simulation kernel and on all processors and device models. All models have to be coded with discipline so that all state is available and can be set at any point in time. In particular, this means that explicit threading like employed in SystemC SC_THREAD is out. It must also be admitted that certain types of models like detailed processor models can be very difficult to serialize and deserialize from disk, simply due to the enormous intricacies of their implementations. But had they been designed with checkpointing in mind from the start, it would have been less difficult.</li>
<li><em>Overlooked</em>: The virtual platform was designed without thinking of checkpointing. Alternatively, no customers asked for it, so it was not built.</li>
</ul>
<p>I find the last argument very interesting, since I can see what happens once you have tried checkpointing. In my experience, once a user of a virtual platform has tried checkpointing, they want it. It goes from a interesting idea to a must-have feature very quickly. No arguments about why it is hard or why they can do without it work, as they have seen how things should be done.</p>
<p>For me, I think it is akin to my first encounter with a Macintosh computer, and the concept of &#8220;undo&#8221; in programs. Before that, I was happily editing code on a ZX Spectrum, in an environment where &#8220;undo&#8221; meant &#8220;manually remember how it looked at change it&#8221;. I had no problems with that, but once I saw how things could be done, there was no going back.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/714/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>VMM Detection Myths and Realities from a Simics and Embedded Perspective</title>
		<link>http://jakob.engbloms.se/archives/97?&amp;owa_from=feed&amp;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/97#comments</comments>
		<pubDate>Sun, 20 Apr 2008 00:02:21 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[virtual machines]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[Andrew Warfield]]></category>
		<category><![CDATA[HOTOS]]></category>
		<category><![CDATA[Jason Franklin]]></category>
		<category><![CDATA[Keith Adams]]></category>
		<category><![CDATA[Simics]]></category>
		<category><![CDATA[Tal Garfinkel]]></category>
		<category><![CDATA[Temporal decoupling]]></category>
		<category><![CDATA[Timing attack]]></category>
		<category><![CDATA[Virtual machine detection]]></category>
		<category><![CDATA[VMWare]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=97</guid>
		<description><![CDATA[It must have been Google Alerts that send me a link to the HOTOS 2007 (Hot Topics in Operating Systems) paper by Tal Garfinkel, Keith Adams, Andrew Warfield, and Jason Franklin called Compatibility is not Transparency: VMM Detection Myths and Realities. This paper is slightly less than a year old today, so it is old [...]]]></description>
			<content:encoded><![CDATA[<p>It must have been Google Alerts that send me a link to the <a href="http://www.usenix.org/events/hotos07/">HOTOS 2007</a> (Hot Topics in Operating Systems) paper by Tal Garfinkel, Keith Adams, Andrew Warfield, and Jason Franklin called <a href="http://www.usenix.org/events/hotos07/tech/full_papers/garfinkel/garfinkel_html/">Compatibility is not Transparency: VMM Detection Myths and Realities</a>. This paper is slightly less than a year old today, so it is old by blog standards and quite recent by research paper standards. It deals with the interesting problem of whether a virtual machine can be made undetectable by software running on it &#8212; and software that is trying to detect it. Their conclusion is that it is not feasible, and I agree with that. The reason WHY that is the case can use some more discussion, though&#8230; and here is my take on that issue from a Simics/embedded systems virtualization perspective.</p>
<p><span id="more-97"></span></p>
<p>Their main important assumption is that the VMM cannot be tailored to avoid detection by any particular piece of software, but has to be sufficiently like the real thing to fool something the first time it appears. They discuss from the perspective of virtualization solutions like VmWare that aim at high performance before all else. The virtual PCs generated by VmWare, Parallels, KQemu, and others are all compatible with physical PCs &#8212; run the same software &#8212; but are not at all identical in detail. So they are not transparent in the words of the paper. This means that they are quite easy to spot.</p>
<p>There are some holes in functional differences that VMMs can quite easily plug. The paper shows how you can get a different-sized TLB (compared to the physical hardware), for example, from interference from the VMM. This can obviously be fixed in the VMM, at a cost in performance. The reason such differences are there is that VMMs are optimized for performance at almost any cost. As long as the requisite operating systems run as they should, the VMM is fine even if it is does actually correspond to any particular existing physical machine. This is a testament to the tolerance of modern operating systems towards their hardware. Basically, any OS that probes hardware and discovers what is there will work fine as long as the (virtual) hardware exposes devices that it can recognize. This is quite different from the 1970s or 1980s where an OS would definitely expect a very particular hardware setup with very peculiar timing to run at all. Thus, making a VMM totally identical to some physical machine is a waste of effort and performance.</p>
<p>Paravirtual approaches like Xen and what Sun has with Niagara and IBM on their Power servers, where the OS is rewritten by having drivers for a purely virtual hardware/software interface is an obvious generalization from the VmWare compatibility approach. Compatible versus transparent/invisible  virtualization is really only an issue in the x86 PC world, since all other datacenter architectures are virtual by definition and all operating systems work towards a standard virtual layer. In such an environment, I have hard time seeing that the question posed in the paper does even make sense. You are always virtualized, period.</p>
<p><strong>Embedded Virtual Platforms</strong></p>
<p>Anyhow, back to the main thread. There is still a large set of targets where transparency and compatibility are of interest. x86 PCs is one such target, it is an interesting question for older architectures (Alpha, Vax, Sun and IBM in older generations). In particular,  it is an important topic for embedded systems where you want to use virtual or simulated approaches to develop and test software. As part of that software development process on a virtual machine, you could potentially be examining malware of various kinds. A good not-too-hypothetical example are mobile phone viruses.</p>
<p>If we look at embedded system virtual platforms, the functionality of the simulator is usually more complete and more like a particular physical machine than what a VmWare-style datacenter VMM. This is partially due to embedded software stacks tending to be a bit pickier about what they run on, and partially due to the simple fact that the goal really IS to expose the hardware/software interface of a particular piece of hardware as closely as possible. Also, since this is usually cross-targets (Power Arch on x86, for example), there is no performance gain from using features of the host directly. So items like TLB counts, memory layout, memory content, flash memory programming, etc. are all going to be functionally identical to the physical machine.</p>
<p><strong>Timing is Key</strong></p>
<p>Thus, just like for a patched VmWare-style VMM as discussed in the article, the main attack vector remains <em>timing</em>.</p>
<p>The best way, according to the authors, to spot a VMM is to look for timing differences compared to the behavior on normal hardware. Despite the inherent variability of typical hardware, there are cases where VMMs by necessity vary detectable amounts. I would say this means a factor five or more over many tests of a case.</p>
<p>The authors discuss whether tools like Virtutech Simics could be used to overcome this problem in the context of x86 PCs.  I think the main argument for something like Simics for this purpose is that by simulating the entire hardware platform and providing all timing measurements from a strong virtual time base, you do not see the types of time differences that can be used to detect a &#8220;normal&#8221; VMM. However, since the paper considers Simics and SimNow (from AMD) to be about ten times slower than native hardware, you can always detect them using a non-local time source. That is likely true. But it less obviously true for an embedded target where the simulator running on a fast PC might well be just as fast as the target.</p>
<p><strong>The Multicore Timing Attack</strong></p>
<p>A more intriguing aspect of embedded virtual platforms that could be used to detect virtual platforms is how simulation of multicore machines is handled. For performance reasons, simulators use <em>temporal decoupling</em>,  where each virtual processors is run for a &#8220;long&#8221; time slice before switching to the next. We discussed the effect of this in a recent presentation at the multicore expo (<a href="http://jakob.engbloms.se/archives/89">link to previous blog post</a>), and some of that data is worth repeating.</p>
<p>Here is a slide explaining how temporal decoupling works:</p>
<p><img class="aligncenter size-full wp-image-105" style="vertical-align: middle;" title="temporaldecoupling-what-it-is" src="http://jakob.engbloms.se/wp-content/uploads/2008/04/temporaldecoupling-what-it-is.png" alt="Illustration of temporal decoupling" width="500" height="375" /></p>
<p>So what does this mean in practice for detecting that you are running in a virtual machine?</p>
<p>It means that the communication latency between parallel threads is proportional to the size of the time slicing. If you have two threads progressing in parallel doing spinlocks, on a real machine they will be stealing the lock from each other all the time. On a temporally decoupled simulator, you will rather see a behavior where you can take the lock and then recapture it a few times before missing it. This effect was captured by a simple test program that we wrote, and the data is shown in the slide below:</p>
<p><img class="aligncenter size-full wp-image-106" title="temporaldecoupling-visible-disturbance" src="http://jakob.engbloms.se/wp-content/uploads/2008/04/temporaldecoupling-visible-disturbance.png" alt="Visible disturbance from temporal decoupling" width="500" height="375" /></p>
<p>The program here is running two threads in parallel, updating a shared variable, with three types of locking for the accesses:</p>
<ul>
<li>No locking at all</li>
<li>A local lock to each thread being used (&#8220;fake locking&#8221;)</li>
<li>A proper lock</li>
</ul>
<p>The interesting behavior is the execution time of the program for each of these locking styles. Obviously, running with no lock is the fastest, and with proper locking the slowest. The relative speed of these is the factor to consider. On real hardware, this program observes a very steep increase in execution time when using proper locking. On the simulator, as seen above, the difference in execution time between fake locking and proper locking is significantly smaller when using a long time slice compared to when using a short time slice. The behavior on physical machines is much more like that observed at time slice lengths of ten than that at time slices of 10000.</p>
<p>Normally, a multiprocessor simulator with any ambition to be fast has to use a time slice of 1000 or more. Thus, detecting that you are running inside a simulator is quite simple. If the outside world time seems right, check if you can see strange timing behavior when using locks. Since high speed requires a long time slice, you cannot have both correct real-world timing and a large performance difference. And on the other hand, if the behavior with locking seems reasonable, you should check the real-world time &#8212; as a simulator with a short time slice will be way slower than the real world.</p>
<p>The paper authors note a similar aspect in desktop/server x86 VMM detection. They discuss &#8220;performance cliffs&#8221; that appear when doing &#8220;unusual&#8221; things. For example, VmWare is engineered assuming a minimum use of self-modifying code. Performance is much worse if you use it extensively, and this can be used to detect VmWare quite effectively. This effect is quite similar to the time slice effect in embedded virtual platforms.</p>
<p>Hope you enjoyed this fairly long rant. And we have not even begun exhausting the contents of this topic&#8230; luckily, these discrepancies only very rarely impact the usefulness of virtual platforms. Since most software even on an embedded system does not care about detailed timing like this. In the example above, we still see the lock contention. So we know that we are getting an increase in execution time from the lock. Only not a complete picture of what it means in absolute terms. We will still find missing locks and overused locks.</p>
]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/97/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
