<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Observations from Uppsala &#187; parallelized software</title>
	<atom:link href="http://jakob.engbloms.se/archives/tag/parallelized-software/feed" rel="self" type="application/rss+xml" />
	<link>http://jakob.engbloms.se</link>
	<description>Computer Technology: Simulation, Virtualization, Virtual Platforms, Embedded, Multicore and Multiprocessing (by Jakob Engblom)</description>
	<lastBuildDate>Sun, 29 Jan 2012 19:45:28 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
<image>
    <title>Observations from Uppsala</title>
    <url>http://jakob.engbloms.se/favicon.png</url>
    <link>http://jakob.engbloms.se</link>
    <width>32</width>
    <height>32</height>
    <description>Observations from Uppsala - http://jakob.engbloms.se</description>
    </image>		<item>
		<title>Parallel SystemC Simulation</title>
		<link>http://jakob.engbloms.se/archives/1327?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1327#comments</comments>
		<pubDate>Fri, 26 Nov 2010 19:08:47 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[virtual platforms]]></category>
		<category><![CDATA[Christopher Schumacher]]></category>
		<category><![CDATA[CODES]]></category>
		<category><![CDATA[ISSS]]></category>
		<category><![CDATA[multicore]]></category>
		<category><![CDATA[parallelized software]]></category>
		<category><![CDATA[Rainer Leupers]]></category>
		<category><![CDATA[SystemC]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1327</guid>
		<description><![CDATA[I just found a recent paper on the topic of parallel simulation of computer  systems. Christopher Schumacher et al., published an articles at CODES+ISSS in October of 2010 talking about &#8220;parSC: Synchronous Parallel SystemC Simulation on Multicore Architectures&#8220;. Essentially, parallel SystemC. This is very much a hot topic: for the past few years, everyone has [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/04/gears.png"><img class="alignleft size-full wp-image-735" style="margin: 5px 10px;" title="gears" src="http://jakob.engbloms.se/wp-content/uploads/2009/04/gears.png" alt="" width="56" height="57" /></a>I just found a recent paper on the topic of parallel simulation of computer  systems. Christopher Schumacher et al., published an articles at <a href="http://www.public.asu.edu/~ashriva6/esweek2010/codesisss2010/">CODES+ISSS in October of 2010 </a>talking about &#8220;<a href="http://doi.acm.org/10.1145/1878961.1879005">parSC: Synchronous Parallel SystemC Simulation on Multicore Architectures</a>&#8220;. Essentially, parallel SystemC.</p>
<p><span id="more-1327"></span></p>
<p>This is very much a hot topic: for the past few years, everyone has been looking for ways to run various forms of simulators in parallel. We had some good discussions on this only last Wednesday at a seminar at KTH where I was presenting about Simics.</p>
<p>The approach taken in this paper is different from what you find being done in tools like Simics (as I briefly discussed at <a href="http://jakob.engbloms.se/archives/1023">MCC 2009 </a>and <a href="http://jakob.engbloms.se/archives/246">SiCS Multicore Days 2008</a>). They do not exploit <a href="http://jakob.engbloms.se/archives/97">temporal decoupling</a> or islands with different local time. Instead, they have a single global clock in the entire simulation, and just parallelize the work that is done during each cycle.</p>
<p>The key for this to be beneficial and practical is that the work done per cycle is far greater than the cost to drive the simulation forward &#8220;between&#8221; cycles. In a high-level TLM model where the work per cycle might be as small as a single host instruction (JIT translation of a simple integer instruction from target to host), it is obvious that this approach would not work at all. However, this work explicitly targets clock-cycle-level simulations, where the work per cycle per hardware unit can be very large. The paper discusses actions that take 1000 to 2000 host cycles per step, and at that level of effort, there is definitely some potential for parallel gain.</p>
<p>What is nice with the approach is that they do peg semantics to a sequential reference, which does aid debugging. Due to the very tight synchronization, it would seem to be deterministic, at least on the same host (the SystemC kernel can theoretically behave differently on different hosts).</p>
<p>They do have one example that is simulating a shared-memory multiprocessor using temporally decoupled CPU models (100 target cycles per invocation, probably 1000 to 10000 host cycles). This achieves fairly neat speedups on a very symmetric case. However, this comes at the cost of making the simulation nondeterministic &#8211; even for the single-threaded case which is pretty scary.</p>
<p>Overall, an interesting paper showing that there is more to be discovered in parallel simulation.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1327"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1327" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1327" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1327/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Concurrency in Lego Mindstorms NXT</title>
		<link>http://jakob.engbloms.se/archives/1058?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1058#comments</comments>
		<pubDate>Fri, 08 Jan 2010 21:19:54 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[parallel computing]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[AVR]]></category>
		<category><![CDATA[Domain-specific languages]]></category>
		<category><![CDATA[LabView]]></category>
		<category><![CDATA[lego]]></category>
		<category><![CDATA[Mindstorms]]></category>
		<category><![CDATA[NXT]]></category>
		<category><![CDATA[parallelized software]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1058</guid>
		<description><![CDATA[For my parental leave, I have just bought myself a Lego Mindstorm NXT 2.0 kit. It is not much fun for our youngest, who mostly gets a bit scared by a piece of Lego driving around making noises, but I hope to be able to use it to teach my older child (almost five) to [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-1057" style="margin: 5px 10px;" title="lego mindstorms nxt2" src="http://jakob.engbloms.se/wp-content/uploads/2010/01/lego-mindstorms-nxt2.png" alt="lego mindstorms nxt2" width="146" height="126" /></p>
<p>For my parental leave, I have just bought myself a <a href="http://www.mindstorms.com">Lego Mindstorm NXT 2.0 kit</a>. It is not much fun for our youngest, who mostly gets a bit scared by a piece of Lego driving around making noises, but I hope to be able to use it to teach my older child (almost five) to program. Let&#8217;s see how that turns out. It looks hard to make the NXT environment provide the kind of <a href="http://www.boardgamegeek.com/boardgame/18/roborally">Roborally</a>-style programming blocks that I had hoped to create, as I cannot for some reason get a sufficiently custom icon onto custom blocks.</p>
<p>It also presented me with an opportunity to try some domain-specific high-level graphical programming. The programming environment provided for the NXT series of Mindstorms kits is based on LabView from National Instruments, and it really does seem to work. It even features parallel tasks, which I tried to use&#8230;</p>
<p><span id="more-1058"></span>It turned out that it is not that easy to get it right. I was trying to have a few different tasks look at different sensors and steer the robot in different directions based on the sensor readings. However, this quickly turned into a literal deadlock: the robot just sat there, doing nothing, after the first time that my ultrasound sensor task tried to steer it. Failure.</p>
<p>The manual is quite silent on the tasking semantics, and there are no (or I have not yet found) shared variables, locks, or message-passing mechanisms to synchronize the tasks.</p>
<p>Looking for an answer on the web, I came across a nice tutorial on tasking:</p>
<ul>
<li><a href="http://www.ortop.org/NXT_Tutorial/tasks.html">http://www.ortop.org/NXT_Tutorial/tasks.html</a></li>
</ul>
<p>It is a video, and it ends with the note that the only reliable way to multitask in the NXT environment is to NOT try to control the same thing from multiple tasks. If I want to listen to multiple sensors, the only way is to create a loop with switching on inputs. I.e., manual polling. So much for using the natural language of tasks to handle naturally concurrent activities. At some point, I might figure out if I can construct it from tasks and data value transfers, but for now, I will just go with the flow and write (or rather draw) some polling loops.</p>
<p>Here is an example from the manual. Note that the top &#8220;beam&#8221; controls motor A, and the bottom controls motors B and C:</p>
<p><img class="aligncenter size-full wp-image-1059" title="mindstorms manual excerpt" src="http://jakob.engbloms.se/wp-content/uploads/2010/01/mindstorms-manual-excerpt.png" alt="mindstorms manual excerpt" width="691" height="448" /></p>
<p>That last part about data wires is intriguing&#8230; but I will need to find some more reference materials.</p>
<p>Anyway, the way to express parallel tasks here is really quite neat. At least as long as they are dealing with completely separate aspects of the control of the robot.</p>
<p>If this had been a more hard-core environment, it would have been fun to put different tasks on the ARM7 and AVR processors that are inside the NXT 2.0 brick. Yes, it is a true multiprocessor, if very limited in capabilities. At <a href="http://mindstorms.lego.com/en-us/whatisnxt/default.aspx">http://mindstorms.lego.com/en-us/whatisnxt/default.aspx </a>you have the specs. ARM7, 256 kB of FLASH, 64 kB of RAM, and the AVR has 512 bytes of RAM and 4 kB of FLASH. A nice real embedded machine!</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1058"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1058" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1058" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1058/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>How (Not) To Present Parallel Programming Results</title>
		<link>http://jakob.engbloms.se/archives/946?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/946#comments</comments>
		<pubDate>Mon, 05 Oct 2009 13:06:42 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[conferences]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[multicore computer architecture]]></category>
		<category><![CDATA[DAC]]></category>
		<category><![CDATA[DAC 2009]]></category>
		<category><![CDATA[parallelized software]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=946</guid>
		<description><![CDATA[SCDSource ran a short but good article summarizing a few DAC talks that I would liked to attend. it mostly about the experience of long-term parallel programming research David Bailey in presenting results in the field&#8230; Or more importantly: how not to present results, or how to mislead the audience as to the efficiency of [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-824" title="46daclogo" src="http://jakob.engbloms.se/wp-content/uploads/2009/07/46daclogo.gif" alt="46daclogo" width="81" height="73" /><a href="http://www.scdsource.com/article.php?id=360">SCDSource ran a short but good article</a> summarizing a few DAC talks that I would liked to attend. it mostly about the experience of long-term parallel programming research David Bailey in presenting results in the field&#8230;</p>
<p><span id="more-946"></span>Or more importantly: how not to present results, or how to mislead the audience as to the efficiency of your approach. <a href="http://crd.lbl.gov/~dhbailey/dhbpapers/twelve-ways.pdf">His old 1991 paper </a>on how to do this is still worthy of a read, even if some things have changed (64-bit FP is pretty much as fast as 32-bit these days, for example). The fundamentals of parallelism are still pretty much the same.</p>
<p>The results were from a <a href="http://www.dac.com/events/eventdetails.aspx?id=95-32">DAC panel about multicore and EDA</a>, I wonder if that panel dealt with how to make EDA software itself parallel, or about how to help semiconductor companies help their end user programmers harness the multicore hardware being designed using EDA tools. That does not seems clear to me.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/946"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/946" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/946" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/946/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>StackOverflow interviews CouchDB</title>
		<link>http://jakob.engbloms.se/archives/830?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/830#comments</comments>
		<pubDate>Tue, 07 Jul 2009 18:29:55 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[desktop software]]></category>
		<category><![CDATA[multicore software]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[couchDB]]></category>
		<category><![CDATA[Damien Katz]]></category>
		<category><![CDATA[Erlang]]></category>
		<category><![CDATA[Jan Lehnard]]></category>
		<category><![CDATA[Jeff Atwood]]></category>
		<category><![CDATA[Joel Spolsky]]></category>
		<category><![CDATA[parallelized software]]></category>
		<category><![CDATA[stackoverflow.com]]></category>
		<category><![CDATA[transactions]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=830</guid>
		<description><![CDATA[Last year, FLOSS Weekly interviewed Jan Lehnard of the CouchDB project. I put up a blog post on this, noting that it was interesting with a scalable parallel program written in Erlang, a true concurrent language. The interview was interesting,  but not very deeply technical. Now, almost a year later, the StackOverflow podcast, number 59, [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-237" style="margin: 5px 10px;" title="couchdb" src="http://jakob.engbloms.se/wp-content/uploads/2008/08/couchdb.png" alt="couchdb" width="158" height="96" />Last year, <a href="http://www.twit.tv/floss">FLOSS Weekly </a>interviewed Jan Lehnard of the CouchDB project. I put up a <a href="http://jakob.engbloms.se/archives/236">blog post </a>on this, noting that it was interesting with a scalable parallel program written in Erlang, a true concurrent language. The interview was interesting,  but not very deeply technical. Now, almost a year later, <a href="http://blog.stackoverflow.com/category/podcasts/">the StackOverflow podcast</a>, <a href="http://blog.stackoverflow.com/2009/06/podcast-59/">number 59,</a> interviewed the founder of the project, Damien Katz. This interview goes a bit more into the technical details and what CouchDB is good for and what not, as well as some details on the use and performance of Erlang.</p>
<p><span id="more-830"></span>An interesting point made is that the light-weight user-level threading of the virtual machine in Erlang optimizes for massively threaded performance. The key property is that the context for each thread is very small compared to an OS-level application thread (like pthreads, for example), and this means that the context switch cost is dramatically smaller thanks to less cache and TLB contents needing to be swapped in and out. Thus, for lots of threads, Erlang tends to get more work done per time unit, as there is less execution time lost to friction in the memory system. I am not sure you can emulate this in C using a user-level package. The very small initial stack and heap size of the Erlang VM is partially achieved by the very fact that in a VM, you have more insight into and control over when memory allocation happens, and thus you can more easily do stack and heap grow operations in small units.</p>
<p>Another interesting aspect of Erlang as opposed to C/C++ brought out in the interview is how to do error handling. In Erlang, this is part of the language, while in C/C++, writing code to handle all cases (and handle them correctly) quickly gets painful and overwhelming. Instead in Erlang, you have a system policy to kill any thread that does something bad and restart it. With that simple strategy imposed on you, the code gets much simpler.</p>
<p><img class="alignright size-full wp-image-300" title="stackoverflowlogo250hq2" src="http://jakob.engbloms.se/wp-content/uploads/2008/10/stackoverflowlogo250hq2.png" alt="stackoverflowlogo250hq2" width="47" height="61" />The podcast also brought up <a href="http://stackoverflow.com/questions/299723/can-i-do-transactions-and-locks-in-couchdb">a StackOverflow question about CouchDB </a>that resulted in a good explanation of the concurrency model (optimistic concurrency on entire documents, an nothing smaller or larger than that). Damien Katz came in with some more insights on transactions and CouchDB, in a discussion on how to solve the classic bank account problem: moving money from one account to another. The &#8220;ACID&#8221; solution is to make sure that changes to two accounts are always both done or none done. The CouchDB solution is to put in a record of the account-to-account money transfer (I won&#8217;t use the word &#8220;transaction&#8221; as that is overloaded in this context) in the database, and just go through all records pertaining to a particular account to arrive at its current balance. That does feel more like proper bookkeeping practice, rather than having a single unauditable  balance in an account record&#8230;</p>
<p>Overall, worth its time to listen to.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/830"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/830" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/830" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/830/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Parallelism in Action</title>
		<link>http://jakob.engbloms.se/archives/793?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/793#comments</comments>
		<pubDate>Sun, 24 May 2009 12:53:27 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[multicore computer architecture]]></category>
		<category><![CDATA[multicore software]]></category>
		<category><![CDATA[Embarrassingly Parallel]]></category>
		<category><![CDATA[iPod]]></category>
		<category><![CDATA[Nero]]></category>
		<category><![CDATA[parallelized software]]></category>
		<category><![CDATA[video]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=793</guid>
		<description><![CDATA[Last year in a blog post on video encoding for the iPod Nano, I complained about the lack of performance on my old Athlon. A bit later, I noted that (obviously) video encoding is a good example of an application that can take advantage of parallelism. Yesterday I put these two topics together in a [...]]]></description>
			<content:encoded><![CDATA[<p><img class="size-full wp-image-125 alignleft" style="margin: 5px;" title="coreshrink1" src="http://jakob.engbloms.se/wp-content/uploads/2008/05/coreshrink1.png" alt="Shrinking cores" width="100" height="100" /></p>
<p>Last year in a blog post on <a href="http://jakob.engbloms.se/archives/28">video encoding for the iPod Nano</a>, I complained about the lack of performance on my old Athlon. A bit later, I noted that (obviously) <a href="http://jakob.engbloms.se/archives/31">video encoding is a good example of an application that can take advantage of parallelism</a>. Yesterday I put these two topics together in a practical test. And it worked nicely enough.</p>
<p><span id="more-793"></span></p>
<p>My new Core i7 920-based machine was very well utilized by the Nero 8 suite&#8217;s Nero Recode 3 application when converting some children&#8217;s movies for use on my Nano. Here is a screenshot of the CPU load at one point in the computation:</p>
<p><img class="aligncenter size-full wp-image-794" title="skarmklipp" src="http://jakob.engbloms.se/wp-content/uploads/2009/05/skarmklipp.png" alt="skarmklipp" width="162" height="139" />It was much higher than this at times, but capturing that using the <a href="http://jakob.engbloms.se/archives/580">snipping tool </a>was harder than expected.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/793"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/793" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/793" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/793/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Few Parallel EDA Tools</title>
		<link>http://jakob.engbloms.se/archives/324?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/324#comments</comments>
		<pubDate>Wed, 29 Oct 2008 12:48:58 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer simulation technology]]></category>
		<category><![CDATA[EDA]]></category>
		<category><![CDATA[multicore computer architecture]]></category>
		<category><![CDATA[multicore software]]></category>
		<category><![CDATA[parallelized software]]></category>
		<category><![CDATA[SPICE]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=324</guid>
		<description><![CDATA[I keep looking out for interesting examples of parallel  software, and there is constant trickle of these. This past week I spotted a couple of new ones in the EDA field: SPICE simulation and chip timing analysis. Mentor Graphics Olympus-SoC Richard Goering at SCDSource has a good write-up of a recent announcement from Mentor Graphics [...]]]></description>
			<content:encoded><![CDATA[<p>I keep looking out for interesting examples of parallel  software, and there is constant trickle of these. This past week I spotted a couple of new ones in the EDA field: SPICE simulation and chip timing analysis.</p>
<p><span id="more-324"></span></p>
<h2>Mentor Graphics Olympus-SoC</h2>
<p>Richard Goering at SCDSource has <a href="http://www.scdsource.com/article.php?id=315">a good write-up of a recent announcement from Mentor Graphics</a> on a parallelized version of the Olympus-SoC tool suite for timing analysis. The best bit is the description of how they found parallelism in what used to be a serial program: they went down to very small components of the overall computation, and did a data-flow analysis to find independent atomic units to compute on in parallel. Here, fine-grained is the key to finding lots of parallelism, while using larger units does not work as well.</p>
<p>Qouting the article:</p>
<blockquote>
<div>“If you don’t work at the atomic level, it is very difficult to come up with tasks that are not dependent on each other,” Srinivas said. “We collect a lot of tasks, and we just keep all the cores busy all the time.” The goal, he said, is “minimal starvation” so that individual CPUs are not starved for tasks.</div>
<div>A key technology that makes this possible is what Mentor calls “pin levelization.” With this approach, each node is assigned a level number. If another node has a higher number, there is a possible dependency. Pins at the same level, however, are independent, and their tasks can be collected together into one heterogeneous chunk.</div>
</blockquote>
<p>Go read the rest of it for nice illustrations and more background.</p>
<h2>Gemini SPICE Simulator</h2>
<p><a href="http://www.chipdesignmag.com/payne/">Daniel Payne at Chip Design writes about another fast SPICE simulator.</a> Not as much detail here, but very nice graphs from the Gemini marketing folks. Not that they could not have been done in 2D with better information density, though. SPICE simulation would seem to be fairly parallellizable, which is not too surprising, considering the inherent parallelism of the domain. But as always, implementing a program to take advantage of such domain parallelism can be harder than expected if you did not do it from scratch. Which is what the Gemini people did, apparently.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/324"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/324" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/324" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/324/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

