<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Observations from Uppsala &#187; Gary Stringham</title>
	<atom:link href="http://jakob.engbloms.se/archives/tag/gary-stringham/feed" rel="self" type="application/rss+xml" />
	<link>http://jakob.engbloms.se</link>
	<description>Computer Technology: Simulation, Virtualization, Virtual Platforms, Embedded, Multicore and Multiprocessing (by Jakob Engblom)</description>
	<lastBuildDate>Sun, 29 Jan 2012 19:45:28 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
<image>
    <title>Observations from Uppsala</title>
    <url>http://jakob.engbloms.se/favicon.png</url>
    <link>http://jakob.engbloms.se</link>
    <width>32</width>
    <height>32</height>
    <description>Observations from Uppsala - http://jakob.engbloms.se</description>
    </image>		<item>
		<title>Register Design Languages &#8211; DSL or not?</title>
		<link>http://jakob.engbloms.se/archives/1462?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1462#comments</comments>
		<pubDate>Wed, 27 Jul 2011 20:20:25 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[ESL]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[Domain-specific languages]]></category>
		<category><![CDATA[Gary Stringham]]></category>
		<category><![CDATA[Register Design Languages]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1462</guid>
		<description><![CDATA[Recently, Gary Stringham has been running a series of interviews with providers of register design tools on his website. Register design tools seems to be an active area with several small companies (and some open-source tools) fighting for the market. I have written about Gary Stringham and register designs before, and it is an area [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://jakob.engbloms.se/wp-content/uploads/2009/04/gears1.png"><img class="alignleft size-full wp-image-737" style="margin: 5px 10px;" title="gears1" src="http://jakob.engbloms.se/wp-content/uploads/2009/04/gears1.png" alt="" width="56" height="57" /></a>Recently, Gary Stringham has been running a series of <a href="http://www.garystringham.com/rdt.shtml">interviews with providers of register design tools on his website</a>. Register design tools seems to be an active area with several small companies (and some open-source tools) fighting for the market. I have written about Gary Stringham and register designs <a href="http://jakob.engbloms.se/archives/358">before</a>, and it is an area that keeps fascinating me. There is something about the task of register design that keeps it separate from the main hardware design languages, tools, and flows.The different approaches taken by the tools supporting the register design task also illustrates some points about programming language standards, domain-specific languages, and exchange formats that I want to address.</p>
<p><span id="more-1462"></span>First of all, what makes register design such a special task? My guess is because it is a higher-level aspect of the design: it  describes an interface, which can be implemented in many different ways  in hardware. Regular HDLs do not help you reason about register layouts in a good way. The register specification should also be used in other places than the hardware: in address definitions for device drivers, in manuals, and other documentation. All of this indicates that the information needs to be explicit, high-level, and in a format that facilitates automatic processing by tools. It basically screams for a domain-specific language (DSL) at some level of ambition (read my old post about how <a href="http://jakob.engbloms.se/archives/747">languages are grown from problems </a>for more background on just how simple a DSL can be).</p>
<p>If you look at the register design tool vendors that Gary interviewed, you can essentially see two different approaches to how to support the task of register design. In particular, the input language differs quite radically between the tools.</p>
<p>There is <em>domain-specific programming language</em> approach. A register-design DSL makes it easy to work with register designs, and pretty hard to do anything else (imagine writing a sorting algorithm in a register-design language &#8211; I don&#8217;t think it can be done, and if it can be done, the result has to be absolutely contorted). The DSL approach assumes a willingness  on the part of a user to learn a new language, but it seems that for this domain, the languages are simple enough that learning them saves you time in the end. Especially if you maintain large register maps that keep changing or is updated and extended across generations of chips (indeed, ease of maintenance is an undersold aspect of most DSLs in my experience).</p>
<p>The underlying assumption of the DSL approach is that the entry language is not too important, as long as you can export data into other formats. To me, this makes sense. From a single source with sufficient descriptive power, you can ideally generate both HDL code to implement the decoder in hardware, as well as documentation, header files, and maybe even skeletons for virtual platform models. And standard formats like IP-XACT to feed into any other tool.</p>
<p>It does not matter if there is a single tool using each input language, since the outputs are what matters. This leaves the vendors free to invent on the input side, and provide a really powerful tool.</p>
<p>The second approach is <em>transformation-based</em>. Its idea is to not use a dedicated input language, but rather powerful import functions to read whatever specifications already exist in text files, Microsoft Word documents, Microsoft Excel spreadsheets, Framemaker documentation files, IP-XACT, or whatever you can come up with.  The assumption is that the tool is really about transforming data and not about compiling a language. Register designs kind of already exist, and since there is no standard language to compile, you just import whatever there is. It makes the assumption that the user base is not willing to learn a new language to handle register design. A funky side-effect is that we might end up actually doing programming work in <a href="http://www.garystringham.com/rdt/AgnisysInterview.shtml">Word docs</a>.</p>
<p>Still, the transformation approach ends up generating the same outputs as the DSL approach. In  both cases, some of the outputs can be considered &#8220;industry standards&#8221;, like IP-XACT, while others are decidedly ad-hoc, like Excel sheets. To me, this demonstrates that the true value in standards is to enable tool interoperability, and not so much as a way to provide inputs.</p>
<p>Indeed, there is a clear difference between good input languages and good exchange formats. An input language tends to drive for expressive power and human readability, and it is OK to have a heavy compilation process associated with it. An exchange format like IP-XACT does not need to be human-readable, nor efficient to code in. It needs to be easy to parse and compile, so that as many tools as possible can work on it.</p>
<p>A typical example from the field of register design is dealing with repeated groups of registers (such as banks of registers for a DMA channel) and parametrized register designs. The input languages used by the tools that Gary Stringham covers all allow this &#8211; while the standard for register design exchange, IP-XACT, only uses a flat compiled map. It just lists all registers with sizes and offsets, not how these offsets were arrived at or if there is some logical grouping or iterated structure. Quite OK for a tool to work on, but not very useful as a repository for the actual information.</p>
<p>This distinction is worth to keep in mind.</p>
<p>I definitely side with the DSL idea, as that fits my idea of a good tool, but I can see why many people find the transformation from other formats attractive. In all cases, the goal is to have an original design specification that is as clear, succinct, and flexible as possible.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1462"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1462" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1462" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1462/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Neat Register Design to Avoid Races</title>
		<link>http://jakob.engbloms.se/archives/1070?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/1070#comments</comments>
		<pubDate>Thu, 28 Jan 2010 18:59:53 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[embedded software]]></category>
		<category><![CDATA[embedded systeme]]></category>
		<category><![CDATA[ESL]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[64-bit computing]]></category>
		<category><![CDATA[device driver]]></category>
		<category><![CDATA[Gary Stringham]]></category>
		<category><![CDATA[high-level synthesis]]></category>
		<category><![CDATA[programming register]]></category>
		<category><![CDATA[race condition]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=1070</guid>
		<description><![CDATA[In his most recent Embedded Bridge Newsletter, Gary Stringham describes a solution to a common read-modify-write race-condition hazard on device registers accessed by multiple software units in parallel. Some of the solutions are really neat! I have seen the &#8220;write 1 clears&#8221; solution before in real hardware, but I was not aware of the other [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-589" style="margin: 5px 10px;" title="racecondition" src="http://jakob.engbloms.se/wp-content/uploads/2008/01/racecondition.png" alt="racecondition" width="99" height="78" />In his most recent <a href="http://garystringham.com/newsletter.shtml?nid=039">Embedded Bridge Newsletter</a>, Gary Stringham describes a solution to a common read-modify-write race-condition hazard on device registers accessed by multiple software units in parallel. Some of the solutions are really neat!</p>
<p>I have seen the &#8220;write 1 clears&#8221; solution before in real hardware, but I was not aware of the other two variants. The idea of having a &#8220;write mask&#8221; in one half of a 32-bit word is really clever.</p>
<p>However, this got me thinking about what the fundamental issue here really is.</p>
<p><span id="more-1070"></span></p>
<p>As I see it, it is the fact that the processor cannot address small enough units atomically. The <a href="http://garystringham.com/newsletter.shtml?nid=037">read-modify-write that was used to start the discussion in the Embedded Bridge #37</a> was needed in order to get the current state of a configuration register, change some setting that only occupied a few bits in it, and write back the result to the register. The way most configuration registers that I have seen in practice works.</p>
<p>But if each setting could be given its own register, the problem would go away. Each operation would target a unique address, achieving the same effect as the bit-wise masks or write-1 solutions proposed. The core problem is that hardware tends to share settings into registers, as it has been considered too expensive to put information that might cover a range as small as [0,1] into a 32-bit register. Probably, since there is a lack of addresses for registers, you cannot have 1000 settings cause each simple device to use up 1000 words of physical addresses.</p>
<p>But is that really an issue, if we look forward?</p>
<p>It seems to me that, as 64-bit instruction sets and addressing systems penetrate down into more and more embedded systems, a simple solution would be to throw address space at the problem. I don&#8217;t think it is uneconomical to allocate huge chunks of memory space to each device, giving each setting its own register, when you have 64 bit virtual addresses to work with. There is no way you can fill up a physical memory system (guess that will some day come back to haunt me)&#8230; even the highest-end machines today only use something like 40 bits for actually addressing physical memories.</p>
<p>The software would be simpler and more robust, with virtually no cost.</p>
<p>Another solution that I have also seen starting to appear is to dispense with register settings altogether, and rather define a command API that the processor &#8220;calls&#8221; by putting in command packets into some memory area. This does require quite a bit of silicon for a decoder, but it provides for a much higher level of interaction with devices. As hardware devices get defined in successively higher-level languages (C, C++, UML, MatLab, &#8230;), and <a href="http://jakob.engbloms.se/archives/871">their programming interfaces and associated drivers get autogenerated</a>, this solution makes eminent sense.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/1070"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/1070" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/1070" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/1070/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The Hardware-Software Interface is where the Action Is</title>
		<link>http://jakob.engbloms.se/archives/799?&#038;owa_medium=feed&#038;owa_sid=</link>
		<comments>http://jakob.engbloms.se/archives/799#comments</comments>
		<pubDate>Sun, 07 Jun 2009 19:52:47 +0000</pubDate>
		<dc:creator>Jakob</dc:creator>
				<category><![CDATA[computer architecture]]></category>
		<category><![CDATA[embedded systeme]]></category>
		<category><![CDATA[Brian Cantrill]]></category>
		<category><![CDATA[Gary Stringham]]></category>
		<category><![CDATA[hardware design]]></category>
		<category><![CDATA[hardware-software interface]]></category>
		<category><![CDATA[Keith Adams]]></category>
		<category><![CDATA[Steve Gibson]]></category>

		<guid isPermaLink="false">http://jakob.engbloms.se/?p=799</guid>
		<description><![CDATA[When I started out doing computer science &#8220;for real&#8221; way back, the emphasis and a lot of the fun was in the basics of algorithms, optimizing code, getting complex trees and sorts and hashes right an efficient. It was very much about computing defined as processor and memory (with maybe a bit of disk or [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-800" title="pn4_quad-gigaswift-utp-adapter" src="http://jakob.engbloms.se/wp-content/uploads/2009/06/pn4_quad-gigaswift-utp-adapter.gif" alt="pn4_quad-gigaswift-utp-adapter" width="100" height="73" />When I started out doing computer science &#8220;for real&#8221; way back, the emphasis and a lot of the fun was in the basics of algorithms, optimizing code, getting complex trees and sorts and hashes right an efficient. It was very much about computing defined as processor and memory (with maybe a bit of disk or printing or user interface accessed at a very high level, and providing the data for the interesting stuff). However, as time has gone on, I have come to feel that this is almost too clean, too easy to abstract&#8230; and gone back to where I started in my first home computer, programming close to the metal.</p>
<p><span id="more-799"></span>As I dig deeper into operating systems and the hardware-software interface layer (mostly with the help of virtual platforms), I have come to appreciate just how hard and interesting that part of the computing stack is. I guess it is partially because that is the level where most of the nice thick layers of middleware and API software we use these days (and which to be frank I find fairly boring) just break down and have to start dealing with the real world. For some reasons, web servers and their programming feels barren and boring compared to dealing with interrupts, memory maps, and bit twiddling.</p>
<p>Several things I have read and heard about recently touch on this subject in various ways. All of them point to the fact that hardware-software interface design is important, and that there is a lot of right and wrong ways of doing it&#8230; which are rarely taught in universities and rarely approached in computing literature.</p>
<p>First, <a href="http://blogs.sun.com/bmc/entry/concurrency_s_shysters">Brian Cantrill of Sun wrote a blog post blasting transactional memory </a>in November of 2008, which I recently reread and got a bit of a epiphany from in this paragraph:</p>
<blockquote><p>&#8230; Even if one assumes that writing a transaction is conceptually easier than acquiring a lock, and even if one further assumes that transaction-based pathologies like livelock are easier on the brain than lock-based pathologies like deadlock, there remains a fatal flaw with transactional memory: much system software can never be in a transaction <strong>because it does not merely operate on memory</strong>. That is, system software frequently takes action outside of its own memory, requesting services from software or hardware operating on a disjoint memory (the operating system kernel, an I/O device, a hypervisor, firmware, another process &#8212; or any of these on a remote machine). In much system software, the in-memory state that corresponds to these services is protected by a lock &#8212; and the manipulation of such state will never be representable in a transaction. So for me at least, transactional memory is an unacceptable solution to a non-problem.</p></blockquote>
<p>In the same style, <a href="http://x86vmm.blogspot.com/2008/11/cantrill-and-bonwick-get-all-concurrent.html">Keith Adams at VMWare </a>picked up on the above and applied it to the microkernel idea:</p>
<blockquote><p>It&#8217;s interesting to me that, as with microkernels, one of the principle reasons TM will fail is the messy, messy reality of peripheral devices. One of the claims made by microkernel proponents is that, since microkernel drivers are &#8220;just user-level processes&#8221;, they&#8217;ll survive driver failures. And this is almost true, for some definition of &#8220;survive.&#8221; Suppose you&#8217;re a microkernel, and you restart a failed user-level driver; the new driver instance has no way of knowing what state the borked-out driver left the actual, physical hardware in. Sometimes, a blind reset procedure can safely be carried out, but sometimes it can&#8217;t. Also, the devices being driven are DMA masters, so they might very well have done something horrible to the kernel even though the buggy driver was &#8220;just a user-level app.&#8221; And if there were I/Os in flight at failure time, have they happened, or not? Remember, they might not be idempotent&#8230; I&#8217;m not saying that some best-effort way of dealing with many of these problems is impossible, just that it&#8217;s unclear that moving the driver into userspace has helped the situation at all.</p></blockquote>
<p>So what this shows is that the hardware-software interface is where the really hard and interesting problems start to pop up. I am big fan of abstraction and layers of indirection as programming methodologies, I am not a <a href="http://www.grc.com">Steve Gibson</a> who feels that programs are best written in assembly&#8230; but the abstractions do have to allow for the truth that is underneath the system. Bad abstractions or too simple abstractions make things more complex, rather than less.</p>
<p>Moving on from the software side of things to the hardware design side,<a href="http://www.garystringham.com/newsletter.shtml">Gary Stringham is running a nice series of tips for hardware design</a>. Here, there are lots of interesting issues to confront as well to make hardware easy or worthwhile to use. He recently ran a link to a <a href="http://www.microsoft.com/whdc/resources/MVP/xtremeMVP_hw.mspx">2004 Microsoft article on how hardware should be designed</a>, based on the experience of the Windows driver team at Microsoft.</p>
<blockquote><p>If every hardware engineer just understood that write-only registers make debugging almost impossible, our job would be a lot easier. Many products are designed with registers that can be written, but not read. This makes the hardware design easier, but it means there is no way to snapshot the current state of the hardware, or do a debug dump of the registers, or do read-modify-write operations. Now that virtually all hardware design is done in Verilog or VHDL, it takes only a tiny bit of additional effort to make the registers readable.</p>
<p>Another typical hardware trick is registers that automatically clear themselves when written. Although this is sometimes useful, it also makes debugging difficult when overused.</p></blockquote>
<p>I guess it is kind of sad that even five years later, this same issues do seem to crop up in new products and merit volumes of venom from driver developers&#8230; On the other hand, some companies do seem to be getting it. To me, the Freescale designs of recent years do seem to be fairly easy to configure and debug, and not feature write-only bits in any large number.</p>
<p>The article about <a href="http://jakob.engbloms.se/archives/770">hardware acceleration for TCP/IP by Mike Odell </a>that I discussed in a previous blog post is also relevant: when do the complexity of hardware interfacing negate any performance benefit from an accelerator?</p>
<p>(<em>for some reason, the initial posting of this post had an incomplete last paragraph, something weird in WordPress updates happened</em>)</p>
<p>To sum up, I think the interaction of hardware and software in the context of full opreating systems and device driver stacks is a really interesting topic that seems to have not gotten very much academic coverage. I hope to be able to help remedy some of this, once I get the Simics setup used in my <a href="http://jakob.engbloms.se/archives/709">experiments with hardware accelerators </a>packaged and available for academia. Full-system virtual platforms make for a very good experimental system, especially those where you use some third-party or standard operating system rather than just your own controlled code.</p>
<div class="simple_likebuttons_container_small">
      <div class="simple_likebuttons_googleplus">
        <g:plusone size="medium" count="false" href="http://jakob.engbloms.se/archives/799"></g:plusone>
      </div>
    
      <div class="simple_likebuttons_twitter simple_likebuttons_twitter_s">
        <a href="https://twitter.com/share" class="twitter-share-button" data-count="none" data-url="http://jakob.engbloms.se/archives/799" data-lang="en">Tweet</a>
      </div>
    
      <div class="simple_likebuttons_facebook">
        <div id="fb-root"></div>
        <script>(function(d, s, id) {
          var js, fjs = d.getElementsByTagName(s)[0];
          if (d.getElementById(id)) {return;}
          js = d.createElement(s); js.id = id;
          js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
          fjs.parentNode.insertBefore(js, fjs);
        }(document, "script", "facebook-jssdk"));</script>
        <div class="fb-like" data-href="http://jakob.engbloms.se/archives/799" data-send="false" data-layout="button_count" data-show-faces="false" data-width="90"></div>
      </div>
    </div>]]></content:encoded>
			<wfw:commentRss>http://jakob.engbloms.se/archives/799/feed</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>

