• About Jakob Engblom and this blog
Observations from Uppsala Computer Simulation, Virtual Platforms, Embedded Programming, Multicore and More (by Jakob Engblom)

SiCS Multicore Days: The Debate Points

2008 September 19 22:14 / 7 Comments / Jakob

It is a week ago now, and sometimes it is good to let impressions sink in and get processed a bit before writing about an event like the SiCS Multicore Days. Overall, the event was serious fun, and I found the speakers very insightful and the panel discussion and audience questions added even more information.

What was quite striking this year was the greater difference of opinion between the speakers. I guess that in 2007, most of the discussion was on the level of “ouch, here comes multicore and what are we going to do about it”. This year, we got a bit deeper and with one more year of experience and massive research work, the collective world of multicore have made some progress and gained insights. And that’s when the differences start to show up; the fact that we have differences of opinion tells us that we are starting to dig into details and turning up different answers due to different viewpoints and user experiences.

So where were the differences this time?

  • Heterogeneous vs homogeneous cores (on a single chip). Kunle Olukotun clearly supported the heterogeneous style (which is what you with Sun’s Niagara that he designed the basis for). Erik Hagersten was more interested in the difference between thin and fat cores of the same basic ISA, and Anant Agarwal was strongly in favor of completely homogeneous systems (which is what they build at Tilera). In my biased view, I think the argument for heterogeneous in pure energy efficiency is always going to prevail. See some of my previous blog posts on this topic, for some background:
    • DNS Hardware Acceleration.
    • Interview with Kunle Olukotun at the Register.
    • Homogeneous vs heterogenous.
    • Homogeneous vs heterogeneous, continued.
    • IBM Z6 accelerators.
    • Montalvo and heterogeneous x86.
  • Domain-specific vs general-purpose programming languages. The same sides here, with Kunle advocating domain-specific languages, and Anant and David Padua more in the general-purpose camp. I like domain-specific better, it seems to rhyme more with what I see people actually doing today to increase programming productivity overall.
  • Memory bottleneck or not? The most interesting discussion came when memory bandwidth and cache sizes were discussed. One quite common school of thought over the past few years teach that caches per core will shrink, and bandwidth to get data into and out of a chip is going to be a severe restriction on what can be done. Not all in the panel agreed with this, there was the idea (mostly from Kunle) that in some way the massive bandwidths and low latencies achievable within a chip (compared to between chip in a classic discrete-processors multiprocessor) could make this less of a problem. Personally, I think this is going to be some kind of problem, but maybe not as much as passing data around faster might reduce the need to store it temporarily. Despite the need for more bandwidth, nobody really agreed with Erik’s thought that maybe it makes sense to build chips that do not max out on the number of cores they contain, but rather try to balance core count with achievable IO bandwidth. That idea has some merit.
  • Core counts. Moore’s law tells us there are going to be thousands of cores on a chip fairly soon… but if we do not manage to make good use of them, maybe the growth in core counts will slow soon. Putting four or six or eight cores into a general-purpose system makes sense today, but more than that might turn out to be a waste for the vast majority of users that do not have problems to solve and programs to run that can make of more than that. In the same sense, maybe it is better with slightly fewer more powerful cores than a maximum amount of minimalistic cores, considering the state of software available today. So it sounds like a fairly divergent future here.
  • Shared memory or local memories? Most of the seemed to be in the camp proposing that shared memory is too convenient not to have, even when it really is bad for you. Several bad jokes comparing shared memory to alcohol, and the moderator of the panel suggesting that a good way to avoid the hangover of shared memory is to stay drunk… whatever that means in practice.

Somethings were generally agreed upon, though.

  • Programming is an issue, shared-memory or local-memory or whatever. the idea for the solution varied, however, as discussed above.
  • Cores will still be plentiful and that operating-systems focusing on sharing time on a single very valuable core is an idea of the past. The keyword for the future is spatial sharing and reducing the overhead of management (I have some previous blog posts on this topic, especially on the subject of IMA and real-time control when cores are free).
  • Virtualization and isolating partitions of a multicore chip from each are necessary mechanisms. Running multiple different operating systems on a single chip will be quite normal, probably under the control of some global hypervisor.

Any comments on this from my small audience? I think the topics under discussion are quite fascinating and the kind of issues on which the success of major chip design projects will be decided. A good architecture with a good programming model has a great chance of success (as long as it looks like a continuation of something existing :) ).

Tweet
Posted in: conferences, multicore computer architecture, multicore software, security / Tagged: conference, heterogeneous, homogeneous, memory bandwidth, multicore, panel discussion, SiCS Multicore days, software tools

7 Thoughts on “SiCS Multicore Days: The Debate Points”

  1. Niklas Ekström on 2008 September 21 at 13:37 said:

    Re Homogeneous vs heterogeneous architecture. It seems that you are saying that given a certain piece of software it is possible to assemble custom hardware hardware that runs that software more energy efficiently than some general purpose hardware. I guess this is hard to argue with.

    In this interview, http://arstechnica.com/articles/paedia/gpu-sweeney-interview.ars/, Tim Sweeney seems to argue that homogeneous architectures is better because “…it could dramatically simplify the toolset and the processes for creating software.”, and heterogeneous is worse because “…a lot of the complexity is unnecessary and makes load-balancing more difficult.”

    Perhaps it would be possible to compile a list of arguments for and against homogeneous/heterogeneous architectures (hopefully arguments that everyone can agree on), and then use those arguments to reason about what architecture is better for running different sets of software.

  2. Jakob on 2008 September 21 at 21:44 said:

    Thanks for the link!

    I can see his point, but it is equally important to recognize why GPUs are not the same as CPUs today — if it was as simple as simple programming trumping raw power, the GPU would be dead. But even today’s fairly general GPUs are orders of magnitude more efficient than general-purpose processors at churning through their target loads. And nothing is going to change that.

    As I see it, an important facet of programming is that form follows function — a good program should be designed after the environment it is going to work in and the manipulations and computations it is supposed to achieve. For graphics, this would mean that program structure is still quite domain-driven, which can be exploited by domain-specific architectures. Not being domain-specific is really going to make the hardware quite inefficient.

    Just look at how much more efficiently a multithreaded architecture cuts through web servers compared to single-threaded processors. A “general purpose” computer is a swiss army knife: decent at a lot of things, great at nothing. When you need true greatness, you specialize the architecture to suit the domain.

    Also, as soon as power and efficiency per chip size becomes a real issue, (which it really is not on a PC which even in laptops have quite generous power budgets of 60 to 90W, and in desktops reaching to 900W today I heard), heterogeneity becomes much more attractive. On battery power, specialization really helps.

  3. Andras Vajda on 2008 September 29 at 20:49 said:

    Hi Jakob,
    I’m glad to see you liked the panel debate.
    On the issue of heterogeneous vs homogeneous: there’s one thing that is often overlooked, the issue of cost vs benefit when it comes to chip design. Designing a new chip costs roughly the same, irrespective of its nature (at least there isn’t an order of magnitude difference), but the more specialized a chip is, the smaller market it can address. Hence, I would argue that more generic chips with good interconnects, memory architecture etc will have a greater economical viability.
    I’ll soon post an article on this to my blog as well, http://www.a-vajda.eu/blog
    Cheers,
    Andras (the one who moderated the panel)

  4. Jakob on 2008 September 29 at 21:39 said:

    Andras, that is a good point, but I think it overlooks the running costs of the chip. If you can get an order of magnitude efficiency in power, the cost might be worth it. The alternative is quite often that you cannot keep up at all. No CPU can do routing or graphics displaying or pattern matching quite as fast as specialized hardware.

    Quite often the cost of designing-in an existing accelerator appears quite small for a particular chip design. Most chips are mostly regroupings of existing IP, not brand new from-scratch development.

    But it is a sliding scale of economics, and including the chip design cost as well run-time energy efficiency, processing latencies, and total system cost is not an easy tradeoff. I think that a general system will always be less efficient, and the question is really when that lack of efficiency makes it inapplicable and therefore cuts off a part of the market that a more specialized chip can address. There is a reason that Intel chips are rare in embedded — 100W is simply a bit much to stomach for many systems just to get a few processor cores.

    Too bad that question never got asked in the panel in that kind of direct a way.

  5. Andras on 2008 September 30 at 16:20 said:

    Jakob, I’ve just posted on my blog a reasoning around the economics:
    http://a-vajda.eu/blog/?p=39

  6. Pingback: Observations from Uppsala » Cadence Industry Insight: “Virtual Platforms Unite HW and SW”

  7. Pingback: Observations from Uppsala » SiCS Multicore Day 2009

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Post Navigation

← Previous Post
Next Post →

Recent Posts

  • A Few Electrons too Many
  • Wind River Blog: Visuality NQ CIFS Server on Simics
  • Everything in the Cloud?
  • Wind River Blog: TCF and Simics
  • Off-Topic: Moving Bad Piggies Save Games
  • Two Cores, Four Cores, Eight Cores – Mobile Variety
  • Bliss: Failing to Pivot for Ideology
  • Wind River Blog and Movie: Demo of Simics Debugging
  • Simulation vs Reality in Schlock Mercenary
  • Programming like Lego
  • Does ISA Matter for Performance?
  • Wind River Blog: Debugging Simics using Simics
  • Wind River Blog: Simics and Flying Piggies
  • Dragons can be Useful – when AT Models Make Sense
  • Logging (Some More Thoughts)

Categories

  • appearances (30)
  • articles (21)
  • blogging (10)
  • books (6)
  • business issues (31)
  • computer architecture (35)
  • conferences (34)
  • EDA (50)
    • ESL (35)
  • embedded (78)
    • embedded software (57)
    • embedded systeme (50)
  • general research (6)
  • history (32)
    • general history (7)
    • history of computing (26)
  • off-topic (94)
    • biking (5)
    • board games (1)
    • computer games (3)
    • desktop software (35)
    • food and drink (1)
    • funny (12)
    • gadgets (24)
    • Politics (3)
    • popular culture (5)
    • trains (5)
    • transportation (10)
    • travel (10)
    • websites (3)
  • parallel computing (92)
    • multicore computer architecture (51)
    • multicore debug (22)
    • multicore software (65)
  • programming (107)
  • review (8)
  • security (19)
  • teaching (7)
  • testing (9)
  • uncategorized (12)
  • virtual things (128)
    • computer simulation technology (68)
    • virtual machines (17)
    • virtual platforms (97)
    • virtualization (14)
  • Wind River Blog (39)

Tags

ARM blog commentary Cadence Checkpointing clock-cycle models Communications of the ACM computer architecture conference cycle accuracy debugging DML Domain-specific languages embedded freescale G900 heterogeneous homogeneous IBM Intel iPod lego linux mobile phones multicore off-topic office 2007 operating systems p4080 podcast commentary power architecture rant research reverse debugging reverse execution S4D SiCS Multicore days Simics simulation software tools Sun SystemC video virtualization Vista Windows

1

  • F-Secure Blog

Blogs and news

  • Andras Vajda's blog (on multicore)
  • Embedded in Academia (John Regehr)
  • Grant Martin
  • Jack Ganssle
  • My Wind River Blog
  • Security Now podcast
  • Secworks (Joachim Strömbergson)
  • Simon Kågström
  • Synopsys View from the Top
  • Worse Than Failure

Archives

  • May 2013 (1)
  • April 2013 (1)
  • March 2013 (4)
  • February 2013 (1)
  • January 2013 (3)
  • December 2012 (2)
  • November 2012 (2)
  • October 2012 (1)
  • September 2012 (6)
  • August 2012 (4)
  • July 2012 (4)
  • June 2012 (3)
  • May 2012 (4)
  • April 2012 (2)
  • March 2012 (3)
  • February 2012 (1)
  • January 2012 (6)
  • December 2011 (2)
  • November 2011 (3)
  • October 2011 (4)
  • September 2011 (5)
  • August 2011 (4)
  • July 2011 (3)
  • June 2011 (4)
  • May 2011 (7)
  • April 2011 (1)
  • March 2011 (3)
  • February 2011 (5)
  • January 2011 (1)
  • December 2010 (4)
  • November 2010 (3)
  • October 2010 (5)
  • September 2010 (5)
  • August 2010 (5)
  • July 2010 (6)
  • June 2010 (5)
  • May 2010 (3)
  • April 2010 (4)
  • March 2010 (3)
  • February 2010 (4)
  • January 2010 (7)
  • December 2009 (6)
  • November 2009 (6)
  • October 2009 (7)
  • September 2009 (6)
  • August 2009 (7)
  • July 2009 (11)
  • June 2009 (5)
  • May 2009 (10)
  • April 2009 (7)
  • March 2009 (8)
  • February 2009 (9)
  • January 2009 (12)
  • December 2008 (8)
  • November 2008 (9)
  • October 2008 (9)
  • September 2008 (10)
  • August 2008 (13)
  • July 2008 (12)
  • June 2008 (8)
  • May 2008 (9)
  • April 2008 (10)
  • March 2008 (7)
  • February 2008 (8)
  • January 2008 (5)
  • December 2007 (5)
  • November 2007 (7)
  • October 2007 (7)
  • September 2007 (12)
  • August 2007 (9)
  • July 2007 (2)
© Copyright 2013 - Observations from Uppsala
Infinity Theme by DesignCoral / WordPress