• About Jakob Engblom and this blog
Observations from Uppsala Computer Simulation, Virtual Platforms, Embedded Programming, Multicore and More (by Jakob Engblom)

SICS Multicore Day August 31

2007 September 2 21:13 / 10 Comments / Jakob

The SICS Multicore Day August 31 was a really great event! We had some fantastic speakers presenting the latest industry research view on multicores and how to program them. Marc Tremblay did the first presentation in Europe of Sun’s upcoming Rock processor. Tim Mattson from Intel tried hard to provoke the crowd, and Vijay Saraswat of IBM presented their X10 language. Erik Hagersten from Uppsala University provided a short scene-setting talk about how multicore is becoming the norm.


The Rock is a very interesting piece of work. It tries to be both a throughput-oriented design like the Niagara/Ultrasparc T machines, and a single-thread high-performance design. Even though on balance, it is more skewed towards the throughput computing aspect. What is very cool is how they use additional threads to help boost the performance of a main thread using “scout threads” (a concept I saw presented back at ISCA 2004). This makes it possible to use threads to either boost single-thread performance OR do throughput, creating a more flexible design than is usually the case. It is also the first commercial implementation of transactional memory. And 16-way. And due for next year.

So far, Rock seems like a very successful and very visionary project that is trying in yet another way to gain momentum by pure hardware innovation. Just like the UltraSparc T line, Sun is trying to out-invent IBM and Intel/AMD. Who seem to be mostly progressing by just piling on more of the same old features. I really hope this play goes well, if we were down to just IBM/PPC & System Z and Intel-AMD/x86-64 on the server and desktop side, the world would just be too boring.

The Intel and IBM talks on programming were both grounded in the idea that to make people accept a new programming language/API, it has to be an evolution of what the programmers already know. Which pretty much ties us down to C/C++/Java/C# with extensions and modified semantics.

X10 is basically Java with some nicely considered features to support local and global memories and programs that can scale to BlueGene-style massively clustered machines. Tim basically tells everyone to stop inventing new languages and focus on improving existing frameworks like MPI and OpenMP in collaboration with industry. Presented in a very funny style, Tim is a great presenter, and tries hard to get the audience to react. In this crowd, most people agreed. Except the Erlang people, who feel that they do have a better solution to multithreading and multicore than any patched-up language in the C-Java family. I must agree with them, and I do feel that Erlang today is mature enough to serve that purpose.

The panel session at the end was very entertaining, where some people (including myself and Joe Armstrong) tried to ask tough questions to the keynote speakers (and Ulf Wiger of Ericsson). Quite engaging and a rare chance to directly engage with some industry heavyweights who otherwise tend to sit on the other side of the Atlantic.

I think the prize for coolest tech of the day goes to QuviQ, a spin-off from Chalmers doing automated testing tools that really work well for parallel and distributed systems. Their method of minimizing the trace of a failed test case is really interesting, and finds things that no human tester would ever find.

I also presented a talk on “Debugging Multicore Software using Virtual Hardware”, in the breakout sessions. I guess our Tools track was the least visited of the three tracks, but the audience asked some good questions. And there were some good discussions afterwards.

However, to summarize the day, I am a bit disappointed that not more is being done on the hardware side to help people debug their multicore and multiprocessor parallel programs. Transactional memory is all nice and dandy and can help simplify low-level locking primitives for threaded programs. But I would like to see much more in terms of smart tracing, hardware breakpoints and triggers, massive synchronized stops, and similar features. And instructions and features that make parallel expressions simpler. Here, the embedded folks doing things like ARM CoreSight seems to have been much more successful than the server-class designers at Sun, Intel, and IBM. But even ARM do not spend more than 10-15% of the chip area on debug support.

I think it would be interesting to see what would happen if you could spend 25-30% of the chip on some seriously powerful debug features. Full support for remote control of all cores at the same time, lots of bandwidth for debug data and commands, and fat traces of all traffic on and off the chip. Performance and event counters everywhere. That would make the peak performance of chip likely less than a competing chip not spending as much space on debug support — but it would make achieving a high utilization much easier, and that might actually make the debug-intense chip more economical. Would be interesting to try. But I guess nobody would dare to buy such a design.

Tweet
Posted in: appearances, conferences, embedded software, embedded systeme, multicore computer architecture, multicore debug, multicore software, parallel computing, uncategorized / Tagged: AMD, Erlang, Hardware debug support, IBM, Intel, Joe Armstrong, Niagara, QuviQ, SiCS Multicore days, Sun, transactional memory, UltraSPARC

10 Thoughts on “SICS Multicore Day August 31”

  1. Jakob on 2007 September 5 at 21:17 said:

    Also see my later follow-up post with more on programming parallel machines: http://jakob.engbloms.se/archives/20

  2. JoachimS on 2007 September 25 at 10:44 said:

    Aloha!

    En sak man behöver fundera *ordentligt* på är hur du ser till att skydda dina debugfunktioner så att dom inte blir en fin dörr in för den som försöker plocka ut information, styra systemet på ett icke-applikations-planerat sätt etc.

    Det finns flera attacker mot ex kryptoimplementationer i chip där nyckeln extraherats via scan-kedjor över JTAG.

    Att det behövs bra stöd för debug och inte minst för profilering när SoC ökar i komplexitet är dock helt klart. Hur skall man ex kunna utveckla applikationer som utnyttjar ens delar av den tillgängliga prestandan i ett multicore-chip om det inte går att få reda på hur applikationsmönster ger upphov till cache-trashing, accesskrockar etc?

  3. Jakob on 2007 September 25 at 10:48 said:

    Mycket bra poäng. Den “lätta” lösningen är varianten där det finns ett vanligt chip som används i produkter, och sedan ett annat chip med antingen extra pinnar eller till och med extra kisel pÃ¥ som används under utvecklingen. SÃ¥ ser ICE-systemen för Tricore och V850 ut, t.ex. Men det tar bort poängen med att debugga i det system man faktiskt skeppar. Lurigt.

  4. Pingback: Observations from Uppsala » Blog Archive » FTF Paris: Debug connections threat to secure network devices

  5. Jakob on 2008 March 16 at 22:58 said:

    Note that in http://jakob.engbloms.se/archives/87 I have another panel summary, from DATE 2008. Quite different ideas on the same topic, from a very different viewpoint.

  6. Pingback: Observations from Uppsala » Blog Archive » DATE 2008 Panel on Multicore Programming

  7. Pingback: Observations from Uppsala » Blog Archive » SiCS Multicore Days 2008: Talk about Threading Simics

  8. Pingback: Observations from Uppsala » Blog Archive » The JVM as Universal Parallel Glue?

  9. Pingback: Observations from Uppsala » Blog Archive » What is Efficiency when Cores are Free?

  10. Pingback: Observations from Uppsala » SiCS Multicore Day 2012

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Post Navigation

← Previous Post
Next Post →

Recent Posts

  • Wind River Blog: Simics 4.8 is Here
  • A Few Electrons too Many
  • Wind River Blog: Visuality NQ CIFS Server on Simics
  • Everything in the Cloud?
  • Wind River Blog: TCF and Simics
  • Off-Topic: Moving Bad Piggies Save Games
  • Two Cores, Four Cores, Eight Cores – Mobile Variety
  • Bliss: Failing to Pivot for Ideology
  • Wind River Blog and Movie: Demo of Simics Debugging
  • Simulation vs Reality in Schlock Mercenary
  • Programming like Lego
  • Does ISA Matter for Performance?
  • Wind River Blog: Debugging Simics using Simics
  • Wind River Blog: Simics and Flying Piggies
  • Dragons can be Useful – when AT Models Make Sense

Categories

  • appearances (30)
  • articles (21)
  • blogging (10)
  • books (6)
  • business issues (31)
  • computer architecture (35)
  • conferences (34)
  • EDA (50)
    • ESL (35)
  • embedded (78)
    • embedded software (57)
    • embedded systeme (50)
  • general research (6)
  • history (32)
    • general history (7)
    • history of computing (26)
  • off-topic (94)
    • biking (5)
    • board games (1)
    • computer games (3)
    • desktop software (35)
    • food and drink (1)
    • funny (12)
    • gadgets (24)
    • Politics (3)
    • popular culture (5)
    • trains (5)
    • transportation (10)
    • travel (10)
    • websites (3)
  • parallel computing (92)
    • multicore computer architecture (51)
    • multicore debug (22)
    • multicore software (65)
  • programming (107)
  • review (8)
  • security (19)
  • teaching (7)
  • testing (9)
  • uncategorized (12)
  • virtual things (129)
    • computer simulation technology (68)
    • virtual machines (17)
    • virtual platforms (98)
    • virtualization (14)
  • Wind River Blog (40)

Tags

ARM blog commentary Cadence Checkpointing clock-cycle models Communications of the ACM computer architecture conference cycle accuracy debugging DML Domain-specific languages embedded freescale G900 heterogeneous homogeneous IBM Intel iPod lego linux mobile phones multicore off-topic office 2007 operating systems p4080 podcast commentary power architecture rant research reverse debugging reverse execution S4D SiCS Multicore days Simics simulation software tools Sun SystemC video virtualization Vista Windows

1

  • F-Secure Blog

Blogs and news

  • Andras Vajda's blog (on multicore)
  • Embedded in Academia (John Regehr)
  • Grant Martin
  • Jack Ganssle
  • My Wind River Blog
  • Security Now podcast
  • Secworks (Joachim Strömbergson)
  • Simon Kågström
  • Synopsys View from the Top
  • Worse Than Failure

Archives

  • May 2013 (2)
  • April 2013 (1)
  • March 2013 (4)
  • February 2013 (1)
  • January 2013 (3)
  • December 2012 (2)
  • November 2012 (2)
  • October 2012 (1)
  • September 2012 (6)
  • August 2012 (4)
  • July 2012 (4)
  • June 2012 (3)
  • May 2012 (4)
  • April 2012 (2)
  • March 2012 (3)
  • February 2012 (1)
  • January 2012 (6)
  • December 2011 (2)
  • November 2011 (3)
  • October 2011 (4)
  • September 2011 (5)
  • August 2011 (4)
  • July 2011 (3)
  • June 2011 (4)
  • May 2011 (7)
  • April 2011 (1)
  • March 2011 (3)
  • February 2011 (5)
  • January 2011 (1)
  • December 2010 (4)
  • November 2010 (3)
  • October 2010 (5)
  • September 2010 (5)
  • August 2010 (5)
  • July 2010 (6)
  • June 2010 (5)
  • May 2010 (3)
  • April 2010 (4)
  • March 2010 (3)
  • February 2010 (4)
  • January 2010 (7)
  • December 2009 (6)
  • November 2009 (6)
  • October 2009 (7)
  • September 2009 (6)
  • August 2009 (7)
  • July 2009 (11)
  • June 2009 (5)
  • May 2009 (10)
  • April 2009 (7)
  • March 2009 (8)
  • February 2009 (9)
  • January 2009 (12)
  • December 2008 (8)
  • November 2008 (9)
  • October 2008 (9)
  • September 2008 (10)
  • August 2008 (13)
  • July 2008 (12)
  • June 2008 (8)
  • May 2008 (9)
  • April 2008 (10)
  • March 2008 (7)
  • February 2008 (8)
  • January 2008 (5)
  • December 2007 (5)
  • November 2007 (7)
  • October 2007 (7)
  • September 2007 (12)
  • August 2007 (9)
  • July 2007 (2)
© Copyright 2013 - Observations from Uppsala
Infinity Theme by DesignCoral / WordPress