• About Jakob Engblom and this blog
Observations from Uppsala Computer Simulation, Virtual Platforms, Embedded Programming, Multicore and More (by Jakob Engblom)

Memory Models: x86 is TSO, TSO is Good

2011 June 22 17:16 / 1 Comment / Jakob

By chance, I got to attend a day at the UPMARC Summer School with a very enjoyable talk by Francesco Zappa Nardelli from INRIA. He described his work (along with others) on understanding and modeling multiprocessor memory models. It is a very complex subject, but he managed to explain it very well.

He showed a very interesting discussion from a few years ago on the x86 memory model and the implementation of spinlocks in the Linux kernel. Various experts went back and forth over whether the final MOV that sets a lock variable to 1 needed to be prefixed by LOCK or not. The discussion ended when Linus Torvalds said “I know that it is needed”. Only to see an Intel architect finally intervene and say “you know, really, it isn’t needed”. This was followed by a series of releases of Intel manuals documenting the x86 memory model, with increasing precision in each release. Intel also actually changed the published rules along the road, withdrawing some optimizations as they realized that they would break existing software.

Note that such a description of a memory model must both describe existing hardware, and serve as the guideline for future hardware. Therefore, there are optimizations that are not implemented today but which are possible given the rules. Such optimization opportunities can be removed from the rulebook as long as they have never been part of shipping hardware, so it is not as crazy as it might sound.

Anyway, the point that Francesco made was both to tell an interesting story from history, and making the point that describing and understanding memory models is hard. I certainly agree with that. I recall an ISCA many years ago when some computer architecture professors all agreed that very few people really understand consistency and weak memory models.

To make life easier for programmers, Francesco and Peter Sewell (in Cambridge) has defined their own set of rules for x86 memory consistency. This is not an architecture spec, but a rule set for regular programmers. It is found at http://www.cl.cam.ac.uk/~pes20/weakmemory/. Essentially, the conclusion is that x86 in practice implements the old SPARC TSO memory model.

They have also attempted to formalize the Power Architecture memory model. Both the actual memory model and their model of it can only be described as very complex. The programmer’s model is expressed in terms of store queues, speculative instruction execution, and commits of instructions. Not something you easily keep in your head. It is interesting to note that ARM MPCore essentially copied the Power Architecture.

He showed an interactive simulation of the Power memory model, and the way that you need to think about it in terms of propagating information between threads and committing them. It is possible to propagate values and then another propagation overrides a value before the thread commits… Fun. Or a headache.

The big take-away from the talk for me is that it confirms the observation made may times before that SPARC TSO seems to be the optimal memory model. It is sufficiently understandable that programmers can write correct code without having barriers everywhere. It is sufficiently weak that you can build fast hardware implementation that can scale to big machines.

Maybe TSO does not theoretically scale in the same insane way as Power or Alpha does/did. But the cost of that theoretical scalability is that programmers might have to litter their code with sync operations just to get it to run correctly. With too many sync operations, the code will run very slowly negating any advantage on the hardware level. Note that sync operations can be very expensive. Doug Lea, in the audience, pointed out that a sync can cost up to 300 cycles on a POWER5.

Tweet
Posted in: computer simulation technology, conferences, multicore computer architecture, multicore software, parallel computing / Tagged: ARM, Doug Lea, Francesco Zappa Nardelli, memory consistency, power architecture, SPARC, UpMarc, x86

One Thought on “Memory Models: x86 is TSO, TSO is Good”

  1. Pingback: Weak vs. Strong Memory Models

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Post Navigation

← Previous Post
Next Post →

Recent Posts

  • A Few Electrons too Many
  • Wind River Blog: Visuality NQ CIFS Server on Simics
  • Everything in the Cloud?
  • Wind River Blog: TCF and Simics
  • Off-Topic: Moving Bad Piggies Save Games
  • Two Cores, Four Cores, Eight Cores – Mobile Variety
  • Bliss: Failing to Pivot for Ideology
  • Wind River Blog and Movie: Demo of Simics Debugging
  • Simulation vs Reality in Schlock Mercenary
  • Programming like Lego
  • Does ISA Matter for Performance?
  • Wind River Blog: Debugging Simics using Simics
  • Wind River Blog: Simics and Flying Piggies
  • Dragons can be Useful – when AT Models Make Sense
  • Logging (Some More Thoughts)

Categories

  • appearances (30)
  • articles (21)
  • blogging (10)
  • books (6)
  • business issues (31)
  • computer architecture (35)
  • conferences (34)
  • EDA (50)
    • ESL (35)
  • embedded (78)
    • embedded software (57)
    • embedded systeme (50)
  • general research (6)
  • history (32)
    • general history (7)
    • history of computing (26)
  • off-topic (94)
    • biking (5)
    • board games (1)
    • computer games (3)
    • desktop software (35)
    • food and drink (1)
    • funny (12)
    • gadgets (24)
    • Politics (3)
    • popular culture (5)
    • trains (5)
    • transportation (10)
    • travel (10)
    • websites (3)
  • parallel computing (92)
    • multicore computer architecture (51)
    • multicore debug (22)
    • multicore software (65)
  • programming (107)
  • review (8)
  • security (19)
  • teaching (7)
  • testing (9)
  • uncategorized (12)
  • virtual things (128)
    • computer simulation technology (68)
    • virtual machines (17)
    • virtual platforms (97)
    • virtualization (14)
  • Wind River Blog (39)

Tags

ARM blog commentary Cadence Checkpointing clock-cycle models Communications of the ACM computer architecture conference cycle accuracy debugging DML Domain-specific languages embedded freescale G900 heterogeneous homogeneous IBM Intel iPod lego linux mobile phones multicore off-topic office 2007 operating systems p4080 podcast commentary power architecture rant research reverse debugging reverse execution S4D SiCS Multicore days Simics simulation software tools Sun SystemC video virtualization Vista Windows

1

  • F-Secure Blog

Blogs and news

  • Andras Vajda's blog (on multicore)
  • Embedded in Academia (John Regehr)
  • Grant Martin
  • Jack Ganssle
  • My Wind River Blog
  • Security Now podcast
  • Secworks (Joachim Strömbergson)
  • Simon Kågström
  • Synopsys View from the Top
  • Worse Than Failure

Archives

  • May 2013 (1)
  • April 2013 (1)
  • March 2013 (4)
  • February 2013 (1)
  • January 2013 (3)
  • December 2012 (2)
  • November 2012 (2)
  • October 2012 (1)
  • September 2012 (6)
  • August 2012 (4)
  • July 2012 (4)
  • June 2012 (3)
  • May 2012 (4)
  • April 2012 (2)
  • March 2012 (3)
  • February 2012 (1)
  • January 2012 (6)
  • December 2011 (2)
  • November 2011 (3)
  • October 2011 (4)
  • September 2011 (5)
  • August 2011 (4)
  • July 2011 (3)
  • June 2011 (4)
  • May 2011 (7)
  • April 2011 (1)
  • March 2011 (3)
  • February 2011 (5)
  • January 2011 (1)
  • December 2010 (4)
  • November 2010 (3)
  • October 2010 (5)
  • September 2010 (5)
  • August 2010 (5)
  • July 2010 (6)
  • June 2010 (5)
  • May 2010 (3)
  • April 2010 (4)
  • March 2010 (3)
  • February 2010 (4)
  • January 2010 (7)
  • December 2009 (6)
  • November 2009 (6)
  • October 2009 (7)
  • September 2009 (6)
  • August 2009 (7)
  • July 2009 (11)
  • June 2009 (5)
  • May 2009 (10)
  • April 2009 (7)
  • March 2009 (8)
  • February 2009 (9)
  • January 2009 (12)
  • December 2008 (8)
  • November 2008 (9)
  • October 2008 (9)
  • September 2008 (10)
  • August 2008 (13)
  • July 2008 (12)
  • June 2008 (8)
  • May 2008 (9)
  • April 2008 (10)
  • March 2008 (7)
  • February 2008 (8)
  • January 2008 (5)
  • December 2007 (5)
  • November 2007 (7)
  • October 2007 (7)
  • September 2007 (12)
  • August 2007 (9)
  • July 2007 (2)
© Copyright 2013 - Observations from Uppsala
Infinity Theme by DesignCoral / WordPress