• About Jakob Engblom and this blog
Observations from Uppsala Computer Simulation, Virtual Platforms, Embedded Programming, Multicore and More (by Jakob Engblom)

Some Fun Cache Results from Carbon

2012 July 6 20:40 / 2 Comments / Jakob

Carbon Design Systems have been on a veritable blogging spree recently, pushing out a large number of posts around various topics. Maybe a bit brief for my taste in most cases (I have a tendency to throw out 1000+ word pseudo-articles when I take the time to write a blog), but sometimes very interesting nevertheless. I particularly liked a few posts on cache analysis, as they presented some good insight into not-quite-expected processor and cache behaviors.

The posts in questions are Cortex-A9 Cache Optimization part 2 and Cortex-A9 Cache Optimization part 3. They are not really that much about Cortex-A9, as the most interesting results seem to come from old ARM11 cores…. but the Cortex-A9 certainly shines in comparison. What I like about these kinds of small experiments and write-ups of them is when you discover something unexpected that you did not foresee at a higher level of abstraction.

In one case, we see behaviors at the hardware level (RTL) that would not normally be expected if looking at a cache system from a designer perspective. In the part 2 blog, the ARM11 core suffers extra cache misses when an L2 cache is attached to it – but not activated. The attached L2 changes some subtle core timing, causing some other core mechanisms to change their observed behavior, leading to more cache misses. Normally, you would assume that the result would be totally neutral. On the other hand, why build an L2 cache unless you intend to use it? Still, interesting observation.

It was also surprising to me that with the same size cache, the Cortex-A9 has fewer cache misses than the ARM11 on the same code. I wonder what lies behind that? It would have been nice with some deeper analysis, my guess is that it is something to do with the replacement policy in use. Or some little tweak like a prefetcher or victim cache or something.

Part 3 shows the well-known fact that an L2 cache is necessary to clean out the access stream to main memory, as well as indicating that with an L2, the L1 does not affect the main memory accesses much. The interaction between L2 and branch prediction was surprising, though. Once again, more explanations would have been in order.

But I can see the fun in playing around with these models and observing behaviors that would otherwise go unnoticed simply due to the difficulty of measuring such information on hardware.

Tweet
Posted in: computer architecture, computer simulation technology, EDA, ESL / Tagged: cache, Carbon, cycle accuracy, performance optimization

2 Thoughts on “Some Fun Cache Results from Carbon”

  1. zero_energy on 2012 July 10 at 10:41 said:

    Having the ultimate powerful design that will scale up to N cores and well into petascale computing is not a trivial task. Imagine a mobile phone with Tflps computing power in your hand? What kind of apps will be using all that power? How about a Pfls mobile device? Again what could such a device be used for?

  2. zero_energy on 2012 July 13 at 15:42 said:

    And once this hand held petascale computer is available … somebody will drop it on the pavement and destroy it. Somebody will hit it with a stone and shatter.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Post Navigation

← Previous Post
Next Post →

Recent Posts

  • Wind River Blog: Simics 4.8 is Here
  • A Few Electrons too Many
  • Wind River Blog: Visuality NQ CIFS Server on Simics
  • Everything in the Cloud?
  • Wind River Blog: TCF and Simics
  • Off-Topic: Moving Bad Piggies Save Games
  • Two Cores, Four Cores, Eight Cores – Mobile Variety
  • Bliss: Failing to Pivot for Ideology
  • Wind River Blog and Movie: Demo of Simics Debugging
  • Simulation vs Reality in Schlock Mercenary
  • Programming like Lego
  • Does ISA Matter for Performance?
  • Wind River Blog: Debugging Simics using Simics
  • Wind River Blog: Simics and Flying Piggies
  • Dragons can be Useful – when AT Models Make Sense

Categories

  • appearances (30)
  • articles (21)
  • blogging (10)
  • books (6)
  • business issues (31)
  • computer architecture (35)
  • conferences (34)
  • EDA (50)
    • ESL (35)
  • embedded (78)
    • embedded software (57)
    • embedded systeme (50)
  • general research (6)
  • history (32)
    • general history (7)
    • history of computing (26)
  • off-topic (94)
    • biking (5)
    • board games (1)
    • computer games (3)
    • desktop software (35)
    • food and drink (1)
    • funny (12)
    • gadgets (24)
    • Politics (3)
    • popular culture (5)
    • trains (5)
    • transportation (10)
    • travel (10)
    • websites (3)
  • parallel computing (92)
    • multicore computer architecture (51)
    • multicore debug (22)
    • multicore software (65)
  • programming (107)
  • review (8)
  • security (19)
  • teaching (7)
  • testing (9)
  • uncategorized (12)
  • virtual things (129)
    • computer simulation technology (68)
    • virtual machines (17)
    • virtual platforms (98)
    • virtualization (14)
  • Wind River Blog (40)

Tags

ARM blog commentary Cadence Checkpointing clock-cycle models Communications of the ACM computer architecture conference cycle accuracy debugging DML Domain-specific languages embedded freescale G900 heterogeneous homogeneous IBM Intel iPod lego linux mobile phones multicore off-topic office 2007 operating systems p4080 podcast commentary power architecture rant research reverse debugging reverse execution S4D SiCS Multicore days Simics simulation software tools Sun SystemC video virtualization Vista Windows

1

  • F-Secure Blog

Blogs and news

  • Andras Vajda's blog (on multicore)
  • Embedded in Academia (John Regehr)
  • Grant Martin
  • Jack Ganssle
  • My Wind River Blog
  • Security Now podcast
  • Secworks (Joachim Strömbergson)
  • Simon Kågström
  • Synopsys View from the Top
  • Worse Than Failure

Archives

  • May 2013 (2)
  • April 2013 (1)
  • March 2013 (4)
  • February 2013 (1)
  • January 2013 (3)
  • December 2012 (2)
  • November 2012 (2)
  • October 2012 (1)
  • September 2012 (6)
  • August 2012 (4)
  • July 2012 (4)
  • June 2012 (3)
  • May 2012 (4)
  • April 2012 (2)
  • March 2012 (3)
  • February 2012 (1)
  • January 2012 (6)
  • December 2011 (2)
  • November 2011 (3)
  • October 2011 (4)
  • September 2011 (5)
  • August 2011 (4)
  • July 2011 (3)
  • June 2011 (4)
  • May 2011 (7)
  • April 2011 (1)
  • March 2011 (3)
  • February 2011 (5)
  • January 2011 (1)
  • December 2010 (4)
  • November 2010 (3)
  • October 2010 (5)
  • September 2010 (5)
  • August 2010 (5)
  • July 2010 (6)
  • June 2010 (5)
  • May 2010 (3)
  • April 2010 (4)
  • March 2010 (3)
  • February 2010 (4)
  • January 2010 (7)
  • December 2009 (6)
  • November 2009 (6)
  • October 2009 (7)
  • September 2009 (6)
  • August 2009 (7)
  • July 2009 (11)
  • June 2009 (5)
  • May 2009 (10)
  • April 2009 (7)
  • March 2009 (8)
  • February 2009 (9)
  • January 2009 (12)
  • December 2008 (8)
  • November 2008 (9)
  • October 2008 (9)
  • September 2008 (10)
  • August 2008 (13)
  • July 2008 (12)
  • June 2008 (8)
  • May 2008 (9)
  • April 2008 (10)
  • March 2008 (7)
  • February 2008 (8)
  • January 2008 (5)
  • December 2007 (5)
  • November 2007 (7)
  • October 2007 (7)
  • September 2007 (12)
  • August 2007 (9)
  • July 2007 (2)
© Copyright 2013 - Observations from Uppsala
Infinity Theme by DesignCoral / WordPress