• About Jakob Engblom and this blog
Observations from Uppsala Computer Simulation, Virtual Platforms, Embedded Programming, Multicore and More (by Jakob Engblom)

Nvidia “Kal-El” Variable SMP

2011 September 23 21:16 / 2 Comments / Jakob

Nvidia recently announced that their already-known “Kal-El” quad-core ARM Cortex-A9 SoC actually contains five processor cores, not just four as a “normal” quad-core would. They call the architecture “Variable SMP”, and it is a pretty smart design. The one where you think, “I should have thought of that”, which is the best sign of something truly good.

It is common practice in multicore computing today to dynamically change the clock frequency of a processor and turn cores on and off in order to adjust the compute power available to the current workload. Such operations tend to be limited in scope, as processors have minimum clock frequencies that make sense, and often the memory system requires all cores to be at the same frequency. Operating systems also tend to want to work with homogeneous sets of cores, as that makes scheduling reasonably straight-forward. This is probably what has kept the idea of “small + large” cores of the same ISA out of the mainstream of SMP design, despite all its advantages in principle.

Now, Nvidia has managed to implement some of that idea in Kal-El.

The key observation is that if you can turn cores on and off, once you get down to a single active core, any system is by definition homogeneous across all cores regardless of what that core is. Changing the nature of this core should then be much easier, since there is only a single core to contend with.

What Nvidia does in Kal-El is to add a fifth low-power core to the main group of four high-performance cores. The fifth core is architecturally identical (ARM Cortex-A9), so that the system state can be moved from the high-performance to the low-performance cores without undue complexities. Indeed, this is all done in hardware, so the OS (typically, Android) thinks it is running on a homogeneous quad-core. When the system is lightly loaded and the OS decides to only have a single core on, the hardware can detect the load is really light, and effectively change the nature of the active core to a low-power-optimized version.

Once more compute power is needed, the hardware invisible slips back to the first high-power core, and then the OS can start increasing clocks and turning on cores as usual. It is effectively the same as a regular ARM Cortex-A9 quad-core setup, but with better low-power performance. The following graph from the Nvidia white paper shows it pretty clearly (red text is my added comment):

Note the slope of the green line: that core is not a good one if you want high performance. It is optimized to scale within a range of low compute-power requirements, rather than provide the best performance per watt at the high end. Using Variable SMP, Nvidia lets us have both.

Neat.

More reading:

  • ArsTechnica has a short summary
  • There does not seem to be much more right now, everyone is really just reiterating the points from the white paper.

 

Tweet
Posted in: computer architecture, multicore computer architecture / Tagged: ARM, heterogeneous, Kal-El, Nvidia

2 Thoughts on “Nvidia “Kal-El” Variable SMP”

  1. John Regehr on 2011 September 23 at 21:35 said:

    I think another reason we haven’t seen this idea in mainstream machines is that it’s a policy nightmare. On the phone/tab things are easier: screen off -> use the slow core; screen on -> use one or more fast cores.

  2. Jakob on 2011 September 24 at 10:03 said:

    Excellent point. Yes, on a desktop is would be harder to determine when a “background processing” mode has been reached. You won’t see 24-hour bulk processing jobs run on a phone.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Post Navigation

← Previous Post
Next Post →

Recent Posts

  • Military Science Fiction – The Books Blur Together
  • Wind River Blog: Starting & Configuring Simics
  • Wind River Blog:
  • Nudge Theory and Graphical User Interfaces
  • Wind River Blog: Collaborating with Recording Checkpoints
  • Wind River Blog: Simics 4.8 is Here
  • A Few Electrons too Many
  • Wind River Blog: Visuality NQ CIFS Server on Simics
  • Everything in the Cloud?
  • Wind River Blog: TCF and Simics
  • Off-Topic: Moving Bad Piggies Save Games
  • Two Cores, Four Cores, Eight Cores – Mobile Variety
  • Bliss: Failing to Pivot for Ideology
  • Wind River Blog and Movie: Demo of Simics Debugging
  • Simulation vs Reality in Schlock Mercenary

Categories

  • appearances (30)
  • articles (21)
  • blogging (10)
  • books (7)
  • business issues (31)
  • computer architecture (35)
  • conferences (34)
  • EDA (50)
    • ESL (35)
  • embedded (78)
    • embedded software (57)
    • embedded systeme (50)
  • general research (6)
  • history (32)
    • general history (7)
    • history of computing (26)
  • off-topic (94)
    • biking (5)
    • board games (1)
    • computer games (3)
    • desktop software (35)
    • food and drink (1)
    • funny (12)
    • gadgets (24)
    • Politics (3)
    • popular culture (5)
    • trains (5)
    • transportation (10)
    • travel (10)
    • websites (3)
  • parallel computing (92)
    • multicore computer architecture (51)
    • multicore debug (22)
    • multicore software (65)
  • programming (109)
  • review (8)
  • security (19)
  • teaching (7)
  • testing (9)
  • uncategorized (12)
  • virtual things (131)
    • computer simulation technology (68)
    • virtual machines (18)
    • virtual platforms (99)
    • virtualization (14)
  • Wind River Blog (43)

Tags

ARM blog commentary Cadence Checkpointing clock-cycle models Communications of the ACM computer architecture conference cycle accuracy debugging Domain-specific languages eclipse embedded freescale G900 heterogeneous homogeneous IBM Intel iPod lego linux mobile phones multicore off-topic office 2007 operating systems p4080 podcast commentary power architecture rant research reverse debugging reverse execution S4D SiCS Multicore days Simics simulation software tools Sun SystemC video virtualization Vista Windows

1

  • F-Secure Blog

Blogs and news

  • Andras Vajda's blog (on multicore)
  • Embedded in Academia (John Regehr)
  • Grant Martin
  • Jack Ganssle
  • My Wind River Blog
  • Security Now podcast
  • Secworks (Joachim Strömbergson)
  • Simon Kågström
  • Synopsys View from the Top
  • Worse Than Failure

Archives

  • June 2013 (3)
  • May 2013 (4)
  • April 2013 (1)
  • March 2013 (4)
  • February 2013 (1)
  • January 2013 (3)
  • December 2012 (2)
  • November 2012 (2)
  • October 2012 (1)
  • September 2012 (6)
  • August 2012 (4)
  • July 2012 (4)
  • June 2012 (3)
  • May 2012 (4)
  • April 2012 (2)
  • March 2012 (3)
  • February 2012 (1)
  • January 2012 (6)
  • December 2011 (2)
  • November 2011 (3)
  • October 2011 (4)
  • September 2011 (5)
  • August 2011 (4)
  • July 2011 (3)
  • June 2011 (4)
  • May 2011 (7)
  • April 2011 (1)
  • March 2011 (3)
  • February 2011 (5)
  • January 2011 (1)
  • December 2010 (4)
  • November 2010 (3)
  • October 2010 (5)
  • September 2010 (5)
  • August 2010 (5)
  • July 2010 (6)
  • June 2010 (5)
  • May 2010 (3)
  • April 2010 (4)
  • March 2010 (3)
  • February 2010 (4)
  • January 2010 (7)
  • December 2009 (6)
  • November 2009 (6)
  • October 2009 (7)
  • September 2009 (6)
  • August 2009 (7)
  • July 2009 (11)
  • June 2009 (5)
  • May 2009 (10)
  • April 2009 (7)
  • March 2009 (8)
  • February 2009 (9)
  • January 2009 (12)
  • December 2008 (8)
  • November 2008 (9)
  • October 2008 (9)
  • September 2008 (10)
  • August 2008 (13)
  • July 2008 (12)
  • June 2008 (8)
  • May 2008 (9)
  • April 2008 (10)
  • March 2008 (7)
  • February 2008 (8)
  • January 2008 (5)
  • December 2007 (5)
  • November 2007 (7)
  • October 2007 (7)
  • September 2007 (12)
  • August 2007 (9)
  • July 2007 (2)
© Copyright 2013 - Observations from Uppsala
Infinity Theme by DesignCoral / WordPress