• About Jakob Engblom and this blog
Observations from Uppsala Computer Simulation, Virtual Platforms, Embedded Programming, Multicore and More (by Jakob Engblom)

Two Cores, Four Cores, Eight Cores – Mobile Variety

2013 March 3 22:26 / Leave a Comment / Jakob

Probably thanks to the yearly Mobile World Congress, there have been a slew of recent announcements of mobile application processors recently. Everything is ARM-based, but show quite some variety in the CPU core configurations used. Indeed, I think this variety has something to say on the general state of multicore.

My starting point is the marketing hype surrounding core counts (something we saw in PCs a few years ago). It sure sounds powerful with an “octacore” device, right? Beats an old dualcore any day, right? To some extent this is marketing hype, but it also does show that there is a need for more performance in phones and tablets. I have also been looking for a new TV lately, and was quite flabbergasted to see several brands touting dual-core processors as a key feature of their offerings. Processor specs for selling a dumb thing like a TV? Things are weird. Sure, Tim Cook of Apple blasted this as being stupid marketing “because they cannot provide a great experience” — but the truth of the matter is that when vendors compete within a single software ecosystem, this is what you compete on. The experience of Android or Windows is mostly the same across devices, so differentiation has to come from the hardware, and core counts are easy to understand. Just like horsepower in cars.

But the key question is: can we really use more cores than two or four in any meaningful way?

Looked at objectively, there are four ways to improve performance of the processor portion of an SoC:

  • More cores
  • Better microarchitecture
  • Higher clock frequency
  • Allow the device to go hotter and use more power

In ARM land, there is clearly room for all four, while in Intel PCs, it is hard to do more than minor improvements in anything except core count. So we are left with a more interesting competitive space than most. The better microarchitecture arena is a place where Qualcomm and Apple have shown that you can do better than ARMs standard cores.

Some smart and brave souls at ST Ericsson (one of which I met many years ago and have great respect for) put out an interesting whitepaper on the quad-core mania, making the case that their new very-high-clock frequency dual Cortex-A9 is more useful than a lower-clocked quad-core Cortex-A9. Basically, the software just does not typically make use of multiple cores in a good way, while driving the clock frequency higher will immediately accelerate the software that people really care about. When it comes to reducing the latency of operations, higher clocks tend to be hard to beat. Still, some marketing person at ST Ericsson decided that quad is the word of the day, and dubbed their Novathor L8580 2.5 GHz dual-core a “eQuad”, for something equivalent to a quad. Silly. Had the core been a bit more modern, I think this could have been a real winner. For most real applications, this should be much much faster than the last-generation 1.0 to 1.5 GHz Cortex-A9 dual-cores or quad-cores.

Another interesting new release was the Renesas Mobile MP6530. It is a chip targeting the midrange of smartphones, using an ARM bigLITTLE setup with two Cortex-A15 and two Cortex-A7 cores in a single core complex. What I found interesting here was that for the first time, I saw a bigLITTLE setup described as activating both types of cores at once (look at the diagram on the press release page). Until now, I have only seen designs that switch between only using A7 cores and only using A15s. It was just a matter of time before software matured to this point (the OS scheduler needs to understand that it is dealing with different types of cores, and behave accordingly), but still good to see it happening. I expect the same idea to be used on other mixed-core setups, would be strange if it was Renesas-specific.

This setup makes eminent sense to me. Living with multicore PCs for several years now, it is clear that having more than one core is useful for any multitasking environment. There is enough work simmering around even in a lightly-used system that using two cores clearly reduce latencies and provide a smoother experience with less locking-up of the UI. As long as nothing heavy is running, you can use the two power-efficient A7s and get a nice long battery life. What is interesting is what happens when a heavy task with high demands on latency appears. I think a good example is a web browser doing a rendering pass or running a Javascript application, or a game with heavy AI (which is pretty serial in general it seems). In this case, to get a snappy user experience you can kick in a powerful core at high frequency and get the job done quickly. Perfect, just activate one of the A15 cores and keep the rest of the background noise running on the A7s. If something really heavy comes along, you activate the second A15 core too.

So far, so good. But what about the new octacores sporting four Cortex-A15 along four Cortex-A7? To me, this seems strange. Assuming that we are able to use any combination of cores, this setup would only make sense if four A7s could somehow do a better job than a single A15 while using less power. I heard some numbers indicating that the magic difference is about 3 in performance… so in that case, just what are all these A7s there for?  It would make more sense with a hexacore device – 2 A7s for background noise, and then you turn on 1 to 4 A15 cores to process progressively heavier software tasks. I have no doubt that we can imagine and invent software that can make good use of any number of powerful cores – but where is the case for a large number of weak cores in a client machine (it makes perfect sense on a server)?

Considering the existence of a reference platform from ARM with two A15s and three A7s, you can clearly mix the numbers. No need for the same number of A7s and A15s. Maybe this will change as the use of truly heterogeneous multiprocessing spreads within the ARM ecosystem, while right now the safe bet is a quad + quad setup where you switch from one cluster to another. I have no real data and can only speculate.

What I would like to see done is to run long-term CPU profiling on an all-cores-active-all-the-time bigLITLLE platform and see just how much use you get out of each type of core. Unfortunately, I do not have that right now, and maybe nothing such will be on the market for quite some time.

In conclusion, I think there is some merit to bumping up mobile processors to at least quad, and there is definitely potential in mixing fast and slow cores. Making good use of eight cores seems a bit unlikely though (unless you switch between two separate clusters). One wonders if that silicon area could not have been used for something else instead.

Tweet
Posted in: computer architecture, multicore computer architecture, multicore software / Tagged: ARM, bigLITTLE, Cortex-A15, Cortex-A7, Cortex-A9, eQuad, mobile, MP6530, Renesas, ST Ericsson

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Post Navigation

← Previous Post
Next Post →

Recent Posts

  • Wind River Blog: Simics 4.8 is Here
  • A Few Electrons too Many
  • Wind River Blog: Visuality NQ CIFS Server on Simics
  • Everything in the Cloud?
  • Wind River Blog: TCF and Simics
  • Off-Topic: Moving Bad Piggies Save Games
  • Two Cores, Four Cores, Eight Cores – Mobile Variety
  • Bliss: Failing to Pivot for Ideology
  • Wind River Blog and Movie: Demo of Simics Debugging
  • Simulation vs Reality in Schlock Mercenary
  • Programming like Lego
  • Does ISA Matter for Performance?
  • Wind River Blog: Debugging Simics using Simics
  • Wind River Blog: Simics and Flying Piggies
  • Dragons can be Useful – when AT Models Make Sense

Categories

  • appearances (30)
  • articles (21)
  • blogging (10)
  • books (6)
  • business issues (31)
  • computer architecture (35)
  • conferences (34)
  • EDA (50)
    • ESL (35)
  • embedded (78)
    • embedded software (57)
    • embedded systeme (50)
  • general research (6)
  • history (32)
    • general history (7)
    • history of computing (26)
  • off-topic (94)
    • biking (5)
    • board games (1)
    • computer games (3)
    • desktop software (35)
    • food and drink (1)
    • funny (12)
    • gadgets (24)
    • Politics (3)
    • popular culture (5)
    • trains (5)
    • transportation (10)
    • travel (10)
    • websites (3)
  • parallel computing (92)
    • multicore computer architecture (51)
    • multicore debug (22)
    • multicore software (65)
  • programming (107)
  • review (8)
  • security (19)
  • teaching (7)
  • testing (9)
  • uncategorized (12)
  • virtual things (129)
    • computer simulation technology (68)
    • virtual machines (17)
    • virtual platforms (98)
    • virtualization (14)
  • Wind River Blog (40)

Tags

ARM blog commentary Cadence Checkpointing clock-cycle models Communications of the ACM computer architecture conference cycle accuracy debugging DML Domain-specific languages embedded freescale G900 heterogeneous homogeneous IBM Intel iPod lego linux mobile phones multicore off-topic office 2007 operating systems p4080 podcast commentary power architecture rant research reverse debugging reverse execution S4D SiCS Multicore days Simics simulation software tools Sun SystemC video virtualization Vista Windows

1

  • F-Secure Blog

Blogs and news

  • Andras Vajda's blog (on multicore)
  • Embedded in Academia (John Regehr)
  • Grant Martin
  • Jack Ganssle
  • My Wind River Blog
  • Security Now podcast
  • Secworks (Joachim Strömbergson)
  • Simon Kågström
  • Synopsys View from the Top
  • Worse Than Failure

Archives

  • May 2013 (2)
  • April 2013 (1)
  • March 2013 (4)
  • February 2013 (1)
  • January 2013 (3)
  • December 2012 (2)
  • November 2012 (2)
  • October 2012 (1)
  • September 2012 (6)
  • August 2012 (4)
  • July 2012 (4)
  • June 2012 (3)
  • May 2012 (4)
  • April 2012 (2)
  • March 2012 (3)
  • February 2012 (1)
  • January 2012 (6)
  • December 2011 (2)
  • November 2011 (3)
  • October 2011 (4)
  • September 2011 (5)
  • August 2011 (4)
  • July 2011 (3)
  • June 2011 (4)
  • May 2011 (7)
  • April 2011 (1)
  • March 2011 (3)
  • February 2011 (5)
  • January 2011 (1)
  • December 2010 (4)
  • November 2010 (3)
  • October 2010 (5)
  • September 2010 (5)
  • August 2010 (5)
  • July 2010 (6)
  • June 2010 (5)
  • May 2010 (3)
  • April 2010 (4)
  • March 2010 (3)
  • February 2010 (4)
  • January 2010 (7)
  • December 2009 (6)
  • November 2009 (6)
  • October 2009 (7)
  • September 2009 (6)
  • August 2009 (7)
  • July 2009 (11)
  • June 2009 (5)
  • May 2009 (10)
  • April 2009 (7)
  • March 2009 (8)
  • February 2009 (9)
  • January 2009 (12)
  • December 2008 (8)
  • November 2008 (9)
  • October 2008 (9)
  • September 2008 (10)
  • August 2008 (13)
  • July 2008 (12)
  • June 2008 (8)
  • May 2008 (9)
  • April 2008 (10)
  • March 2008 (7)
  • February 2008 (8)
  • January 2008 (5)
  • December 2007 (5)
  • November 2007 (7)
  • October 2007 (7)
  • September 2007 (12)
  • August 2007 (9)
  • July 2007 (2)
© Copyright 2013 - Observations from Uppsala
Infinity Theme by DesignCoral / WordPress