Grant Martin is a nice fellow from Tensilica who has a blog at ChipDesignMag. In a recent post, he raises the question of nomenclature and taxonomy for multicore processor designs:
…the discussion, and the need to constantly define our terms (and redefine them, and discuss them when people disagree) makes me wish that the world of electronics, system and software design had some agreement on what the right terms are and what they mean…
I think this is a good idea, but we need to keep the core count out of it…
The reason for the confusion of terms and the strong will to create new terms all the time is really that people feel that there is a real difference between a dual-core x86 processor used in a laptop and a highly integrated 100-core-or-more embedded design for traffic processing in a large switch. And for that reason, they want to define a term to define themselves out of the mainstream desktop/server space with a few large cores.
But the number of cores is probably the least useful parameter to use as a differentiator. If 4 cores is multicore and 32 cores manycore today, in a few years time the decrease in feature width will have moved 32 cores into multi and 128 cores into many… etc. So that is really something is bound to change over time.
I think that rather we need to look at other aspects of a chip design, in particular those that are not just straight multiplication of features. Those aspects that really matter to the kinds of programs the chip takes nicely to, and that architects have to think hard about.
Programming models are not the right answer to this. As Grant says, programming models need to be put in a taxonomy of its own:
A kind of taxonomy of multicore related terms, together with a taxonomy of programming models (SMP, AMP, etc.) that everyone could be referred to when these discussions are held and that everyone could begin to build a consensus around would be of great value to all.
If nothing else, we all know that any programming model can be put onto pretty much any piece of silicon, given a sufficiently thick layer of middleware. It might not be the most efficient way to program any particular hardware in terms of hardware resources used, but someone is going to do it anyway.
So what is left in the chip taxonomy?
I think we need to look at things like where memories are located (global, local to each core, shared by a small group), number of levels of memories, whether they are caches or program-controlled. How interrupts and IO are routed is another interesting aspect. Can any core do anything, or do we have master nodes that can do more things? Are all cores equal in terms of performance and computational ability, or do they differ?
As Grant says, a great subject for academia to dig into.
The comments at the end of the post about some secret activities from the Multicore Association by Markus Levy makes me agree with Grant: please get the ideas and drafts out into the open, and make sure to get the widest input possible!
Jakob, thanks for the kind words. Your blog is also valued and on my bookmarks for regular reading. However, I don’t really agree with you that numbers don’t matter in the N-core discussions. But it may be that numbers matter more in the heterogeneous N-core systems and less in the homogeneous N-core systems. The 2-core laptop and 100+ core chip for a large network system will tend to be homogeneous, in that all cores (or almost all cores – the 100+ core chip for networking may have some control cores and a lot of packet crunching cores), even if tuned to the application, will tend to be the same. In the heterogeneous asymmetric multiprocessing cell phones, iPhones, etc., most processors are different ASIPs for different applications. The homogeneous style tends to a step and repeat approach, and as technology proceeds, it is indeed true that 2,4,8, 16, 32, 64, 128, ….. cores, all the same, may really fall into an evolving single class of multicore architectures. Then, I agree with Jakob that other architectural concerns such as memory systems, interconnect, etc. may be the profound differentiators.
But heterogeneous AMP multicore devices are different enough from the homogeneous world that I think numbers matter. Here I see a profound difference between 1-a few 10’s of different ASIPs on a die, and 100’s or 1000’s of different ASIPs on a single die. Here it seems to me that something profoundly different may happen in our design approach as the numbers climb. No stepping and repeating will be possible here! Right now, I just don’t know where the dividing line is, or if we will discover it only post facto.
But a good debate – ideally one we should be conducting on a taxonomy site if one gets started!
Good points in the debate — if only we had somewhere to keep them in a coherent manner for the community to discuss…
Anycase, I might have overstated my case in the original post. Basically, I do not think that numbers alone are a useful indicator of the nature of a particular beast (chip design). The homogeneous/heterogeneous nature is very much more important, and the particular nature of the cores included.
I agree that a 100-core or 1000-core chip IS qualitatively different from a 10-core chip, since it poses a very different programming and design challenge. But I have a feeling that that difference is exactly what we want to capture in a good taxonomy, and the fact that a design gets to 1000 cores is going to be due to some design aspect of it that makes that many cores feasible. And those design aspects are going to be very different from the choices made for a 10-core chip.
I guess in the end, I think that core count is a visible but secondary outcome of the deeper more interesting design decisions made. Just saying that “manycore is >= 100 cores” is not helpful, we need to express how the chip got to 100 cores.
In the old world of servers, before Multicore, it used to be said that as processor counts climbed from 4 to 8 to 32 to 128, the processors became more or less free and the interconnect absolutely dominated system cost. I think similar economics and laws are waiting to be discovered in what people are desinging today in what for lack of a better term has to be called MPSoC, multiprocessor systems on chips.