Thin Phone, Fat Core

nvidia_logoWhen mobile phones first appeared, they were powered by very simple cores like the venerable ARM7 and later the ARM9. Low clock frequencies, zero microarchitectural sophistication, sufficient for the job. In recent years, as smartphones have come into their own as the most important computing device for most people, the processor performance of mobile phones have increased tremendously. Today, cutting-edge phones and tablets contain four or eight cores, running at clock frequencies well above 2 gigahertz. The performance race for most of the market (more about that in a moment) was mostly about pushing higher clock frequencies and more cores, even while microarchitecture was left comparatively simple. Mobile meant “fairly simple”, and IPC was nowhere near what you would get with a typical Intel processor for a laptop or desktop.

Today, that seems to be changing, as the Nvidia Denver core and Apple’s Cyclone core both go the route of a few fat cores rather than many thin cores.

Continue reading “Thin Phone, Fat Core”

David May on Multicore: Heterogeneity not Needed

Via the EETimes, I found a very interesting talk by Bristol professor David May, presented at the 4th Annual Bristol Multicore Challenge, in June of 2013. The talk can be found as a Youtube movie here, and the slides are available here. The EETimes focused on the idea to cut down ARM to be really RISC, but I think the more interesting part is Professor May’s observations on multicore computing in general, and the case for and against heterogeneity in (parallel) computers.

Continue reading “David May on Multicore: Heterogeneity not Needed”

Two Cores, Four Cores, Eight Cores – Mobile Variety

Probably thanks to the yearly Mobile World Congress, there have been a slew of recent announcements of mobile application processors recently. Everything is ARM-based, but show quite some variety in the CPU core configurations used. Indeed, I think this variety has something to say on the general state of multicore.

Continue reading “Two Cores, Four Cores, Eight Cores – Mobile Variety”

Does ISA Matter for Performance?

When I grew up with computers, the big RISC vs CISC debate was raging. At the time, in the late 1980s, it did indeed seem that RISC was inherently superior to CISC. SPARCs, MIPS, and Alpha all outpaced boring old x86, VAX and 68000 processors. This turned out to be a historical parenthesis, as the Pentium Pro from Intel showed how RISC-style performance could be mated to a CISC ISA. However, maybe ISAs still do matter.

Continue reading “Does ISA Matter for Performance?”

Nvidia “Kal-El” Variable SMP

Nvidia recently announced that their already-known “Kal-El” quad-core ARM Cortex-A9 SoC actually contains five processor cores, not just four as a “normal” quad-core would. They call the architecture “Variable SMP”, and it is a pretty smart design. The one where you think, “I should have thought of that”, which is the best sign of something truly good.

Continue reading “Nvidia “Kal-El” Variable SMP”

Steve Furber: Emulated BBC Micro on Archimedes on PC

I just read an interview with Steve Furber, the original ARM designer, in the May 2011 issue of the Communications of the ACM. It is a good read about the early days of the home computing revolution in the UK. He not only designed the ARM processor, but also the BBC Micro and some other early machines.

Continue reading “Steve Furber: Emulated BBC Micro on Archimedes on PC”

Memory Models: x86 is TSO, TSO is Good

By chance, I got to attend a day at the UPMARC Summer School with a very enjoyable talk by Francesco Zappa Nardelli from INRIA. He described his work (along with others) on understanding and modeling multiprocessor memory models. It is a very complex subject, but he managed to explain it very well.

Continue reading “Memory Models: x86 is TSO, TSO is Good”

S4D 2010

Looks like S4D (and the co-located FDL) is becoming my most regular conference. S4D is a very interactive event. With some 20 to 30 people in the room, many of them also presenting papers at the conference, it turns into a workshop at its best. There were plenty of discussion going on during sessions and the breaks, and I think we all got new insights and ideas.

Continue reading “S4D 2010”

Concurrency in Lego Mindstorms NXT

lego mindstorms nxt2

For my parental leave, I have just bought myself a Lego Mindstorm NXT 2.0 kit. It is not much fun for our youngest, who mostly gets a bit scared by a piece of Lego driving around making noises, but I hope to be able to use it to teach my older child (almost five) to program. Let’s see how that turns out. It looks hard to make the NXT environment provide the kind of Roborally-style programming blocks that I had hoped to create, as I cannot for some reason get a sufficiently custom icon onto custom blocks.

It also presented me with an opportunity to try some domain-specific high-level graphical programming. The programming environment provided for the NXT series of Mindstorms kits is based on LabView from National Instruments, and it really does seem to work. It even features parallel tasks, which I tried to use…

Continue reading “Concurrency in Lego Mindstorms NXT”

Is Cycle Accuracy a bad Idea?

In a funny coincidence, I published an article at about the need for cycle-accurate models for virtual platforms on the same day that ARM announced that they were selling their cycle-accurate simulators and associated tool chain to Carbon Technology. That makes one wonder where cycle-accuracy is going, or whether it is a valid idea at all… is ARM right or am I right, or are we both right since we are talking about different things?

Let’s look at this in more detail.

Continue reading “Is Cycle Accuracy a bad Idea?”