Today here at the MCC 2009 workshop, I heard an interesting talk by David Black-Schaffer of Stanford university. His work is on stream programming for image processing (“2D streams”). Pretty simple basic idea, to use 2D blobs of pixels as kernel inputs rather than single values or vectors. Makes eminent sense for image processing.
More from the SiCS multicore days 2008.
There were some interesting comments on how to define efficiency in a world of plentiful cores. The theme from my previous blog post called “Real-Time Control when Cores Become Free” came up several times during the talks, panels, and discussions. It seems that this year, everybody agreed that we are heading to 100s or 1000s of “self-respecting” cores on a single chip, and that with that kind of core count, it is not too important to keep them all busy at all times at any cost. As I stated earlier, cores and instructions are now free, while other aspects are limiting, turning the classic optimization imperatives of computing on its head. Operating systems will become more about space-sharing than time-sharing, and it might make sense to dedicate processing cores to the sole job of impersonating peripheral units or doing polling work. Operating systems can also be simplified when the job of time-sharing is taken away, even if communications and resource management might well bring in some new interesting issues.
So, what is efficiency in this kind of environment?
This was a refreshingly different post: Too Many Cores, not Enough Brains:
More importantly, I believe the whole movement is misguided. Remember that we already know how to exploit multicore processors: with now-standard multithreading techniques. Multithreaded programming is notoriously difficult and error-prone, so the challenge is to invent techniques that will make it easier. But I just don’t see vast hordes of programmers needing to do multithreaded programming, and I don’t see large application domains where it is needed. Internet server apps are architected to scale across a CPU farm far beyond the limits of multicore. Likewise CGI rendering farms. Desktop apps don’t really need more CPU cycles: they just absorb them in lieu of performance tuning. It is mostly specialized performance-intensive domains that are truly in need of multithreading: like OS kernels and database engines and video codecs. Such code will continue to be written in C no matter what.
The argument at core is that multicore is about performance, and performance optimization is generally something that we do prematurely rather than focussing on how to solve the core problem in the best way. You have to respect Jonathan Edwards, and often this is true: programmers optimize themselves into a horrible design that is also slow.
A very interesting idea that has been bandied around for a while in manycore land is the notion that in the future, we will see a total inversion in today’s cost intuition for computers. Today, we are all versed in the idea that processor cores and processing times are quite precious, while memory is free. For best performance, you need to care about the cache system, but in the end, the goal is to keep those processor pipelines as busy as possible. Processors have traditionally been the most expensive part of a system, and ideas such as Integrated Modular Avionics are invented to make the best use of a resource perceived as rare and expensive…
But is that really always going to be true? Is it reasonably to think of CPU cores are being free but other resources as expensive? And what happens to program and system design then?