Photoshop Scalability and “-10% overhead”

I just finished reading the October 2010 issue of Communications of the ACM. It contained some very good articles on performance and parallel computing. In particular, I found the ACM Case Study on the parallelism of Photoshop a fascinating read. There was also the second part of Cary Millsap’s articles about “Thinking Clearly about Performance”.

Continue reading “Photoshop Scalability and “-10% overhead””

Wind River Blog: “IMA on Simics”

I have a fairly lengthy new blog post at my Wind River blog. This time, I interview Tennessee Carmel-Veilleux, a Canadian MSc student who have done some very smart things with Simics. His research is in IMA, Integrated Modular Avionics, and how to make that work on multicore.

Continue reading “Wind River Blog: “IMA on Simics””

S4D 2010

Looks like S4D (and the co-located FDL) is becoming my most regular conference. S4D is a very interactive event. With some 20 to 30 people in the room, many of them also presenting papers at the conference, it turns into a workshop at its best. There were plenty of discussion going on during sessions and the breaks, and I think we all got new insights and ideas.

Continue reading “S4D 2010”

Multicore is not That Bad

I recently read a couple of articles on multicore that felt a bit like jumping back in time. In IEEE Spectrum, David Patterson at Berkeley’s parallel computing lab brings up the issue of just how hard it is to program in parallel and that this makes the wholesale move to multicore into something like a “hail Mary pass” for the computer industry. In Computer World, Chris Nicols at NICTA in Australia asks what you will do with a hundred cores – implying that there is not much you can do today. While both articles make some good points, I also think they should be taken with a grain of salt. Things are better than they make them seem. Continue reading “Multicore is not That Bad”

Wind River Blog: True Concurrency is Different

I have another blog up at Wind River. This one is about multicore bugs that cannot happen on multithreaded systems, and is called True Concurrency is Truly Different (Again). It bounces from a recent interesting Windows security flaw into how Simics works with multicore systems.

Concurrency in Lego Mindstorms NXT

lego mindstorms nxt2

For my parental leave, I have just bought myself a Lego Mindstorm NXT 2.0 kit. It is not much fun for our youngest, who mostly gets a bit scared by a piece of Lego driving around making noises, but I hope to be able to use it to teach my older child (almost five) to program. Let’s see how that turns out. It looks hard to make the NXT environment provide the kind of Roborally-style programming blocks that I had hoped to create, as I cannot for some reason get a sufficiently custom icon onto custom blocks.

It also presented me with an opportunity to try some domain-specific high-level graphical programming. The programming environment provided for the NXT series of Mindstorms kits is based on LabView from National Instruments, and it really does seem to work. It even features parallel tasks, which I tried to use…

Continue reading “Concurrency in Lego Mindstorms NXT”

MCC 2009 Presentations Online

UPMARC_700x150The presentations from the 2009 Swedish Workshop on Multicore Computing (MCC 2009) are now online at the program page for the workshop. Let me add some comments on the workshop per se.

Continue reading “MCC 2009 Presentations Online”

MCC 2009: 2D Stream Processing for Manycore

UPMARC_700x150Today here at the MCC 2009 workshop, I heard an interesting talk by David Black-Schaffer of Stanford university.  His work is on stream programming for image processing (“2D streams”). Pretty simple basic idea, to use 2D blobs of pixels as kernel inputs rather than single values or vectors. Makes eminent sense for image processing.

Continue reading “MCC 2009: 2D Stream Processing for Manycore”

Finally, a Bug!

butterflyPart of my daily work at Virtutech is building demos. One particularly interesting and frustrating aspect of demo-building is getting good raw material. I might have an idea like “let’s show how we unravel a randomly occurring hard-to-reproduce bug using Simics“. This then turns into a hard hunt for a program with a suitable bug in it… not the Simics tooling to resolve the bug. For some reason, when I best need bugs, I have hard time getting them into my code.

I guess it is Murphy’s law — if you really set out to want a bug to show up in your code, your code will stubbornly be perfect and refuse to break. If you set out to build a perfect piece of software, it will never work…

So I was actually quite happy a few weeks ago when I started to get random freezes in a test program I wrote to show multicore scaling. It was the perfect bug! It broke some demos that I wanted to have working, but fixing the code to make the other demos work was a very instructive lesson in multicore debug that would make for a nice demo in its own right. In the end, it managed to nicely illustrate some common wisdom about multicore software. It was not a trivial problem, fortunately.

Continue reading “Finally, a Bug!”

Ericsson Blog Post about DSL

ericsson_logoAndras Vajda of Ericsson wrote an interesting blog post on domain-specific languages (DSLs). Thanks for some success stories and support in what sometimes feels like an uphill battle trying to make people accept that DSLs are a large part of the future of programming. In particular for parallel computing, as they let you hide the complexities of parallel programming.

Continue reading “Ericsson Blog Post about DSL”

How (Not) To Present Parallel Programming Results

46daclogoSCDSource ran a short but good article summarizing a few DAC talks that I would liked to attend. it mostly about the experience of long-term parallel programming research David Bailey in presenting results in the field…

Continue reading “How (Not) To Present Parallel Programming Results”

Freescale P4080, in Physical Form

freescale-logo-iconPast Tuesday, I attended the Freescale Design With Freescale (DWF) one-day technology event in Kista, Stockholm. This is a small-scale version of the big Freescale Technology Forum, and featured four tracks of talks running from the morning into the afternoon. All very technical, aimed at designing engineers.

Continue reading “Freescale P4080, in Physical Form”

GPGPU – a new type of DSP?

My post on SiCS multicore, as well as the SiCS multicore day itself, put a renewed spotlight on the GPGPU phenomenon. I have been following this at a distance, since it does not feel very applicable to neither my job of running Simics, nor do I see such processors appear in any customer applications. Still, I think it is worth thinking about what a GPGPU really is, at a high level.

Continue reading “GPGPU – a new type of DSP?”

SiCS Multicore Day 2009

Last Friday, I attended this year’s edition of the SiCS Multicore Day. It was smaller in scale than last year, being only a single day rather than two days. The program was very high quality nevertheless, with keynote talks from Hazim Shafi of Microsoft, Richard Kaufmann of HP, and Anders Landin of Sun. Additionally, there was a mid-day three-track session with research and industry talks from the Swedish multicore community. Continue reading “SiCS Multicore Day 2009”

Øredev 2009: Meanwhile, Parallel

Öredev logoØredev is the premier software development conference in Sweden and Europe (they claim). I gave some presentations there in 2006 and 2007, but since then they have dropped the general embedded software development track and just focused on programming for mobile phones. Most of the material is “general IT”. If you are doing software development on the desktop or for servers, it is a good place to go to learn new things from the general world of computing.

Continue reading “Øredev 2009: Meanwhile, Parallel”

Downloadable Book about Embedded Multicore

freescale-logo-iconFreescale has now released the collected, updated, and restyled book version of the article series on embedded multicore that I wrote last year together with Patrik Strömblad of Enea, and Jonas Svennebring, and John Logan of Freescale. The book covers the basics of multicore software and hardware, as well as operating systems issues and virtual platforms. Obviously, the virtual platform part was my contribution.

Continue reading “Downloadable Book about Embedded Multicore”

Coding Horror on Big Iron Hardware

opinionIn a post from late June, Jeff Atwood at Coding Horror discusses the horrible cost of a large HP server (scaling up to 32 processor cores in eight AMD x86 sockets), compared to a bunch of simple single-socket basic servers. There are some interesting notes on relative costs of small-and-simple servers, including things like administration and power. There is an undercurrent to the post and the comments that the big HP machine is “overpriced”. I don’t think it is. If you have ever had Erik Hagersten as a teacher in computer architecture, you will know why.

Continue reading “Coding Horror on Big Iron Hardware”

StackOverflow interviews CouchDB

couchdbLast year, FLOSS Weekly interviewed Jan Lehnard of the CouchDB project. I put up a blog post on this, noting that it was interesting with a scalable parallel program written in Erlang, a true concurrent language. The interview was interesting,  but not very deeply technical. Now, almost a year later, the StackOverflow podcast, number 59, interviewed the founder of the project, Damien Katz. This interview goes a bit more into the technical details and what CouchDB is good for and what not, as well as some details on the use and performance of Erlang.

Continue reading “StackOverflow interviews CouchDB”

Cavium Octeon II: Short Notes

octeon-iiAbout two months ago, Cavium Networks launched their second generation of Octeon chips, the Octeon II. The most obvious difference to the previous generation (Octeon, Octeon Plus) is a new MIPS64 core with much better support for hypervisors and virtualization. There are some other interesting aspects to this chip, though.

Continue reading “Cavium Octeon II: Short Notes”

Parallelism in Action

Shrinking cores

Last year in a blog post on video encoding for the iPod Nano, I complained about the lack of performance on my old Athlon. A bit later, I noted that (obviously) video encoding is a good example of an application that can take advantage of parallelism. Yesterday I put these two topics together in a practical test. And it worked nicely enough.

Continue reading “Parallelism in Action”

When does Hardware Acceleration make Sense in Networking?

q_stampYes, when does hardware acceleration make sense in networking? Hardware acceleration in the common sense of “TCP offload”. This question was answered by a very nicely reasoned “no” in an article by Mike Odell in ACM Queue called “Network Front-End Processors, Yet Again“.

Continue reading “When does Hardware Acceleration make Sense in Networking?”

Simulation Determinism: Necessary or Evil?

gearsIn my series (well, I have one previous post about checkpointing) about misunderstood simulation technology items, the turn has come to the most difficult of all it seems: determinism. Determinism is often misunderstood as meaning “unchanging” or “constant” behavior of the simulation. People tend to assume that a deterministic simulation will not reveal errors due to nondeterministic behavior or races in the modeled system, which is a complete misunderstanding. Determinism is a necessary feature of any simulation system that wants to be really helpful to its users, not an evil that hides errors.

Continue reading “Simulation Determinism: Necessary or Evil?” – Multicore CPUs face slow road in comms

eetimes logoThe  EETimes article Multicore CPUs face slow road in comms piqued my interest. There is an interesting chart in there about just how slow more-than-one-core processors will be in penetrating a vaguely defined “comms” market place. I can believe that, but I think their comments on the PowerQUICC series require some commentary…

Continue reading “ – Multicore CPUs face slow road in comms”

Enea and Freescale Article on SMP OS

Elektronik i Norden just published a technical insight article about the SMP kernels of Enea OSE and Linux, by Patrik Strömblad and Jonas Svennebring.

Continue reading “Enea and Freescale Article on SMP OS”

Adding to Schirrmeister’s Virtual Platform Myth Busting

opinionFrank Schirrmeister of Synopsys recently published a blog post called “Busting Virtual Platform Myths – Part 1: “Virtual Platforms are for application software only”. In it, he is refuting a claim by Eve that virtual platforms are for application-level software-development only, basically claiming that they are mostly for driver and OS development and citing some Synopsys-Virtio Innovator examples of such uses. In his view, most appication-software is being developed using host-compiled techniques.  I want to add to this refutal by adding that application-software is surely a very important — and large — use case for virtual platforms.

Continue reading “Adding to Schirrmeister’s Virtual Platform Myth Busting”

IBM z10 Heavy-Duty Virtual Platform

ibm_z10Unknown to most, IBM has one of the world’s longest records of using virtual platforms for software and firmware development and verification. This project has been ongoing since at least the days of the zSeries 900 machines, through z990, z9, and now z10. An excellent article on this virtual platform and its uses is found in the IBM Journal of Research and Development, number 1, 2009, . It is called “IBM System z10 Firmware Simulation”, by Körner et al.

Continue reading “IBM z10 Heavy-Duty Virtual Platform”