Schloss Dagstuhl (and a Seminar and Cerebras)

A month ago, I participated in a seminar at Schloss Dagstuhl in Germany, about “Discrete Algorithms on Modern and Emerging Compute Infrastructure”. Not my usual cup of tea, but it was very interesting and insightful nevertheless. I have attended a Dagstuhl seminar once before, back in 2003.

Continue reading “Schloss Dagstuhl (and a Seminar and Cerebras)”

When is Redundancy Cheaper?

fire from MS Office clip artI find the subject of fault tolerance and resiliency in computers quite interesting. It also very interesting to look into what kinds of faults actually do happen in the real world, and what impact they have. I recently found a couple of good sources on this. First of all, a paper from Super Computing 2012 by Fiala et al, called “Detection and Correction of Silent Data Corruption for Large-Scale High-Performance Computing” (ACM Digital Library). One of its references was to a 2011 talk by Al Geist, “What is the Monster in the Closet”, which provided some more data on how common faults are.

Continue reading “When is Redundancy Cheaper?”