Negative Results

In the past year, I have started listening to various podcast from the “Skeptic” community. Although much of the discussion tends to center on medicine (because of the sadly enormous market for quackery) and natural science (because the sad fight over evolution), it has made me think and reflect more about the nature of science and publishing. Indeed, it would have been great if this kind of material would have been easily found back when I was doing my PhD.

My little area of interest in computer science and engineering covers programming, compilers, languages, simulators, computer architecture and a bit of hardware design in general. Every year, I do review papers for some conferences or journals. Over the years, I have seen a torrent of papers proving showing claiming that something new is better than something old. Only very rarely do we see a paper admitting that something was tried and it failed. Or a paper that replicates somebody else’s results, and shows that maybe it did not work.

This is common with most other fields of science it seems. I recommend listening to the interview Chris French in Skeptikerpodden #80, and the interview with Ed Yong in EconTalk.  While these two interviews mostly deal with psychology and parapsychology, the patterns are definitely generalizable to most scientific fields.

What was particularly interesting was that there seem to be quite a few “established truths” out there that are really nothing more than a random blip where in one particular case, one particular experiment produced measurements that support some hypothesis. The first paper comes out, showing some exciting or surprising result. It gets published, and the popular press picks up on it. After this, we have an established truth in most people’s minds. When scrupulous researchers then recreate the experiment and do not get the results, or show that there was some methodological flaw in the initial experiment, they have a very hard time getting published. Even the best-known scientific journals favor newsworthy papers, and (science) journalists rarely write about such “cold water” stories. Thus, truths are easy to establish and hard to debunk, unfortunately.

I was reminded of an idea I had at the tail end of my PhD student days.

After five years on the conference circuit, I was pretty jaded and a skeptical about many of the types the claims made in typical papers presented. I was also somewhat bored by papers that just presented minor tweaks to existing ideas with minor (positive) effects on results. There were papers that were clearly examples of “data fishing”, where a correlation was confused with causation, and that correlation was found only by trying all kinds of possible connections between data sets (to be correct, these papers should then have tried to falsify the proposed effect). I had seen researchers tweak the initial conditions or change the assumptions for their experimental evaluations of their ideas to get something positive to show in their measurements – but in the process changing the assumptions into something totally unrealistic with no base in useful reality (I will not name names here). I remember with some pride that I built a reputation as the guy that poked holes in experiments and asked pointed questions. I also felt that there was a large chunk of knowledge missing from the conferences – the stuff that did not work out at all, but which would be good to know about.

The idea, which I regrettably never realized, was to organize a conference or workshop on the ideas that did not work. The ones that looked promising, were implemented, but turned out to totally fall down when confronted with more complex situations or simply reality (as in applying a method to an industrial code base rather than a toy example, or realizing that the span of parameters was bigger than assumed and that most of the space gave horrendous results). I felt that such a workshop would have been far more entertaining and insightful than most of the conference sessions that I had attended.

Indeed, when do we learn the best? It is not when we succeed, but when we fail. And propagating knowledge of failures should be just as important for the progress of science and technology as propagating knowledge of successes. Such knowledge tends to be transferred informally, as part of the culture of the field… and maybe in textbooks, but very rarely do we see specific studies showing that an idea just won’t work.

It is human nature to not want to admit failure, but isn’t the scientific process we have built our current standard of living and standard of technology on all about overcoming our human nature and being rational and honest?

There are some notable examples of published failures, in particular the recent mistake at CERN that led to the belief that neutrinos could travel faster than light. Or various space missions over the year where failures were investigated and causes laid bare for others to learn from (missing Mars by confusing metric and imperial units, the acceleration of Ariane 5 compared to Ariane 4, and others).

The silicon valley culture that embraces failure as a learning experience is a good example of how important failures are to progress. An entrepreneur who has failed a few times is mostly considered as having learnt something and thus being more likely to succeed the next time – rather than a proven failure that will never amount to anything. By allowing and embracing failure, people are allowed to try things and push them all the way – rather than having them chicken out and abandon a project at the first sign of trouble.

That is what I want to see published in academic conferences too – the spectacular failures that others can learn from. I think many such stories could be quite entertaining too. It is rarely interesting to hear a straight success story, as humans we love the drama of a good failure and the struggle to recover. Both tragedies and comedies, the classic forms of drama, are about failures, if you think about it.

Update: here are a couple of additional angles on truth and how science works:

 

One thought on “Negative Results”

  1. Jakob, I agree with all this! In particular, a large proportion of academics are determined to not learn anything from failure. This is very bad. Part of the problem is that, as you suggest, much research has no real potential for failure because it is based on minor tweaks that map onto incidental effects. Since nobody is remotely interested in attempting to refute these results, they are completely safe to publish.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.