Why can I Debug some Numerical Programs that You Can’t? Why should we care?

Why can I Debug some Numerical Programs that You Can’t? Why should we care?(cs.berkeley.edu)

29 points by raviksharma 14 years ago | 18 comments

Abstract: The future promises teraflops and terabytes in your laptop, petaflops and petabytes in your supercomputer, and the inability to debug numerical programs on either. Why can’t they be debugged? What should we do instead?

gjm11 14 years ago |

It may be worth offering a brief summary, so here is one. (Note: statements of opinion and exhortations in what follows are Prof. Kahan's, not mine.)

-- 1 --

A very effective way to diagnose some kinds of problem in numerical software is to run it with different settings for floating-point rounding. That way, if you're using an algorithm whose outputs are pathologically sensitive to small variations in their inputs (or in intermediate results), you're likely to be able to tell because the final results will differ by more than a few bits in the lowest places.

Unfortunately, support for doing this is lacking in most programming languages and environments. This is a Bad Thing.

-- 2 --

When doing FP computation that mixes single and double precision, it is tempting to treat mixed-precision operations as single-precision rather than double precision. The latest version of MATLAB (at the time when Kahan gave this talk) does this.

This can be very bad, because doing more of the computation than necessary in single precision can produce needlessly inaccurate results and therefore slow down convergence of algorithms (or just plain screw them up).

This is also a Bad Thing.

-- 3 --

Computer architectures, languages and programming environments should be designed so that following the path of least resistance leads to good, not bad, numerical behaviour. Lots of more detailed proposals along these lines can be found on Kahan's web pages.

Making this happen is a big job. Kahan is likely to be dead before it's finished. So go and make it happen.

saljam 14 years ago |

For what it's worth, that presentation was by William Kahan. “The Father of Floating Point” and the guy behind the IEEE 754 standard. He also has a Turing award.

Sniffnoy 14 years ago |

As someone who doesn't typically have reason to use either -- is using floats rather than doubles actually common? I was under the impression this was pretty rare.

vilya 14 years ago | |

It's not rare at all. A double takes up twice as much memory as a float and memory bandwidth is a precious commodity these days. If you have a lot of numbers (e.g. in 3D models or similar) and don't need the extra precision, using floats is an obvious choice.

copper 14 years ago | | |

Or when you're reading/writing a great deal of numerical data in a file.

xedarius 14 years ago | |

Most video games use floats. This is because doubles will be emulated in software on some platforms, thus making them incredibly slow. The Unreal Engine, for example uses floats extensively.

jensnockert 14 years ago | | |

Also, most modern x86 processors can perform single-precision operations twice as fast as double-precision, which helps (and saving memory bandwidth is good too.)

On PPC/ARM, single-precision allows you to use Altivec/NEON, which can give significant performance boosts as well.

lambada 14 years ago |

Is there any reason this is marked as scribd, when the link goes to a normal PDF? Usually I avoid scribd documents, but I noticed that the link didn't go to scribd.

[Apologies that this is Offtopic, but I was unsure where to post this question].

Mvandenbergh 14 years ago | |

That's not a tag, the [scribd] is a separate link to a scribd version of the pdf.

keeperofdakeys 14 years ago | |

Hacker News will automatically attach a scribd link to any pdf.

ars 14 years ago | |

It's not marked as scribd - there is an alternate link to a scribd version of the pdf. All pdfs have this extra link.

(Scridb is a ycombinator company.)

raviksharma 14 years ago | |

here.. http://news.ycombinator.com/item?id=175378

lucian1900 14 years ago |

Is it infeasible to use something other than floats for such computations? There are many schemes for dealing with bigintegers, bigdecimals and such with reasonable performance.

nzmsv 14 years ago | |

There are some simulations that squeeze everything they can out of a GPU or a cluster or a TOP500 machine. And I don't think arbitrary precision can be done easily on a GPU; if someone knows more, I'd love to be corrected on this one.

lucian1900 14 years ago | | |

TOP500 machines probably have CPU features that could help with that.

But yeah, afaik arbitrary precision on GPUs isn't possible with reasonable performance.

roel_v 14 years ago | |

By 'schemes', do you mean hardware support? Because if not, using floats will be much much faster, in my (admittedly fairly limited) intuition.