Verde Technologies was an aerial photograph firm that used the Streamed Image Transformation Editor (SITE) to process digital framegrabs on Sun 3/160 workstations. Each cost about $40,000.00.
We required about 90 seconds for each step of our calibration process. There were several steps per image, we shot many images each day and promised our clients 24-hour turnaround.
So Scott Lydiard, the company president spent $5,000.00 on a Floating Point Accellerator card, as Sun promised that it doubled the speed of numeric code. It was a full-size VME bus card with a 68881 in the middle. I'm not real clear why it needed that much circuitry as Macs just had a single socket where you could pop in a 68881, much like the 8087 socket on DOS PCs.
So one night I install our new card. To my dismay the calibration time was reducing only to 85 seconds.
Not wanting to lose my job I discovered just that night what a profiler was. It turned out that SITE used getc() to read each pixel of an image, and putc() to write it. This because the "streamed" part was meant for building UNIX pipelines. You know simple tools that only do one thing.
I patched the code to read all the pixels in one read() system call.
"Hey Mike. Our new FPA is AWESOME! Only five seconds to calibrate an image."
"You made a wise purchase, Scott."
Then a few minutes later...
"Mike, could you explain why calibration now takes ten seconds on the workstation that doesn't have an FPA?"
"That's because it really does double the speed of numerical code."
SITE was written by a computer science graduate student. It is for this and many similar reasons that I don't regard a CS degree as being of much use to computer programmers. Mine is in Physics; among other things, Physicists figure out the way things work.