The Birth of Standard Error (2013)

The Birth of Standard Error (2013)(www2.dmst.aueb.gr)

160 points by marbu 1 year ago | 49 comments

dn3500 1 year ago |

Separate error output was around for at least a decade before this. I know MTS had it in the 1960s, and I don't think it was their original idea. I used a CDC for a while and they had it too. So while this is the story of how standard error was introduced into Unix, it is not the origin story of the concept of standard error.

lapsed_lisper 1 year ago | |

Unix's standard error is definitely not the first invention of a sink for errors. According to Doug McIlroy, Unix got standard error in its 6th Edition, released in May 1975 (http://www.cs.dartmouth.edu/~doug/reader.pdf). 5th Edition was released in June, 1974, so it's reasonable to suppose Unix's standard error was developed during that 11 month interval. By that time, Multics already had a dedicated error stream, called error_output (see https://multicians.org/mtbs/mtb763.html, dated October 1973).

All the same, I'd be willing to believe that Unix's standard error could have been an "independent rediscovery" of one feature made highly desirable by other features (redirection and pipes). It's not clear how much communication there was among distinct OS researcher groups back then, so even if other systems had an analogue, Bell Labs people might not have been aware of it.

dbcurtis 1 year ago | | |

The story that I recall about the origins of stderr is that without it, pipes are a mess. Keeping stdout to just the text that you want to pipe between tools and diverting all “noise” elsewhere is what makes pipes useable.

temporarely 1 year ago | |

I've always felt stderr should have been stdmeta.

p.s.

Well, actually more completely, something like this:

                  +---------+
    [meta-in] --> |         | --> meta-out
                  | p r o c | 
     input    ==> |         | ==> output
                  +---------+

dotancohen 1 year ago | | |

You might like this proposal:

https://unix.stackexchange.com/questions/197809/propose-addi...

The idea is that some output is metadata (such as ps headers) and some is data. With stdmeta we could differentiate between the two.

nerdponx 1 year ago | | |

Powershell has seven "output streams": https://learn.microsoft.com/en-us/powershell/module/microsof...

jasonjayr 1 year ago | | |

I bet you'd also be onboard with files having data forks and resource forks too ... ?

TBH, it's a great idea, but history proved that we apparently prefer a single stream of data and solving all the problems it brings ...

lapsed_lisper 1 year ago | | |

Among conceptually Unix-like OSes, at least one tried to do something along these lines: see http://www.bitsavers.org/pdf/apollo/SR10/011021-A00_Using_Yo... PDF pp 151 and following.

dfee 1 year ago | | |

This is interesting: MIMO channels to a process. Single stdin/stderr/stdout is effective for a single OS process, but with so much pulled up to user land (e.g. workers via green threads) maybe it makes sense to introduce multichannel i/e/o.

euroderf 1 year ago | | |

Sure. Make metadata out-of-band rather than in-band, so that the ungovernable mess of Unix-standard plain ol' text streams is replaced by structured data.

So, well then: allowing programs to consume and emit JSON - is this progress ?

BoingBoomTschak 1 year ago | | |

Why not more than 4? Another thing CL did "better":

https://www.lispworks.com/documentation/lw50/CLHS/Body/v_deb...

https://www.lispworks.com/documentation/lw50/CLHS/Body/v_ter...

pmarreck 1 year ago | | |

so... named pipes/named file descriptors?

dredmorbius 1 year ago | |

Similarly, the SAS System, originally written on an IBM mainframe in 1971 (probably OS-360 / MVS), and featured an input file (the SAS program itself) and two outputs, a LIST (the desired analytic output) and LOG, which contained status, warning, and error messages. It's not quite stderr, but clearly reflects similar thinking and was probably based on extant practices at the time.

fnord77 1 year ago |

> "One afternoon several of us had the same experience -- typesetting something, feeding the paper through the developer, only to find a single, beautifully typeset line: "cannot open file foobar"

reminds me of those t-shirts or digital billboards displaying some system error that we've all seen as memes

amelius 1 year ago |

What I hate about stderr is that it's character-based, not line-based.

I often get output from multiple threads or multiple processes garbled together on the same line. I know how to fix this, but I feel my OS should do it for me.

Bombinator 1 year ago | |

You can set the buffering mode of any file stream with setvbuf. For example, setvbuf(stderr, NULL, _IOLBF, BUFSIZ) sets stderr to line buffered I/O.

kragen 1 year ago | | |

that may help, but if a write writes more than PIPE_BUF bytes, it isn't guaranteed atomic by the kernel. similarly, stdioing a line of more than BUFSIZ may result in multiple write calls. i don't think posix makes any guarantees there (this is just an empirically based speculation) and i'm fairly sure the c standard doesn't

Sharlin 1 year ago | |

stderr having line buffering turned off by default is intentional. You want to see the output immediately and not have it stuck in a buffer that might be lost if the program crashes or freezes.

o11c 1 year ago | |

In my experience, the biggest offender is programs trying to do syscalls directly (possibly for async-signal-safety), but not being aware of `writev`. Especially programs that do colored output can be really stupid here. Sometimes there are stupid programs that use multiple processes to do colored output even (IIRC CMake is a big offender here, but CMake is infamous for refusing to fix bugs)!

The pipe buffer is big enough that sane programs aren't likely to run into problems. The math:

PIPE_BUF is 512 per POSIX but in practice 4096 on Linux (probably others too?). If we assume a horrible-and-unlikely 12 formatting characters per real character (and assume a real character is non-BMP and thus 4 bytes, but still single-column), Linux has enough for 64 characters. With more reasonable assumptions (mostly ascii, no more than 4 formatting changes per line) we get more like 6 lines of output being atomic on Linux, and even POSIX being likely to get at least one whole line.

chrisrhoden 1 year ago | |

I think it makes sense as a default to avoid issues discerning timing due to a buffer.

PhilipRoman 1 year ago | |

Lack of a standard way to control standard stream buffering is a big pain point for me sometimes. I'm still salty the libc+environment based approach was rejected by maintainers. And it also cannot be fixed on the kernel side since buffering is purely userspace feature.

YuxiLiuWired 1 year ago |

I thought this was going to be "the birth of standard deviation".

jchw 1 year ago |

This is not exactly the most earth-shattering revelation, but man: handling and communicating errors always seems to be a source of a vast amount of inelegance in software.

I'd argue we haven't really "solved" the optimal way to do error handling in programming: Using union types remains one of the best options, but even that has its downsides. Consider the ergonomics of forwarding an error type multiple layers in a Rust program: you can remove some of the boilerplate by strapping macros on top, but I'd argue that's more of a bandage than a fix. Most other programming languages are either using exceptions, which I don't like as they complicate control flow behavior significantly, or simply ignore error handling entirely (like C and Go; Both of them provide some standard facilities for dealing with error values, but handling it is completely manual. I do like this, since it's very straightforward, but it nonetheless is just sidestepping the problem.) And even trying to keep it simple can create new problems, like of course the way pthreads has to contort errno into a thread-local, for reasons obvious.

And while stderr has created a somewhat unified channel for dumping errors into, once they've bubbled up to the point where the program needs to output it, there's an almost unlimited amount of opinions on exactly how error logging should work. Some software won't use stderr by default, others only uses stderr for specific types of errors. Some software dumps everything that isn't data output into stderr, including e.g. `--help` text, whereas some software uses stdout for anything that isn't explicitly an error (Which often leads to me needing to pipe --help to less twice: once without, and once with 2>&1.) Categorization of error logging is also somewhat contentious: should there be a "warning" severity? should you split errors into modules? Formatting, too: what should be in a log line? Should logs be structured into a machine-readable format such as JSON?

It was probably a bad omen that even very old versions of UNIX ran into problems dealing with error logging and wound up needing to bifurcate things. Few programs feel as 'lazy' as UNIX; if UNIX couldn't ignore the problem, god knows the rest of the software was doomed.

JadeNB 1 year ago |

Here I was thinking I might have another fun little bit of trivia for my Statistics class, like the T-test one from the other day (https://news.ycombinator.com/item?id=40485313). Different standard error!

stuaxo 1 year ago |

Some things I want from std streams:

Timestamping or sync points, so that if I pipe multiple streams (say stdout and stderr) I can keep them in sync further along when various buffers may have been involved.

Metadata, such as magic file types.

Structured data (this may link with meta data, and maybe there is even a way programs could negotiate what to send to each other).

nikeee 1 year ago |

Have there been systems which support other additional streams?

When using PowerShell, I find it useful that it handled progress separately, so it doesn't interfere with piping (putting aside that cmdlets are .NET-based objects anyway). Is there something like stdprogress?

jgalt212 1 year ago |

> As one might expect, Bell Labs didn't use the paper tape input

tangentially related.

The Great 202 Jailbreak - Computerphile

https://www.youtube.com/watch?v=CVxeuwlvf8w

staring the inimitable Professor Brailsford.

nintendo1889 1 year ago |

Wow. And now we can run webservers on printers.

immibis 1 year ago | |

https://thedailywtf.com/articles/The-Killing-Job

kragen 1 year ago | |

i think contiki needs less than 16k of ram to run a webserver, most of which is the tcp stack. my own httpdito is about 2k of code and uses a 1024-byte data buffer, but that's sweeping tcp under the linux kernel rug

this is just to say that you could probably have run a webserver on a pdp-8/s, which was about the size of an atx case and would be a reasonable controller to build into a phototypesetter at the time