Syslog woes(vitobotta.com) |
Syslog woes(vitobotta.com) |
while(written < 1MB) {
written += write(fd,buf,chunksz);
//fsync(fd);
//fdatasync(fd);
}
Here's some of the times it took to write a 1MB file (Fedora 14, old hard drive) chunksz = 16:
No sync'ing: 926ms
fdatasync: 727114ms
fsync: 3024498ms
(yes, THAT slow)
chunksz = 256:
No sync'ing: 65ms
fdatasync: 53786ms
fsync: 191553ms
chunksz=1024:
No sync'ing: 20ms
fdatasync: 20232ms
fsync: 48039msIf your apps have some heavy logging to do, and you can't afford some more expensive setup, then you really should forget about syslog, because normally it doesn't scale well if used with high traffic web applications. It's best used for things like logging done by daemons and system components that do not suffer from the load that a web app with decent traffic can have.
I've seen syslog incite a holy war on HN recently, with one camp espousing that syslog (really rsyslog and syslog-ng) does not scale. As an interested third party, it'd be very helpful to have references to articles or blog posts detailing just how it didn't scale, or at least some first-hand commentary. The article describes an issue with the syslogd daemon in which it grinds to a halt at a mere 10 requests sent amongst 5 concurrent threads. That kind of issue is not going to burn you months down the road when your production deployment refuses to scale. Any stories of the newer open source implementations of syslog failing or severely degrading at production load, or conversely, syslog success stories?
I'm on rsyslogd or syslog-ng, depending on the machine at this point. Configurations vary, but they both seem to hold up well to consistently writing 100 messages/sec (and peaks of 20k/minute when I get portscanned) on a VM without causing stuff to break.
The no fsync option is important, and really, fsync on important for kernel logs in the event of a crash, and not a whole lot else. Not mail, not messages, not syslog, and certainly not debug. You get (roughly) max 100 fsyncs a second unless you're spending extra money on your disks. That's a really damn small budget given that you mega bytes and giga cycles for other things.
The default configuration (at least in debian/ubuntu) writes entries in lots of logs. There's catchalls, there's mail.info, mail.err, mail.log, daemon, and such. You need to prune that down and make sure that you're not writing your debug logs in 2 or 3 places with fsync.
I'm finding munin and my log greppers are a whole lot more demanding on the box than syslog.
One of the first things I used to do was compile my own rsyslog and disable syslog. Haven't had to do that for a while.
FreeBSD
Syslog (if we talk about networking - not with UDP, but with reliable TCP or SCTP-based protocol) and fsync() after each write is reliable solution.
If you're logging mostly pointless runtime data (like webserver access logs to static files, which, most of time, nobody ever cares about) with reliable syslog - you're doing it wrong. (That's why webservers don't generally use syslog, but directly write to files.) If you're logging important transactions with unreliable logging system - you're doing it wrong, too.
There's nothing wrong with syslog. Just use the right tool for the right job.
...message packets are limited to 1024 bytes
linking to the older, RFC 3164 and not to RFC 5424[1] which states:
There is no upper limit per se. Each transport mapping defines the minimum maximum required message length support, and the minimum maximum MUST be at least 480 octets in length.
Servers like syslog-ng do support messages larger than > 1024 bytes.
Why does he even mention webapps?