API Monitoring: Up Is Not Enough (2014)(pagerduty.com) |
API Monitoring: Up Is Not Enough (2014)(pagerduty.com) |
monitoring and QA are the same thing. You'd never think so until you try doing a big SOA. But when your service says "oh yes, I'm fine", it may well be the case that the only thing still functioning in the server is the little component that knows how to say "I'm fine, roger roger, over and out" in a cheery droid voice. In order to tell whether the service is actually responding, you have to make individual calls. The problem continues recursively until your monitoring is doing comprehensive semantics checking of your entire range of services and data, at which point it's indistinguishable from automated QA. So they're a continuum.
https://gist.github.com/chitchcock/1281611
In smaller projects I've worked on, I'm not willing to expend the level of effort required for truly comprehensive monitoring, but I try to do some kind of end-to-end test of the whole system as part of the monitoring -- something that requires all components to be alive and working -- so that I'm aware of issues at least as fast as the people using the system.
This monitoring shouldn't brush aside other system-level monitoring - which can alert on abnormal memory, disk space, error rates, etc.