API Monitoring: Up Is Not Enough (2014)

API Monitoring: Up Is Not Enough (2014)(pagerduty.com)

33 points by heitortsergent 9 years ago | 7 comments

elevensies 9 years ago |

This reminds me of Steve Yegge's google plaforms rant, which I'm sure I've posted and reposted before:

monitoring and QA are the same thing. You'd never think so until you try doing a big SOA. But when your service says "oh yes, I'm fine", it may well be the case that the only thing still functioning in the server is the little component that knows how to say "I'm fine, roger roger, over and out" in a cheery droid voice. In order to tell whether the service is actually responding, you have to make individual calls. The problem continues recursively until your monitoring is doing comprehensive semantics checking of your entire range of services and data, at which point it's indistinguishable from automated QA. So they're a continuum.

https://gist.github.com/chitchcock/1281611

In smaller projects I've worked on, I'm not willing to expend the level of effort required for truly comprehensive monitoring, but I try to do some kind of end-to-end test of the whole system as part of the monitoring -- something that requires all components to be alive and working -- so that I'm aware of issues at least as fast as the people using the system.

ryen 9 years ago | |

Some shops call this "synthetic monitoring" where the tests run are a subset of a full integration test, that can be run at regular, but timely, intervals of say every 10 or 15 minutes. "The happy path" for some typical use cases.

This monitoring shouldn't brush aside other system-level monitoring - which can alert on abnormal memory, disk space, error rates, etc.

dasil003 9 years ago | |

There's also an analogous split to unit testing vs integration/system testing. You want system-wide monitoring to give the strongest guarantee that a service is actually up and available to customers, but you also want monitoring on each component so you can pinpoint the source of failures more quickly.

johns 9 years ago |

Author here. It's been awhile since I wrote this and we've learned a ton more about what good API monitoring looks like. So if you have any questions, let me know!

andyfleming 9 years ago | |

Are there any updated resources you've written or recommendations based on what you've since learned?

dozzie 9 years ago | |

Now that we have an idea on how looks monitoring API, how would you monitor documentation?

nytopop 9 years ago | | |

With documentation documentation.