Practical Introduction to Prometheus for Developers

Practical Introduction to Prometheus for Developers(github.com)

41 points by melzarei 6 years ago | 3 comments

personomas 6 years ago |

> the second is how the 99th percentile reported by the the summary (1s) is quite different than the one estimated by the histogram_quantile() function (~2.2s). How can this be?

> ... for the quantile estimation from the buckets of a histogram to be accurate, we need to be careful when choosing the bucket layout; if it doesn't match the range and distribution of the actual observed durations, you will get inaccurate quantiles as a result.

> According to the previous plot, all slow requests from our application are falling into the 1s-2.5s bucket, resulting in this loss of precision when calculating the 99th percentile.

Can anyone explain mathematically why this is happening? I think I understand conceptually, but if I could also perhaps understand it from a mathematical perspective, I would feel much more confident!

By the way, I think it's a great introduction!

collyw 6 years ago |

Good timing, I just started working with Prometheus this week.

machawinka 6 years ago |

Well written.