A summary of how not to measure latency(bravenewgeek.com) |
A summary of how not to measure latency(bravenewgeek.com) |
This is completely not true. Latency is very dependent on the client requesting and the object being requested. You are going to get clustering, not an even distribution.
Now, if the point is that something will be delayed, that's true. And it's true that many people don't realize that. There's the classic example where if everyone tries to be five minutes early, your group is still going to be late.
The real lesson is to analyse your critical path to death and ensure it is as resilient as possible. And if possible, real data is buckets more meaningful than conventional load tests.
I also don't see anything in here about how to get meaningful metrics from real users. The W3C Navigation Timing API has really shed a ton of light into things commonly forgotten.
In summary, let's focus on the max latency, home in on which backend exhibited said latency, identify the depth of the queue at the time that latency was experienced, and use that information to model the impact to users. From this, I expect you can draw some meaningful percentiles in terms of latency distributions, and without having to measure more data points than feasible without decreasing latency further.
Am I misunderstanding something? I'm no math whiz, this is mostly intuition.