MTR: 'traceroute' and 'ping' in a single tool

MTR: 'traceroute' and 'ping' in a single tool(bitwizard.nl)

106 points by program 1 year ago | 47 comments

I normally have `mtr 1.1`¹ running in the background, in the third display mode, which is a 2D histogram—time in the x axis, hops in the y axis, and ASCII character/colour for ping time. When problems occur, this tends to let you easily see the nature of the problem—total loss, elevated packet loss, elevated response times; and to see the location of the problem—local network, local ISP, public internet. There are definitely occasions for loss%, sent, last/average/best/worst/stddev ping and such things as are found in the first display mode, but most of the time I find the histogram view most useful as the starting point.

You can make mtr start in this view with --displaymode=2 (direct command line arguments, `mtr --displaymode=2 …`; or shell alias, `alias mtr="mtr --displaymode=2"`; or set environment variable MTR_OPTIONS=--displaymode=2).

Screenshot of this mode: https://temp.chrismorgan.info/2025-02-06-hn-42924182-mtr-dis...

—⁂—

¹ 1.1 = 1.0.0.1 = Cloudflare public DNS, a convenient nearby public internet endpoint.

jlmcguire 1 year ago |

MTR is a useful tool but it is a somewhat common source of illusory issues since it generates so many icmp time exceeded packets that routers stop replying to other folks running traces. It's important, as others said, to understand that these aren't testing the data path of a network but instead the control plane path.

cootsnuck 1 year ago | |

What tools exist for people to test the data path of network reliably? In my experience MTR has worked well enough to approximate network routing issues. But it has always felt like a blunt tool given it can't do anything about hops with firewalls.

jlmcguire 1 year ago | | |

It's tough. iperf is a reasonable tool. It works by setting up tcp connections and actually transferring data.

I like the work https://fasterdata.es.net/ does. They provide clear guides and set expectations if you want to get more bandwidth out of a connection.

neilv 1 year ago |

MTR has long been one of the first little tools that I install on workstations.

    sudo apt install mtr-tiny

I also have a hotkey to pop it up in a window, pinging to some host that'll always be somewhere on the other side of any ISP from me. Whenever I suddenly suspect a networking problem from my laptop, I hit the hotkey as the first troubleshooting step. MTR starts to narrow down a few different problems very quickly.

eudhxhdhsb32 1 year ago |

Mtr is indeed nice.

One thing I've not understood is why will some hops have consistently lower ping times than hops farther down the chain in the same trace?

Is it indicating that the router is faster at forwarding packets than responding to ping requests?

p_ing 1 year ago | |

This is always worth a (re)read to understand traceroute:

https://archive.nanog.org/sites/default/files/traceroute-201...

Bluecobra 1 year ago | | |

^ This should be required reading for anyone using traceroute.

wrigby 1 year ago | |

> Is it indicating that the router is faster at forwarding packets than responding to ping requests?

Exactly this. In most “real” routers, forwarding (usually) happens in the “data plane”. It’s handled by an ASIC that has a routing table accessible to it in RAM. A packet comes in on an interface, a routing decision is made, and it goes out another interface - all of this happens with dedicated hardware. Pings (ICMP Echo requests), however, get forwarded by this ASIC to a local CPU, where they are handled by software (in the “control plane”).

You’re really seeing different response times from the two control planes - one may be more loaded or less powerful than another, regardless of the capacity of their data planes.

linsomniac 1 year ago | | |

This is also why you may see packet loss at one particular hop but then responses from hops beyond it. The hop with packet loss in this case probably has an overwhelmed CPU, rather than indicating that a particular network link has packet loss. mtr reporting packet loss at a hop is only reliable if every hop after it has similar packet loss.

Maybe the only thing I've explained more in my career than this is why it's ok that your Linux box has no "free" memory.

commandersaki 1 year ago | | |

It also doesn’t help that mtr ICMP handling code is just bad, it disregards packets that actually arrive as a loss.

oxygen_crisis 1 year ago | |

Traceroute doesn't use ping requests except with the old Windows binary. Usually it uses "Time-to-live (TTL) exceeded in transit" messages.

Beyond that technicality, your guess is often right... Routers will frequently prioritize forwarding packets over sending the TTL exceeded packets tools like MTR use to measure response times.

ta1243 1 year ago | | |

Also you can easily have the TTL expired message going via a different route on the return path (and indeed the same applies with your normal connections, asymetric routing can be a pain - especially in networks with rpf issues (multicast ones are a particular pain point), and with stateful firewalls, but most of the time it's fine. You just need to be aware.

Obviously you know, but for anyone else reading, a modern traceroute tool (like mtr) can send icmp, udp or tcp, on generic or specific ports. Indeed the default for mtr on my laptop is to use icmp.

toast0 1 year ago | |

Most likely, it's as you described, router N forwards packets much faster than it generates icmp ttl exceeded, and router N+1 is nearby and generates icmp faster.

However, it could also be the case that the routing back to you is significantly different, so you can have a much longer path to you from router N than router N+1.

This is more likely to happen on routes that cross oceans. Say you're tracing from the US to Brazil. If router N and N+1 are both in Brazil, but N sends return packets through Europe and N+1 sends through Florida, N+1 returns will arrive significantly sooner.

rixed 1 year ago | |

> Is it indicating that the router is faster at forwarding packets than responding to ping requests?

I believe most of the time this is the reason indeed. Answering an ICMP error to a TTL expiration or to an echo request is very low priority.

This latency in error message generation may even be a better signal of the router load than the latency of the actualy trip through it.

commandersaki 1 year ago |

Great tool for misleading results.