Average latency is easy to market because it fits cleanly into one number. Trading systems do not trade inside averages. They trade inside distributions, tails, and unstable path behavior.

Why averages hide the operational risk

A candidate path can look attractive on average while still producing inconsistent behavior that shows up in the moments the desk actually cares about. A cleaner median with worse tails can still be the wrong path for the workload.

That is why benchmark comparisons need percentile-aware logic. Median RTT matters, but it is not enough on its own.

What tails and jitter change

Jitter and p99 behavior answer a more practical question: how often does the path stop behaving like the median case? For market making, quoting, hedging, and venue-to-venue balancing, those deviations can matter more than a single small median improvement.

  • Lower jitter improves consistency of decision timing
  • Lower tails reduce worst-case execution surprises
  • Stable path behavior improves trust in the runtime choice
Tail-aware benchmarking is not academic overhead.

It is the difference between a path that looks good in a chart and a path the desk can actually rely on.

How candidate paths should be compared

The benchmark should compare the current path against feasible candidates under the same frame, then show where the distribution actually shifts. A path that wins only on one flattering statistic is not a production recommendation yet.

Route diversity also belongs here. Sometimes the first path wins the median case, while the second path produces a cleaner tail profile or a better failover posture.

What this means for rollout

Do not add routing complexity by default. Add it when the benchmark shows that the cleaner tail profile or the better resilience posture justifies it.

If median RTT is the only number you know, you do not know enough to choose the production path.