OmniRoute Metrics

Which model is fastest right now — and is it healthy?

Loading...
Output TPS
Completion tokens per second over full request duration (tokens_out ÷ duration).
End-to-end latency
Total request time from OmniRoute's perspective — not time-to-first-token.
Cache hit rate
Share of input tokens served from cache (cache_read ÷ tokens_in).
Semantic vs upstream cache
Semantic = OmniRoute cache; upstream = provider-native cache.
Reasoning ratio
Reasoning tokens as a percentage of output tokens.
Compression savings
Tokens removed by OmniRoute compression before upstream delivery.
Provider health
Weighted score: 50% success rate, 30% TPS, 20% latency.

Keys: 15 time windows · Alt+18 sections