OmniRoute Metrics

Which model is fastest right now — and is it healthy?

Full dashboard (auth) ↗ Usage API ↗

Output TPS: Completion tokens per second over full request duration (tokens_out ÷ duration).
End-to-end latency: Total request time from OmniRoute's perspective — not time-to-first-token.
Cache hit rate: Share of input tokens served from cache (cache_read ÷ tokens_in).
Semantic vs upstream cache: Semantic = OmniRoute cache; upstream = provider-native cache.
Reasoning ratio: Reasoning tokens as a percentage of output tokens.
Compression savings: Tokens removed by OmniRoute compression before upstream delivery.
Provider health: Weighted score: 50% success rate, 30% TPS, 20% latency.

Keys: 1–5 time windows · Alt+1–8 sections

Today's pick

Loading recommendation…

● Healthy — 12h window

Total Requests

-

Output TPSⓘ

-

End-to-end Latencyⓘ

-

Success Rate

-

Semantic Cache

-

Avg Compression

-

Output TPS by Model

Latency Distribution (top models)

Top Models

Model	Requests	Success	Output TPS	P95 Latency	Errors

Model Comparison

Model	Requests	Success	Output TPS	P95 Latency	Errors

Pipeline State

P50 / P95 / P99 Latency

Provider Comparison

Provider	Health	Requests	Success	Output TPS	P95 Latency	Errors

Provider TPS

API Key Performance

API Key	Total Tokens	Requests	Success	Cache Hit	Reasoning	Output TPS	Latency	Errors

Error Breakdown

Errors by Model

Model	Total Errors	Error Rate	Top Error Type

Cache Hit Rate

-

Compression Saved

-

Reasoning Ratio

-

Out / In Ratio

-

End-to-end Latency

-

Semantic Cache by Model

Model	Entries	Hits	Tokens Saved	Avg Hits/Entry

Compression by Model

Model	Count	Saved	Ratio

Per-Model Efficiency

Model	Cache Hit %	Compression Saved	Reasoning %	End-to-end Latency

Daily Requests (30d)

Hourly Traffic (7d)

Daily Token Volume

Daily Cost (from usage summaries)

Routing Decisions by Provider

Provider	Decisions	Success Rate	Avg Score	Avg Latency

Provider Nodes

Name	Type	Prefix	Base URL	Updated

Middleware Performance

Middleware	Executions	Success Rate	Avg Duration

Combo vs Direct (call_logs)

Route	Requests	Success	Output TPS	End-to-end Latency	Total Tokens

Usage History by Combo Strategy

Strategy	Requests	Success Rate	End-to-end Latency	Total Tokens	Errors

Service Tier Breakdown

Tier	Requests	Success	Total Tokens	End-to-end Latency

Provider Connections

Account	Requests	Success	Total Tokens	End-to-end Latency