Auriko vs OpenRouter: which LLM gateway should you use?
A practical comparison of OpenRouter and Auriko across model coverage, routing, pricing, BYOK, and production inference cost economics.
Both Auriko and OpenRouter are OpenAI-compatible LLM gateways that route requests across multiple providers through a single API. OpenRouter's strength is broader provider and model coverage plus a larger integration surface. Auriko is built for teams that care most about production inference cost economics: routing calibrated to your usage pattern, provider-aware routing, pricing transparency, zero markup, and free BYOK at scale.
TL;DR
Choose OpenRouter if you need broader provider and model coverage or existing third-party integrations for your workflow.
Choose Auriko if your goal is better inference cost economics, zero inference markup and free BYOK at scale.
Both are OpenAI SDK compatible, support BYOK, and handle fallback chains.
What is OpenRouter
OpenRouter offers 400+ active models on 60+ providers through a single OpenAI-compatible API, with a larger third-party integration surface across developer tools and frameworks. You buy credits with a 5.5% platform fee. Per-token rates match what providers charge directly.
What is Auriko
Auriko is an OpenAI-compatible gateway built for production inference cost economics. It supports 300+ models across 12 providers and is actively expanding provider and model coverage. Its core differentiation is a routing engine built by ex-quant traders that models provider pricing, cache mechanics, and user workload patterns. It uses those inputs to predict expected cost and choose the lowest-cost provider. Auriko charges 0 markup on provider token prices.
What to compare
The dimensions that matter most depend on what you're building. For production LLM applications, the criteria that usually drive the decision are:
- Cost structure. How each gateway charges and how that interacts with your usage patterns.
- Model and provider coverage. Whether the models you need are available.
- Routing approach. How each gateway selects a provider for a given request.
- Ecosystem. SDK maturity, tool integrations, observability.
- Operational visibility. What each gateway exposes about provider health, routing decisions, cost, and data handling.
The table below covers these and more. The deep-dive sections explain the dimensions where the gateways differ most.
Head-to-head
| Dimension | OpenRouter | Auriko | Notes |
|---|---|---|---|
| Pricing model | 5.5% platform fee on credit purchases | 0% markup | OpenRouter's per-token rates match providers; the fee is on credit purchase |
| BYOK cost | Free under 1M req/month, 5% after | Free on all tiers | |
| Model and provider coverage | 400+ active models on 60+ providers | 300+ models, 12 providers | |
| Routing approach | Default provider routing uses price-based load balancing with recent-outage filtering and inverse-square price weighting | 4D provider scoring across expected cost, TTFT, throughput, and capacity headroom | |
| Cost modeling | Routes by published per-token rates | Models provider pricing, cache mechanics, and workload patterns to predict expected cost | |
| Provider signals | Uses recent-outage filtering plus latency and throughput signals in routing controls | Tracks latency distributions, provider health, throughput, pricing changes, and capacity signals | |
| Free tier limit | 50 req/day, 20 RPM | 1,000 RPM | |
| Free tier BYOK | Not available | 500 RPM, 10K req/month | |
| Paid tier limit | Unlimited for paid models | Unlimited | |
| Paid tier BYOK | 1M req/month free, then 5% | Free at any volume | |
| SDKs | Client SDKs + Vercel AI SDK provider | Client SDKs + Vercel AI SDK provider |
Where OpenRouter Is the Better Fit
Broader model access. OpenRouter is the better fit when your priority is broader provider and model coverage, especially for niche models, newly launched models, or many model variants behind one API.
Existing integration surface. OpenRouter is the better fit when broader third-party integration coverage is the deciding factor for an existing workflow.
Feature breadth outside provider routing. OpenRouter includes features beyond core provider routing, including Auto Router for prompt-based model selection, web search tooling, file/PDF input support, embeddings, quantization filtering, and model variant suffixes.
Where Auriko Is the Better Fit
Better inference cost economics. Auriko is the better fit when cost efficiency is a production priority. It helps teams reduce production inference cost by up to 40%. Its routing engine models provider pricing, cache mechanics, and workload patterns to predict expected cost and choose the lowest-cost provider for each request.
Routing calibrated to your usage pattern. Auriko helps lower inference cost by routing based on how your production traffic actually behaves, not only provider list prices.
Provider-aware routing. Auriko helps reduce the risk of degraded routes by accounting for provider health, performance, throughput, and capacity signals before sending the request.
Zero markup, free BYOK, and durable credits. Auriko charges 0 markup on provider token prices, keeps BYOK free across tiers, and does not expire purchased credits.
Why Auriko's Cost Optimization Is Different
Cost modeling
Two providers quoting identical per-token rates can produce different bills on the same workload. The difference comes from caching mechanics that per-token rates don't capture:
- Minimum token thresholds. Some providers won't cache content below a certain token count. A system prompt under the threshold pays full price on every request, regardless of the published cache discount.
- Block granularity. Providers cache in fixed-size blocks. Tokens that don't fill a complete block pay full price.
- Expiration windows. Cache durations vary by provider. A cached entry only saves money if matching requests arrive before it expires.
- Write and storage costs. Some providers charge a premium for the initial cache write and for ongoing storage. These costs only pay off through subsequent reuse.
- Tiered pricing. Rates shift based on context length, so the rate on the pricing page may not apply to a specific request.
OpenRouter's default provider routing prioritizes price among stable providers. Auriko maintains a cost model built by ex-quant traders that tracks provider pricing, per-provider cache mechanics, and usage patterns from request metadata (token counts and context length, not prompts or content). It computes expected cost per provider per request and selects the lowest-cost provider for the actual workload after accounting for pricing, cache behavior, and usage patterns.
For multi-turn conversations, the engine models cache economics across the full conversation. A provider that charges a write premium but offers deep read discounts can be cheaper over 10 turns than one with lower single-request rates.
Requests can set cost optimization strategies: "cost" balances affordability with performance, "cost-focus" minimizes cost aggressively. Users can also set hard constraints: max_ttft_ms for latency ceilings, max_cost_per_1m to exclude expensive providers, and only_byok to restrict routing to bring-your-own-key providers.
For the full breakdown of how provider caching mechanics affect cost, see The variables that affect your inference cost.
Provider-aware routing
After Auriko estimates expected cost, the routing engine still has to choose a provider that can serve the request under current operating conditions. It filters qualified provider candidates, then scores them across four dimensions: expected cost, time to first token, throughput, and capacity headroom.
For cost-focused strategies, Auriko keeps cost as the primary optimization goal while using performance and capacity signals to reduce the risk of choosing routes that look inexpensive from published price alone but are weaker under current performance or capacity conditions. Requests can also set hard constraints such as max_ttft_ms, max_cost_per_1m, and only_byok, or provide custom weights when a workload needs a different balance.
The engine runs on a provider data pipeline that tracks latency distributions, provider health, throughput, pricing changes, and capacity signals in real time. Routing strategies are backtested under stress scenarios including provider degradation, price changes, capacity shocks, and partial outages.
The direct provider-routing comparison is OpenRouter's default price-based load balancing. For a specified model, OpenRouter prioritizes providers without recent outages, weights low-cost candidates by inverse square of price, and uses remaining providers as fallbacks. Users can override that behavior with provider sorting by price, throughput, or latency, plus preferred latency and throughput thresholds.
Separately, OpenRouter offers Auto Router, powered by NotDiamond, for prompt-based model selection when the request uses openrouter/auto. That is a model-selection feature, not the default provider-routing path for a specified model.
Pricing and BYOK
OpenRouter charges a 5.5% platform fee on credit purchases. Per-token rates match provider rates, so the fee is on purchase, not per-token. OpenRouter reserves the right to expire unused credits after 365 days.
Auriko charges 0% markup and does not expire purchased credits. Unused balance stays available until you spend it.
OpenRouter's BYOK is free for the first 1 million requests per month. After that, requests are charged at 5%. Auriko's BYOK is free on all tiers with no volume threshold.
Who should use what
| If you need... | Use |
|---|---|
| Broader provider and model coverage | OpenRouter |
| Existing third-party integrations | OpenRouter |
| Auto Router / prompt-based model selection | OpenRouter |
| Better production inference cost economics | Auriko |
| Routing calibrated to your usage pattern | Auriko |
| Provider-aware routing with health, performance, and capacity signals | Auriko |
| Zero markup pricing, free BYOK at scale, or non-expiring credits | Auriko |
Switching
Both gateways are OpenAI SDK compatible. Switching is a base URL and API key change.
Python:
from openai import OpenAI
# OpenRouter
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="sk-or-..."
)
# Auriko
client = OpenAI(
base_url="https://api.auriko.ai/v1",
api_key="ak_..."
)
TypeScript:
import OpenAI from "openai";
// OpenRouter
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: "sk-or-...",
});
// Auriko
const client = new OpenAI({
baseURL: "https://api.auriko.ai/v1",
apiKey: "ak_...",
});
For standard chat completions with common parameters, the request and response handling is compatible. Response metadata fields, streaming metadata delivery, and provider-specific extension parameters differ between gateways. Test those paths when switching.
Get started
To try Auriko, see the quickstart guide. To compare pricing on your specific models, see the model directory.