One API · every model · your infrastructure

Route every model.
Keep every byte.

For global enterprises: Cudator unifies AI spend across every provider, subsidiary, and country into one consolidated invoice — while routing each request to the best model on ground you control.

No card to start Zero data retention One invoice, any currency
POST  /v1/chat/completions
app OpenAI Claude Gemini vLLM routed in 38ms

Powering AI at teams that can't afford to get infrastructure wrong

What Cudator does

One gateway between your app and every model you'll ever use.

Stop wiring keys, quotas, and failover logic per vendor. Point your SDK at Cudator once and let the platform handle routing, residency, and money.

Model routing

Send one request; Cudator picks the right model on price, latency, and quality — then load-balances and fails over across a pool of credentials.

  • Policy-based routing by cost, latency or task
  • Weighted load-balancing & automatic failover
  • One OpenAI-compatible endpoint for all vendors

Sovereign private routing

Pin sensitive traffic to providers and regions you approve — or to your own self-hosted models. Data never leaves the ground you control.

  • Route to on-prem & VPC-hosted models (vLLM)
  • Region & residency pinning, per workspace
  • Zero retention, full request-level audit trail

Wallet & payments

Every provider's spend, in every country, rolls into one wallet. Set caps per key, team, and legal entity — then settle in the currency you choose.

  • One wallet across subsidiaries & currencies
  • Spend limits per key, user & workspace
  • Itemised usage settlement & export
How it works

From request to settled invoice, in one hop.

01

Point your SDK

Swap your base URL and key. Cudator speaks the OpenAI API, so existing code keeps working unchanged.

02

Apply a routing policy

Choose the cheapest, fastest, or highest-quality path — and pin sensitive workloads to approved regions or self-hosted models.

03

Cudator routes & fails over

Requests load-balance across a credential pool. If a provider errors or rate-limits, traffic shifts automatically.

04

Spend settles to the wallet

Every token is metered, attributed, and deducted from one wallet — with caps enforced before the call ever leaves.

Global billing

One invoice for AI, across every country you operate in.

Large enterprises run AI across dozens of subsidiaries, currencies, and tax regimes. Cudator meters every entity locally, handles FX and tax, and rolls it all into a single consolidated bill — settled in the currency your finance team chooses.

  • Consolidated invoicing

    Every subsidiary's AI spend on one monthly bill — or split by entity for local books.

  • Multi-currency wallets

    Fund and settle in USD, EUR, GBP, JPY and 40+ more. FX is handled automatically.

  • Local payment methods

    SEPA, ACH, wire, and card — pay the way each region's finance team already works.

  • Tax & compliant invoices

    VAT, GST, and local e-invoicing applied per jurisdiction — audit-ready out of the box.

USD · EUR · GBP · JPY · SGD +40 SEPA · ACH · Wire · Card VAT / GST handled
Acme Global Industries
consolidated · 5 entities · Mar 2026
settles in USD
US Acme US Inc.United States $48,200.00USD
GB Acme UK Ltd.United Kingdom £28,100.00≈ $35,640
DE Acme Deutschland GmbHGermany · incl. VAT €19,540.00≈ $21,100
JP Acme KKJapan · incl. JCT ¥2,410,000≈ $16,180
SG Acme APAC Pte.Singapore · incl. GST S$12,800.00≈ $9,520
Total due · all entities FX & tax reconciled by Cudator
$130,640.00
Providers & models

Bring the frontier, the open, and your own.

Hosted frontier labs, embedding specialists, and OpenAI-compatible self-hosted clusters — managed as one routable pool.

OOpenAIchat · embeddings
AAnthropicchat
GGooglechat
MMistralchat
CCohereembeddings
VVoyage AIembeddings
LLlama · vLLMself-hosted
DDeepSeekself-hosted

Any OpenAI-compatible endpoint plugs in — add your own model behind a base URL in minutes.

Sovereign by design

Run AI on ground you control — and prove it.

For teams in health, finance, and defense, "where does the data go?" isn't a footnote. Cudator makes residency a routing rule, not a leap of faith.

  • Private & self-hosted routing

    Keep regulated traffic on your VPC or on-prem GPUs. The frontier handles the rest — split by policy, never by accident.

  • Region & residency pinning

    Bind a workspace to approved regions. Out-of-region credentials are simply not in the pool.

  • Request-level audit & zero retention

    Every call is logged with model, region, and cost — payloads are never stored. Export the trail for your auditors.

workspace · meridian-health
residency policy · active
Allowed regioneu-west-1
PHI workloadsself-hosted · vLLM
Egress to public APIsblocked
Payload retentionnone
SOC 2 Type II HIPAA GDPR ISO 27001 Data residency
Drop-in API

Already built? Change two lines.

Cudator speaks the OpenAI API. Point the base URL at the gateway, use a Cudator key, and routing, residency, and global billing come along for free.

OpenAI SDKs Streaming & tools Embeddings One policy header
route.py
from openai import OpenAI

client = OpenAI(
    base_url="https://api.cudator.ai/v1",
    api_key="cud_live_••••••••",
)

# route by policy — Cudator picks the model
resp = client.chat.completions.create(
    model="auto",
    extra_headers={"X-Cudator-Policy": "sovereign-eu"},
    messages=[{"role": "user",
               "content": "Summarise this record."}],
)

# → served by self-hosted vLLM · eu-west-1
# → $0.0004 metered to wallet · 38ms
"We moved seven AI products onto Cudator in a quarter. Our auditors got a request-level trail, our finance team got one invoice, and engineering stopped babysitting provider keys."
JI
Juancho Irigaray
Founder · StrategicSyntax
8+
providers, one pool
99.98%
routing uptime
<40ms
routing overhead
0
payloads retained
Start here

Ship AI on ground you control.

Create a workspace, drop in your base URL, and route your first request in minutes. No card required to start.

Prepaid or invoiced · multi-currency · cancel anytime