Enterprise

The aggregator is table stakes.
Primitives win the committee.

Evals, guardrails, replay, and prompt management — the features that only an aggregator can build, because they work the same way across every model underneath. Plus the procurement essentials: SSO, SLA, dedicated endpoint, DPA, private audit log export.

Talk to salesSee pricing
Evals

Pick the cheapest model that passes — on your traffic.

Replay real production requests against any other model in the catalog and score the outputs against your rubric. Swap cheaper routes in once they pass; freeze expensive ones out. Runs on anonymized copies of your own logs, not public benchmarks that don't match your use case.

What you get
  • Shadow-replay up to 100% of traffic on a candidate model
  • LLM-judge + deterministic scorers (JSON validity, regex, length, latency)
  • One-click promote a candidate behind an existing route
  • Weekly cost-and-quality delta reports
Guardrails

One policy. Every provider.

Content safety, PII redaction, topic restrictions, and output JSON enforcement applied at the gateway layer — before the request leaves us and before the response reaches you. Because it runs at the aggregator, the same policy covers OpenAI, Anthropic, Google, Groq, Moonshot, and every other provider identically.

What you get
  • Inbound + outbound filtering with configurable severity
  • PII detection + redaction (GDPR / HIPAA classes)
  • Prompt-injection detection on user-supplied text
  • JSON-schema enforcement for structured outputs
  • Audit log of every block, by key, by user
Replay & shadow A/B

Re-run any logged request, deterministically.

Every request is captured (opt-in on Pro, default on Enterprise) and replayable — against the original model, a different model, or a new prompt version. Use it to reproduce production bugs, validate a prompt change, or run a shadow A/B without touching user-facing traffic.

What you get
  • Deterministic replay with original seed + params
  • Bulk replay across thousands of logged requests
  • Diff view: before / after on cost, latency, output
  • Canary a new model against 1% of live traffic, flip when you're happy
Prompt IDs

Version prompts server-side. Cache them automatically.

Pin prompts server-side and reference them by ID from your app. Deploy a new version without shipping code. Every pinned prompt gets automatic prefix caching — cache hits return in under 10ms at 10% of the uncached cost.

What you get
  • Named prompt IDs with versions + environments (staging / prod)
  • Automatic prefix caching on pinned prompts
  • Rollback in one click; audit log of every change
  • Programmatic diff against prior versions
Platform

The procurement checklist.

Everything your security, legal, and finance teams ask for — without a 12-week implementation.

Single sign-on
SAML + OIDC, provisioned via SCIM. Custom IdP supported.
Dedicated endpoint
Private hostname with IP allow-listing and region pinning.
Direct-provider agreements
We apply your existing OpenAI / Anthropic / Google volume discounts on top of our routing.
SLA
99.95% uptime on dedicated tier, with latency SLO by modality.
SOC 2 Type II
In progress for 2026 Q3. SOC 2 Type I today; DPA available on request.
Audit log export
Every key create, every guardrail block, every replay — streamed to your SIEM.
Committed spend
Annual commits with rollover. Prepaid discount schedule available.
Named support
A real engineer on Slack Connect. 24-hour response on weekends, 4-hour on business days.
Next step

30 minutes with a real engineer.

We walk your stack, show the platform against your workload, and send a DPA. No SDRs. No discovery decks.

enterprise@aigateway.shSecurity posture