Managed API BYOK
One endpoint for any app, agent, or workflow. Tell us the task — or let us read it — and Oriora routes to the best available model for the job, scored on quality, cost, and latency, within the preferences you set. Then it gets out of the way.
Pick your lane
The one thing Oriora always manages is the model choice. From there it's a dial: take just the pick and run the call yourself, or hand us the call too. You pay only for the layers you switch on.
You run the call
1 layer · just the pickYour key stays on your own infrastructure. Ask us which model fits; we return the best-fit model plus ranked alternatives. You make the call yourself — we never see your key or your output, and your prompt only if you choose to have us classify it.
Pick this if you want maximum privacy or control, or already have your own call setup.
One flat fee per recommendation.
We run the call
2 layers · pick + runHand us your vendor key. One request — we pick the best model and run it, with caching, fallback, and retries, then return the output. Point your tool at one endpoint.
Pick this if you want the least work — one endpoint and we do the rest.
Two flat fees per call.
Both run on the same scoring brain — quality, cost, and latency across every supported vendor, within the preferences you set. Two things are yours to set on top: how many layers you hand us (the pick, or the pick and the call), and whether you declare the task or let us classify it for you.
Client-side, you make the call — so it comes down to whether you control the code. Writing your own? Ask for the model in one line. Using an off-the-shelf agent? Drop in our local adapter — no code change.
Your own code
Ask which model fits, then make the call yourself on your key. One request to /api/select — pass the task type if you know it, or send the prompt and we classify it server-side. Same endpoint, same flat fee. The Python SDK covers the declared form in a line.
pip install oriora · client.model_select(task_type)
Off-the-shelf agent
AvailableHermes, OpenClaw, the raw OpenAI SDK, LangChain — any OpenAI-compatible agent you can point at a base URL. Run our local adapter; it asks for the best model per call and routes on your key. No agent code to change.
pip install oriora-c2 · oriora-c2 serve
Either way, your key and output never leave your side, and your prompt stays local too — unless you opt to have us classify it server-side. The adapter runs on 127.0.0.1, on your machine.
Don't want to wire anything?
The easiest way to watch Oriora pick the model for you — a capable AI terminal on your own key, set up in one step.
First time? Open a terminal on your Mac — 3 steps
Don't want a terminal at all? Use the one-click installer below — double-click, paste your key, done. · On Windows, run the commands in WSL or Git Bash.
One-click installer
Double-click, paste your key, done. Sets everything up for you — no terminal experience needed.
↓ Download for macOSAlready have Claude Code?
Add the Oriora layer to your existing install — nothing re-downloaded.
curl -fsSL https://orioralabs.com/terminal/add-oriora.sh | bash
What it does, plainly
Claude Code is a product of Anthropic — we only add the layer + the routing. Not affiliated with or endorsed by Anthropic.
Prefer your own agent? (Hermes, OpenClaw, or any OpenAI tool)
A different setup from the terminal above — here your vendor keys live on your Oriora account(not your Mac), and Oriora runs each call for you. The agent only ever needs your Oriora key. Good if you'd rather not keep keys on your own machine.
Hermes
curl -fsSL https://orioralabs.com/terminal/add-oriora-hermes.sh | bash
OpenClaw
curl -fsSL https://orioralabs.com/terminal/add-oriora-openclaw.sh | bash
Any OpenAI-compatible agent or SDK works the same way — point its base URL at https://api.orioralabs.com/v1, use your Oriora key, model oriora-auto.
Never used a terminal? This is the whole thing. The wizard does the setup — you just paste your key when it asks.
One OpenRouter key
Oriora picks the best model for each job from our whole catalogue — every vendor, one key.
One vendor's key
Oriora picks the best of that vendor's models for each job. Either way, the picking is the value.
Note on four vendors: OpenAI, Google, Mistral and Cohere speak OpenAI's API format, which the Claude terminal can't call directly — pick OpenRouter to use those (it carries them). The rest — OpenRouter, DeepSeek, Anthropic, Zhipu, Moonshot, MiniMax, xAI — connect directly.
Optional · give your agent more power
Want to give your terminal more power? Let your agent see your screen and click and typeto run real on-screen tasks — fill a form, organize files, grab a screenshot — while you're away. It runs on your own machine, on your own key, billed per call like everything else.
You stay in control — keep it sandboxed and watch what it does. An agent driving your computer is powerful; treat it like handing over the keyboard.
Claude Code & MCP agents
Add a computer-use MCP server — it gives the agent screenshot, click and type tools.
Hermes & agents with it built in
Just ask it — "install computer use and control my Mac." No config wrangling.
OpenAI-compatible
Oriora uses the same API shape as OpenAI. Any tool that accepts a custom base URL works today — set it to Oriora's endpoint, drop in your Oriora key, and the tool gets intelligent model routing on your own vendor keys. No code changes, no new SDK. Your vendor keys live on your Oriora account (Settings → Provider keys); the tool only needs your Oriora key.
Claude Code
One-click installer above. An OpenRouter key lets Oriora route across our full catalogue per request; a single vendor key uses that vendor — your own key either way. Docs →
Vercel AI SDK
Building a web app? createOpenAI({ baseURL: "https://api.orioralabs.com/v1" }) — drop-in, every call routed on your keys. Docs →
Cloudflare AI Gateway
Add Oriora as a custom provider in your CF AI Gateway → your traffic routes through us to the best model on your keys. Docs →
Any OpenAI SDK
Python / Node / Go / Rust — set the base URL to Oriora, keep your code. Every call routed to the best model on your keys. Docs →
The pattern
base_url = "https://api.orioralabs.com/api/route" api_key = "<your-oriora-key>" # Everything else stays the same — model name, messages, stream, all of it.
Or — Oriora as an MCP server
Plug Oriora into Claude Desktop, Cursor, or any MCP client and your agent gains one tool — recommend_model — that picks the best model for each task from your router, so it's never locked to one model or guessing which to use. Your agent then runs the call on your own keys; Oriora never sees your prompt or vendor key. $0.001 per recommendation. Nothing to install or self-host — it's hosted at api.orioralabs.com/mcp.
{
"mcpServers": {
"oriora": {
"command": "npx",
"args": ["-y", "mcp-remote", "https://api.orioralabs.com/mcp",
"--header", "Authorization: Bearer ${ORIORA_API_KEY}"]
}
}
}Get your Oriora key at Settings → Provider keys. The tool returns the same fields as /api/select (model · provider · alternatives) — see the Reference below. What is MCP? →
Oriora is an independent product — not affiliated with, partnered with, or endorsed by any of the tools listed above. We work with them because Oriora is OpenAI-compatible; anyone can point a compatible client at our endpoint.
If your app knows its job, declare the task and we route on that — we never read your prompt. If it can't — an off-the-shelf agent, a mixed-purpose tool — let us classify the prompt instead: read in memory, stored nowhere. Same router, same scores. You choose whether we read it.
Classify — when you can't declare
For agents & mixed-purpose tools
An off-the-shelf agent can't pass a task label, so we read the prompt to work out what kind of task it is.
Read in memory, stored nowhere
The prompt is sorted into a task type in memory and never written down — the same zero-storage promise as everywhere else.
Local or server-side — privacy vs accuracy
Classify on your machine with a fast rules classifier (your prompt never leaves) for maximum privacy — or send it to our sharper server-side classifier, which catches the ambiguous, mixed prompts that simple rules miss and routes them better. Either way: in memory, stored nowhere.
Same router, same scores
Once the task is known, routing is identical to declaring it — classifying only fills in the label.
Declare — the clean default
You declare the task
Your app is a code reviewer, or a support bot, or a research tool. It always will be. You pass taskType — we never need to read your prompt.
No black box
Routing is deterministic — the top scorer for your task within the vendors and priority you set. The response tells you which model actually ran.
Zero classification overhead
No ML inference before routing. Your request goes straight to the best model.
Built for apps with a purpose
If your app knows what it does, declare it. That's the only requirement.
The full menu
Three ways the task gets decided — all still client-side (you run the call, your key never leaves, one flat fee). It's a menu, not an either-or: mix them per workload, even per task type.
Declare it
Prompt never sentYou pass the task type yourself. No classifier runs — the cleanest, fastest path.
Classify locally
Prompt never leaves your machineThe oriora-c2 adapter classifies on your side with a fast rules classifier. Maximum privacy — built for off-the-shelf agents that can’t pass a label.
Classify server-side
Read in memory, stored nowhereSend the prompt to /api/select and our sharper classifier reads it — better on the ambiguous, mixed prompts that simple rules miss.
Privacy
Not a policy. Not a contractual add-on. The architecture itself has nowhere for your prompt to land.
Managed selection
Logs the task type, the model that ran, and the flat fee per request. That's it. Prompt content is never written to disk or any database.
Model gateway
Forwards your request to the model provider. Configured for zero prompt retention — nothing stored in transit.
Model provider
Runs inference and returns a response. Same as calling them directly — the routing layer adds no extra data surface.
Relevant for any privacy-conscious product where prompt content shouldn't pass through a third-party logging layer. Some routing tools store prompts by default and charge extra for zero-data-retention. We don't store them at all.
Powered by Oriora — every call is labeled
Each AI result can carry an honest, un-fakeable credit line — proof that Oriora routed the call and your key was never retained. It rides on the apps you build and spreads wherever Oriora ran — proof, not a logo.
Bring Your Own Memory
Bring your own key — and bring your own memory. Your rules, your preferences, your way of working, kept in a file you own. Any Oriora app or agent reads it the moment you act, uses it for that one reply, then forgets it. One memory, everywhere you work — and we never store it.
It stays yours
Your memory lives in a file you own — your repo, your device. You edit the source; the next reply already has it. We keep no copy.
Read per call, stored nowhere
When you act, the memory is added to the prompt for that single call — then it is gone. Nothing of it is written to disk or any database.
One memory, every surface
The same memory follows you across every app and agent you use. Not per-app silos — one you, everywhere.
It's a gate, not a sync. Each call checks one thing — is your memory connected? — and either reads it or skips it. There's nothing to upload to us, and nothing of yours to delete from us later: we only ever hold a pointer that says it's there, never the memory itself.
Building an agent or app?
Just include your memory in the prompt you already send. No new endpoint, no SDK — it's text in your system message. It works today.
Using an Oriora app?
Point it at your memory file once — in the GitHub you already connect. From then on every app reads the live file. Edit it anytime; the change lands on your next reply.
Your memory file stays private — read through access you already grant, never made public, never copied to us. And because we read the live source each time, it can never go stale.
You don't have to connect a separate key for every vendor. Connect a single OpenRouter key and Oriora routes through it — you reach the whole catalogue with one credential and deal with just two parties: OpenRouter for the AI, Oriora for the routing. Same selection brain, plus two independent controls over what runs and where.
Exclude a vendor on Oriora
A hard skip at the source. Oriora never picks it, never recommends it, never attempts it — as if that vendor does not exist for you. Nothing is sent.
Exclude a provider on OpenRouter
Keep the model, change the host. Oriora still routes it — but OpenRouter runs it on a host you allow, not the excluded company's own servers. The model works; the excluded company never receives your data.
Example — DeepSeek:leave DeepSeek allowed on Oriora, but in your OpenRouter account exclude DeepSeek's own host and keep Western hosts (e.g. Together, Fireworks). Oriora picks the DeepSeek model → your OpenRouter key runs it on a US host → DeepSeek-quality output, and your data never reaches DeepSeek the company.
Where your data ultimately goes is configured and owned by you, in OpenRouter — Oriora routes the model and never overrides your host choice. It is a control you own, not a guarantee Oriora makes on your behalf. Host-switching applies to open-weight models (DeepSeek, Llama, Qwen, open Mistral); closed models (GPT, Claude, Gemini) run only on their owner's servers.
Oriora is an independent product — we build the apps and the platform that run on top of third-party AI models and gateways; we don't make the models ourselves. We are not affiliated with OpenRouter or any of the third-party tools and services named on this page. Oriora has no formal partnership, sponsorship, or endorsement arrangement with any of them unless explicitly disclosed.
Smart model selection, prompt cache pass-through, fallback, and circuit breakers across every supported vendor — full mechanics and current models on the pricing page.
How it works + supported models →01
Create an account
Sign up at orioralabs.com. Connect your vendor API keys — Oriora charges only its flat per-call fees.
02
Generate a key
Inside your account, generate an sk_oriora_... key. Yours in seconds.
03
Pass your task type
One POST request with taskType declared. Oriora handles everything from there.
Opens Settings → Oriora API keys (sign in required).
Oriora returns the best-fit model; you make the call with your own vendor key. Your key never touches us. Flat $0.001 per recommendation.
curl https://api.orioralabs.com/api/select \
-H "Authorization: Bearer sk_oriora_..." \
-H "Content-Type: application/json" \
-d '{"task_type": "coding_extra_hard"}'
# → { "model": "anthropic/claude-fable-5", "provider": "anthropic",
# "native_model": "claude-fable-5", "alternatives": [...],
# "task_type": "coding_extra_hard",
# "orchestration": { "pattern": "quality-verify",
# "verifier_model": "deepseek/deepseek-v4-pro",
# "verify_prompt": "Review the answer above for correctness
# and completeness, then produce an improved, final version.
# Respond with only the final answer." } }
#
# Task tiers: coding → coding_hard → coding_extra_hard (same for reasoning/agentic).
# coding_extra_hard = premium model + a quality-verify recommendation. To use it:
# run the call on "model", then run a second pass on "verifier_model" (a different
# vendor) feeding it your prompt + the first answer + "verify_prompt" — that second
# model reviews and returns the corrected final answer. To disable the recommendation
# account-wide, toggle "Orchestration hints" off in Settings. Oriora runs no extra call.
#
# *_extra_hard is reached by declaration (as above) or by auto-classification when the
# classifier detects high-consequence signals (security vulnerabilities, whole-system
# scope, completeness requirements). orchestration is null on all other task types.
#
# "models" is optional — omit it and Oriora ranks across the full catalogue.
# Prefer not to name the task? Send "messages" instead of "task_type" and
# Oriora classifies the prompt in memory — stored nowhere (see above).Discover task types (no auth): GET /api/select/task-types
Every request is classified into a routing tier. You can set the tier explicitly with task_type, or let Oriora infer it from your prompt. Explicit task_typeis always honored exactly; prompt-based inference is a convenience for when you'd rather not classify yourself.
| To request | Set task_type, or signal in your prompt |
|---|---|
| Standard routing — the best model for the task | The default; no action needed |
| A higher-capability model for demanding work | Phrasing such as “this is complex / difficult” → the _hard tier |
| A higher-capability model plus a second, independent model that reviews the answer | Declare task_type: "coding_extra_hard" (or reasoning_/agentic_), or phrasing such as “double-check this”, “must be correct”, “high-stakes” |
Point your existing OpenAI SDK at Oriora. We route to the best model and run it on your own vendor key (BYOK) — your vendor bills the tokens; Oriora charges only the platform & routing fees.
curl https://api.orioralabs.com/v1/chat/completions \
-H "Authorization: Bearer sk_oriora_..." \
-H "x-oriora-app: my-app" \
-H "Content-Type: application/json" \
-d '{
"model": "oriora-auto",
"messages": [{"role": "user", "content": "Review this PR diff..."}]
}'
# Returns an OpenAI-shaped chat.completion.
# model:"oriora-auto" lets Oriora route. x-oriora-app is an optional label —
# it keeps each app's usage breakdown and cache/route warmth separate.
# Vendor keys connect account-wide under Settings -> Provider keys.The simplest integration: one POST, no SDK shape. Same routing and BYOK execution as the OpenAI-compatible endpoint — you name the task type yourself. Flat $0.002 per call.
curl https://api.orioralabs.com/api/route \
-H "Authorization: Bearer sk_oriora_..." \
-H "Content-Type: application/json" \
-d '{
"taskType": "coding",
"messages": [{"role": "user", "content": "Write a regex for..."}]
}'
# → { "response": "...", "amount_usd": 0.002, "provider": "...", "latency_ms": ... }
# You declare the task (full list: GET /api/select/task-types). Add
# "classify_task": true to let Oriora read the prompt and escalate the task.
# Three tiers: coding → coding_hard (premium model) → coding_extra_hard
# (premium model + quality-verify orchestration; also works for agentic/reasoning).What you can rely on, and what each status code means. Failed BYOK calls are not charged.
Rate limit 60 calls/min per account on the AI endpoints.
Every response carries standard RateLimit-* headers.
Input cap 800,000 characters (~200k tokens) per request.
Output cap 16,384 tokens (max_tokens above this is clamped).
401 missing or invalid key — these endpoints take sk_oriora_... API
keys, not website session logins
402 insufficient_funds — top up your wallet at orioralabs.com
403 requires BYOK — connect a vendor key in Settings -> Provider keys
400 unknown task_type — valid values: GET /api/select/task-types
413 input too large (see caps above)
429 rate limited — wait for the RateLimit-Reset header, then retry
502 routing or vendor failure — the call was not charged (BYOK)