A $99.99 USD monthly subscription gets you access to the private LLM — Zoras Chat, Zoras Coding, and the OpenAI-compatible API. On top, pay only per million tokens at rates up to 80× cheaper than frontier US vendors.
Every customer is on the same plan: a flat $99.99 monthly platform fee plus what you use, priced by model. Simple to forecast, nothing hidden in the fine print.
Covers access to the private LLM — Zoras Chat for your team, Zoras Coding for your developers, and the OpenAI-compatible API. Token usage (see the table below) is billed on top at per-model rates.
Token usage priced per million, separately for input (your prompt) and output (the response). Zoras and Zoras Coder are flat $0.12 in / $0.12 out; other models tiered. No minimum, no commitment.
Typical developer spend: $99.99 base + $5–$50 in usage for light prototyping, $99.99 + $100–$500 for active production apps. Need a private deployment with no per-token fees? See enterprise pricing ↓
All models below run on the same sovereign Zoras infrastructure — your data never leaves. Pricing is per million tokens (1M = 1,000,000), charged separately for input (your prompt) and output (the response).
| Model | Alias | Input | Output |
|---|---|---|---|
| — Zoras native models | |||
|
Zoras RECOMMENDED
Our general-purpose workhorse. Fast, versatile, flat pricing.
|
zoras |
$0.12/1M | $0.12/1M |
|
Zoras Coder
Tuned for code completion, agent mode, and repo-aware tasks.
|
zoras-coder |
$0.12/1M | $0.12/1M |
| — GLM family | |||
|
GLM 5.1
Long context (1M tokens), strong reasoning.
|
glm-5.1 |
$1.68/1M | $5.28/1M |
|
GLM 5
Flagship GLM generation; prior version of 5.1.
|
glm-5 |
$1.68/1M | $5.28/1M |
|
GLM 4.7
Mid-tier GLM; excellent quality-to-price.
|
glm-4.7 |
$0.72/1M | $2.64/1M |
| — DeepSeek family | |||
|
DeepSeek V3.1
Strong generalist with competitive throughput.
|
deepseek-v3.1 |
$0.67/1M | $2.02/1M |
|
DeepSeek V3.2
Latest DeepSeek; sharper reasoning, same pricing.
|
deepseek-v3.2 |
$0.67/1M | $2.02/1M |
| — Qwen & Kimi | |||
|
Qwen3 Plus
High-quality multilingual with cheap input.
|
qwen3-plus |
$0.60/1M | $3.60/1M |
|
Kimi K2.5
Moonshot custom variant; solid generalist.
|
kimi-k2.5-custom |
$0.72/1M | $3.60/1M |
|
Kimi K2.6
Latest Kimi; stronger reasoning and tool-use.
|
kimi-k2.6-custom |
$1.14/1M | $4.80/1M |
All prices in USD per 1,000,000 tokens. Input tokens are counted from your prompt (system + user messages + function schemas). Output tokens are counted from the model's response. Embeddings and fine-tuning priced separately — contact us.
Our Zoras models are priced to make high-volume workloads — agent loops, document processing, scheduled tasks — economically viable. Here's how $0.12 in / $0.12 out stacks up.
Competitive prices as of April 2026, published vendor list rates. Your workloads may run at scale with volume discounts from all vendors; contact sales for a side-by-side for your actual usage.
If you need Zoras deployed in your own VPC or on-prem — with the API, Chat, and Code running on your infrastructure and your hardware — pricing moves to a flat annual platform fee plus an implementation engagement. No per-token markup. You own the GPU capacity.
Sign up, get a key, point your OpenAI SDK at our base URL. You'll be making calls inside five minutes — and your CFO won't flinch at the bill.