Pricing

$99.99 a month. Plus what you use.

A $99.99 USD monthly subscription gets you access to the private LLM — Zoras Chat, Zoras Coding, and the OpenAI-compatible API. On top, pay only per million tokens at rates up to 80× cheaper than frontier US vendors.

Base$99.99 USD / month
UsagePer 1M tokens, by model
MinimumsNone on usage
BillingMonthly · auto-invoiced
How pricing works

One plan. Base plus usage.

Every customer is on the same plan: a flat $99.99 monthly platform fee plus what you use, priced by model. Simple to forecast, nothing hidden in the fine print.

● PLATFORM FEE

Developer plan

$99.99 USD / month

Covers access to the private LLM — Zoras Chat for your team, Zoras Coding for your developers, and the OpenAI-compatible API. Token usage (see the table below) is billed on top at per-model rates.

  • Zoras Chat — branded web app on your subdomain
  • Zoras Coding — Cursor / VS Code / JetBrains backend
  • API access with private keys & usage analytics
  • Every model in the table below available
  • Standard rate limits (10 RPM · 100K TPM)
  • Email support with same-day responses
Start for $99.99 →
◆ USAGE · PER 1M TOKENS

+ API usage

$0.12 starting · Zoras in/out per 1M

Token usage priced per million, separately for input (your prompt) and output (the response). Zoras and Zoras Coder are flat $0.12 in / $0.12 out; other models tiered. No minimum, no commitment.

  • Zoras & Zoras Coder at $0.12 / $0.12 per 1M
  • Full table of 10 models — see below
  • No minimum spend on usage
  • Rate limits & daily spend caps per key
  • Volume discounts available at scale
  • Embeddings priced separately ($0.05 per 1M)
See model pricing ↓

Typical developer spend: $99.99 base + $5–$50 in usage for light prototyping, $99.99 + $100–$500 for active production apps. Need a private deployment with no per-token fees? See enterprise pricing ↓

Per-model usage · billed on top of $99.99 base

Every model, on one private endpoint.

All models below run on the same sovereign Zoras infrastructure — your data never leaves. Pricing is per million tokens (1M = 1,000,000), charged separately for input (your prompt) and output (the response).

Model Alias Input Output
— Zoras native models
— GLM family
GLM 5.1
Long context (1M tokens), strong reasoning.
glm-5.1 $1.68/1M $5.28/1M
GLM 5
Flagship GLM generation; prior version of 5.1.
glm-5 $1.68/1M $5.28/1M
GLM 4.7
Mid-tier GLM; excellent quality-to-price.
glm-4.7 $0.72/1M $2.64/1M
— DeepSeek family
DeepSeek V3.1
Strong generalist with competitive throughput.
deepseek-v3.1 $0.67/1M $2.02/1M
DeepSeek V3.2
Latest DeepSeek; sharper reasoning, same pricing.
deepseek-v3.2 $0.67/1M $2.02/1M
— Qwen & Kimi
Qwen3 Plus
High-quality multilingual with cheap input.
qwen3-plus $0.60/1M $3.60/1M
Kimi K2.5
Moonshot custom variant; solid generalist.
kimi-k2.5-custom $0.72/1M $3.60/1M
Kimi K2.6
Latest Kimi; stronger reasoning and tool-use.
kimi-k2.6-custom $1.14/1M $4.80/1M

All prices in USD per 1,000,000 tokens. Input tokens are counted from your prompt (system + user messages + function schemas). Output tokens are counted from the model's response. Embeddings and fine-tuning priced separately — contact us.

Market comparison

Up to 80× cheaper than frontier US vendors.

Our Zoras models are priced to make high-volume workloads — agent loops, document processing, scheduled tasks — economically viable. Here's how $0.12 in / $0.12 out stacks up.

Zoras
$0.12 / $0.12
Our price per 1M tokens
OpenAI GPT-4o
$2.50 / $10.00
~20× input · 83× output
Anthropic Sonnet
$3.00 / $15.00
~25× input · 125× output
Google Gemini Pro
$1.25 / $5.00
~10× input · 42× output

Competitive prices as of April 2026, published vendor list rates. Your workloads may run at scale with volume discounts from all vendors; contact sales for a side-by-side for your actual usage.

Enterprise

Private deployments: flat platform fee.

If you need Zoras deployed in your own VPC or on-prem — with the API, Chat, and Code running on your infrastructure and your hardware — pricing moves to a flat annual platform fee plus an implementation engagement. No per-token markup. You own the GPU capacity.

  • Flat annual platform fee (site license)
  • Implementation engagement (2–6 weeks)
  • Managed support with SLA & runbooks
  • Air-gapped and sovereign-region available
  • BAA, DPA, SOC 2 report, security questionnaires

What's included

Unlimited API usage
On your own GPU capacity
✓ Included
Chat + Code + Automate
All products, all users
✓ Included
Integrations & MCP
120+ native connectors
✓ Included
Managed upgrades
On your maintenance schedule
✓ Included
99.9% SLA
With support tiers
✓ Included
FAQ

Common pricing questions.

What's a token, in plain English?+
A token is roughly 3-4 characters of English. One million tokens is approximately 750,000 words — about a 3,000-page book. For our Zoras models at $0.12 per million tokens, that's roughly $0.12 to read a small library, or draft 750,000 words of output.
Why are Zoras and Zoras Coder so much cheaper than everything else?+
Zoras runs on open-weight models we've fine-tuned and self-host on our own GPU capacity. We don't pay licensing to OpenAI or Anthropic, and we don't mark up third-party API costs. You get the savings.
What does the $99.99 monthly fee cover?+
Access to the private LLM itself — that's Zoras Chat (the branded web app on your subdomain), Zoras Coding (the backend for Cursor, VS Code, JetBrains), and the OpenAI-compatible API with your private keys. Standard rate limits (10 requests per minute, 100K tokens per minute) and email support are included. Token usage is billed on top at the per-model rates in the table above.
Is there a minimum usage commitment on top of the base?+
No. The $99.99/month is the only fixed cost. Token usage is pure pay-as-you-go with no minimums — if you don't make a single API call in a month, you still only pay the base subscription. Set rate limits and monthly spend caps per API key to stay in control.
Can I pause or cancel the subscription?+
Yes. Monthly subscription — cancel anytime, prorated to the end of the billing period. No annual commitment required for the Developer plan.
Are there volume discounts?+
Yes. At ~$5K/month and above, we tier into volume-discounted contracts. Above ~$15K/month, a private deployment with flat platform pricing is usually cheaper than usage-based — talk to us.
Do you charge for embeddings or function calls?+
Embeddings are billed at $0.05 per 1M tokens on our Zoras embedding model. Function calling is billed as part of the regular chat completion — your function schema counts as input tokens, the tool-call response counts as output.
Can we run this in our own VPC?+
Yes — see the enterprise section above. Private deployments move to a flat annual platform fee. All models above are available on your own infrastructure. Air-gapped deployments available for defense and regulated customers.
How do I pay?+
Prepaid credits via credit card for self-serve customers. Monthly invoicing with net-30 terms for organizations on annual contracts. Payments in USD, CAD, or EUR.
Get started

Build on a $0.12 API.

Sign up, get a key, point your OpenAI SDK at our base URL. You'll be making calls inside five minutes — and your CFO won't flinch at the bill.