Pricing — Zoras

How pricing works

One plan. Base plus usage.

Every customer is on the same plan: a flat $99.99 monthly platform fee plus what you use, priced by model. Simple to forecast, nothing hidden in the fine print.

● PLATFORM FEE

Developer plan

$99.99 USD / month

Covers access to the private LLM — Zoras Chat for your team, Zoras Coding for your developers, and the OpenAI-compatible API. Token usage (see the table below) is billed on top at per-model rates.

Zoras Chat — branded web app on your subdomain
Zoras Coding — Cursor / VS Code / JetBrains backend
API access with private keys & usage analytics
Every model in the table below available
Standard rate limits (10 RPM · 100K TPM)
Email support with same-day responses

Start for $99.99 →

◆ USAGE · PER 1M TOKENS

+ API usage

$0.12 starting · Zoras in/out per 1M

Token usage priced per million, separately for input (your prompt) and output (the response). Zoras and Zoras Coder are flat $0.12 in / $0.12 out; other models tiered. No minimum, no commitment.

Zoras & Zoras Coder at $0.12 / $0.12 per 1M
Full table of 10 models — see below
No minimum spend on usage
Rate limits & daily spend caps per key
Volume discounts available at scale
Embeddings priced separately ($0.05 per 1M)

See model pricing ↓

Typical developer spend: $99.99 base + $5–$50 in usage for light prototyping, $99.99 + $100–$500 for active production apps. Need a private deployment with no per-token fees? See enterprise pricing ↓

Per-model usage · billed on top of $99.99 base

Every model, on one private endpoint.

All models below run on the same sovereign Zoras infrastructure — your data never leaves. Pricing is per million tokens (1M = 1,000,000), charged separately for input (your prompt) and output (the response).

Model	Alias	Input	Output
— Zoras native models
Zoras RECOMMENDED Our general-purpose workhorse. Fast, versatile, flat pricing.	`zoras`	$0.12/1M	$0.12/1M
Zoras Coder Tuned for code completion, agent mode, and repo-aware tasks.	`zoras-coder`	$0.12/1M	$0.12/1M
— GLM family
GLM 5.1 Long context (1M tokens), strong reasoning.	`glm-5.1`	$1.68/1M	$5.28/1M
GLM 5 Flagship GLM generation; prior version of 5.1.	`glm-5`	$1.68/1M	$5.28/1M
GLM 4.7 Mid-tier GLM; excellent quality-to-price.	`glm-4.7`	$0.72/1M	$2.64/1M
— DeepSeek family
DeepSeek V3.1 Strong generalist with competitive throughput.	`deepseek-v3.1`	$0.67/1M	$2.02/1M
DeepSeek V3.2 Latest DeepSeek; sharper reasoning, same pricing.	`deepseek-v3.2`	$0.67/1M	$2.02/1M
— Qwen & Kimi
Qwen3 Plus High-quality multilingual with cheap input.	`qwen3-plus`	$0.60/1M	$3.60/1M
Kimi K2.5 Moonshot custom variant; solid generalist.	`kimi-k2.5-custom`	$0.72/1M	$3.60/1M
Kimi K2.6 Latest Kimi; stronger reasoning and tool-use.	`kimi-k2.6-custom`	$1.14/1M	$4.80/1M

All prices in USD per 1,000,000 tokens. Input tokens are counted from your prompt (system + user messages + function schemas). Output tokens are counted from the model's response. Embeddings and fine-tuning priced separately — contact us.

Market comparison

Up to 80× cheaper than frontier US vendors.

Our Zoras models are priced to make high-volume workloads — agent loops, document processing, scheduled tasks — economically viable. Here's how $0.12 in / $0.12 out stacks up.

Zoras

$0.12 / $0.12

Our price per 1M tokens

OpenAI GPT-4o

$2.50 / $10.00

~20× input · 83× output

Anthropic Sonnet

$3.00 / $15.00

~25× input · 125× output

Google Gemini Pro

$1.25 / $5.00

~10× input · 42× output

Competitive prices as of April 2026, published vendor list rates. Your workloads may run at scale with volume discounts from all vendors; contact sales for a side-by-side for your actual usage.

Enterprise

Private deployments: flat platform fee.

If you need Zoras deployed in your own VPC or on-prem — with the API, Chat, and Code running on your infrastructure and your hardware — pricing moves to a flat annual platform fee plus an implementation engagement. No per-token markup. You own the GPU capacity.

Flat annual platform fee (site license)
Implementation engagement (2–6 weeks)
Managed support with SLA & runbooks
Air-gapped and sovereign-region available
BAA, DPA, SOC 2 report, security questionnaires

Talk to enterprise → About & security

What's included

Unlimited API usage

On your own GPU capacity

✓ Included

Chat + Code + Automate

All products, all users

✓ Included

Integrations & MCP

120+ native connectors

✓ Included

Managed upgrades

On your maintenance schedule

✓ Included

99.9% SLA

With support tiers

✓ Included

FAQ

Common pricing questions.

What's a token, in plain English?+

A token is roughly 3-4 characters of English. One million tokens is approximately 750,000 words — about a 3,000-page book. For our Zoras models at $0.12 per million tokens, that's roughly $0.12 to read a small library, or draft 750,000 words of output.

Why are Zoras and Zoras Coder so much cheaper than everything else?+

Zoras runs on open-weight models we've fine-tuned and self-host on our own GPU capacity. We don't pay licensing to OpenAI or Anthropic, and we don't mark up third-party API costs. You get the savings.

What does the $99.99 monthly fee cover?+

Access to the private LLM itself — that's Zoras Chat (the branded web app on your subdomain), Zoras Coding (the backend for Cursor, VS Code, JetBrains), and the OpenAI-compatible API with your private keys. Standard rate limits (10 requests per minute, 100K tokens per minute) and email support are included. Token usage is billed on top at the per-model rates in the table above.

Is there a minimum usage commitment on top of the base?+

No. The $99.99/month is the only fixed cost. Token usage is pure pay-as-you-go with no minimums — if you don't make a single API call in a month, you still only pay the base subscription. Set rate limits and monthly spend caps per API key to stay in control.

Can I pause or cancel the subscription?+

Yes. Monthly subscription — cancel anytime, prorated to the end of the billing period. No annual commitment required for the Developer plan.

Are there volume discounts?+

Yes. At ~$5K/month and above, we tier into volume-discounted contracts. Above ~$15K/month, a private deployment with flat platform pricing is usually cheaper than usage-based — talk to us.

Do you charge for embeddings or function calls?+

Embeddings are billed at $0.05 per 1M tokens on our Zoras embedding model. Function calling is billed as part of the regular chat completion — your function schema counts as input tokens, the tool-call response counts as output.

Can we run this in our own VPC?+

Yes — see the enterprise section above. Private deployments move to a flat annual platform fee. All models above are available on your own infrastructure. Air-gapped deployments available for defense and regulated customers.

How do I pay?+

Prepaid credits via credit card for self-serve customers. Monthly invoicing with net-30 terms for organizations on annual contracts. Payments in USD, CAD, or EUR.

Get started

Build on a $0.12 API.

Sign up, get a key, point your OpenAI SDK at our base URL. You'll be making calls inside five minutes — and your CFO won't flinch at the bill.

Get an API key → See Zoras Coding

$99.99 a month. Plus what you use.

One plan. Base plus usage.

Developer plan

+ API usage

Every model, on one private endpoint.

Up to 80× cheaper than frontier US vendors.

Private deployments: flat platform fee.

What's included

Common pricing questions.

Build on a $0.12 API.