Billing
Pay-as-you-go credits, no subscription. $1 in free credits on signup. Stripe-powered top-ups, full-retail pricing, and credits that don't expire.
How billing works
Every API call writes a usage_events row through the proxy. The Worker reads DeepSeek's usage object off the final SSE chunk, looks up the active rate card for the model, and computes cost in microcents (1 microcent = 1e-6 USD). Microcent precision means we never round mid-calculation and never store floats — the ledger only ever adds and subtracts integers.
The same transaction debits your balance from credit_ledger. The whole thing is idempotent on request_id, so a retry of the same upstream call cannot double-charge you. If the meter write fails, the row is buffered to a Cloudflare KV queue and a scheduled drainer retries up to 12 times before dead-lettering. The proxy never silently drops a metered call.
- Atomic — usage row + ledger debit land in one transaction.
- Idempotent — keyed on
request_id, so retries can't double-charge. - Microcent precision — integer arithmetic end to end. No floats. No rounding drift.
- Audit-grade — the verbatim DeepSeek
usageobject is stored inraw_usage_jsonalongside the resolvedrate_card_version.
Rates
Two SKUs. Prices are USD per 1M tokens. The "billed" column is what your account is charged; the "upstream" column is what we pay DeepSeek and is published purely for transparency.
| Model | Input (uncached) | Input (cached) | Output |
|---|---|---|---|
deepseek-v4-flash — billed | $0.35 | $0.007 | $0.70 |
deepseek-v4-flash — upstream | $0.14 | $0.0028 | $0.28 |
deepseek-v4-pro — billed | $4.35 | $0.03625 | $8.70 |
deepseek-v4-pro — upstream | $1.74 | $0.0145 | $3.48 |
The 2.5× margin covers infrastructure, Stripe fees, free signup credit, and the proxy's engineering. The split between billed and upstream is fixed — there's no enterprise tier, no volume discount, and no negotiation. If that's a dealbreaker for your use case, the API is OpenAI-compatible and DeepSeek sells direct.
Fred bills full retail. We don't pass through DeepSeek's promotional or off-peak discounts; you pay the same regardless of when you call. That keeps your bill predictable and immune to mid-month rate changes — and it's enforced at the type level in our pricing module, so we couldn't ship a "promo override" by accident even if we wanted to.
Reasoning tokens
On deepseek-v4-pro, the model emits a chain-of-thought before the visible output. Those reasoning tokens count toward the output total and are billed at the output rate — there's no separate "reasoning" rate card.
DeepSeek surfaces them in usage.completion_tokens_details.reasoning_tokens on the final SSE chunk. The proxy captures that into the per-row breakdown so you can see how much of an expensive turn was thinking versus answering — but the total cost is unchanged. If completion_tokens says 4,000, you pay for 4,000 output tokens, full stop.
Practically: deepseek-v4-pro is roughly 12× the output cost of flash, and reasoning typically inflates a turn's output count by 3–10× depending on difficulty. If you're seeing surprise bills, swap back to flash with /flash for the easy turns.
Cached prompt input
DeepSeek caches identical prompt prefixes across requests. When the next request comes in with the same prefix, the cached portion bills at the much lower cached rate — 50× cheaper for flash, 120× cheaper for pro. That's not a typo.
Structure long prompts so the stable parts come first and the volatile parts come last:
- System prompt — the most stable, byte-identical across every call.
- Tool definitions — also stable; only changes when you ship a new tool.
- Long context (file contents, docs) — stable within a session; whole-block replacements bust the cache, but appending doesn't.
- User query last — the volatile part. Goes at the bottom so the prefix above it stays cacheable.
The Fred CLI's system prompt and tool definitions are already structured this way; that's why a long agent loop is much cheaper than its raw token count would suggest. If you're building your own client, the same trick applies — keep your stable prefix byte-identical across calls and DeepSeek's cache will earn the savings back.
Top-ups
Stripe Checkout. Click Top up at /billing on the dashboard, enter a USD amount (min $5, max $1000), pay, done. Funds land as a stripe_topup ledger row immediately on webhook delivery — usually a couple seconds after the Stripe success redirect.
- Min $5, max $1,000 per Checkout session. Need bigger? Email
support@fredcode.netand we'll arrange invoiced top-ups. - No expiry. Credits sit on your account forever.
- No auto-recharge by default. Opt in on the billing page if you want a refill threshold.
Free signup credit
New accounts auto-get $1 in credits via the signup_bonus ledger row. That's enough for a few hundred turns on deepseek-v4-flash at typical agent-loop sizes — long enough to evaluate the product without ever opening Stripe.
The bonus is one-time per account. Closing and reopening the account does not re-trigger it.
Out of credits
When your balance won't cover the next call, the proxy returns 402 insufficient_credits with a top-up link in the body:
{
"error": {
"code": "insufficient_credits",
"message": "Account balance is below the cost of this request.",
"topup_url": "https://app.fredcode.net/billing?topup=1"
}
}
The Fred CLI catches this and prompts you to top up inline before retrying the turn — no silent over-spend, no in-flight call partially completing on credit you don't have. Bare API consumers should handle the 402 the same way: pause, open topup_url in a browser, retry once funds land.
Internal accounts
If your account has is_internal=true (set by an admin during onboarding for the team or beta testers), API calls are billed at zero. The proxy still records the row with cost_billed_microcents=0, but cost_upstream_microcents stays at full retail — the rate card never lies about what we paid DeepSeek, even for free users. That's a load-bearing invariant for monthly reconciliation.
Internal accounts can't be created through the dashboard; they're flipped on by an admin. If you think you should be one and aren't, ping us.
Reconciliation
Once a month, a Vercel cron computes per-month aggregates from usage_events and we cross-check them against DeepSeek's invoice. Each invoice month becomes a row in reconciliation_runs, keyed by period_start. We compute drift_microcents and drift_bps against the invoice; anything over 200 bps gets investigated before the run is marked matched.
You don't see this directly — it's a backstop. Mention it to your finance team if they ask how we know the bill is right.
Where to see your usage
app.fredcode.net/usage— per-day stacked breakdown by model, with the share of spend ondeepseek-v4-prohighlighted (it's where the money goes).app.fredcode.net/billing— full ledger, current rate cards, top-up button, download invoices.- From the CLI —
/costshows session totals,fred usageshows the last 30 days at the terminal.
Refunds
Email support@fredcode.net from the address on your account. We'll book a refund ledger entry against the original top-up and Stripe will reverse the charge. There are no automatic refunds for unused credits because credits don't expire — if you want a chunk of your balance back, just ask.
If you set up a CI run that accidentally goes wide — a stuck loop, an infinite recursion, a misconfigured agent — we'll work with you on it. Email us before paying $$$ to fix the regret. We see the rows; we can usually tell when a bill is the product working correctly versus a runaway.
See also
API reference — endpoint shapes, error codes, correlation headers. Models — when to pick flash vs pro. /pricing — the same rate cards in marketing format.