Models
Fred ships with two DeepSeek SKUs. Default to flash; switch to pro for the hard turns. That's the whole strategy.
deepseek-v4-flash (default)
Fast, cheap, surprisingly capable. This is what every session starts on, and it's the right choice for the overwhelming majority of coding work — refactors, file edits, search, test scaffolding, lint fixes, anything where the bottleneck is "look at the code and apply the obvious change."
| Bucket | Upstream rate | What you pay (2.5× margin) |
|---|---|---|
| Input (cache miss) | $0.14 / M tokens | $0.35 / M tokens |
| Input (cache hit) | $0.0028 / M tokens | $0.007 / M tokens |
| Output | $0.28 / M tokens | $0.70 / M tokens |
deepseek-v4-pro
The reasoning SKU. Slower, much more expensive, and noticeably smarter on hard problems. Worth it for: hairy debugging where the bug spans files, architectural design from a blank page, math/algorithms work, or anytime flash gets stuck in a loop and you can feel it guessing.
| Bucket | Upstream rate | What you pay (2.5× margin) |
|---|---|---|
| Input (cache miss) | $1.74 / M tokens | $4.35 / M tokens |
| Input (cache hit) | $0.0145 / M tokens | $0.0363 / M tokens |
| Output | $3.48 / M tokens | $8.70 / M tokens |
Output is roughly 12× the cost of flash. Pro is only worth running on the turn that actually needs the reasoning — flip back to flash as soon as the gnarly part is done.
Switching with /flash and /pro
Inside the REPL, two slash commands swap your main slot:
> /flash # → deepseek-v4-flash
> /pro # → deepseek-v4-proThe first /pro in a session prompts you to confirm, with a quick reminder of the cost ratio. Subsequent uses don't re-prompt. Your choice persists to ~/.config/fred/preferences.json so it survives across restarts — start a new Fred next morning and you're still on whatever you last picked.
If you're scripting Fred (CI, batch jobs, fred -p pipelines) and don't want the confirmation prompt, set:
export FRED_NO_CONFIRM_PRO=1You can /pro halfway through a session, ask the hard question, then /flash to keep going on cleanup. The slot swap takes effect on the very next turn — there's no re-warming, no session reset.
Slot architecture
Fred has three model slots, each with its own job:
main— does the actual reasoning. Reads your code, writes the diffs, makes the decisions. This is the one/proswaps.editor— formats and validates diffs. (TODO: not yet wired up — currently inherits whatevermainis on. Once landed, will run on flash regardless of the main slot.)weak— auto-compaction, summarization of old turns, side-channel work that doesn't need reasoning. Always flash by default.
/pro deliberately only swaps main. Paying 12× output rates for a 3,000-token compaction summary is wasted spend — those summaries are exactly the work weak is for. If you actually want every slot on pro (and you have a reason), set them individually:
> /model main pro
> /model weak pro
> /model editor proOr one at a time if you prefer. /model with no args prints the current slot assignments.
Persistence and precedence
Several things can set a model. When they conflict, this is the order — top wins:
- CLI flag
--model <name>(per-invocation, applies tomain). - Environment variables:
FRED_MAIN_MODEL,FRED_EDITOR_MODEL,FRED_WEAK_MODEL. LegacyDSC_*names are still honored. ~/.config/fred/preferences.json— what/model,/flash, and/prowrite to.- The compiled-in default (
DEFAULT_MODEL = deepseek-v4-flash).
So a --model deepseek-v4-pro on the command line beats whatever your preferences file says, which beats the default. Useful for one-off pro runs without polluting your saved preference.
How rates appear in the CLI
After every turn, Fred prints a status line that ends with the live cost:
· deepseek-v4-flash in=12,431 cached=8,200 out=412 $0.0058Those numbers come from the rate-card cache at ~/.config/fred/rate-card.json, refreshed on every fred login from https://api.fredcode.net/api/pricing. There's a bundled fallback rate card baked into the package for offline first-runs, so a brand-new install can still print accurate costs before its first network round-trip.
Adding new models
DeepSeek occasionally ships new SKUs. When that happens, we bump the Fred package — the CLI's allowlist updates and you can pin the new model with /model main <new-name> after upgrading.
Until a model is in the allowlist, the proxy returns HTTP 400 with unknown_model rather than silently passing it through. If you try a model name Fred doesn't know about and get that error, run fred update and try again.
A 50,000-input, 1,000-output turn costs roughly $0.0073 on flash and $0.0395 on pro — about 5.4× more end-to-end (the input dominates and pro's input premium is smaller than its output premium). On long pro sessions with lots of generation, the gap widens toward the headline 12× output ratio.
If you've been running on pro for a while, verify your spend at /usage.
Next
Slash commands — every command available in the REPL. CLI reference — flags, env vars, subcommands.